v1 instance APIs as per RFD-322 #1957

just-be-dev · 2022-11-17T20:33:43Z

This PR establishes the new pattern for how the v1 of oxide's external API will be shaped. The details here are driven by the determinations of RFD-322.

To recap RFD-322, we want our API to be more ID focused but to also support names. Through the resolution of that RFD we've decided to go with a flatter API that uses query parameters to support names if needed.

The notion of being ID first is something we'd like to flow through the rest of Nexus too. Where possible, IDs should always be favored over names.

Overview

Instead of breaking the old API the new, flattened endpoints are being added under the /v1/ path which we agreed to introduce in RFD-299.
Establishes the practice that names don't flow beyond the http_entrypoints file. New <resource>_lookup methods are introduced that resolve to the authz identifier of a given resource and that's passed around instead.
Introduces a notion of selectors (i.e. ProjectSelector and InstanceSelect) which is a data structure that stores all possible representations of parameters required to lookup a resource.

Example

Old way of fetching instance

GET /organizations/<org_name>/projects/<project_name>/instances/<instance_name>
GET /by-id/instances/<instance_id>

New ways of fetching instance

GET /v1/instances/<instance_id>
GET /v1/instances/<instance_name>?project=<project_id>
GET /v1/instances/<instance_name>?project=<project_name>&organization=<org_id>
GET /v1/instances/<instance_name>?project=<project_name>&organization=<org_name>

Old way of creating an instance

POST /organizations/<org_name>/projects/<project_name>/instances/<instance_name> { ...body }

New way of creating an instance

POST /v1/instances?project=<project_id> { ...body }
POST /v1/instances?project=<project_name>&organization=<org_id> { ...body }
POST /v1/instances?project=<project_name>&organization=<org_name> { ...body }

Note that query parameters are always used to scope down an API call regardless of its method.

common/src/api/external/mod.rs

nexus/src/external_api/http_entrypoints.rs

just-be-dev · 2022-11-20T23:45:50Z

nexus/src/external_api/http_entrypoints.rs

+        let project_id = nexus
+            .project_lookup_id(
                &opctx,
-                &organization_name,
-                &project_name,
-                &new_instance_params,
+                params::ProjectSelector::ProjectAndOrg {
+                    project_name: project_name.clone().into(),
+                    organization_name: organization_name.clone().into(),
+                },
            )
            .await?;
+        let instance = nexus
+            .project_create_instance(&opctx, &project_id, &new_instance_params)
+            .await?;


One of the implicit goals of RFD-322 is to reduce our overall reliance on names in Nexus. Effectively we only want names to be utilities to simplify lookups for humans but to not flow deeper down than API level. With this new selector lookup approach, we resolve names to relevant ids as the first step. Internal nexus apis should therefore only take IDs.

just-be-dev · 2022-11-20T23:55:21Z

nexus/src/external_api/http_entrypoints.rs

+            .instance_lookup_id(
+                &opctx,
+                query.selector.to_instance_selector(path.instance.into()),
+            )
+            .await?;


Even though these top level instance endpoints actually take in the project selector via query params we transform it into an instance selector for more uniform matching. This does mean that the instance_lookup_id method isn't exactly aware of where it's being called from and is thus can only really return a generic error message. We should probably have a custom error type for this method such that we can do a better job from the callsite here of giving the user accurate error feedback (E.g. you're missing a query param).

just-be-dev · 2022-11-21T00:14:17Z

nexus/src/external_api/http_entrypoints.rs

+    #[serde(flatten)]
+    selector: params::ProjectSelector,


I think the flatten here causes there to be duplicates in the output because the selector is an untagged enum. Need to figure out how to fix that.

Looks like this is a serde issue: serde-rs/serde#2200

Actually, I believe I was incorrect. I think this is likely a schemars issue. I was able to work around it in d0a19c6 by using #[schemars(skip)] for the duplicate names, but we should dig into where this is failing and make the relevant PR.

ahl

still need to look at the meat of this PR, but wanted to post this so I didn't forget

common/src/api/external/mod.rs

david-crespo · 2022-11-25T17:14:08Z

nexus/types/src/external_api/params.rs

+        project_name: Name,
+        organization_name: Name,
+    },
+    None {},


Does this None need to be built into the enum because of how deserialization works? Otherwise it seems like a natural spot for Option or Result. Alternatively, I feel like in the None case the endpoint should automatically 400 at the parsing step, which would mean you don't need the None in the enum itself.

So an untagged enum tries to match all of the options and errors out if it can't find a match. I think wrapping it in an option might be viable. I didn't take that tack because we always need to match against the enum anyway and having a None option (that essentially means there were not matches in the untagged enum) makes the handling easier.

If a route has a path parameter and that path parameter is not an ID then the selector having a matching arm is required. Given that two types are involved here the optional isn't really valid. In this case we can actually document None as a part of the API where it's understood at what point you shouldn't include a selector.

just-be-dev · 2022-12-07T02:36:55Z

nexus/tests/integration_tests/endpoints.rs

-        VerifyEndpoint {
-            url: "/by-id/instances/{id}",
-            visibility: Visibility::Protected,
-            unprivileged_access: UnprivilegedAccess::None,
-            allowed_methods: vec![
-                AllowedMethod::Get,
-            ],
-        },
-


Given that the endpoints are now collapsed this mechanism of having two VerifyEndpoint definitions won't work. Essentially any endpoint that is verified here will be consumed from the spec so if you try to verify the same endpoint twice you'll end up with a situation where it says the spec doesn't contain the other instance of the endpoint (because the previous verify consumed it).

After the API is converted this suite may need to be revisited.

just-be-dev · 2022-12-07T03:47:37Z

nexus/tests/integration_tests/unauthorized_coverage.rs

-    assert_eq!(expected_uncovered_endpoints, uncovered_endpoints);
+    let mut unexpected_uncovered_endpoints = "These endpoints were expected to be covered by the unauthorized_coverage test but were not:\n".to_string();
+    let mut has_uncovered_endpoints = false;
+    for endpoint in uncovered_endpoints.lines() {
+        if !expected_uncovered_endpoints.contains(endpoint) {
+            unexpected_uncovered_endpoints
+                .push_str(&format!("\t{}\n", endpoint));
+            has_uncovered_endpoints = true;
+        }
+    }
+    assert_eq!(
+        has_uncovered_endpoints, false,
+        "{}\nMake sure you've added a test for this endpoint in unauthorized.rs.",
+        unexpected_uncovered_endpoints
+    )


I have been on the other end of debugging this test one too many times. If you click through to the comment it's helpful sort of in explaining what's happening, but the test output most of the time is entirely illegible. This just updates the test output to tell you a list of things you need to write tests for instead of an inscrutable mismatch of file contents.

I have been bitten by this before too, THANK YOU for updating to a more clear version

just-be-dev · 2022-12-07T03:48:26Z

nexus/tests/output/uncovered-authz-endpoints.txt

+
+Deprecated API endpoints to be removed at the end of the V1 migration


Because I updated the test not to be an exact file match we can add extra lines of text in here.

just-be-dev · 2022-12-07T17:15:17Z

nexus/tests/integration_tests/endpoints.rs

    pub static ref DEMO_INSTANCE_NICS_URL: String =
-        format!("{}/network-interfaces", *DEMO_INSTANCE_URL);
+        format!("/organizations/{}/projects/{}/instances/{}/network-interfaces", *DEMO_ORG_NAME, *DEMO_PROJECT_NAME, *DEMO_INSTANCE_NAME);
    pub static ref DEMO_INSTANCE_EXTERNAL_IPS_URL: String =
-        format!("{}/external-ips", *DEMO_INSTANCE_URL);
-    pub static ref DEMO_INSTANCE_SERIAL_URL: String =
-        format!("{}/serial-console", *DEMO_INSTANCE_URL);
-    pub static ref DEMO_INSTANCE_SERIAL_STREAM_URL: String =
-        format!("{}/serial-console/stream", *DEMO_INSTANCE_URL);
+        format!("/organizations/{}/projects/{}/instances/{}/external-ips", *DEMO_ORG_NAME, *DEMO_PROJECT_NAME, *DEMO_INSTANCE_NAME);


The new URL structure doesn't lend itself well to the same composition as before. You have parts of the hierarchy that come after whatever we're appending to the structure. Instead of taking that tack I'm planning to just flatten this whole tree of variables. I think ultimately it'll make it clearer as to what's actually being defined.

just-be-dev · 2022-12-07T19:26:43Z

Unless there's strong feedback my plan is to merge this PR by EOD. I will, of course, address any issues that come up after merge. There's some urgency around getting this API work done so that clients and be updated (and new client features worked on) before our next customer demo.

smklein

I find this new structure really readable, and the separation of "lookup" vs "authenticated usage" is great.

There are a couple areas where clearly we need to follow-up, but I understand why we're not doing them in this PR:

Marking the non-v1 endpoints as deprecated (presumably waiting on oxidecomputer/dropshot#503)
Deleting the non-v1 endpoints
Getting v1 endpoints up for all APIs, not just instances

Can we get up issues (or at least a larger, tracking issue) to make sure we're covering all these things? I know it's probably obvious in your head right now, but I wanna make sure we're recording this just in case the last 10% takes us some time to complete.

Again, great job with this PR, I'm excited to not need to puzzle over the codebase to figure out whether I should or should not have an "authz" object or a name.

nexus/src/app/instance.rs

augustuswm · 2022-12-07T20:50:13Z

nexus/types/src/external_api/params.rs

+impl InstanceSelector {
+    pub fn new(
+        instance: NameOrId,
+        project_selector: &Option<ProjectSelector>,


If new immediately clones the elements of ProjectSelector, it is probably worth passing ownership and letting the caller call clone if they need to.

david-crespo

heroic

just-be-dev · 2022-12-08T04:03:02Z

Can we get up issues (or at least a larger, tracking issue) to make sure we're covering all these things? I know it's probably obvious in your head right now, but I wanna make sure we're recording this just in case the last 10% takes us some time to complete.

Absolutely, I'll create that now.

Edit: #2028

just-be-dev · 2022-12-08T15:02:24Z

I'm going to merge this PR and move forward for now. I would still like post merge feedback from @davepacheco. Even though I'm merging, we can still revisit decisions made here if necessary.

davepacheco

Thanks for taking this on! Sorry this took a little bit but it sounded like the feedback was still wanted.

I found a bunch of things (noted above) that I think would be resolved if the foo_lookup() functions returned lookup::Foo instead of authz::Foo:

The lookup functions themselves could directly use the parent lookup function (i.e., instance_lookup() could call project_lookup()) rather than duplicating the logic. (I'm worried this logic duplication (and boilerplate) grows with the depth of the tree. It'll also be more than it looks like here once we handle the extra error cases. )
The caller of the lookup function (or, more likely, subsequent code) can decide which authz check to do. This preserves the previous coupling between lookup and authz check that I hope makes it harder to get the authz check wrong (or forget one). It would also avoid doing multiple redundant authz checks in every path. (There are still some code paths with extra checks (just like before), but with the current pattern, I think every path would have redundant checks.)
The caller of the lookup function (or subsequent code) can decide whether they want the database part (fetch()) or not (lookup()). Otherwise, are many code paths that do multiple lookups just to re-fetch the database part that was just fetched and thrown away.

I'm curious what you think.

Thanks again for taking this on. Long live ids!

davepacheco · 2022-12-09T00:50:03Z

nexus/src/app/instance.rs

 const MAX_KEYS_PER_INSTANCE: u32 = 8;

 impl super::Nexus {
+    pub async fn instance_lookup(


Is the thought that we'll eventually create a foo_lookup() for every type of object? And that it would includes a match block with all possible combinations of the query params? It seems like there's a lot that can go wrong there. But it feels like this is the kind of problem we solved (or tried to solve?) with LookupPath and I wonder if we can clean this up by passing around lookup::Instance or the like (discussed a little here, which I see you considered). What if we had the http_entrypoints functions use LookupPath directly and pass the lookup::Foo into app-level functions? I think that would make them more composeable, so that instance_lookup() could use project_lookup() (which would use organization_lookup()) rather than duplicating that logic. What do you think?

Yeah, I like that idea a lot. I'll explore it a little more. There've been some iterations on this that have slowly started to feel better. It went from names, to ids, to authz objects. Maybe the lookups are the last step to really make it compositional. Thanks for the suggestion.

davepacheco · 2022-12-09T00:51:33Z

nexus/src/app/instance.rs

+    ) -> LookupResult<authz::Instance> {
+        match instance_selector {
+            params::InstanceSelector { instance: NameOrId::Id(id), .. } => {
+                // TODO: 400 if project or organization are present


For what it's worth, I think these cases are valuable to fix early because they can help catch otherwise nasty client bugs, and also fixing them later seems likely a breaking change.

davepacheco · 2022-12-09T00:53:26Z

nexus/src/app/instance.rs

-            .project_name(project_name)
-            .lookup_for(authz::Action::CreateChild)
-            .await?;
+        opctx.authorize(authz::Action::CreateChild, authz_project).await?;


I gather we've basically decoupled the lookup (which is now done by the caller) from the authz check. It makes sense, just a little sad because coupling those was intentional to make it harder to mess up the authz part. (And I see now why you were talking about wanting to express the authz checks that were done in the type.)

Actually, I think if you have the foo_lookup() functions return lookup::Foo, that will address this problem too because the top-level caller doesn't have to decide which authz check to do.

As this is now, I think we'll also wind up doing 2x the authz checks now and we may wind up wanting to explore caching those. As I think about it, what we probably want to cache are the roles and not the result of the authz check itself. That's because when you call foo_lookup(), you're only checking for read (because the caller doesn't know what authz checks will be needed by other functions it's going to call later), but later functions may wind up needing other permissions, so the authz checks won't look exactly the same.

davepacheco · 2022-12-09T00:58:20Z

nexus/src/app/instance.rs

-        &self,
-        opctx: &OpContext,
-        instance_id: &Uuid,
+        authz_instance: &authz::Instance,


I think this pattern of function makes a lot less sense after this change. (That's not a bad thing.) We presumably just did a lookup of the Instance, dropped the database part, and now we're doing a whole other lookup just to fetch the database part again. It seems like wherever we're using foo_fetch() now we should just have it use LookupPath()...fetch() instead of what it's currently doing, which must be LookupPath()...lookup(). This might be tricky if we're using the new Nexus::foo_lookup() pattern, but I think that's another reason to have those functions return lookup::Foo instead of authz::Foo (so the caller can decide if they want to use lookup() or fetch()).

davepacheco · 2022-12-09T01:00:43Z

nexus/src/app/instance.rs

-                .organization_name(organization_name)
-                .project_name(project_name)
-                .instance_name(instance_name)
+                .instance_id(authz_instance.id())


Same here as above -- we already did a lookup in the caller. It'd be nice to just use that here instead of doing another lookup. (You could even wind up with a different result here than the caller got.)

davepacheco · 2022-12-09T01:01:22Z

nexus/src/app/sagas/instance_create.rs

 #[derive(Clone, Debug, Deserialize, Serialize)]
 pub struct Params {
    pub serialized_authn: authn::saga::Serialized,
-    pub organization_name: Name,


🎉 (I love dropping names from saga stuff)

davepacheco · 2022-12-09T01:02:04Z

nexus/tests/integration_tests/instances.rs

    );
 }

+#[nexus_test]


I'm assuming this isn't as exhaustive a test as what we had for the old APIs. Is there a plan for converting them?

Yes, absolutely.

davepacheco · 2022-12-09T01:06:27Z

nexus/tests/integration_tests/unauthorized_coverage.rs

        std::fs::read_to_string("tests/output/uncovered-authz-endpoints.txt")
            .expect("failed to load file of allowed uncovered endpoints");
-    assert_eq!(expected_uncovered_endpoints, uncovered_endpoints);
+    let mut unexpected_uncovered_endpoints = "These endpoints were expected to be covered by the unauthorized_coverage test but were not:\n".to_string();


It seems like we're no longer checking whether there are things in the allowlist file that are covered. That seems bad? It feels like it undermines the point of an allowlist. (why wouldn't we just put every endpoint into the allowlist?)

This is bending my brain a little. So, this would happen when you've set something in the allowlist that you expect to not be covered by tests but the tests cover it anyway. So the uncovered_endpoints would be smaller than the expected_uncovered_endpoints meaning you might need to remove something from the allowlist.

That's fair. What I was trying to solve here is there failure experience isn't the best. I've hit up against this test several times and it always takes me a while to figure out what went wrong and what I need to do. Ultimately my goal here is just to make it more explicit as to what needs to be done. I'll see if I can work backwards and test both ways without making it messier.

That sounds good. Yeah, I'm all for making this easier. We could also use expectorate here like we do in other places. I think the suggestion not to was based on not making it seem like the right thing for people to add things to this list.

davepacheco · 2022-12-09T01:07:24Z

nexus/types/src/external_api/params.rs

 use std::{net::IpAddr, str::FromStr};
 use uuid::Uuid;

+#[derive(Deserialize, JsonSchema)]


At one point I think we had talked about describing more precisely the set of things that was allowed, so that it would be impossible to express for example an InstanceSelector that had an instance name and no project part. Did we decide not to do that because it can't be expressed using query parameters?

In a previous iteration of this I defined the shape of the selector as an enum of only valid combinations. There were a few challenges with this (like a bug in schemars that caused duplicate schema output). There was added implementation complexity there that ultimately still yielded the same schema output. I had floated some ideas in the RFD about how to make the schema more expressive in terms of typing, but generating that output is a highly non-trivial endeavor that would eat up more time than I have to give to this initiative.

davepacheco · 2022-12-09T01:07:55Z

nexus/tests/output/uncovered-authz-endpoints.txt

 logout                                   (post   "/logout")
+
+Deprecated API endpoints to be removed at the end of the V1 migration
+instance_delete                          (delete "/organizations/{organization_name}/projects/{project_name}/instances/{instance_name}")


Why aren't we covering these any more?

@davepacheco

… `authz::*` (#2042) This PR is mostly in response to @davepacheco's feedback on #1957. Special thanks to @smklein for helping work through some rust lifetime issues here. Previously the `instance_lookup` and `project_lookup` functions returned a `authz::Instance` or `authz::Project` respectively that was authorized with read permissions. The challenge here is that this meant that in several places we required duplicate lookups to be able to resolve the instance with the right permissions. To avoid that we're passing around the lookup step itself instead.

Test impl of new endpoint format

233dc6d

just-be-dev changed the title ~~Test impl of new endpoint format~~ PoC implementing RFD-322 Nov 17, 2022

just-be-dev commented Nov 17, 2022

View reviewed changes

common/src/api/external/mod.rs Outdated Show resolved Hide resolved

just-be-dev commented Nov 17, 2022

View reviewed changes

nexus/src/external_api/http_entrypoints.rs Outdated Show resolved Hide resolved

just-be-dev added 9 commits November 17, 2022 15:54

ResourceIdentifier -> NameOrId

0be1778

Add docs, remove unneeded chunk

6e3210f

Use v1 prefix, convert instance delete

a2681b9

Push up diplay error

18bc9cf

ReferenceIdentifer -> NameOrId in nexus.json

07ec450

Update nexus.json to fix build issues

5b2f217

Change v1 to a prefix instead of postfix

8631e9f

Normalize id lookups and path selection

41672aa

Add v1 create instance api

954ffdb

just-be-dev commented Nov 20, 2022

View reviewed changes

just-be-dev commented Nov 21, 2022

View reviewed changes

just-be-dev added 8 commits November 21, 2022 12:02

Add instance list endpoint

c8d191b

Add instance migrate endpoint

6708bac

Add instance reboot endpoint

26b5651

Add instance start endpoint

f70f84f

Add instance stop endpoint

e6374fe

Add serial console endpoints

cdaa977

Workaround for multiple emits

d0a19c6

Implement NameOrId for display to try resolve dropshot error

f439f59

This comment was marked as resolved.

Sign in to view

just-be-dev mentioned this pull request Nov 22, 2022

Enums unusable as path params due to not deriving Display oxidecomputer/progenitor#253

Closed

Flip the order of id/name in the NameOrId enum

b4e818a

ahl reviewed Nov 24, 2022

View reviewed changes

common/src/api/external/mod.rs Show resolved Hide resolved

common/src/api/external/mod.rs Show resolved Hide resolved

common/src/api/external/mod.rs Outdated Show resolved Hide resolved

Add untagged to NameOrId

7bdfeb0

david-crespo reviewed Nov 25, 2022

View reviewed changes

just-be-dev added 2 commits December 6, 2022 12:13

Merge branch 'main' into rfd-322-poc

2092012

Update unauthz coverage

96165f5

just-be-dev requested a review from davepacheco December 6, 2022 20:56

just-be-dev marked this pull request as ready for review December 6, 2022 21:05

just-be-dev added 2 commits December 6, 2022 16:34

Fix up some failed testcases

e616c03

Add view by id to skip list

3c13edc

just-be-dev commented Dec 7, 2022

View reviewed changes

just-be-dev added 2 commits December 7, 2022 12:49

Merge branch 'main' into rfd-322-poc

f26c18a

Update lookups to return authz refs instead of uuids

12bc365

just-be-dev removed the request for review from davepacheco December 7, 2022 18:48

just-be-dev changed the title ~~PoC implementing RFD-322~~ v1 instance APIs as per RFD-322 Dec 7, 2022

smklein approved these changes Dec 7, 2022

View reviewed changes

augustuswm reviewed Dec 7, 2022

View reviewed changes

nexus/src/app/instance.rs Outdated Show resolved Hide resolved

augustuswm reviewed Dec 7, 2022

View reviewed changes

ahl approved these changes Dec 8, 2022

View reviewed changes

david-crespo approved these changes Dec 8, 2022

View reviewed changes

Address PR feedback

fc517af

just-be-dev mentioned this pull request Dec 8, 2022

Update API to match resolution of RFD-322 #2028

Open

22 tasks

just-be-dev merged commit cb6a75d into main Dec 8, 2022

just-be-dev deleted the rfd-322-poc branch December 8, 2022 15:02

davepacheco reviewed Dec 9, 2022

View reviewed changes

just-be-dev mentioned this pull request Dec 9, 2022

Update lookup fns introduced in the V1 API to return lookup::* over authz::* #2042

Merged

just-be-dev mentioned this pull request Dec 14, 2022

RFD-322: Add /v1/ endpoints for organizations and projects #2050

Merged


		Deprecated API endpoints to be removed at the end of the V1 migration

v1 instance APIs as per RFD-322 #1957

v1 instance APIs as per RFD-322 #1957

Uh oh!

Conversation

just-be-dev commented Nov 17, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Example

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

This comment was marked as resolved.

ahl left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

david-crespo Nov 25, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

just-be-dev commented Dec 7, 2022

Uh oh!

smklein left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

david-crespo left a comment

Choose a reason for hiding this comment

Uh oh!

just-be-dev commented Dec 8, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

just-be-dev commented Dec 8, 2022

Uh oh!

davepacheco left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

just-be-dev commented Nov 17, 2022 •

edited

Loading

david-crespo Nov 25, 2022 •

edited

Loading

just-be-dev commented Dec 8, 2022 •

edited

Loading

just-be-dev Dec 9, 2022 •

edited

Loading