Skip to content

Conversation

@david-crespo
Copy link
Contributor

@david-crespo david-crespo commented Jul 28, 2021

Add VPCView model and endpoints to list, create, get, and delete VPCs. No use of sagas. No conditions on deletion yet. The VPC model doesn't have anything on it except ID metadata and a project ID, so there's no logic to test beyond existence and non-existence, uniqueness of name, etc. There's also nothing to update, so I left out that endpoint.

@david-crespo david-crespo force-pushed the networking-endpoints branch 3 times, most recently from 9542772 to 02058e6 Compare July 28, 2021 17:52
@david-crespo david-crespo force-pushed the networking-endpoints branch 2 times, most recently from 4f9ea44 to c85d470 Compare July 28, 2021 19:15
@david-crespo david-crespo force-pushed the networking-endpoints branch from c85d470 to 8f7ddce Compare July 28, 2021 20:11
@david-crespo david-crespo force-pushed the networking-endpoints branch from 078b93d to 293a11a Compare July 28, 2021 20:39
@david-crespo david-crespo force-pushed the networking-endpoints branch from 8602307 to 343fe84 Compare July 28, 2021 21:08
@david-crespo david-crespo marked this pull request as ready for review July 28, 2021 21:13
@david-crespo david-crespo requested a review from smklein July 28, 2021 21:20
@david-crespo david-crespo force-pushed the networking-endpoints branch from f243585 to 21384bd Compare July 28, 2021 22:53
@david-crespo david-crespo force-pushed the networking-endpoints branch from 21384bd to 6d482e5 Compare July 29, 2021 00:48
@david-crespo david-crespo force-pushed the networking-endpoints branch from 0e1c1fc to a3b5dc7 Compare July 29, 2021 02:05
Copy link
Collaborator

@smklein smklein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, this looks really solid - thanks for putting this together so quickly, and appreciate the robust test case!

Comment on lines 95 to 101
#[derive(Clone, Debug)]
pub struct VPC {
/** common identifying metadata */
pub identity: IdentityMetadata,
/** id for the project containing this VPC */
pub project_id: Uuid,
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, what's the point of re-defining the VPC object within internal::nexus?

It's used as part of the interface in omicron-nexus/src/http_entrypoints_external.rs, and not "http_entrypoints_internal.rs" - from what I can tell, that means it genuinely is used as part of the external API, but not the internal API.

I'm pretty sure if we replace the usage of internal::nexus::VPC with external::VPC below, nothing breaks - WDYT?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(if we need a separate type for storage in the database, that could be a valid reason to create a new struct somewhere, but I figured "api/internal" would be for "types passed via HTTP interfaces between internal components" - which mostly means "in the requests/responses of http_entrypoints_internal.rs")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I misunderstood internal. I was thinking of it as "internal to the API" as opposed to "used by internal API endpoints". Right now they're the same anyway. If we had properties on the model we didn't want to include in the JSON, where would that live? Is that still a case for the *View concept?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I misunderstood internal. I was thinking of it as "internal to the API" as opposed to "used by internal API endpoints". Right now they're the same anyway.

ah, actually, they'll end up having some distinct properties. "used by internal API endpoints" may have versioning implications if the things communicating have different update schedules. As an example, we upgrade a Sled, but old Nexus is still running somewhere else. There are upgrade considerations with which we'd need to be cautious - the two components would need to have a compatible format.

If we had properties on the model we didn't want to include in the JSON, where would that live?

If it's just an in-memory structure which isn't passed between components, I'd say "make a struct somewhere in Nexus".

As mentioned earlier though, we could reference the external::VPC object everywhere in this PR - what additional properties are you trying to define?

Is that still a case for the *View concept?

I don't think so - "*view" was just a synonym for "external".

) -> ListResult<api::internal::nexus::VPC> {
let client = self.pool.acquire().await?;
sql_fetch_page_by::<
LookupByUniqueNameInProject,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're looking up VPCs by name here, but the name isn't explicitly a unique identifier for VPCs, is it?

CREATE TABLE omicron.public.VPC (
/* Identity metadata */
id UUID PRIMARY KEY,
name STRING(63) NOT NULL,
description STRING(512) NOT NULL,
time_created TIMESTAMPTZ NOT NULL,
time_modified TIMESTAMPTZ NOT NULL,
/* Indicates that the object has been deleted */
time_deleted TIMESTAMPTZ,
project_id UUID NOT NULL
);

The primary key is the UUID - shouldn't we be looking up VPCs by ID, not name?

I get that the implementation of project_create_vpc should be preventing the names from overlapping, but it seems like it would be nice to rely on the structure of the DB for the uniqueness, rather than an implicit constraint.

Caveat: It's totally possible I'm misinterpreting how sql_fetch_page_by is working here...

/**
* Implementation of [`LookupKey`] for looking up objects within a project by
* the project_id and the object's name
*/
pub struct LookupByUniqueNameInProject;
impl<'a, R: ResourceTable> LookupKey<'a, R> for LookupByUniqueNameInProject {
type ScopeKey = (&'a Uuid,);
const SCOPE_KEY_COLUMN_NAMES: &'static [&'static str] = &["project_id"];
type ItemKey = Name;
const ITEM_KEY_COLUMN_NAME: &'static str = "name";
fn where_select_error(
_scope_key: Self::ScopeKey,
item_key: &Self::ItemKey,
) -> Error {
Error::not_found_by_name(R::RESOURCE_TYPE, item_key)
}
}

But wouldn't we want a LookupByUniqueIDInProject that uses ItemKey = ID?

(FYI @davepacheco - I see LookupByUniqueNameInProject is also used for project_list_instances; I think my same question applies there too)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up: Is this relying on the unique index in dbinit.sql?

CREATE UNIQUE INDEX ON omicron.public.VPC (
name
) WHERE
time_deleted IS NULL;

Copy link
Contributor Author

@david-crespo david-crespo Jul 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good news, I already wrote LookupByUniqueIDInProject and got rid of it in 67eccbc89, so we can easily put it back. I don't think the DB is constraining uniqueness. To be honest I also am not sure where the constraint is coming from (though the constraint is there, the test attempting to create a second VPC with the same name proves it).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I saw the test, but I'm still trying to wrap my head around this - the unique_value parameter passed to sql_insert_unique_idempotent is only used in the error message!

pub async fn sql_insert_unique_idempotent<'a, R>(
client: &'a tokio_postgres::Client,
values: &'a SqlValueSet,
unique_value: &'a str,
ignore_conflicts_on: &'static str,
) -> Result<(), Error>
where
R: ResourceTable,
{
let mut sql = SqlString::new();
let param_names = values
.values()
.iter()
.map(|value| sql.next_param(*value))
.collect::<Vec<String>>();
let column_names = values.names().iter().cloned().collect::<Vec<&str>>();
sql.push_str(
format!(
"INSERT INTO {} ({}) VALUES ({}) ON CONFLICT ({}) DO NOTHING",
R::TABLE_NAME,
column_names.join(", "),
param_names.join(", "),
ignore_conflicts_on,
)
.as_str(),
);
sql_execute(client, sql.sql_fragment(), sql.sql_params())
.await
.map_err(|e| sql_error_on_create(R::RESOURCE_TYPE, unique_value, e))
.map(|_| ())
}

So I'm still kinda uncertain what provides the "names must be unique" property, if not the unique DB index.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having some conversations in chat, but figured I'd post the TL;DR here too:

If the CREATE UNIQUE INDEX statement on omicron.public.VPC is what's actually enforcing this uniqueness, wouldn't that imply that names are globally unique? (example: two different subnets both want to create a VPC named "Foo" - one of 'em would get a collision).

Seems the approach taken by instances is creating a unique index by the tuple (project_id, name). Seems like we'll need to update dbinit.sql's indices for VPC objects to guarantee similar scoping.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed in the docs that the index is the official way to enforce uniqueness for a pair of columns:

Indexes can also be used to enforce uniqueness of a column's value, or the uniqueness of the combined values of more than one column.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think y'all figured most of this out already! I'll try to explain a little more why things are the way they are.


In terms of desired behavior, RFD 4 says this about the "name" property of something that has identity metadata:

Names of resources must be unique under the scope of the parent resource. For example, Instance names are required to be unique to a project and project names are required to be unique to an Organization.

RFD 21 explains that VPCs are scoped to projects. So yes, their name should be unique within the project, and not necessarily globally unique.


In terms of implementing this: yes, we use CockroachDB unique indexes to guarantee uniqueness. That's the natural place to do it, since the database has such a facility and the behavior needs to be strongly consistent.

As @smklein points out, the unique index for Instances is on the (project_id, name) tuple, not just name. You want something similar for VPCs here. (Since we're organizing everything around Projects, nearly everything should probably look like this. Projects themselves should be scoped to an Organization, and Organization names I believe will wind up being globally unique, but we haven't implemented Organizations yet so Project names are currently globally unique.)

There's a close relationship between the fields in the unique indexes on a table, the Lookup* structs supported by the table, and the ScopeKey/ItemKey in those lookups.


@smklein wrote:

Yeah, I saw the test, but I'm still trying to wrap my head around this - the unique_value parameter passed to sql_insert_unique_idempotent is only used in the error message!

Yeah, so when the block comment on sql_insert_unique_idempotent talks about "conflict", it's referring not to some vague notion of some constraint being violated (that the function itself might be expected to check) but a specific condition that databases like CockroachDB and PostgreSQL call "conflict" and handle with an ON CONFLICT clause on the INSERT statement. In retrospect maybe that's not obvious if you haven't dug into the implementation of the function or dealt with UPSERTs before. The comment currently says:

/// Idempotently insert a database record.  This is intended for cases where the
/// record may conflict with an existing record with the same id, which should
/// be ignored, or with an existing record with the same name, which should
/// produce an [`Error::ObjectAlreadyExists`] error.

It might be more precise to say:

///
/// Idempotently insert a database record.  More precisely, this is intended
/// for cases where:
///
/// * the caller wants to insert a database row for some particular object
/// * the object is identified by an immutable, unique id
/// * the caller wants the operation to succeed if a row already exists for this
///   object (i.e., a row exists with the same unique id, which is by definition
///   a row for the same object)
/// * the caller wants to fail if inserting the record would violate some other
///   constraint, usually a name conflict (which is necessarily a record
///   for a _different_ object, since if it were the same object, it would have
///   the same id, and the operation would succeed without inserting anything)
///
/// That might sound incredibly specific or even contrived, but it's the common
/// case for sagas that want to create database records for API resources.  For
/// example, when provisioning an Instance, the provision saga will generate a
/// unique id and use this function to insert an initial Instance object into
/// the database.  Since the id is generated in the saga, the only way to have a
/// conflict on the object's id is if the saga action is being rerun.  Since
/// actions are supposed to be idempotent, the correct behavior is to succeed
/// despite the conflict.  On the other hand, if the new Instance's name
/// conflicts with another one in the same project, that will raise a different
/// constraint violation that we do want to package up and bubble up to the
/// caller.  This condition means somebody tried to create an Instance with the
/// same name as one that's already running, so we package this up as an
/// [`Error::ObjectAlreadyExists`] error.
///

This is a long way of explaining why the unique value is not used in the function aside from the error message. (Why bother putting it in the error message? The ObjectAlreadyExists error requires you to say what the conflicting value was, and that's because I've found specific messages like that to be a huge improvement in user experience.)


@smklein wrote:

But wouldn't we want a LookupByUniqueIDInProject that uses ItemKey = ID?

(FYI @davepacheco - I see LookupByUniqueNameInProject is also used for project_list_instances; I think my same question applies there too)

I think you got this answered but I'm not sure? I wasn't quite sure why you expected to use a lookup by id. Is it just because that's the primary key in the database? Anyway, I expect API operations to do lookups by name rather than id. (RFD 4 is a little confusing in that it uses identifiers like "projectId" in all the URLs. But it also says "the endpoints that reference an individual resource work on either the uuid or the name." If I remember right, we only implement name lookups in Nexus for most resources today.) Given this, for any resource that's scoped to a project, I'd expect us to use LookupByUniqueNameInProject and not LookupByUniqueIdInProject.

The general ideas here are:

  • The uuid of an object is immutable and should uniquely identify it across all time, all types of objects, and all Oxide systems. (Like many systems, we assume this by the astronomical probability of UUIDv4 collision.)
  • The name of an object is mutable and uniquely identifies the object within its parent scope and while it is still live. We will (do already?) use soft deletes for many objects, and there may be multiple objects with the same name in the same scope as long as no more than one is alive. This is why the unique indexes include a WHERE time_deleted IS NULL clause.
  • From the user's perspective, since they can only see live objects, both identifiers are equally good. I expect most people to want to use "name" most of the time. Software that wants to be robust against other clients renaming things that they're working on might prefer to use ids instead.
  • Internally, we usually want to resolve the name to an id when we receive a request and do all of our internal work using these ids. Otherwise, we can wind up with nasty concurrency issues if somebody renames a resource while we're working on it. This can even lead to security vulnerabilities, if we were to do an access check of some resource and then somebody renames it and we wind up operating on a different resource than the one we checked.

I'd been assuming all this mostly based on past experience, but assuming it makes sense, we should probably document it.

Does all that make sense?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That all makes sense. Most of it we were able to infer from the code, but it is helpful to see it spelled out. Especially:

Internally, we usually want to resolve the name to an id when we receive a request and do all of our internal work using these ids.

I was mostly able to get there by copying the patterns I saw used for instances.

Comment on lines 27 to 28
// let apictx = &cptestctx.server.apictx;
// let nexus = &apictx.nexus;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Dunno if this is WIP, but dead code here

// let nexus = &apictx.nexus;

/* Create a project that we'll use for testing. */
let project_name = "springfield-squidport";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A+ naming choice

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copied from dap in test_instances 😉

@david-crespo david-crespo merged commit 320ae22 into main Jul 29, 2021
@david-crespo david-crespo deleted the networking-endpoints branch July 29, 2021 17:47
name
CREATE UNIQUE INDEX ON omicron.public.Vpc (
name,
project_id
Copy link
Collaborator

@davepacheco davepacheco Jul 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't quite right. It needs to be (project_id, name) rather than (name, project_id). This is critical to be able to use this index to list the VPCs in a Project. (Think of it this way: we want the database to be able to quickly find the first the VPC in the Project. If the index is sorted by project_id first, that's easy. Once it finds that record, the next N values in the index will be the N VPCs in that project, sorted by name -- perfect. If the index is sorted by (name, project_id), this doesn't work, and there's no efficient way to find the first VPC (or any other VPC) in a Project.)

Edit: I really wish there were some way to automatically test this. This is probably working in your change because CockroachDB is doing a table scan rather than using this index. I kind of want CockroachDB to refuse to do queries that always necessitate table scans. I'm not really sure how best to do this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Didn't realize the index works like a nested dictionary.

name
/* Subnet and network interface names are unique per VPC, not project */
CREATE UNIQUE INDEX ON omicron.public.VpcSubnet (
name,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to above, this needs to be (vpc_id, name), not (name, vpc_id).

-- TODO: add project_id to index
CREATE UNIQUE INDEX ON omicron.public.NetworkInterface (
name
name,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants