Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RPW for OPTE v2p Mappings #5568

Open
wants to merge 29 commits into
base: main
Choose a base branch
from
Open

Conversation

internet-diglett
Copy link
Contributor

@internet-diglett internet-diglett commented Apr 19, 2024

TODO

  • Extend db view to include probe v2p mappings
  • Update sagas to trigger rpw activation instead of directly configuring v2p mappings
  • Test that the delete functionality cleans up v2p mappings

Possible Optimizations

  • Expose internal API that a sled-agent can call to pre-emptively refresh v2p mappings? (this will allow individual sleds to have v2p entries repopulated more rapidly upon restarting, as long as Nexus is running)

Related

Resolves #5214
Resolves #4259
Resolves #3107

@davepacheco
Copy link
Collaborator

Ooh, will this also resolve #4259? We had talked about using an RPW for the v2p mappings in that ticket as well.

@internet-diglett
Copy link
Contributor Author

Ooh, will this also resolve #4259? We had talked about using an RPW for the v2p mappings in that ticket as well.

@davepacheco Why yes, yes it will! I love sweeping up multiple bugs / issues into a single fix!

@internet-diglett
Copy link
Contributor Author

Quick tests are showing that our initial implementation is working

Desired State

VPC 14487919
----------------------------------------------------------------------

IPv4 mappings
----------------------------------------------------------------------
VPC_IP      VPC MAC ADDR       UNDERLAY IP
172.30.0.5  A8:40:25:FD:DA:2B  fd00:1122:3344:102::1
172.30.0.6  A8:40:25:FE:89:1F  fd00:1122:3344:103::1
172.30.0.7  A8:40:25:F8:CA:88  fd00:1122:3344:101::1
172.30.0.8  A8:40:25:F6:66:01  fd00:1122:3344:104::1

Push the v2p state away from the desired state

root@g3:~# /opt/oxide/opte/bin/opteadm set-v2p 172.30.0.7 A8:40:25:FE:89:1F fd00:1122:3344:103::1 14487919

v2p state is now incorrect

VPC 14487919
----------------------------------------------------------------------

IPv4 mappings
----------------------------------------------------------------------
VPC_IP      VPC MAC ADDR       UNDERLAY IP
172.30.0.5  A8:40:25:FD:DA:2B  fd00:1122:3344:102::1
172.30.0.6  A8:40:25:FE:89:1F  fd00:1122:3344:103::1
172.30.0.7  A8:40:25:FE:89:1F  fd00:1122:3344:103::1
172.30.0.8  A8:40:25:F6:66:01  fd00:1122:3344:104::1

rpw corrects the drift

17:47:32.713Z INFO e7ef9388-d6e1-4f1c-8a8b-0c1a0987e391 (ServerContext): v2p mappings to add                                                            
    background_task = v2p_manager                                                                                                                       
    file = nexus/src/app/background/v2p_mappings.rs:147                                                                                                 
    mappings = [SetVirtualNetworkInterfaceHost { physical_host_ip: fd00:1122:3344:101::1, virtual_ip: 172.30.0.7, virtual_mac: MacAddr(MacAddr6([168, 64
, 37, 248, 202, 136])), vni: Vni(14487919) }]                                                                                                           
    sled = g3                            

v2p state is now correct again

VPC 14487919
----------------------------------------------------------------------

IPv4 mappings
----------------------------------------------------------------------
VPC_IP      VPC MAC ADDR       UNDERLAY IP
172.30.0.5  A8:40:25:FD:DA:2B  fd00:1122:3344:102::1
172.30.0.6  A8:40:25:FE:89:1F  fd00:1122:3344:103::1
172.30.0.7  A8:40:25:F8:CA:88  fd00:1122:3344:101::1
172.30.0.8  A8:40:25:F6:66:01  fd00:1122:3344:104::1

@internet-diglett
Copy link
Contributor Author

Going to shift gears and finish up the delete implementation in the OPTE side of things

@internet-diglett internet-diglett marked this pull request as ready for review April 29, 2024 23:45
Copy link
Contributor

@jmpesp jmpesp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

illumos-utils/src/opte/port_manager.rs Outdated Show resolved Hide resolved
illumos-utils/src/opte/port_manager.rs Outdated Show resolved Hide resolved
nexus/db-queries/src/db/datastore/network_interface.rs Outdated Show resolved Hide resolved
use omicron_common::api::external::ListResultVec;

impl DataStore {
pub async fn v2p_mappings(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this also need a paginated query?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well... I admittedly tend to use DataPageParams::max_page() in RPWs whenever using db queries that require pagination parameters, and that will return u32::MAX (4_294_967_296) entries, so I'd say it's kind of moot here, unless we want to standardize on a lower limit for db queries in background tasks (which honestly, it wouldn't shock me if we decide that it would be a good idea to do so!)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Under @davepacheco's direction we've started using this pattern in reconfigurator RPWs:

opctx.check_complex_operations_allowed()?;
let mut zpools = Vec::new();
let mut paginator = Paginator::new(SQL_BATCH_SIZE);
while let Some(p) = paginator.next() {
let batch = self
.zpool_list_all_external(opctx, &p.current_pagparams())
.await?;
paginator = p.found_batch(&batch, &|(z, _)| z.id());
zpools.extend(batch);
}

check_complex_operations_allowed() fails the opctx is associated with an external API client, and then we do still list all the items, but in paginated batches to avoid a giant result set coming from crdb.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated the query to use this pattern.

nexus/src/app/background/v2p_mappings.rs Outdated Show resolved Hide resolved
nexus/src/app/background/v2p_mappings.rs Outdated Show resolved Hide resolved
nexus/src/app/background/v2p_mappings.rs Show resolved Hide resolved
Comment on lines 87 to 89
// - it means that delete calls are required as well as set calls,
// meaning that now the ordering of those matters (this may also
// necessitate a generation number for V2P mappings)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think generation numbers are required with the approach of using a periodic background task to correct drift... what do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it mostly depends on whether or not it is acceptable to send / read the entire configuration during a reconciliation loop. For things like NAT, the total configuration can be quite large, so we employ generation numbers so we can send only the required updates instead of the full set. For v2p mappings, the individual messages are much smaller. My napkin math from before showed that if we had a vm per core, with a vnic per vm, that would be something in the ballpark of 213KB for a full set of updates for a single sled, which seems reasonable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Comment on lines 43 to 49
let results = dsl::v2p_mapping_view
.select(V2PMappingView::as_select())
.load_async(&*self.pool_connection_authorized(opctx).await?)
.await
.map_err(|e| public_error_from_diesel(e, ErrorHandler::Server))?;

Ok(results)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're duplicating the above query here, and then discarding the paginated results in mapping?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😅 I forgot to remove the old query after adding the paginated one, thanks for catching this!

.await
.map_err(|e| public_error_from_diesel(e, ErrorHandler::Server))?;

paginator = p.found_batch(&batch, &|mapping| mapping.nic_id);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mapping.nic_id here is telling the paginator to fetch the next page based on NIC ID, but the paginated call above is using dsl::sled_id. I think these need to match - it should be possible to craft a test that fails with the code as written, I think? (Or maybe the current tests would fail after removing the non-paginated query below that @FelixMcFelix pointed out?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if we moved these columns to typed UUIDs, this would fail to compile entirely - @sunshowers do you know if DbTypedUuid can be used through paginated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jgallagher @sunshowers if it's possible to used typed UUIDs to guard against this without too much rework I'd like to do so :D

@internet-diglett
Copy link
Contributor Author

Quick update on the status of this PR. Some changes in main have added a background task dependency on notify_instance_updated. In this PR that method has been changed to use the background_task object, but we can't pass background_task from a background task, so now I need to refactor some things. 😂

@internet-diglett internet-diglett enabled auto-merge (squash) May 18, 2024 01:32
@internet-diglett
Copy link
Contributor Author

All outstanding comments have been resolved, going ahead and setting this to auto-merge unless anyone has an objection. I'd like to get this soaking on Dogfood sooner rather than later for the upcoming release :)

@internet-diglett
Copy link
Contributor Author

sigh looks like a test is flaky

--- STDERR:              omicron-nexus::test_all integration_tests::instances::test_instance_v2p_mappings ---
log file: /tmp/test_all-0075ccc75846a7de-test_instance_v2p_mappings.15117.0.log
note: configured to log to "/tmp/test_all-0075ccc75846a7de-test_instance_v2p_mappings.15117.0.log"
DB URL: postgresql://root@[::1]:35454/omicron?sslmode=disable
DB address: [::1]:35454
log file: /tmp/test_all-0075ccc75846a7de-test_instance_v2p_mappings.15117.2.log
note: configured to log to "/tmp/test_all-0075ccc75846a7de-test_instance_v2p_mappings.15117.2.log"
log file: /tmp/test_all-0075ccc75846a7de-test_instance_v2p_mappings.15117.3.log
note: configured to log to "/tmp/test_all-0075ccc75846a7de-test_instance_v2p_mappings.15117.3.log"
thread 'integration_tests::instances::test_instance_v2p_mappings' panicked at nexus/tests/integration_tests/instances.rs:4561:10:
v2p mappings should be empty: PermanentError("v2p mappings are still present")

@internet-diglett internet-diglett enabled auto-merge (squash) May 21, 2024 19:02
@internet-diglett
Copy link
Contributor Author

Fixed the flaky test. I was just using the tooling wrong 😛

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants