Adding a previously-expunged sled leaves it expunged #5625

jgallagher · 2024-04-25T19:35:44Z

On an a4x2 testbed, I went through this sequence:

Set up with three sleds (g0, g1, g3)
Add sled g2
Step through blueprints to give g2 NTP and crucible
Use omdb db reconfigurator-save, reconfigurator-cli, and omdb nexus blueprints import to craft a blueprint that gives g2 a Nexus zone
Confirm the Nexus zone comes online and is externally reachable
Manually clean-slate the sled (disable sled-agent and mg-ddm, remove relevant files in /pool/int/*)
a4x2 hyperstop g2 to shut off the sled
omdb db nexus sleds expunge ... to expunge the sled
a4x2 hyperstart g2 to bring the sled back online
Reenable sled-agent and mg-ddm

At this point the sled shows up as uninitialized:

root@oxz_switch:~# omdb nexus sleds list-uninitialized
RACK_ID                              CUBBY SERIAL PART  REVISION
73d99599-d25c-41d6-a78b-246ceba80faa 2     g2     i86pc 0

And adding it appears to succeed:

root@oxz_switch:~# omdb -w nexus sleds add g2 i86pc
added sled g2 (i86pc)

However, after doing so, it still shows up in list-uninitialized:

root@oxz_switch:~# omdb nexus sleds list-uninitialized
RACK_ID                              CUBBY SERIAL PART  REVISION
73d99599-d25c-41d6-a78b-246ceba80faa 2     g2     i86pc 0

and we can even apparently add it again (again successfully):

root@oxz_switch:~# omdb -w nexus sleds add g2 i86pc
added sled g2 (i86pc)

but it remains in the list-uninitialized output.

We expect this to work, and the add should create a new sled_id for this same baseboard we've seen before. But looking in CRDB, we only have one row for g2 (the one created when we added it the first time), and it is still expunged:

root@[fd00:1122:3344:103::3]:32221/omicron> select id,time_created,time_modified,serial_number,sled_policy from sled;
                   id                  |         time_created          |         time_modified         | serial_number | sled_policy
---------------------------------------+-------------------------------+-------------------------------+---------------+--------------
  353582e7-6a8d-4de4-8aba-ea97ecd2f11b | 2024-04-25 17:24:32.016339+00 | 2024-04-25 19:06:50.980526+00 | g2            | expunged
  692fd79d-6f93-4787-9fa4-bbc47a3cc494 | 2024-04-25 17:18:30.451498+00 | 2024-04-25 17:18:30.451498+00 | g3            | in_service
  cacc195b-5c78-4e08-833c-5af089815eba | 2024-04-25 17:18:28.702021+00 | 2024-04-25 17:18:28.702021+00 | g0            | in_service
  fca3f869-15ce-49ee-a872-7a4129877efb | 2024-04-25 17:18:31.583653+00 | 2024-04-25 17:18:31.583653+00 | g1            | in_service
(4 rows)

Tracing through the add sled flow, I think the problem is in DataStore::allocate_sled_underlay_subnet_octets(). This is the method that creates a new sled ID (line 309); however, hw_baseboard_id is tied to the baseboard which hasn't changed. If we find a matching allocation on lines 315-318, we'll stop and return that prior allocation. In CRDB, we can see that we do have a prior allocation for this baseboard (as expected, since we've added this sled before!):

root@[fd00:1122:3344:103::3]:32221/omicron> select * from sled_underlay_subnet_allocation;
            hw_baseboard_id            |               rack_id                |               sled_id                | subnet_octet
---------------------------------------+--------------------------------------+--------------------------------------+---------------
  a41243b4-dc81-4db4-9223-4dd83dd00eea | 73d99599-d25c-41d6-a78b-246ceba80faa | 353582e7-6a8d-4de4-8aba-ea97ecd2f11b |           33
(1 row)

root@[fd00:1122:3344:103::3]:32221/omicron> select * from hw_baseboard_id where id = 'a41243b4-dc81-4db4-9223-4dd83dd00eea';
                   id                  | part_number | serial_number
---------------------------------------+-------------+----------------
  a41243b4-dc81-4db4-9223-4dd83dd00eea | i86pc       | g2
(1 row)

I think (?) if allocate_sled_underlay_subnet_octets() finds a prior allocation with a matching baseboard, it should check the sled table for that allocation's sled_id, and if it's expunged, ignore that allocation and continue looping (resulting in it either finding a non-expunged match, or exiting the loop and returning new_allocation).

The text was updated successfully, but these errors were encountered:

davepacheco · 2024-04-25T20:07:19Z

Would implementing #5554 fix this?

jgallagher · 2024-04-25T20:11:07Z

Would implementing #5554 fix this?

Probably, with at least two caveats:

sled expungement should clean up sled underlay allocations #5554 notes that we probably want to gate cleanup on decommissioning, which we don't do yet.
If we added the sled back between expunging it and it being decommissioned (and therefore before cleanup), we'd still hit this.

I ran into this as a part of #5625, where adding a previously-expunged sled appeared to succeed, but didn't actually add anything new. Today if we try to add a sled that is already _running_, we get a 500, because Nexus fails when it tries to tell the sled-agent to start. But with this PR, we fail earlier: adding a sled that already has a subnet allocation fails before we even try to talk to the sled-agent, because it means someone has already added this sled: ``` root@oxz_switch:~# omdb -w nexus sleds add g2 i86pc added sled g2 (i86pc): 90413e40-8139-43b4-9081-365dab6e5579 root@oxz_switch:~# omdb -w nexus sleds add g2 i86pc Error: adding sled Caused by: Error Response: status: 400 Bad Request; headers: {"content-type": "application/json", "x-request-id": "9eb95a9f-3fe0-4f75-8846-13490b95500e", "content-length": "188", "date": "Tue, 30 Apr 2024 20:54:49 GMT"}; value: Error { error_code: Some("ObjectAlreadyExists"), message: "already exists: sled \"g2 / i86pc (90413e40-8139-43b4-9081-365dab6e5579)\"", request_id: "9eb95a9f-3fe0-4f75-8846-13490b95500e" } ``` This does change the external API slightly (204 -> 201 created, and we now return the ID), but I think (?) that's probably fine since we have no real consumers of that yet.

This adds the `sled_state` map described in RFD 457 to `Blueprint`. The motivating driver here is decommissioning expunged sleds, which is on the path to addressing #5625, but this PR only adds the field - the planner carries forward states from its inputs, but otherwise makes no changes, and the executor doesn't read the new field at all.

This builds on (and is currently pointed at) #5663, and is on the path to fixing #5625, although there is still more work to do there. (Should be small, and I'll start on it while this is in review to make sure it really fixes it.) I don't plan on landing this before R8 is out the door, but wanted to go ahead and open it up to review. Much of this is fallout from our discussions about what it means to decommission a sled. Major changes are: * `SledFilter::All` has been renamed to `SledFilter::Commissioned`, and no longer returns sleds that are in `SledState::Decommissioned`. * The blueprint planner will now update the desired sled state to `Decommissioned` for sleds which satisfy our conditions. (See `do_plan_decommission()`.) * The blueprint planner will carry forward the Omicron zones of decommissioned sleds to child blueprints. Pruning these is #5552. * The blueprint planner will _not_ carry forward a desired sled state of `Decommissioned` once the inputs report that the sled has already been decommissioned. * The blueprint executor will decommission sleds that the planner said to. * Decommissioning a sled implicitly decommissions all its disks. (This matches what happens with sled expungement, and should not interfere with region replacement, which keys off of policy, not state.)

jgallagher added this to the 8 milestone Apr 25, 2024

jgallagher self-assigned this Apr 25, 2024

This was referenced Apr 26, 2024

Add omdb expunge-testing fudge-serial <SLED_ID> #5637

Closed

Blueprints: store sled state #5663

Merged

Reject add sled requests for sleds that already exist #5675

Merged

jgallagher mentioned this issue May 3, 2024

[reconfigurator] Decommission sleds #5698

Merged

morlandi7 modified the milestones: 8, 9 May 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding a previously-expunged sled leaves it expunged #5625

Adding a previously-expunged sled leaves it expunged #5625

jgallagher commented Apr 25, 2024

davepacheco commented Apr 25, 2024

jgallagher commented Apr 25, 2024

Adding a previously-expunged sled leaves it expunged #5625

Adding a previously-expunged sled leaves it expunged #5625

Comments

jgallagher commented Apr 25, 2024

davepacheco commented Apr 25, 2024

jgallagher commented Apr 25, 2024