You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If I'm reading things right, this function selects the sled to which to send the runtime state update by looking at the instance record in CRDB without regard for the instance's current state:
This seems like the right thing to do if the instance is already incarnated on a sled somewhere. But if the instance is stopped and doesn't exist on any sled, this will try to create the instance on the sled on which it most recently ran, which might not have capacity for it (even though some other sled might). This function should distinguish the "instance already incarnated" and "instance stopped" cases and select a new sled in the latter case.
The text was updated successfully, but these errors were encountered:
I suspect fixing this problem will more or less require instance start to become a saga, because once it's done, starting an instance will require a lot of attendant work (reserving space on a sled, setting up V2P mappings) that we need to be able to retry if interrupted and that may need to be undone if the entire attempt to start the instance fails.
The Nexus external API's "instance start" command passes through to
Nexus::instance_start_runtime
:omicron/nexus/src/external_api/http_entrypoints.rs
Line 3088 in 9d1bd55
omicron/nexus/src/app/instance.rs
Lines 358 to 377 in 9d1bd55
If I'm reading things right, this function selects the sled to which to send the runtime state update by looking at the instance record in CRDB without regard for the instance's current state:
omicron/nexus/src/app/instance.rs
Lines 608 to 619 in 9d1bd55
This seems like the right thing to do if the instance is already incarnated on a sled somewhere. But if the instance is stopped and doesn't exist on any sled, this will try to create the instance on the sled on which it most recently ran, which might not have capacity for it (even though some other sled might). This function should distinguish the "instance already incarnated" and "instance stopped" cases and select a new sled in the latter case.
The text was updated successfully, but these errors were encountered: