Return sentinels from sled_insert_resource_query#10399
Return sentinels from sled_insert_resource_query#10399jmpesp merged 8 commits intooxidecomputer:mainfrom
sled_insert_resource_query#10399Conversation
The failure mode seen in oxidecomputer#10353 involves multiple contending sled reservations and lead to a sled reservation iterating over all possible potential allocation mappings, causing the instance-start to appear to hang when in reality it was exploring all potential permutations. The fix for that was to update the iterator's stored information so that it could recognize when the sled insert query it was about to execute could never work. This fix was insufficient as further testing lead to more instance-start sagas that appeared stuck. During contention, there are a few broad categories to how the sled reservation insert query could fail: - the target sled could no longer have the available resources - putting a VMM on the target sled would now violate affinity or anti-affinity rules - the pools affected by requested local storage mapping no longer have the required available space Updating the iterator's stored information would only let Nexus detect the last category where the instance-start saga hang seen today fell into the first. Nexus needs to know which part of the insert query failed so that it can decide what to do based on if local storage allocations are required or not, and this lead to this commit: using the "true or cast error" pattern, return sentinels from the sled insert resource query, and bail out of searching for a valid local storage allocation permutation if the sled target is no longer valid. Fixes oxidecomputer#10393
| |query| { | ||
| query.sql("EXISTS(SELECT 1 FROM sled_has_space)"); | ||
| }, | ||
| "SLED_HAS_SPACE", |
There was a problem hiding this comment.
this feels like a nitpick, but seems like actually important future-proofing:
Can we do:
const SPACE_SENTINEL: &static str = "SLED HAS SPACE";
const SLED_INSERT_QUERY_SENTINELS: [&'static str; 3] =
[SPACE_SENTINEL, THE OTHER ONES ];The contents of these strings is load-bearing; I'd like to define them in a single spot
| // as errors as this branch does not have any | ||
| // requested local storage allocations. Ignore these | ||
| // and proceed to the next sled_target. | ||
| info!(&log, "reservation failed due to {sentinel}"); |
There was a problem hiding this comment.
In other variants of this, we use the "sentinel" as a "dense but clear" indicator of the class of error, and then in rust code, we "Match on that variant to return a more readable error".
E.g., seeing BANNED_SLEDS in the logs with no context requires that someone go dig into the code. Might be nice to have a match statement going from "sentinel" -> "what does it mean in english though"
|
|
||
| info!( | ||
| &log, | ||
| "reservation failed due to {sentinel}", |
There was a problem hiding this comment.
Same comment as above about having a decoder ring for the sentinels
| { | ||
| // The only part of the `insert_valid` section of | ||
| // the insertion query that can fail are the same | ||
| // places where these sentinels are cast and thrown | ||
| // as errors as this branch does not have any | ||
| // requested local storage allocations. Ignore these | ||
| // and proceed to the next sled_target. |
There was a problem hiding this comment.
nit:
| { | |
| // The only part of the `insert_valid` section of | |
| // the insertion query that can fail are the same | |
| // places where these sentinels are cast and thrown | |
| // as errors as this branch does not have any | |
| // requested local storage allocations. Ignore these | |
| // and proceed to the next sled_target. | |
| { | |
| // The only part of the `insert_valid` section of | |
| // the insertion query that can fail are the same | |
| // places where these sentinels are cast and thrown | |
| // as errors, as this branch does not have any | |
| // requested local storage allocations. Ignore these | |
| // and proceed to the next sled_target. |
| info!(&log, "reservation failed"); | ||
| Err(e) => { | ||
| if let Some(sentinel) = | ||
| matches_sentinel(&e, &SLED_INSERT_QUERY_SENTINELS) |
There was a problem hiding this comment.
So, I had not seen matches_sentinel before and was going to leave some criticism of it, but I guess it's actually code that was here before this PR. Basically, I was just going to suggest that rather than doing this, we could have an error enum and convert the query error into an enum of SledHasSpace, BannedSleds, RequiredSleds, DbError(e) and match on that here...but, since this is a pre-existing pattern, I'm not going to make you go change it up as part of this bugfix. Carry on!
| // as errors as this branch does not have any | ||
| // requested local storage allocations. Ignore these | ||
| // and proceed to the next sled_target. | ||
| info!(&log, "reservation failed due to {sentinel}"); |
There was a problem hiding this comment.
turbo nit: i feel like "reservation failed due to SLED_HAS_SPACE" is kind of a weird sentence, and i wonder if we ought to have a way to turn those sentinels into more human-able descriptions. on the other hand, i think including the sentinel that we hit in the log line is a good idea as it's greppable in the source when debugging.
maybe just turning the sentinel into a structured field would make it feel less weird?
| info!(&log, "reservation failed due to {sentinel}"); | |
| info!(&log, "reservation failed"; "sentinel" => sentinel); |
| // Alternatively, checking affinity related | ||
| // conditions could now also invalidate this | ||
| // sled_target. |
There was a problem hiding this comment.
nit: maybe:
| // Alternatively, checking affinity related | |
| // conditions could now also invalidate this | |
| // sled_target. | |
| // Alternatively, concurrent reservations | |
| // could have allocated an instance which | |
| // makes this sled_target invalid for the | |
| // instance we are trying to allocate due | |
| // to affinity/anti-affinity constraints. |
| // sled_target. | ||
| // | ||
| // This isn't a problem when _not_ performing | ||
| // local storage allocations because the insert |
There was a problem hiding this comment.
| // local storage allocations because the insert | |
| // local storage allocations, because the insert |
| // be chosen right away. With local storage | ||
| // (just like the failure mode mentioned in a | ||
| // previous block comment), the iterator will | ||
| // erroneously keep returning different | ||
| // permutations, meanwhile the insert query | ||
| // isn't failing due to the available local | ||
| // storage space. |
There was a problem hiding this comment.
Sorry, I'm not totally sure if I get what this comment is trying to say: you're saying that if we are allocating an instance with local storage, and we hit a sentinel indicating that a concurrent allocation rendered this sled target invalid, the insert query won't fail? That seems counter to what's actually happening here. It seems like the point this comment is actually trying to make is that when an insert query fails, we must break out of the iteration so that we don't keep trying other a whole bunch of other possible local storage allocations on the same sled which will also fail. Am I understanding correctly?
There was a problem hiding this comment.
you're saying that if we are allocating an instance with local storage, and we hit a sentinel indicating that a concurrent allocation rendered this sled target invalid, the insert query won't fail?
I wasn't precise with the language here: the insert will fail in the sense that it returns one of the sentinel errors. We have to break out of the iteration only when we see one of the sentinels. I rewrote the comment in 7c00158
| "SLED_HAS_SPACE", | ||
| ); | ||
| query.sql(" AND "); | ||
|
|
||
| query.true_or_cast_error( | ||
| |query| { | ||
| query | ||
| .sql("NOT(EXISTS(SELECT 1 FROM banned_sleds WHERE sled_id = ") | ||
| .param() | ||
| .bind::<sql_types::Uuid, _>( | ||
| resource.sled_id.into_untyped_uuid(), | ||
| ) | ||
| .sql("))"); | ||
| }, | ||
| "BANNED_SLEDS", | ||
| ); | ||
| query.sql(" AND "); | ||
|
|
||
| query.true_or_cast_error( | ||
| |query| { | ||
| query | ||
| .sql("(EXISTS(SELECT 1 FROM required_sleds WHERE sled_id = ") | ||
| .param() | ||
| .bind::<sql_types::Uuid, _>( | ||
| resource.sled_id.into_untyped_uuid(), | ||
| ) | ||
| .sql(") OR NOT EXISTS (SELECT 1 FROM required_sleds))"); | ||
| }, | ||
| "REQUIRED_SLEDS", |
There was a problem hiding this comment.
nit, perhaps not that important, but: since the string value of the sentinels is both (a) load-bearing and (b) would not change the validity of the SQL query itself if typo'd, I'd kind of rather that these were string constants defined in this module, and the array of sentinels used in sled.rs was defined in terms of those constants. so here, we'd write:
| "SLED_HAS_SPACE", | |
| ); | |
| query.sql(" AND "); | |
| query.true_or_cast_error( | |
| |query| { | |
| query | |
| .sql("NOT(EXISTS(SELECT 1 FROM banned_sleds WHERE sled_id = ") | |
| .param() | |
| .bind::<sql_types::Uuid, _>( | |
| resource.sled_id.into_untyped_uuid(), | |
| ) | |
| .sql("))"); | |
| }, | |
| "BANNED_SLEDS", | |
| ); | |
| query.sql(" AND "); | |
| query.true_or_cast_error( | |
| |query| { | |
| query | |
| .sql("(EXISTS(SELECT 1 FROM required_sleds WHERE sled_id = ") | |
| .param() | |
| .bind::<sql_types::Uuid, _>( | |
| resource.sled_id.into_untyped_uuid(), | |
| ) | |
| .sql(") OR NOT EXISTS (SELECT 1 FROM required_sleds))"); | |
| }, | |
| "REQUIRED_SLEDS", | |
| SLED_HAS_SPACE, | |
| ); | |
| query.sql(" AND "); | |
| query.true_or_cast_error( | |
| |query| { | |
| query | |
| .sql("NOT(EXISTS(SELECT 1 FROM banned_sleds WHERE sled_id = ") | |
| .param() | |
| .bind::<sql_types::Uuid, _>( | |
| resource.sled_id.into_untyped_uuid(), | |
| ) | |
| .sql("))"); | |
| }, | |
| BANNED_SLEDS, | |
| ); | |
| query.sql(" AND "); | |
| query.true_or_cast_error( | |
| |query| { | |
| query | |
| .sql("(EXISTS(SELECT 1 FROM required_sleds WHERE sled_id = ") | |
| .param() | |
| .bind::<sql_types::Uuid, _>( | |
| resource.sled_id.into_untyped_uuid(), | |
| ) | |
| .sql(") OR NOT EXISTS (SELECT 1 FROM required_sleds))"); | |
| }, | |
| REQUIRED_SLEDS, |
and then somewhere at the top of this file:
/// The sled insert query will return a sentinel (using the pattern of an
/// erroneous cast of a string into a bool) if the insert is no longer valid due
/// to three of the conditions related to a sled's available hardware resources,
/// or affinity rules.
pub const SLED_INSERT_QUERY_SENTINELS: [&'static str; 3] =
[SLED_HAS_SPACE, BANNED_SLEDS, REQUIRED_SLEDS];
const SLED_HAS_SPACE: &'static str = "SLED_HAS_SPACE";
const BANNED_SLEDS: &'static str = "BANNED_SLEDS";
const REQUIRED_SLEDS: &'static str = "REQUIRED_SLEDS";This way, we are guaranteeing that the sentinel value in the query and the one that we are checking for in the sled-allocation logic are always the same string. I realize that you didn't typo them here, but I think it's maybe worth establishing the pattern now so that future code which might add a new sentinel value also does it this way?
hawkw
left a comment
There was a problem hiding this comment.
some annoying nits. overall, though, i think the change makes sense!
| // [`TrueOrCastError`] exists but requires an expression that evaluates to | ||
| // sql_type::Bool. This function applies the same pattern but for the | ||
| // QueryBuilder. | ||
| pub fn true_or_cast_error<F>(&mut self, condition: F, error: &'static str) | ||
| where | ||
| F: FnOnce(&mut QueryBuilder), | ||
| { | ||
| self.sql("CAST(IF("); | ||
| self.sql("("); | ||
| condition(self); | ||
| self.sql("),"); | ||
| self.sql("'TRUE', '"); | ||
| self.sql(error); | ||
| self.sql("') AS BOOL)"); | ||
| } |
There was a problem hiding this comment.
this is cute! i wonder if it might make sense for TrueOrCastError to be reimplemented using this but LET'S DO THAT IN A SEPARATE PR FROM THE BUGFIX. :)
There was a problem hiding this comment.
nit: this is not actually something that changed in this PR, but i noticed that there is no longer a loop labeled 'outer, as neither loop is labeled. mayhaps we should update that?
| "reservation failed due to {sentinel}", | ||
| ); | ||
|
|
||
| break; |
There was a problem hiding this comment.
so, this is doing the correct thing and breaking out of the inner loop, so this isn't a bug or anything, but...part of me would kind of like to see a label on the inner loop and use a labeled break here (like break 'local_storage_allocations or something?). mainly, there is just a lot of nesting going on here and (especially because the loop keywords are several lines above here due to the very long comments (which , for the record, are good and we should keep!) but it's a bit hard to tell which loop is being breaked here.
hawkw
left a comment
There was a problem hiding this comment.
Thanks for addressing the feedback, this looks good to me! I had a couple more little nitpicks, but nothing worth blocking the fix over.
| use crate::db::queries::sled_reservation::BANNED_SLEDS_SENTINEL; | ||
| use crate::db::queries::sled_reservation::BANNED_SLEDS_SENTINEL_REASON; | ||
| use crate::db::queries::sled_reservation::LocalStorageAllocation; | ||
| use crate::db::queries::sled_reservation::LocalStorageAllocationRequired; | ||
| use crate::db::queries::sled_reservation::REQUIRED_SLEDS_SENTINEL; | ||
| use crate::db::queries::sled_reservation::REQUIRED_SLEDS_SENTINEL_REASON; | ||
| use crate::db::queries::sled_reservation::SLED_HAS_SPACE_SENTINEL; | ||
| use crate::db::queries::sled_reservation::SLED_HAS_SPACE_SENTINEL_REASON; |
There was a problem hiding this comment.
huh, i find it a bit odd that all the sentinels and their reasons are defined in sled_reservation but the SLED_INSERT_QUERY_SENTINELS array and sentinel_to_reason are defined here...I kinda feel like we could import less stuff if we put all the sentinel-related code in that module? then you would just need SLED_INSERT_QUERY_SENTINELS and SENTINEL_TO_REASON...
not a big deal though.
| SLED_HAS_SPACE_SENTINEL => SLED_HAS_SPACE_SENTINEL_REASON, | ||
|
|
||
| BANNED_SLEDS_SENTINEL => BANNED_SLEDS_SENTINEL_REASON, | ||
|
|
||
| REQUIRED_SLEDS_SENTINEL => REQUIRED_SLEDS_SENTINEL_REASON, | ||
|
|
There was a problem hiding this comment.
weird whitespace here
| SLED_HAS_SPACE_SENTINEL => SLED_HAS_SPACE_SENTINEL_REASON, | |
| BANNED_SLEDS_SENTINEL => BANNED_SLEDS_SENTINEL_REASON, | |
| REQUIRED_SLEDS_SENTINEL => REQUIRED_SLEDS_SENTINEL_REASON, | |
| SLED_HAS_SPACE_SENTINEL => SLED_HAS_SPACE_SENTINEL_REASON, | |
| BANNED_SLEDS_SENTINEL => BANNED_SLEDS_SENTINEL_REASON, | |
| REQUIRED_SLEDS_SENTINEL => REQUIRED_SLEDS_SENTINEL_REASON, | |
|
|
||
| REQUIRED_SLEDS_SENTINEL => REQUIRED_SLEDS_SENTINEL_REASON, | ||
|
|
||
| _ => unreachable!("unrecognized sentinel"), |
There was a problem hiding this comment.
I think I'd kinda rather we not panic here, even though I agree that it should be unreachable in real life. We could just return the string instead?
| _ => unreachable!("unrecognized sentinel"), | |
| _ => "unrecognized sentinel", |
There was a problem hiding this comment.
frankly we could also "just return the sentinel" in this case. the point is just "improve readability" in this case
There was a problem hiding this comment.
i like Sean's thing, you should do that
| ) { | ||
| // Concurrent sled reservations could have | ||
| // allocated enough hardware threads, RSS RAM, | ||
| // and/ or reservoir RAM that means this |
There was a problem hiding this comment.
| // and/ or reservoir RAM that means this | |
| // and/or reservoir RAM that means this |
| // permutations, meanwhile the insert query | ||
| // isn't succeeding for reasons other than the |
There was a problem hiding this comment.
| // permutations, meanwhile the insert query | |
| // isn't succeeding for reasons other than the | |
| // permutations. Meanwhile, the insert query | |
| // will not succeed for reasons other than the |
| // available space for local storage. In this | ||
| // case, one of the sentinels will be returned, | ||
| // and if we see those then bail out of the | ||
| // search right away and pick another | ||
| // sled_target. |
There was a problem hiding this comment.
thanks, this is definitely clearer!
|
|
||
| REQUIRED_SLEDS_SENTINEL => REQUIRED_SLEDS_SENTINEL_REASON, | ||
|
|
||
| _ => unreachable!("unrecognized sentinel"), |
There was a problem hiding this comment.
frankly we could also "just return the sentinel" in this case. the point is just "improve readability" in this case
The failure mode seen in #10353 involves multiple contending sled reservations and lead to a sled reservation iterating over all possible potential allocation mappings, causing the instance-start to appear to hang when in reality it was exploring all potential permutations.
The fix for that was to update the iterator's stored information so that it could recognize when the sled insert query it was about to execute could never work. This fix was insufficient as further testing lead to more instance-start sagas that appeared stuck.
During contention, there are a few broad categories to how the sled reservation insert query could fail:
the target sled could no longer have the available resources
putting a VMM on the target sled would now violate affinity or anti-affinity rules
the pools affected by requested local storage mapping no longer have the required available space
Updating the iterator's stored information would only let Nexus detect the last category where the instance-start saga hang seen today fell into the first. Nexus needs to know which part of the insert query failed so that it can decide what to do based on if local storage allocations are required or not, and this lead to this commit: using the "true or cast error" pattern, return sentinels from the sled insert resource query, and bail out of searching for a valid local storage allocation permutation if the sled target is no longer valid.
Fixes #10393