New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
scheduler: ensure dup alloc names are fixed before plan submit. #18873
Conversation
This change fixes a bug within the generic scheduler which meant duplicate alloc indexes (names) could be submitted to the plan applier and written to state. The bug originates from the placements calculation notion that names of allocations being replaced are blindly copied to their replacement. This is not correct in all cases, particularly when dealing with canaries. The fix updates the alloc name index tracker to include minor duplicate tracking. This can be used when computing placements to ensure duplicate are found, and a new name picked before the plan is submitted. The name index tracking is now passed from the reconciler to the generic scheduler via the results, so this does not have to be regenerated, or another data structure used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work on this @jrasell. The fix is tightly scoped and very clear, and the testing and extra commentary is really helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work!
allocIndex := a.Index() | ||
|
||
if bitmap.Check(allocIndex) { | ||
duplicates[allocIndex]++ | ||
} else { | ||
bitmap.Set(allocIndex) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think duplicates
replaces bitmap
entirely. bitmap
is "a map of bools" when we need "a map of ints", hence the new duplicates
mapping. However, I think ideally both bitmap and duplicates would be something like a slice of ints where the offset is the alloc index and the value is the number of duplicates: 0 indicating the index is unused, 1 indicating it is used, and >1 indicating it has duplicates. (this is all awkward to describe since both "alloc index" and "index of the slice" are two different concepts that could actually be same thing!)
That being said even if my suggestion works, it would be functionally identical to what you've implemented, so we should go with whatever the lowest-risk easiest-to-read implementation is. Since this is in the parallelized part of scheduling, and cpu time is usually plentiful on servers, I'm far more concerned with correctness than performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that the name index tracking could be updated to include duplicates without using a bitmap, however, I think reworking this should be done within any follow-up. This change seems fairly low risk and targeted, whereas a larger rewrite would require substantial regression testing to be included. I'll raise an issue to discuss the idea of reworking the tracking, which I could noodle on casually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sharing some comments early in case I don't have a chance to come back for a deeper review later today, but great work!
// Pull the allocation name as a new variables, so we can alter | ||
// this as needed without making changes to the original | ||
// object. | ||
newAllocName := missing.Name() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I'm misunderstanding it, but this comment seems a bit inaccurate. newAllocName
doesn't seem to be modified, and even if it were strings are constants, so it shouldn't affect missing
? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
newAllocName
is potentially modified on L654 if it is found to be a duplicate.
// future version of Nomad. | ||
if taskGroupNameIndex.IsDuplicate(allocIndex) { | ||
oldAllocName := newAllocName | ||
newAllocName = taskGroupNameIndex.Next(1)[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not immediately obvious to me how this could cause any problems, but given that we're trying to avoid duplicate names it could useful to further investigate if this bit of code could cause problems:
nomad/scheduler/reconcile_util.go
Lines 739 to 744 in fdde8a5
// We have exhausted the free set, now just pick overlapping indexes | |
var i uint | |
for i = 0; i < remainder; i++ { | |
next = append(next, structs.AllocName(a.job, a.taskGroup, i)) | |
a.b.Set(i) | |
} |
Maybe if, somehow, the count
value differs between the time the allocNameIndex
is created and the call to Next()
(like in a job update? or a version revert?) we could maybe hit an overlap? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good thread to pull on and something I will take a look at once this PR has been merged. This resolves a reproducible manifestation of the bug, so I would like to get that fixed given the time pressures before opening up new investigations.
scheduler/reconcile.go
Outdated
// Duplicate allocation indexes can be caused due to the way this piece of | ||
// code works. The reproduction involved canaries, and performing both a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to make I understood, this hot path can return early with duplicate alloc names? Is there a way to avoid that so that callers don't have to handle them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment has been clarified, hopefully that makes things clearer? I spent a while looking to see if I could fix the problem at the source, but I wasn't able to figure out a way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
This change fixes a bug within the generic scheduler which meant duplicate alloc indexes (names) could be submitted to the plan applier and written to state. The bug originates from the placements calculation notion that names of allocations being replaced are blindly copied to their replacement. This is not correct in all cases, particularly when dealing with canaries. The fix updates the alloc name index tracker to include minor duplicate tracking. This can be used when computing placements to ensure duplicate are found, and a new name picked before the plan is submitted. The name index tracking is now passed from the reconciler to the generic scheduler via the results, so this does not have to be regenerated, or another data structure used.
…) (#18891) This change fixes a bug within the generic scheduler which meant duplicate alloc indexes (names) could be submitted to the plan applier and written to state. The bug originates from the placements calculation notion that names of allocations being replaced are blindly copied to their replacement. This is not correct in all cases, particularly when dealing with canaries. The fix updates the alloc name index tracker to include minor duplicate tracking. This can be used when computing placements to ensure duplicate are found, and a new name picked before the plan is submitted. The name index tracking is now passed from the reconciler to the generic scheduler via the results, so this does not have to be regenerated, or another data structure used. Co-authored-by: James Rasell <jrasell@users.noreply.github.com>
This change fixes a bug within the generic scheduler which meant duplicate alloc indexes (names) could be submitted to the plan applier and written to state. The bug originates from the placements calculation notion that names of allocations being replaced are blindly copied to their replacement. This is not correct in all cases, particularly when dealing with canaries. The fix updates the alloc name index tracker to include minor duplicate tracking. This can be used when computing placements to ensure duplicate are found, and a new name picked before the plan is submitted. The name index tracking is now passed from the reconciler to the generic scheduler via the results, so this does not have to be regenerated, or another data structure used.
This change fixes a bug within the generic scheduler which meant duplicate alloc indexes (names) could be submitted to the plan applier and written to state. The bug originates from the placements calculation notion that names of allocations being replaced are blindly copied to their replacement. This is not correct in all cases, particularly when dealing with canaries. The fix updates the alloc name index tracker to include minor duplicate tracking. This can be used when computing placements to ensure duplicate are found, and a new name picked before the plan is submitted. The name index tracking is now passed from the reconciler to the generic scheduler via the results, so this does not have to be regenerated, or another data structure used.
This change fixes a bug within the generic scheduler which meant duplicate alloc indexes (names) could be submitted to the plan applier and written to state. The bug originates from the placements calculation notion that names of allocations being replaced are blindly copied to their replacement. This is not correct in all cases, particularly when dealing with canaries. The fix updates the alloc name index tracker to include minor duplicate tracking. This can be used when computing placements to ensure duplicate are found, and a new name picked before the plan is submitted. The name index tracking is now passed from the reconciler to the generic scheduler via the results, so this does not have to be regenerated, or another data structure used.
…icorp#18873) This change fixes a bug within the generic scheduler which meant duplicate alloc indexes (names) could be submitted to the plan applier and written to state. The bug originates from the placements calculation notion that names of allocations being replaced are blindly copied to their replacement. This is not correct in all cases, particularly when dealing with canaries. The fix updates the alloc name index tracker to include minor duplicate tracking. This can be used when computing placements to ensure duplicate are found, and a new name picked before the plan is submitted. The name index tracking is now passed from the reconciler to the generic scheduler via the results, so this does not have to be regenerated, or another data structure used.
…icorp#18873) This change fixes a bug within the generic scheduler which meant duplicate alloc indexes (names) could be submitted to the plan applier and written to state. The bug originates from the placements calculation notion that names of allocations being replaced are blindly copied to their replacement. This is not correct in all cases, particularly when dealing with canaries. The fix updates the alloc name index tracker to include minor duplicate tracking. This can be used when computing placements to ensure duplicate are found, and a new name picked before the plan is submitted. The name index tracking is now passed from the reconciler to the generic scheduler via the results, so this does not have to be regenerated, or another data structure used.
This change fixes a bug within the generic scheduler which meant duplicate alloc indexes (names) could be submitted to the plan applier and written to state. The bug originates from the placements calculation notion that names of allocations being replaced are blindly copied to their replacement. This is not correct in all cases, particularly when dealing with canaries.
The fix updates the alloc name index tracker to include minor duplicate tracking. This can be used when computing placements to ensure duplicate are found, and a new name picked before the plan is submitted. The name index tracking is now passed from the reconciler to the generic scheduler via the results, so this does not have to be regenerated, or another data structure used.
closes #10727
Reviewer Notes
The new test
TestServiceSched_JobModify_ProposedDuplicateAllocIndex
mimics the reproduction behaviour and can be used to see how the code change has fixed the bug.This reproduction can be used to test the code change in a manual way. The reproduction (in my testing) worked 100% of the time.