-
Notifications
You must be signed in to change notification settings - Fork 573
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
overlord: allow max 500 changes in "ready" state to avoid growing changes for 24h #2545
Conversation
6f747de
to
37d86f5
Compare
Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>
Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>
Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>
Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>
37d86f5
to
3d9ce1d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
@@ -363,11 +364,16 @@ func (s *State) tasksIn(tids []string) []*Task { | |||
// Prune removes changes that became ready for more than pruneWait | |||
// and aborts tasks spawned for more than abortWait. | |||
// It also removes tasks unlinked to changes after pruneWait. | |||
func (s *State) Prune(pruneWait, abortWait time.Duration) { | |||
func (s *State) Prune(pruneWait, abortWait time.Duration, maxReadyChanges int) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"maxReadyChanges" seems misleading since the comparison below looks at the total count instead of just the ready changes. This should probably be "pruneReadyOver" or similar, and the documentation above should be updated to mention it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I updated the name and the documentation now.
@@ -379,14 +385,16 @@ func (s *State) Prune(pruneWait, abortWait time.Duration) { | |||
} | |||
continue | |||
} | |||
if readyTime.Before(pruneLimit) { | |||
// change old or we have too many changes | |||
if readyTime.Before(pruneLimit) || (chg.Status().Ready() && len(s.changes) > maxReadyChanges) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change statuses aren't super cheap to compute. Best to avoid it in a frequent global iteration like this.
There's also no need in this case I believe. It has to be ready, otherwise readyTime would be zero and we wouldn't be here.
The logic here doesn't seem quite healthy though. This may end up killing changes as soon as they're done, to the point of the client being unable to even see their result.
Let's please talk about this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch, I removed the chg.Status().Ready() check now.
Good catch on the logic, if 100 changes are in flight, this will always kill the one change that just became ready, not what we want! As a strawman I implemented a real maxReadyChanges that only counts the changes that are ready, not the total changes. Do you think that is reasonable? Or should we do something else instead (like a min-duration of e.g. 10min that we allow a change to stay even if the limit is reached?).
On Jan 17, 2017 19:25, "Gustavo Niemeyer" <notifications@github.com> wrote:
*@niemeyer* requested changes on this pull request.
------------------------------
In overlord/state/state.go
<#2545 (review)>:
@@ -363,11 +364,16 @@ func (s *State) tasksIn(tids []string) []*Task {
// Prune removes changes that became ready for more than pruneWait
// and aborts tasks spawned for more than abortWait.
// It also removes tasks unlinked to changes after pruneWait.
-func (s *State) Prune(pruneWait, abortWait time.Duration) {
+func (s *State) Prune(pruneWait, abortWait time.Duration,
maxReadyChanges int) {
"maxReadyChanges" seems misleading since the comparison below looks at the
total count instead of just the ready changes. This should probably be
"pruneReadyAfter" or similar, and the documentation above should be updated
to mention it.
I missed noticing this... maybe we really want to implement maxReadyChanges
instead, hmm
|
sort.Sort(byReadyTime(changes)) | ||
|
||
// used just for couting | ||
readyChanges := map[string]bool{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wondering, isn't a counter enough? also given the ordering can't we just count backward in the main loop (maybe a bit obscure)
Changes are kept until they are pruned after 24h. This means that we keep all the changes of the last 24h around in memory. This is not ideal as LP:1642068 shows. This PR limits the amount to 100 total changes that are ready.
LP: #1642068