New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
daemon, overlord/state: warnings pipeline #5514
Conversation
This is the backend work for the warnings pipeline. It has some (possibly ephemeral) debug verbs added as well, to let you play with it via http if that's your thing. Missing is: the frontend work: namely 'snap warnings', 'snap okay', and the messages after any command. deleting warnings when they expire. --- This is actually the third iteration, but the overlord/state code remains unchanged: the change has been to simplify the way the warnings summary is returned, thanks to pedronis's insightful insight.
Codecov Report
@@ Coverage Diff @@
## master #5514 +/- ##
==========================================
+ Coverage 78.94% 78.99% +0.05%
==========================================
Files 518 519 +1
Lines 39441 39619 +178
==========================================
+ Hits 31137 31298 +161
- Misses 5773 5786 +13
- Partials 2531 2535 +4
Continue to review full report at Codecov.
|
overlord/state/warning.go
Outdated
for k, w := range s.warnings { | ||
if w.IsDeletable(now) { | ||
delete(s.warnings, k) | ||
s.modified = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we do this in other places, Prune for example calls writing in a similar situation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack, will change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reviewed the state bits, bunch of comments/questions
overlord/state/warning.go
Outdated
// the warning text itself. Only one of these in the system at a time. | ||
message string | ||
// the first time one of these messages was created | ||
firstSeen time.Time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seen here is a bit ambiguous with Shown, Emitted instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the function is AddWarning, so maybe s/seen/added/ ?
overlord/state/warning.go
Outdated
lastSeen time.Time | ||
// the last time one of these was shown to the user | ||
lastShown time.Time | ||
// how much time since one of these was last seen should we drop the message |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again seen is unclear here, also we probably want it in the name deleteAfterLastEmitted ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also we tend to use deadlines and not duration, not sure how if that's relevant here though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these durations are relative to the deadlines in the same struct so I think they are ok
overlord/state/warning.go
Outdated
// how much time since one of these was last seen should we drop the message | ||
deleteAfter time.Duration | ||
// how much time since one of these was last shown should we repeat it | ||
repeatAfter time.Duration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
repeatAfterLastShown ?
overlord/state/warning.go
Outdated
return nil | ||
} | ||
|
||
func (w *Warning) IsDeletable(now time.Time) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we should /delete/expire/ and call this Expired() ?
overlord/state/warning.go
Outdated
type byLastSeen []*Warning | ||
|
||
func (a byLastSeen) Len() int { return len(a) } | ||
func (a byLastSeen) Swap(i, j int) { a[i], a[j] = a[j], a[i] } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
coverage says this is not tested (or not always)
overlord/state/warning.go
Outdated
n := 0 | ||
for _, w := range s.warnings { | ||
if w.IsShowable(t) { | ||
w.lastShown = t |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this always moves lastShown forward because of the def of IsShowable right? not sure it's obvious
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
always? no... a warning with firstAdded > t
, or with lastShown + repeatAfter > t
, would not have been returned by WarningsToShow(t)
and will not have its lastShown
bumped by OkayWarnings(t)
func (s *State) UnshowAllWarnings() { | ||
s.writing() | ||
for _, w := range s.warnings { | ||
w.lastShown = time.Time{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this doesn't seem to be tested directly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now it is
overlord/state/warning.go
Outdated
} | ||
|
||
// WarningsSummary returns the number of warnings that are ready to be | ||
// shown to the user, and the current timestamp (useful for ACKing the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should it say Okaying ?
overlord/state/warning.go
Outdated
|
||
n := 0 | ||
for _, w := range s.warnings { | ||
if w.IsShowable(t) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IsShowable uses/checks firstSeen, but here at Okaying should we use lastSeen perhaps? should we okay or not warnings that were emitting again since t?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
translating to the new names (just so I don't get confused),
IsShowable
uses/checksfirstAdded
, but here at Okaying should we uselastAdded
perhaps? should we okay or not warnings that were emitting again sincet
?
A warning is showable at time t
iff
- it was known at
t
(that is,firstAdded
is not aftert
), and - it had not been shown at
t
(that is,lastShown
is zero), or it had been shown long beforet
(repeatAfter
before).
if a warning that would be showable at t
is added again at tʹ > t
, then IsShowable(t)
will be true
, and its lastShown
will be updated, but it will now have lastAdded == tʹ
so it will be again showable after repeatAfter
has elapsed.
This is, as I understand it, part of the design: if the system is throwing up a warning periodically, snap okay
should shut the warning up for a while (until repeatAfter
).
8148ed3
to
6cb16f3
Compare
6cb16f3
to
9700719
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really like the implementation, thank you!
A few easy comments:
daemon/api.go
Outdated
st.AddWarning(a.Message) | ||
return SyncResponse(true, nil) | ||
case "unshow-warnings": | ||
st.UnshowAllWarnings() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per comments below, we can probably make the terminology a bit more consistent across frontend and backend around the "ack" and "pending" terminology. This would become "ack-warnings", and perhaps the method could be the same AckWarnings, if it can accept a 0 timestamp for "all" maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note this is debug, and it's not "ack"ing the warnings; it's unsetting the last shown timestamp.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see.. unshow is awkward, but sounds right for this.
daemon/api.go
Outdated
var ( | ||
stateOkayWarnings = (*state.State).OkayWarnings | ||
stateAllWarnings = (*state.State).AllWarnings | ||
stateWarningsToShow = (*state.State).WarningsToShow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ack verb also used here feels a bit more clear than "Okay":
How about:
- AckWarnings
- AllWarnings
- PendingWarnings
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the user command would still be okay
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
after talking on IRC, we're keeping "okay" everywhere -- having it be Ack internally but snap okay
would cause confusion, even more so because snap ack
exists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unresolving because of the ToShow
-> Pending
rename
daemon/api.go
Outdated
switch sel { | ||
case "all": | ||
all = true | ||
case "to-show", "": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/to-show/pending/ in an attempt to have a common term there as well.
daemon/daemon.go
Outdated
} | ||
if rsp.Type != ResponseTypeError { | ||
st.Lock() | ||
rsp.addWarningsToMeta(st.WarningsSummary) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious.. I'd have expected PendingWarnings (WarningsToShow) here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, the discussion on that as reflected on the whiteboard was a bit confusing. Initially we thought of printing the actual warnings after each command (“Warnings show up at the bottom of other commands, but, If more than 3, say WARNING: see snap warnings
”), but then as we thought of it more it got cut back to “On commands, if last warning is more recent than last seen show WARNING: There are new messages. See snap warnings
”. This sending the warning summary supports the latter directly, and the client can implement the former by hitting the warnings endpoint if the summary indicates it needs to.
overlord/state/warning.go
Outdated
} | ||
|
||
func (w *Warning) IsShowable(t time.Time) bool { | ||
return (w.lastShown.IsZero() || w.lastShown.Add(w.repeatAfter).Before(t)) && !w.firstAdded.After(t) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How could it have been last shown before it was first added?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when lastShown.IsZero()
, then firstAdded
can be after t
.
So I could rewrite this as
if w.lastShown.IsZero() {
return !w.firstAdded.After(t)
}
return w.lastShown.Add(w.repeatAfter).Before(t)
is that clearer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it was never shown, shouldn't we just say "yes, show it"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IsShowable(t)
tells you whether it is or was showable at the given time. You can then decide things based on that. If a warning was not known at time t
, it would not have been showable at that time.
the t
argument is always in the past; very recent past, for when getting a list of everything we want to show to the user, or not so recent for when the user asks us to okay
everything that was shown to them before.
From that point of view perhaps your suggestion of ShowAfter
is even more confusing than IsShowable
(perhaps WasShowable
would be more accurate? dunno)
overlord/state/warning.go
Outdated
} | ||
|
||
// flattenWarning loops over the warnings map, and returns all | ||
// warnings therein as a flat list, for serialising. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The choice of a map for the messages feels a bit unusual. The slice feels slightly more natural for the in memory representation, and while it would be more expensive to find strings, it would avoid the cost of flattening it.
With that said, this is not a blocker as it's just the in-memory representation and easy to go one way or the other later as we learn more. So let's go with your choice.
overlord/state/warning.go
Outdated
// message it'll be added (with its firstAdded and lastAdded set to the | ||
// current time), otherwise the existing one will have its lastAdded | ||
// updated. | ||
func (s *State) AddWarning(message string) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest "Warning" as the function name, and something a bit more comfortable:
func (s *State) Warning(format string, args ...interface{}) {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we want warnings to have arguments like this. That is, we don't want "You only have 200MB of disk space left"; we want "Your disk space is running low". Otherwise the "only hold one warning of a given type" idea goes down the drain.
If, as an iteration, we move to use the format
as the key, and have extra data in warnings where we store the args, then this would work. But I'd like to leave it as an iteration of the api (and call it Warningf
? I'm bad with names)
Warning
on its own seems strange to me; as it's a noun and not a verb I'd expect it to return a warning, not add it (like state.Task(id)
returns a task). But we've established that I'm bad with names 🤷♂️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, this should really be Warn instead of Warning (or Warnf if you want). Can we go with that? There's not much reason to wait.. it's just a Sprintf inside the function.
I agree with your point about being careful with arguments so we're not bothering people and losing the benefits of the system, but we do want arguments anyway so we can talk to people about specific things instead of just general messages. For example, colloquially:
- Your snap "foo" is broken
- Partition /foo is running low on space
- Snapshot file at /.../foo is corrupted
etc.
overlord/state/warning.go
Outdated
w.firstAdded = t | ||
if err := w.validate(); err != nil { | ||
// programming error! | ||
panic(err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we please just log this? I'm concerned about us finding warnings in the field that we don't have tests for, and breaking down a system permanently as every time it starts it warns again and breaks down again.
overlord/state/warning.go
Outdated
} | ||
|
||
// DeleteExpired deletes warnings that have expired. | ||
func (s *State) DeleteExpired() int { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function name lacks context. I also suggest making this a private function and baking it into Prune instead. It's goal is already to garbage collect old stuff, and it's already called opportunistically.
Thanks to @niemeyer and @pedronis for their feedback. This changes AddWarning to Warnf (and has it take args like Sprintf), changes WarningsToShow to PendingWarnings (including as an action on the daemon), renames Expired to ExpiredBefore and IsShowable to ShowAfter, and lastly drops DeleteExpired and adds expired warning pruning to flatten/unflatten and State.Prune itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, bunch of small comments/wonderings/nitpicks
overlord/state/warning.go
Outdated
} | ||
|
||
func (w *Warning) ShowAfter(t time.Time) bool { | ||
return (w.lastShown.IsZero() || w.lastShown.Add(w.repeatAfter).Before(t)) && !w.firstAdded.After(t) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had an hard time spotting the parenthesis initially, maybe swap the two side of the &&
overlord/state/warning.go
Outdated
}, time.Now().UTC()) | ||
} | ||
|
||
func (s *State) addWarningFull(w Warning, t time.Time) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this still have the Full in the name given we don't have AddWarning itself anymore ?
overlord/state/warning_test.go
Outdated
} | ||
|
||
for _, t := range []T1{ | ||
// snaity check |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo
overlord/state/warning_test.go
Outdated
for _, t := range []T1{ | ||
// snaity check | ||
{`{"message": "x", "first-added": "2006-01-02T15:04:05Z", "expire-after": "1h", "repeat-after": "1h"}`, nil}, | ||
// remove one at a time: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove?
overlord/state/warning.go
Outdated
DefaultRepeatAfter = time.Hour * 24 | ||
DefaultExpireAfter = time.Hour * 24 * 28 | ||
|
||
errNoMessage = errors.New("warning has no message") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bit strange that the error variables' names don't mention warning
flat := make([]*Warning, 0, len(s.warnings)) | ||
for _, w := range s.warnings { | ||
if w.ExpiredBefore(now) { | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this bit tested?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep :-) one of the AllWarnings check happens after a flatten pass
s.warnings = make(map[string]*Warning, len(flat)) | ||
for _, w := range flat { | ||
if w.ExpiredBefore(now) { | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this bit tested?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no; I'd have to play with times a little too much (it'd have to not have expired during serialise, and then expire before unserialise; or I'd have to do it by hand? i guess i could do that).
daemon/response.go
Outdated
@@ -72,14 +73,31 @@ func (r *resp) transmitMaintenance(kind errorKind, message string) { | |||
} | |||
} | |||
|
|||
func (r *resp) addWarningsToMeta(ws func() (int, time.Time)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure I understand why does it take a function and not count, stamp ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll change it to do this. Whatever the reason was it seems silly.
After confusing multiple reviewers I refactored ShowAfter in an attempt to make it easier to follow.
added comments; renamed addWarningFull to just addWarning; added Warning to the error names.
This is the backend work for the warnings pipeline. It has some
(possibly ephemeral) debug verbs added as well, to let you play with
it via http if that's your thing.
Missing is:
the frontend work: namely 'snap warnings', 'snap okay', and the
messages after any command.
deleting warnings when they expire.
This is actually the third iteration, but the overlord/state code
remains unchanged: the change has been to simplify the way the
warnings summary is returned, thanks to pedronis's insightful insight.