Support async saving of hosts' last seen time#5640
Conversation
fb0d150 to
74fcc10
Compare
Codecov Report
@@ Coverage Diff @@
## main #5640 +/- ##
==========================================
- Coverage 58.16% 58.16% -0.01%
==========================================
Files 358 359 +1
Lines 33028 33144 +116
==========================================
+ Hits 19212 19279 +67
- Misses 11888 11925 +37
- Partials 1928 1940 +12
Continue to review full report at Codecov.
|
| if !t.AsyncEnabled { | ||
| t.seenHostSet.addHostID(hostID) | ||
| return nil | ||
| } |
There was a problem hiding this comment.
I'd prefer the caller do this. It would be more readable and it's a bit unexpected that async.Task.SomeMethod could be doing it synchronously depending on config/creation-arg (same case with labels and policies).
There was a problem hiding this comment.
This way you also don't need to define "synchronized" related stuff on the async package (// seenHostSet implements synchronized storage for the set of seen hosts.)
There was a problem hiding this comment.
does this also apply to FlushHostsLastSeen below? I kind of like the idea of either completely separating "syncronized" vs "non-syncronized" code OR keeping all the !task.AsyncEnabled checks in a centralized place as much as possible (eg: moving the check that is done in serve.go here)
There was a problem hiding this comment.
I agree that it's not ideal API-wise, but it's quite useful as a way to have the full feature implemented in one location instead of scattered across files/packages. To be honest, I don't think it's really cleaner to have this condition check at the call-site. It's also how the other 2 async features are implemented, so for consistency I think it's best for all of them to take full "ownership" of the feature. Maybe the bigger issue is the naming instead of the code separation, though.
There was a problem hiding this comment.
(+1 to this being a naming issue, that was my first thought)
There was a problem hiding this comment.
does this also apply to FlushHostsLastSeen below? I kind of like the idea of either completely separating "syncronized" vs "non-syncronized" code OR keeping all the !task.AsyncEnabled checks in a centralized place as much as possible (eg: moving the check that is done in serve.go here)
Yeah I'd like that too, although in that case I wanted to avoid creating a goroutine for nothing, hence the check inside serve.go. I don't know, it wouldn't be a big deal either to have it start and run a no-op every 1-10s, let me know how y'all feel about it, I don't mind either way!
There was a problem hiding this comment.
I'd trust your instinct! the code as-is LGTM!
There was a problem hiding this comment.
If we agree about the naming issue, I'll address this in a subsequent PR (and suggestions welcome, I think task or job is too generic, asyncable? it conveys that it can be but is not necessarily async?)
There was a problem hiding this comment.
+1 to naming change for now (And doing it on a separate PR sounds good too.)
| name: "collect_last_seen", | ||
| pool: t.Pool, | ||
| ds: t.Datastore, | ||
| execInterval: t.CollectorInterval, |
There was a problem hiding this comment.
Does it make sense for host seen times to have same collector interval as the other two? (30s is the default IIRC). Maybe we should allow user to configure per-feature?
There was a problem hiding this comment.
Yeah eventually I think we may want to expand the configs so each async feature has its own options (at least for the interval), but for now I think it's ok at least to try this out in load testing and see if we see the gains we expect.
lucasmrod
left a comment
There was a problem hiding this comment.
LGTM! Left a couple of comments.
Editor pass for: - #5640
#5536
Checklist for submitter
If some of the following don't apply, delete the relevant line.
changes/and/ororbit/changes/).Documented any API changes (docs/Using-Fleet/REST-API.md)Documented any permissions changesAdded support on fleet's osquery simulatorcmd/osquery-perffor new osquery data ingestion features.Manual QA for all new/changed functionalityNote: to be validated/tested in load testing environment, to see if this unblocks further scaling regarding number of hosts.