New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up 25-cache-service.t with a shorter worker timeout #4362
Conversation
@@ -93,7 +93,7 @@ sub cache_minion_worker { | |||
require OpenQA::CacheService; | |||
local $ENV{MOJO_MODE} = 'test'; | |||
note('Starting cache minion worker'); | |||
OpenQA::CacheService::run(qw(run)); | |||
OpenQA::CacheService::run(qw(run --dequeue-timeout 1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wow, what is that? Can we speedup everything by factor 10 by making this .1
? :) You could go into more details in the git commit message. That would be nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately there's a point where performance gets worse again (implementation detail upstream...). 😁
The Minion worker is designed as a daemon with a blocking mainloop that primarily blocks when waiting for new jobs to be dequeued. By default Minion::Backend::SQLite
blocks for 5 seconds before handing control back to Minion::Worker
, where it does housekeeping stuff like reaping failed jobs and so on before blocking to dequeue again. That option limits the time it blocks to 1 second. So our workers can stop 4 seconds earlier on average, and this test stops a lot of workers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A 0
timeout might work, but is a little risky since the mainloop could start eating a lot of cpu. 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, 0
is "a little" better.
All tests successful.
Files=1, Tests=23, 32 wallclock secs ( 0.04 usr 0.01 sys + 21.66 cusr 3.95 csys = 25.66 CPU)
Result: PASS
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But at the cost of almost 5 times the cpu cycles. I think 1
has the best balance between runtime and cpu cost.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The option --dequeue-timeout
itself comes from Minion::Command::minion::worker, which ::run()
calls indirectly.
Codecov Report
@@ Coverage Diff @@
## master #4362 +/- ##
=======================================
Coverage 97.94% 97.94%
=======================================
Files 371 371
Lines 33705 33705
=======================================
Hits 33012 33012
Misses 693 693
Continue to review full report at Codecov.
|
As a bonus Before:
After:
|
Setting the timeout to
|
That's very impressive and a great side-effect. Still, I would appreciate if you could put these details into the git commit message |
The `--dequeue-timeout` option comes from `Minion::Command::minion::worker` and is used to decide how long the Minion backend will block when waiting for the next job to be dequeued. By reducing it from 5 seconds to 1 second we can speed up the worker shurdown in our tests. While it is possible to reduce the value further to 0 seconds, and speed up tests a little more, that also has the negative side effect of increasing CPU usage very significantly. So it is probably not worth doing.
5302593
to
7900679
Compare
Ok, i've added some more information to the commit message. |
Turns out the test spends most of its time in a blocking SQL loop in
Minion::Backend::SQLite
.Before:
After:
Progress: https://progress.opensuse.org/issues/102221