New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove resource allocator #1990
Conversation
ed00bcc
to
893ae52
Compare
Codecov Report
@@ Coverage Diff @@
## master #1990 +/- ##
===========================================
+ Coverage 71.8% 88.89% +17.09%
===========================================
Files 132 153 +21
Lines 9614 10356 +742
===========================================
+ Hits 6903 9206 +2303
+ Misses 2711 1150 -1561
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #1990 +/- ##
===========================================
+ Coverage 71.8% 88.97% +17.17%
===========================================
Files 132 153 +21
Lines 9614 10340 +726
===========================================
+ Hits 6903 9200 +2297
+ Misses 2711 1140 -1571
Continue to review full report at Codecov.
|
please give a good test with multi machine jobs on staging |
This looks already good and also complete with all the adjustments to the spec file and bootstrap script. Let's see how it works on staging. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
Multi machine tests will take a little more time i'm afraid, since our staging machines aren't working properly at the moment (and i'm still learning while trying to fix them 😉). |
you learning about them has priority over merging this :) |
Small progress update, ran CaaSP multi machine tests so far and everything looks fine. |
I'm wondering on which instance. Only openqa-staging-1 has CaaSP jobs but they failed as incomplete. |
@Martchus I restarted the CaaSP jobs right away yesterday to try again after they were successful, but it looks like the API keys expired around the same time... |
@ldevulder also ran some independent HA/HPC tests successfully on his lab with this pull request applied. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why http://openqa-staging-1.qa.suse.de/tests/756 failed but I guess it is beyond the scope of removing the resource allocator :-)
When i started replacing the dbus methods in
OpenQA::ResourceAllocator
with a redis alternative i noticed that dbus was only used for blocking RPC calls. And that means that we could have just as well usedOpenQA::Resource::Locks/Jobs
directly from the API controllers, making the resource allocator obsolete. So that's what i ended up doing. This pull request removes the resource allocator completely, and i will have to use another service for introducing redis into openQA (most likely the scheduler).The only noticeable difference should be the removal of the single process bottleneck that blocking RPC calls to
OpenQA::ResouceAllocator
caused. And that means there is a small risk that new race conditions will be introduced once locks/barriers can be managed with the prefork daemon. But looking through the code inOpenQA::Resource::Locks/Jobs
(and having done a few tests) it seems rather defensive and i'm cautiously optimistic that it will "just work". 😉 Worst case, we need to add one or two transactions toOpenQA::Resource::Locks
later.Progress: https://progress.opensuse.org/issues/46778