-
Notifications
You must be signed in to change notification settings - Fork 188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Display running state for testing in Bodhi #4853
Comments
I can probably work on this if I can find time. Note, we should also do |
@AdamWill yeah, I need to check how this works in downstream, but I believe the resultsdb running/queued "results" should be available in the greenwave response, so I would take it from there. Afaik this is how CI dashboard consumes those results, and shows them as running. |
This feature would be super useful, because I am annoyed by this update right now. It says "fedora-ci.koji-build.tier0.functional is a required test" for libselinux and pcs components. However, ATM I have no idea what does it actually mean, because it does not even give me a hint if the test is running, where it is running, how can I check the state. I was just told that I can check resultsdb, but without asking, I have no change to figure this out. BTW all the documentation is referring to ResultsDB, but to the sources instead of actual instances. I wish this was changed. |
Just FTR, the update is mass rebuild of Ruby packages. I have no idea if some of the components has some test suites or gating or what not. So better reporting of the CI status would really helped me to understand if I should wait or just waive the tests. |
If a CI test is required for a package - that is, a test whose name starts with 'fedora-ci' - that means the package maintainer has manually configured it that way and they really want that test to pass. No CI tests are gating by default as of yet; to gate a package on a CI test requires manual action on the part of the packager. (On the other hand, openQA tests - tests whose names start with 'updates', for updates - are gating by default for all critical path updates outside of Rawhide). If a required test doesn't run, either CI ought to be running it but isn't, or the packager shouldn't have required it (maybe it's an old test which is no longer run, but they haven't updated the package's gating config). However, in this case, the tests all failed, they aren't missing. If you mouse over the 'failed' icon (the minus sign in a red circle) it gives you the status, which would be MISSING (IIRC) if the test had not run or not yet completed, but is actually FAILED, which means it ran and failed. Also, clicking on the line for a failed test should take you to the results of that test (the target of the link is controlled by the system submitting the result to resultsdb). For CI tests it takes you to Jenkins, which is always fun. e.g. clicking on the libselinux failure takes you to https://osci-jenkins-1.ci.fedoraproject.org/job/fedora-ci/job/dist-git-pipeline/job/master/203316/ . My first step to try and figure out what the hell is going on in Jenkins is usually to click Console Output, which shows us https://osci-jenkins-1.ci.fedoraproject.org/job/fedora-ci/job/dist-git-pipeline/job/master/203316/console ; in there is the magic message "ERROR: Test environment installation failed: reason unknown, please escalate", which usually means "something is broken in CI, bug the CI admins about it". |
@AdamWill I am really scratching my head how to respond, because I really do appreciate the time and effort you have spend to submit this response (and all your work generally). However, it just highlights the poor state of the matter. It should be much more intuitive then to need 4 paragraphs to describe just some basics. But trying to stick to the context of this ticket and not diving into other dark corners of our CI:
OT: I have reported this ticket trying to address the specific test failures. However, I have no idea if I reported it at the right place, because I don't know where I could find the sources for the specific test case. The CI is unfortunately generally frustrating and I always feel very incapable trying to solve even the tiniest problem. You might remember this ticket. |
Well, I was just trying to help you deal with the problem you're having right now. I'm not saying it means everything's fine right now. Yes, it's expected that MISSING lines aren't clickable, because nothing knows where to send you if you were to click on such a line. The thing that knows the result is "missing" is Greenwave; this is where Bodhi gets the information from. But all Greenwave "knows" is that there has to be a passed result for that test. Greenwave has no knowledge about the test itself or the test system that ought to provide it. All it has is an 'identifier' for the test result to look up in resultsdb; when it looks it up and doesn't find anything, it's MISSING. When it looks up the test and finds a result, it has more information - all the information in the test result. Both the test systems we care about (openQA and Fedora CI) include a URL when filing results to resultsdb. So for a failed (or passed, for that matter) test, the result in resultsdb includes a URL which Greenwave finds and passes on to Bodhi, so Bodhi can make the row for the failed result point to that URL. But for a missing test, there's just no way - at present - for Bodhi to know where to point. Resolving this ticket would resolve that, because the way we intend to resolve this ticket is to have test systems file "results" when a test is scheduled and/or running; those "results" will also have URLs, so Greenwave and Bodhi will have somewhere to point to even for tests which have not completed. At that point, a result will only be "missing" if a test is listed in the gating config but not actually scheduled by any test system, which obviously would be an error on someone's part. To point 3 - well, the "line" includes the "square". By "line" I mean one row in the table of results - the whole thing, with the green or red or yellow or blue background, the red square or green check or whatever, the name of the test, sometimes a little lozenge to indicate the test's "environment", and the time. The whole row is clickable, not just the icon. On a computer, when you mouse over a row that acts as a hyperlink, the background color of the row goes darker and the pointer changes, just like when you mouse over a "regular" hyperlink; this is intended to communicate that the row acts as a hyperlink. Were you testing on a computer, saw this, but it still didn't clearly indicate "hyperlink"? Or were you testing on a mobile device, where I guess those clues aren't available? I didn't design the UI, so I can't make any claims about why it was designed any given way. I guess you could instead have a "Results" link between the test name or lozenge and the time. I don't know if that would be better - I guess it might be more obvious, though a smaller target. |
Thx for elaborating. The envisioned design is certainly step in the right direction and I look forward to it 👍
Right. This is how regular hyperlink looks like: i.e. the URL is displayed by the browser. That is not the case for the results page: |
I kinda started looking into this today. We actually have more of the plumbing in openQA than I remembered; we already publish messages (in both openqa's own style and the standardized CI style) when jobs are queued. mostly. So it should be easy enough to extend the message consumer that already publishes the results to resultsdb to also publish 'results' when a job is queued. We don't currently do 'running' status, I don't think, but I can look into that later. However, I did find a relatively important case where openQA doesn't emit a message (so we can't publish fedmsgs, and can't report a result): https://progress.opensuse.org/issues/123625 . I'll try and get that resolved and work on the result reporting at the same time. |
Update on this, I now have openQA in staging reporting QUEUED 'results' in most cases (not the RETRY case linked above). I'll use that to work on the Bodhi side of things tomorrow if I can. |
Working on this further today, here's another fun thing I hit: with current greenwave, Bodhi cannot possibly tell from a non-verbose query response whether a result is truly "missing" (there is no result for it at all in resultsdb) or "incomplete" (there is a QUEUED or RUNNING result filed in resultsdb). I also ran into an inconsistency between verbose and non-verbose greenwave query responses which is caused by greenwave caching results (the cache is bypassed on verbose queries). Still thinking what to do with that. |
so interestingly, Bodhi does actually kinda already show these. They show up in the automated results table as grey-background rows with no icon, because the row style looks up a CSS class for the background color and an icon from tables based on the outcome, and there's no entry for QUEUED or RUNNING in either table. A small change would give us blue-background rows with appropriate icons:
I'm gonna sit on that and not submit it yet, though, till I hear back from lukas about the problem mentioned above. It'd be nice to send a PR which does more than just that, but we need to be able to tell a 'missing' requirement from an 'incomplete' requirement in the greenwave requirements list in order to be able to do that. |
This is mostly fixed in 7.1.1. However, I messed up the icon names - #5187 fixes that. I think tooltips for the icons may still not be working for some reason even after that fix, I'm not sure why not though. |
Should be fixed now in Bodhi 7.2.0 deployed on prod. |
We would like to improve the experience to show running state in Bodhi.
Obviously, resultsdb now contains running state for some of the results.
https://resultsdb.fedoraproject.org/results?&testcases=fedora-ci.koji-build.tier0.functional
It would be great if bodhi could display this state, so user is more clear on what state is the request in.
The text was updated successfully, but these errors were encountered: