Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Display running state for testing in Bodhi #4853

Closed
thrix opened this issue Dec 7, 2022 · 15 comments
Closed

Display running state for testing in Bodhi #4853

thrix opened this issue Dec 7, 2022 · 15 comments

Comments

@thrix
Copy link

thrix commented Dec 7, 2022

We would like to improve the experience to show running state in Bodhi.

Obviously, resultsdb now contains running state for some of the results.

https://resultsdb.fedoraproject.org/results?&testcases=fedora-ci.koji-build.tier0.functional

It would be great if bodhi could display this state, so user is more clear on what state is the request in.

@AdamWill
Copy link
Contributor

AdamWill commented Dec 7, 2022

I can probably work on this if I can find time. Note, we should also do QUEUED, not just RUNNING. Bodhi gets results via greenwave, so this may require changes to greenwave (not sure if greenwave's current query retrieves/passes along QUEUED or RUNNING results).

@thrix
Copy link
Author

thrix commented Dec 14, 2022

@AdamWill yeah, I need to check how this works in downstream, but I believe the resultsdb running/queued "results" should be available in the greenwave response, so I would take it from there.

Afaik this is how CI dashboard consumes those results, and shows them as running.

@voxik
Copy link

voxik commented Jan 4, 2023

This feature would be super useful, because I am annoyed by this update right now. It says "fedora-ci.koji-build.tier0.functional is a required test" for libselinux and pcs components. However, ATM I have no idea what does it actually mean, because it does not even give me a hint if the test is running, where it is running, how can I check the state.

I was just told that I can check resultsdb, but without asking, I have no change to figure this out.

BTW all the documentation is referring to ResultsDB, but to the sources instead of actual instances. I wish this was changed.

@voxik
Copy link

voxik commented Jan 4, 2023

Just FTR, the update is mass rebuild of Ruby packages. I have no idea if some of the components has some test suites or gating or what not. So better reporting of the CI status would really helped me to understand if I should wait or just waive the tests.

@AdamWill
Copy link
Contributor

AdamWill commented Jan 4, 2023

If a CI test is required for a package - that is, a test whose name starts with 'fedora-ci' - that means the package maintainer has manually configured it that way and they really want that test to pass. No CI tests are gating by default as of yet; to gate a package on a CI test requires manual action on the part of the packager. (On the other hand, openQA tests - tests whose names start with 'updates', for updates - are gating by default for all critical path updates outside of Rawhide).

If a required test doesn't run, either CI ought to be running it but isn't, or the packager shouldn't have required it (maybe it's an old test which is no longer run, but they haven't updated the package's gating config).

However, in this case, the tests all failed, they aren't missing. If you mouse over the 'failed' icon (the minus sign in a red circle) it gives you the status, which would be MISSING (IIRC) if the test had not run or not yet completed, but is actually FAILED, which means it ran and failed. Also, clicking on the line for a failed test should take you to the results of that test (the target of the link is controlled by the system submitting the result to resultsdb). For CI tests it takes you to Jenkins, which is always fun. e.g. clicking on the libselinux failure takes you to https://osci-jenkins-1.ci.fedoraproject.org/job/fedora-ci/job/dist-git-pipeline/job/master/203316/ .

My first step to try and figure out what the hell is going on in Jenkins is usually to click Console Output, which shows us https://osci-jenkins-1.ci.fedoraproject.org/job/fedora-ci/job/dist-git-pipeline/job/master/203316/console ; in there is the magic message "ERROR: Test environment installation failed: reason unknown, please escalate", which usually means "something is broken in CI, bug the CI admins about it".

@voxik
Copy link

voxik commented Jan 5, 2023

@AdamWill I am really scratching my head how to respond, because I really do appreciate the time and effort you have spend to submit this response (and all your work generally). However, it just highlights the poor state of the matter. It should be much more intuitive then to need 4 paragraphs to describe just some basics.

But trying to stick to the context of this ticket and not diving into other dark corners of our CI:

  1. It is totally confusing, while the tests are probably running somewhere, that I am provided with 'MISSING' status.
  2. The test are finished now and it seems that the "red failed squares" are clickable (it is probably the "line for a failed test" in you terminology). But I don't think that they were clickable when they stated the 'MISSING'. I was not provided any additional information what is the 'MISSING' based upon.
  3. Even if they were clickable, the UI is terrible (sorry, no offense). It is not obvious I could click on the "red failed square" to get more details. It does not even show the target URL to make it somehow obvious. Instead it displays some icons with pop-overs, which makes it even less obvious one should click on the line. Why we even have to discuss if it is "red failed square" or "line for a failed test"? Standard UIs has links and buttons. This would probably deserve its own ticket, dunno.

OT: I have reported this ticket trying to address the specific test failures. However, I have no idea if I reported it at the right place, because I don't know where I could find the sources for the specific test case. The CI is unfortunately generally frustrating and I always feel very incapable trying to solve even the tiniest problem. You might remember this ticket.

@AdamWill
Copy link
Contributor

AdamWill commented Jan 5, 2023

Well, I was just trying to help you deal with the problem you're having right now. I'm not saying it means everything's fine right now.

Yes, it's expected that MISSING lines aren't clickable, because nothing knows where to send you if you were to click on such a line. The thing that knows the result is "missing" is Greenwave; this is where Bodhi gets the information from. But all Greenwave "knows" is that there has to be a passed result for that test. Greenwave has no knowledge about the test itself or the test system that ought to provide it. All it has is an 'identifier' for the test result to look up in resultsdb; when it looks it up and doesn't find anything, it's MISSING.

When it looks up the test and finds a result, it has more information - all the information in the test result. Both the test systems we care about (openQA and Fedora CI) include a URL when filing results to resultsdb. So for a failed (or passed, for that matter) test, the result in resultsdb includes a URL which Greenwave finds and passes on to Bodhi, so Bodhi can make the row for the failed result point to that URL. But for a missing test, there's just no way - at present - for Bodhi to know where to point. Resolving this ticket would resolve that, because the way we intend to resolve this ticket is to have test systems file "results" when a test is scheduled and/or running; those "results" will also have URLs, so Greenwave and Bodhi will have somewhere to point to even for tests which have not completed. At that point, a result will only be "missing" if a test is listed in the gating config but not actually scheduled by any test system, which obviously would be an error on someone's part.

To point 3 - well, the "line" includes the "square". By "line" I mean one row in the table of results - the whole thing, with the green or red or yellow or blue background, the red square or green check or whatever, the name of the test, sometimes a little lozenge to indicate the test's "environment", and the time. The whole row is clickable, not just the icon. On a computer, when you mouse over a row that acts as a hyperlink, the background color of the row goes darker and the pointer changes, just like when you mouse over a "regular" hyperlink; this is intended to communicate that the row acts as a hyperlink. Were you testing on a computer, saw this, but it still didn't clearly indicate "hyperlink"? Or were you testing on a mobile device, where I guess those clues aren't available?

I didn't design the UI, so I can't make any claims about why it was designed any given way. I guess you could instead have a "Results" link between the test name or lozenge and the time. I don't know if that would be better - I guess it might be more obvious, though a smaller target.

@voxik
Copy link

voxik commented Jan 5, 2023

Thx for elaborating. The envisioned design is certainly step in the right direction and I look forward to it 👍

Were you testing on a computer, saw this, but it still didn't clearly indicate "hyperlink"?

Right. This is how regular hyperlink looks like:

Snímek obrazovky z 2023-01-05 21-04-24

i.e. the URL is displayed by the browser. That is not the case for the results page:

Snímek obrazovky z 2023-01-05 21-07-23

@AdamWill
Copy link
Contributor

I kinda started looking into this today. We actually have more of the plumbing in openQA than I remembered; we already publish messages (in both openqa's own style and the standardized CI style) when jobs are queued. mostly. So it should be easy enough to extend the message consumer that already publishes the results to resultsdb to also publish 'results' when a job is queued. We don't currently do 'running' status, I don't think, but I can look into that later.

However, I did find a relatively important case where openQA doesn't emit a message (so we can't publish fedmsgs, and can't report a result): https://progress.opensuse.org/issues/123625 . I'll try and get that resolved and work on the result reporting at the same time.

@AdamWill
Copy link
Contributor

AdamWill commented Feb 9, 2023

Update on this, I now have openQA in staging reporting QUEUED 'results' in most cases (not the RETRY case linked above). I'll use that to work on the Bodhi side of things tomorrow if I can.

@AdamWill
Copy link
Contributor

AdamWill commented Feb 9, 2023

Working on this further today, here's another fun thing I hit: with current greenwave, Bodhi cannot possibly tell from a non-verbose query response whether a result is truly "missing" (there is no result for it at all in resultsdb) or "incomplete" (there is a QUEUED or RUNNING result filed in resultsdb).

I also ran into an inconsistency between verbose and non-verbose greenwave query responses which is caused by greenwave caching results (the cache is bypassed on verbose queries). Still thinking what to do with that.

@AdamWill
Copy link
Contributor

AdamWill commented Feb 9, 2023

so interestingly, Bodhi does actually kinda already show these. They show up in the automated results table as grey-background rows with no icon, because the row style looks up a CSS class for the background color and an icon from tables based on the outcome, and there's no entry for QUEUED or RUNNING in either table.

A small change would give us blue-background rows with appropriate icons:

diff --git a/bodhi-server/bodhi/server/templates/update.html b/bodhi-server/bodhi/server/templates/update.html
index 12e06280..a32f9017 100644
--- a/bodhi-server/bodhi/server/templates/update.html
+++ b/bodhi-server/bodhi/server/templates/update.html
@@ -755,6 +755,8 @@ ${parent.javascript()}
       ABORTED: 'warning',
       CRASHED: 'warning',
       ABSENT: 'danger',
+      QUEUED: 'info',
+      RUNNING: 'info',
     }
     var icons = {
       PASSED: 'check-circle',
@@ -764,6 +766,8 @@ ${parent.javascript()}
       ABORTED: 'trash',
       CRASHED: 'fire', // no joke.
       ABSENT: 'question-circle',
+      QUEUED: 'hourglass-top',
+      RUNNING: 'hourglass-split'
     }
 
     var update = '${update.alias}';

I'm gonna sit on that and not submit it yet, though, till I hear back from lukas about the problem mentioned above. It'd be nice to send a PR which does more than just that, but we need to be able to tell a 'missing' requirement from an 'incomplete' requirement in the greenwave requirements list in order to be able to do that.

@AdamWill
Copy link
Contributor

#5139

@AdamWill
Copy link
Contributor

This is mostly fixed in 7.1.1. However, I messed up the icon names - #5187 fixes that. I think tooltips for the icons may still not be working for some reason even after that fix, I'm not sure why not though.

@mattiaverga
Copy link
Contributor

Should be fixed now in Bodhi 7.2.0 deployed on prod.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants