Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix longitudinal report #5783

Merged
merged 3 commits into from Feb 9, 2022
Merged

Fix longitudinal report #5783

merged 3 commits into from Feb 9, 2022

Conversation

ian-r-rose
Copy link
Collaborator

There was some weird variation in behavior here depending on precise versions of libraries that I didn't really track down, but the heart of the issue was that, a few days ago, a test failure message included a full HTML string that was not escaped, and messed up the resulting HTML document.

Fix is live on my fork

@github-actions
Copy link
Contributor

github-actions bot commented Feb 9, 2022

Unit Test Results

       18 files  +       6         18 suites  +6   9h 39m 9s ⏱️ + 3h 12m 7s
  2 595 tests +       2    2 511 ✔️ +     12       80 💤  -   11  3 ±0  1 🔥 +1 
23 221 runs  +7 711  21 744 ✔️ +7 129  1 472 💤 +580  4 +1  1 🔥 +1 

For more details on these failures and errors, see this check.

Results for commit 78779bd. ± Comparison against base commit 65f4151.

♻️ This comment has been updated with latest results.

Copy link
Member

@fjetter fjetter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a test failure message included a full HTML string that was not escaped, and messed up the resulting HTML document.

I know we made the entire report more robust, primarily for missing artifacts. Is there a way to spot these kind of errors?

@@ -172,7 +173,7 @@ def dedup(group):
for w in workflows
if (
pandas.to_datetime(w["created_at"])
> pandas.Timestamp.now(tz="UTC") - pandas.Timedelta(days=90)
> pandas.Timestamp.now(tz="UTC") - pandas.Timedelta(days=60)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why reduce this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm starting to be concerned about bumping up against GitHub's rate limit for the use of GITHUB_TOKEN in actions (currently 1000 requests per hour). Roughly speaking, we are downloading ~10 artifacts per workflow run, plus a slew of administrative requests here. An alternative would be to just choose the most recent ~60-80 runs or thereabouts.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, got it.

I think what we're looking for is indeed "last X runs" and not "last X days". The way I understand this code, that would be very easy to change, wouldn't it?

@ian-r-rose
Copy link
Collaborator Author

I know we made the entire report more robust, primarily for missing artifacts. Is there a way to spot these kind of errors?

I don't think there is an easy way -- the test failure in question was checking for a specific string in the HTML for the dask dashboard.

To be clear: this fix shouldn't be necessary. Vega specs should be robust to unusual strings, and indeed, this probably represents a security hole in Vega where a malicious user could inject HTML/JS to do something bad to a user. I suspect a bug was recently introduced in the Altair+Vega+VegaLite stack somewhere, but I wasn't able to find it in a timely fashion, and I wanted to unblock this visualization.

Copy link
Member

@fjetter fjetter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. If you're up for it you may change the Xdays to Xruns but I'm fine with merging this either way

@@ -172,7 +173,7 @@ def dedup(group):
for w in workflows
if (
pandas.to_datetime(w["created_at"])
> pandas.Timestamp.now(tz="UTC") - pandas.Timedelta(days=90)
> pandas.Timestamp.now(tz="UTC") - pandas.Timedelta(days=60)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, got it.

I think what we're looking for is indeed "last X runs" and not "last X days". The way I understand this code, that would be very easy to change, wouldn't it?

@fjetter
Copy link
Member

fjetter commented Feb 9, 2022

As always, thanks @ian-r-rose for fixing these builds so quickly

@ian-r-rose
Copy link
Collaborator Author

I think what we're looking for is indeed "last X runs" and not "last X days". The way I understand this code, that would be very easy to change, wouldn't it?

That's right -- I'll push a change for this right now

@ian-r-rose
Copy link
Collaborator Author

Done

@fjetter fjetter merged commit 8a5e014 into dask:main Feb 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Test report broken
2 participants