Skip to content

Conversation

@j-c-c
Copy link
Collaborator

@j-c-c j-c-c commented May 16, 2025

This PR resolves the downloader mismatched hash reported in #1276. This PR

  1. Updates hashes
  2. Adds logic to warn on hash mismatch instead of raise
  3. Adds a scheduled_workflow.yml, scheduled pytest marker, and downloader test marked "scheduled" that will fail the workflow in the case of a hash mismatch.

If we'd prefer to just merge in the updated hashes I can break off the other adds into a separate PR.

@j-c-c j-c-c self-assigned this May 16, 2025
@j-c-c j-c-c added bug Something isn't working CI Continuous Integration cleanup support User Support labels May 16, 2025
@codecov
Copy link

codecov bot commented May 16, 2025

Codecov Report

Attention: Patch coverage is 50.00000% with 4 lines in your changes missing coverage. Please review.

Project coverage is 90.60%. Comparing base (845a750) to head (363f02e).
Report is 11 commits behind head on develop.

Files with missing lines Patch % Lines
src/aspire/downloader/data_fetcher.py 50.00% 4 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #1279      +/-   ##
===========================================
- Coverage    90.63%   90.60%   -0.03%     
===========================================
  Files          132      132              
  Lines        14174    14181       +7     
===========================================
+ Hits         12846    12849       +3     
- Misses        1328     1332       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@j-c-c j-c-c requested a review from garrettwrong May 19, 2025 14:54
@j-c-c
Copy link
Collaborator Author

j-c-c commented May 20, 2025

@garrettwrong Oof, now I'm remembering why we didn't use scheduled jobs before. Scheduled jobs have to run off the default branch, ie. main. So adding the logic to have the job run on develop only makes it so the workflow never runs.

I think our options are to either

  1. Change the scheduled_workflow to a manual_workflow that we can run manually by using the workflow_dispatch trigger. The downside here would be that we need to remember to trigger the workflow periodically.
  2. Scrap the scheduled_workflow and add the test to the expensive suite. The downside there would be we are attempting to download our whole registry (~3G) on review ready PRs. Upside is it's automatic and we can catch any changes to the hashes and make a patch.

Thoughts? Any other options you can think of?

@garrettwrong
Copy link
Collaborator

2 is a hard no.

If I am reading the documentation correctly... the workflow (yaml) must be committed to default branch. However, you can use the checkout action as per usual to checkout any branch you want (eg develop). This will probably over-ride the 'manually selected checkout branch in the web ui dropdown' widget, but I don't think we care much about that...

The most unfortunate thing about that is you will want to be sure everything is correct before it hits main and gets queued to run .... I see you added the positive test case I mentioned; combined with manually testing on a fork that is probably sufficient to avoid errors in the ci workflow syntax itself. The tests you can test directly locally after your refactor. Maybe it makes more sense why I suggested that now.

@j-c-c
Copy link
Collaborator Author

j-c-c commented May 21, 2025

Ok. Looks like I was not properly checking out develop. Fixed that and am currently waiting to see if the scheduled job runs as expected on my fork. Sorry about that.

@j-c-c j-c-c requested a review from garrettwrong May 21, 2025 17:25
@j-c-c
Copy link
Collaborator Author

j-c-c commented May 21, 2025

This is ready for another look. Tested on a fork and everything is running as expected. I removed the run_workflow button as discussed the dev meeting.

Copy link
Collaborator

@garrettwrong garrettwrong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great thanks!

In case it is useful in future... You could have tested the error case by using a mock function to force the warning. That would exercise the capturing code and except branch (which should go on to download the correct file). If you wanted to avoid the download most of the time, that could be mocked as well, to return the file that exists in cache already. Just a different way to approach it.

@j-c-c
Copy link
Collaborator Author

j-c-c commented May 22, 2025

In case it is useful in future... You could have tested the error case by using a mock function to force the warning. That would exercise the capturing code and except branch (which should go on to download the correct file). If you wanted to avoid the download most of the time, that could be mocked as well, to return the file that exists in cache already. Just a different way to approach it.

Ok, thanks for the suggestion! I'll consider it in the future :)

@j-c-c j-c-c marked this pull request as ready for review May 22, 2025 18:17
@j-c-c j-c-c requested a review from janden as a code owner May 22, 2025 18:17
@j-c-c j-c-c merged commit 88b2110 into develop Jun 3, 2025
44 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working CI Continuous Integration cleanup support User Support

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants