Update Downloader Hash #1279

j-c-c · 2025-05-16T18:36:32Z

This PR resolves the downloader mismatched hash reported in #1276. This PR

Updates hashes
Adds logic to warn on hash mismatch instead of raise
Adds a scheduled_workflow.yml, scheduled pytest marker, and downloader test marked "scheduled" that will fail the workflow in the case of a hash mismatch.

If we'd prefer to just merge in the updated hashes I can break off the other adds into a separate PR.

codecov · 2025-05-16T18:53:26Z

Codecov Report

Attention: Patch coverage is 50.00000% with 4 lines in your changes missing coverage. Please review.

Project coverage is 90.60%. Comparing base (845a750) to head (363f02e).
Report is 11 commits behind head on develop.

Files with missing lines	Patch %	Lines
src/aspire/downloader/data_fetcher.py	50.00%	4 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #1279      +/-   ##
===========================================
- Coverage    90.63%   90.60%   -0.03%     
===========================================
  Files          132      132              
  Lines        14174    14181       +7     
===========================================
+ Hits         12846    12849       +3     
- Misses        1328     1332       +4

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

.github/workflows/scheduled_workflow.yml

tests/test_downloader.py

.github/workflows/scheduled_workflow.yml

j-c-c · 2025-05-20T19:51:37Z

@garrettwrong Oof, now I'm remembering why we didn't use scheduled jobs before. Scheduled jobs have to run off the default branch, ie. main. So adding the logic to have the job run on develop only makes it so the workflow never runs.

I think our options are to either

Change the scheduled_workflow to a manual_workflow that we can run manually by using the workflow_dispatch trigger. The downside here would be that we need to remember to trigger the workflow periodically.
Scrap the scheduled_workflow and add the test to the expensive suite. The downside there would be we are attempting to download our whole registry (~3G) on review ready PRs. Upside is it's automatic and we can catch any changes to the hashes and make a patch.

Thoughts? Any other options you can think of?

garrettwrong · 2025-05-21T12:30:48Z

2 is a hard no.

If I am reading the documentation correctly... the workflow (yaml) must be committed to default branch. However, you can use the checkout action as per usual to checkout any branch you want (eg develop). This will probably over-ride the 'manually selected checkout branch in the web ui dropdown' widget, but I don't think we care much about that...

The most unfortunate thing about that is you will want to be sure everything is correct before it hits main and gets queued to run .... I see you added the positive test case I mentioned; combined with manually testing on a fork that is probably sufficient to avoid errors in the ci workflow syntax itself. The tests you can test directly locally after your refactor. Maybe it makes more sense why I suggested that now.

j-c-c · 2025-05-21T12:53:15Z

Ok. Looks like I was not properly checking out develop. Fixed that and am currently waiting to see if the scheduled job runs as expected on my fork. Sorry about that.

…ield.

j-c-c · 2025-05-21T17:26:09Z

This is ready for another look. Tested on a fork and everything is running as expected. I removed the run_workflow button as discussed the dev meeting.

garrettwrong

Great thanks!

In case it is useful in future... You could have tested the error case by using a mock function to force the warning. That would exercise the capturing code and except branch (which should go on to download the correct file). If you wanted to avoid the download most of the time, that could be mocked as well, to return the file that exists in cache already. Just a different way to approach it.

j-c-c · 2025-05-22T18:17:31Z

In case it is useful in future... You could have tested the error case by using a mock function to force the warning. That would exercise the capturing code and except branch (which should go on to download the correct file). If you wanted to avoid the download most of the time, that could be mocked as well, to return the file that exists in cache already. Just a different way to approach it.

Ok, thanks for the suggestion! I'll consider it in the future :)

j-c-c self-assigned this May 16, 2025

j-c-c added bug Something isn't working CI Continuous Integration cleanup support User Support labels May 16, 2025

j-c-c requested a review from garrettwrong May 19, 2025 14:54

garrettwrong requested changes May 19, 2025

View reviewed changes

.github/workflows/scheduled_workflow.yml Outdated Show resolved Hide resolved

tests/test_downloader.py Outdated Show resolved Hide resolved

garrettwrong reviewed May 19, 2025

View reviewed changes

.github/workflows/scheduled_workflow.yml Show resolved Hide resolved

j-c-c added 11 commits May 21, 2025 11:18

proceed with download on hash mismatch, with warning.

c991a22

update hash in registry

2cc6eb8

stack_level. unused variable

b58cada

add scheduled_workflow and scheduled downloader test.

23a8b1e

same py version

cb237b7

Use logger warning. Update test to fail on hash mismatch warning.

fa1b431

Workflow updates: Run on develop, remove check, remove fail on warnings.

6cbbf33

Workflow updates: Checkout develop, set cron off-hour, remove needs f…

c48ad9f

…ield.

test hash mismatch warning works

679b2bf

checkout on develop. mark test as scheduled.

77c9ad7

remove manual run-workflow button

363f02e

j-c-c force-pushed the downloader branch from e6fe9d6 to 363f02e Compare May 21, 2025 15:18

j-c-c requested a review from garrettwrong May 21, 2025 17:25

garrettwrong approved these changes May 22, 2025

View reviewed changes

j-c-c marked this pull request as ready for review May 22, 2025 18:17

j-c-c requested a review from janden as a code owner May 22, 2025 18:17

janden approved these changes Jun 3, 2025

View reviewed changes

j-c-c merged commit 88b2110 into develop Jun 3, 2025
44 checks passed

j-c-c mentioned this pull request Jun 3, 2025

Mismatched (or obsolete) SHA256 hash on aspire.downloader.emdb_14621() call #1276

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update Downloader Hash #1279

Update Downloader Hash #1279

Uh oh!

j-c-c commented May 16, 2025

Uh oh!

codecov bot commented May 16, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

j-c-c commented May 20, 2025

Uh oh!

garrettwrong commented May 21, 2025

Uh oh!

j-c-c commented May 21, 2025

Uh oh!

j-c-c commented May 21, 2025

Uh oh!

garrettwrong left a comment •

edited

Loading

Uh oh!

j-c-c commented May 22, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Update Downloader Hash #1279

Update Downloader Hash #1279

Uh oh!

Conversation

j-c-c commented May 16, 2025

Uh oh!

codecov bot commented May 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

Uh oh!

j-c-c commented May 20, 2025

Uh oh!

garrettwrong commented May 21, 2025

Uh oh!

j-c-c commented May 21, 2025

Uh oh!

j-c-c commented May 21, 2025

Uh oh!

garrettwrong left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

j-c-c commented May 22, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov bot commented May 16, 2025 •

edited

Loading

garrettwrong left a comment •

edited

Loading