Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevalence runs failing #1654

Closed
apiology opened this issue Nov 26, 2022 · 7 comments
Closed

Prevalence runs failing #1654

apiology opened this issue Nov 26, 2022 · 7 comments

Comments

@apiology
Copy link
Member

Our daily prevalence updates have failed the last four days (11/23-11/26). This is an example log, which is unfortunately hidden to folks who aren't part of the org:

13s
Run yarn prevalence -b -c -k *** -v /home/runner/.virtualenvs/.venv
yarn run v[1](https://github.com/microCOVID/microCOVID/actions/runs/3532758468/jobs/5927512982#step:10:1).22.19
$ ./scripts/prevalence_helper.sh -b -c -k *** -v /home/runner/.virtualenvs/.venv
Using a manual virtualenv directory: /home/runner/.virtualenvs/.venv
Branch will be based on the currently checked out branch
Switched to a new branch 'auto-update-prevalence-2022-11-23--14-0[7](https://github.com/microCOVID/microCOVID/actions/runs/3532758468/jobs/5927512982#step:10:8)-30'
remote: 
remote: Create a pull request for 'auto-update-prevalence-2022-11-23--14-07-30' on GitHub by visiting:        
remote:      https://github.com/microCOVID/microCOVID/pull/new/auto-update-prevalence-2022-11-23--14-07-30        
remote: 
To https://github.com/microCOVID/microCOVID
 * [new branch]          auto-update-prevalence-2022-11-23--14-07-30 -> auto-update-prevalence-2022-11-23--14-07-30
branch 'auto-update-prevalence-2022-11-23--14-07-30' set up to track 'origin/auto-update-prevalence-2022-11-23--14-07-30'.
Created branch auto-update-prevalence-2022-11-23--14-07-30
Activating the virtualenv
Activating virtulenv: /home/runner/.virtualenvs/.venv/bin/activate
Running prevalence script
Fetching https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/UID_ISO_FIPS_LookUp_Table.csv...
read 4322 objects
Fetching https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_daily_reports/11-22-2022.csv...
read 4017 objects
Fetching https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_daily_reports/11-21-2022.csv...
read 4017 objects
Fetching https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_daily_reports/11-20-2022.csv...
read 4017 objects
Fetching https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_daily_reports/11-19-2022.csv...
read 4017 objects
Fetching https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_daily_reports/11-1[8](https://github.com/microCOVID/microCOVID/actions/runs/3532758468/jobs/5927512982#step:10:9)-2022.csv...
read 4017 objects
Fetching https://raw.githubusercontent.com/CSSEGISandData/COVID-1[9](https://github.com/microCOVID/microCOVID/actions/runs/3532758468/jobs/5927512982#step:10:10)/master/csse_covid_19_data/csse_covid_19_daily_reports/11-17-2022.csv...
read 4017 objects
Fetching https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_daily_reports/11-16-2022.csv...
read 4017 objects
Fetching https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_daily_reports/11-15-2022.csv...
read 4017 objects
Fetching https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_daily_reports/11-14-2022.csv...
read 4017 objects
Fetching https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_daily_reports/11-13-2022.csv...
read 4017 objects
Fetching https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_daily_reports/11-12-2022.csv...
read 4017 objects
Fetching https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_daily_reports/11-11-2022.csv...
read 4017 objects
Fetching https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_daily_reports/11-[10](https://github.com/microCOVID/microCOVID/actions/runs/3532758468/jobs/5927512982#step:10:11)-2022.csv...
read 4017 objects
Fetching https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_daily_reports/[11](https://github.com/microCOVID/microCOVID/actions/runs/3532758468/jobs/5927512982#step:10:12)-09-2022.csv...
read 4017 objects
Fetching https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_daily_reports/11-08-2022.csv...
read 4017 objects
Fetching https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_daily_reports/11-07-2022.csv...
read 4017 objects
Fetching https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_daily_reports/11-06-2022.csv...
read 4017 objects
Fetching https://raw.githubusercontent.com/govex/COVID-19/master/data_tables/vaccine_data/global_data/time_series_covid19_vaccine_global.csv...
read [12](https://github.com/microCOVID/microCOVID/actions/runs/3532758468/jobs/5927512982#step:10:13)3105 objects
Traceback (most recent call last):
  File "update_prevalence.py", line 1863, in <module>
    main()
  File "update_prevalence.py", line 1746, in main
    parse_jhu_vaccines_global(cache, data)
  File "update_prevalence.py", line [14](https://github.com/microCOVID/microCOVID/actions/runs/3532758468/jobs/5927512982#step:10:15)57, in parse_jhu_vaccines_global
    raise ValueError(f"Not able to gain data from {JHUVaccinesTimeseriesGlobal.SOURCE}")
ValueError: Not able to gain data from https://raw.githubusercontent.com/govex/COVID-[19](https://github.com/microCOVID/microCOVID/actions/runs/3532758468/jobs/5927512982#step:10:20)/master/data_tables/vaccine_data/global_data/time_series_covid19_vaccine_global.csv
Sentry is attempting to send 2 pending error messages
Waiting up to 2 seconds
Press Ctrl-C to quit
error Command failed with exit code 1.
info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command.
Error: Process completed with exit code 1.
@apiology
Copy link
Member Author

If anyone thinks they could take a crack on figuring out some Python code and would like to pair on fixing this, please reach out, or just book some time with me.

There's some information on setting up local development in our README to get started.

@shawnbiesan2
Copy link
Collaborator

shawnbiesan2 commented Nov 27, 2022

Eyeballing it, it looks like the world vaccine data (https://raw.githubusercontent.com/govex/COVID-19/master/data_tables/vaccine_data/global_data/time_series_covid19_vaccine_global.csv) has not been updated since the 21st.

For issues like this is it typical to followup with the source to figure out why it hasn't been updated? (changes in release cadence, no longer maintained, etc). Or moreso just make the script handle the lack of data and move on?

@apiology
Copy link
Member Author

Yeah, I've made a practice of following up with the upstream source when things like this happen, which has been pretty effective in general.

Note that population vaccination numbers currently don't affect the risk values in the model much, because it's easy to catch and spread Omicron even when you're vaccinated.

Given that, I wouldn't have a problem making changes to the safety check, especially if it's done in a way that balances the risk of things failing silently as a result. We've encountered data feeds being retired, data formats radically changing, upstream providers having issues they don't fix until we talk to them, etc...having something to tell us about those is useful.

Ideally we'd have a low-noise way to publish warnings about things like this without failing prevalence entirely. We don't today - the Sentry references in the code aren't configured to go to an account I have access to. I've thought about adding a Sentry Slack integration or even just a direct Slack integration from the Python script, so we can at least get those piped into places that active contributors can see.

@shawnbiesan2
Copy link
Collaborator

Left a github issue comment for the upstream source but given the state of the other issues I'm not expecting a near term response 🤞

Gotcha, makes sense. Sentry does allow open source projects to apply for a free account via https://sentry.io/for/open-source if a new account for current contributors is needed. I'm assuming the slack you refer to is an instance used for contributors?

@apiology
Copy link
Member Author

Left a github issue comment for the upstream source but given the state of the other issues I'm not expecting a near term response 🤞

Thanks for filing that!

They don't provide source code or logs for their data ingestion pipeline. That said, I notice in their README they list three sources for the upstream data:

Wonder if there's an obvious point where things are stuck upstream.

Sentry does allow open source projects to apply for a free account via https://sentry.io/for/open-source if a new account for current contributors is needed. I'm assuming the slack you refer to is an instance used for contributors?

Right on. Be aware that their free plan does have a 50k monthly error limit, which I suspect we'd blow through with the current configuration on what gets logged. Maybe the open source plan has a higher limit...

Yeah, we have a Slack instance we can use - I can get you access if you like. It's a ghost town in terms of actual discussion, but may be useful for integrations like this to post into.

@coachnate
Copy link
Collaborator

I'm happy to take a look. I'm not a python expert, but I know my way around well enough. I only took a cursory look, but I didn't see any try/catch action going on. @apiology I'm going to grab some time on calendly with you for later this week to get better aquatinted with the code. I also know actions FWIW.

@apiology
Copy link
Member Author

Fixed upstream--thanks to @shawnbiesan2 for alerting folks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants