-
Notifications
You must be signed in to change notification settings - Fork 91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug 1401655 - Run clients_daily and experiments_daily #176
Conversation
Per issue mozilla#155
1757336
to
f59fd16
Compare
Change the Makefile to pass through environment variables for AWS credentials, and update the docs to mention running the db migration target.
f59fd16
to
ae6ef5a
Compare
I was able to run both tasks locally using the updated instructions / Makefile, however both tasks had the same strange behaviour:
After a while, both encountered "Resetting dropped connection", then didn't update any further. Meanwhile, the task itself appeared to run to successful completion on EMR (with the Also note that I gave all the existing tasks meaningful names in a separate commit before adding the two new tasks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, just the one comment from me.
This isn't really about airflow, but it seems odd to me that experiments_daily takes longer to run than clients_daily, provided the logic is similar since the volume should be way higher on clients_daily
@@ -30,7 +30,7 @@ redis-cli: | |||
docker-compose run redis redis-cli -h redis | |||
|
|||
run: | |||
docker-compose run web airflow $(COMMAND) | |||
docker-compose run -e AWS_SECRET_ACCESS_KEY -e AWS_ACCESS_KEY_ID web airflow $(COMMAND) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be more generally applicable to add the env variables in https://github.com/mozilla/telemetry-airflow/blob/master/docker-compose.yml -- that way the keys will be passed in to all the compose-based commands
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, I'm going to punt on this for now, since I'm not sure if we want to automatically pass the through more generally when spinning up the stack. I'll file an issue to follow up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filed #178
Experiments daily currently processes all the experiments data each time (not incrementally by day). Changing it to process incrementally is on my todo list. |
Adding a comment for posterity -- the errors are probably fine: we see these a lot in the normal course of a job, and I think because of the way jobs are normally run (via celery workers) they're resilient to these dropped connections while the test run is not so much. |
This also addresses #155, at least for the
main_summary
DAG.This needs to land after the associated PR over in python_mozetl, and I need to test this first, so please hold off on reviewing.