-
Notifications
You must be signed in to change notification settings - Fork 96
feat(dags): Add DAP Collector dag for PPA Dev #1976
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(dags): Add DAP Collector dag for PPA Dev #1976
Conversation
38e85a9 to
9bcd194
Compare
| default_args=default_args, | ||
| doc_md=DOCS, | ||
| schedule_interval="*/15 * * * *", | ||
| tags=tags, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might want to explicitly set catchup=False so it doesn't trigger all runs since the start_date/last run if it's not necessary. False might be the default value but I can't remember
| tags=tags, | |
| tags=tags, | |
| catchup=False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I can add that.
dags/dap_collector_ppa_dev.py
Outdated
| "retries": 1, | ||
| "retry_delay": timedelta(minutes=15), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Retry delay is the same as schedule interval so you'll have two running concurrently if it fails. Would it make sense to wait for the next run instead of retrying? Or is the continuity important?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense. I copied this from another job and did not evaluate the retry logic.
If a job fails, I do not actually think a retry or subsequent run would work, because the job depends on the {{ ts }} variable of the dag run, and only goes back every 15 minutes.
I think what I can do is remove the retry and run the script manually if a job fails.
Description
This PR adds a new PPA Dev DAP Collector DAG to run the new docker-etl job here: mozilla/docker-etl#189. This DAG will be used to test out the ads use case in the DAP infrastructure.
Related Tickets & Documents