Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Send Slack alerts for DAG failures #249

Merged
merged 1 commit into from
Jan 5, 2024
Merged

Conversation

pankajkoti
Copy link
Contributor

@pankajkoti pankajkoti commented Jan 4, 2024

The PR adds on_failure_callback for all the Ask Astro DAGs
to notify DAG failures to the desired alerts Slack channel by leveraging
the Slack Notifier from the Apache Airflow Slack provider.
For the alerts to work, it will need a connection of type slack
created in the deployment. If the connection ID for this connection
is different than the default slack_api_default, then the
connection ID needs to be set in the deployment as an environment
variable named ASK_ASTRO_ALERT_SLACK_CONN_ID.

closes: #231

Copy link
Contributor

@josh-fell josh-fell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, to cut down the code duplication throughout, would it make sense to have a utility function/class and import it into these DAGs instead?

@pankajkoti
Copy link
Contributor Author

pankajkoti commented Jan 5, 2024

@josh-fell I thought about it, but wondered what would be a good place to keep such a util in the repo. Then thought to keep it in the DAG to make the DAG self-sufficient 🙂

Would creating a new dir utils under airflow/include make sense and keeping it in there?

cc: @sunank200 @Lee-W @pankajastro

Copy link
Collaborator

@sunank200 sunank200 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might want to rebase this PR.

@josh-fell I thought about it, but wondered what would be a good place to keep such a util in the repo. Then thought to keep it in the DAG to make the DAG self-sufficient 🙂

Would creating a new dir utils under airflow/include make sense and keeping it in there?

cc: @sunank200 @Lee-W @pankajastro

@pankajkoti @josh-fell we could have a monitoring folder inside airflow/include. We should use taskflow API which calls this common method.

@pankajkoti
Copy link
Contributor Author

pankajkoti commented Jan 5, 2024

We should use taskflow API which calls this common method.

@sunank200 We're using on_failure_callback at the DAG level to leverage Slack Notifier which is already a parameter for the dag decorator. I don't think I got what you mean by Taskflow API. Please explain.

to notify DAG failures to the desired alerts Slack channel by
leveraging the Slack Notifier from the Apache Airflow Slack provider.
For the alerts to work, it will need a connection of type slack
created in the deployment. If the connection ID for this connection
is different than the default slack_api_default, then the
connection ID needs to be set in the deployment as an environment
variable named ASK_ASTRO_ALERT_SLACK_CONN_ID.

closes: #231
Copy link

cloudflare-pages bot commented Jan 5, 2024

Deploying with  Cloudflare Pages  Cloudflare Pages

Latest commit: 0ce3d12
Status: ✅  Deploy successful!
Preview URL: https://4c7f2537.ask-astro.pages.dev
Branch Preview URL: https://231-send-dag-failure-alerts.ask-astro.pages.dev

View logs

@pankajkoti
Copy link
Contributor Author

pankajkoti commented Jan 5, 2024

@josh-fell Thanks for the suggestion. I created the required utility in include/utils/slack module.

@sunank200 I have resolved the conflicts but I did not understand the second point in your previous comment.

Requesting re-reviews please.

Sample alert
Screenshot 2024-01-05 at 5 11 34 PM

Copy link
Collaborator

@pankajastro pankajastro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@pankajkoti pankajkoti requested review from utkarsharma2 and removed request for josh-fell January 5, 2024 11:59
Copy link
Collaborator

@sunank200 sunank200 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this approach looks good to me. I thought you would be using some other custom method for this implementation which would look for any error in the DAG run.

Have you tested this implementation? If yes i think its good to be merged.

Copy link
Collaborator

@utkarsharma2 utkarsharma2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@pankajkoti
Copy link
Contributor Author

I thought you would be using some other custom method for this implementation which would look for any error in the DAG run.

The on_failure_callback at the DAG level would send an alert when any of the tasks in the DAG fail. This approach looked easier and better than adding a custom method.

Have you tested this implementation? If yes i think its good to be merged.

Yes, and I did paste a sample alert here #249 (comment).
I follow the convention to test changes before marking a PR ready for review 🙂. However, as mentioned in the PR description, for alerts to work, we would need to create the Slack API connection in the needed deployments from where we would like to start sending these alerts.

@josh-fell I am merging the PR, appreciate your review and please add more comments if you would have, I will address them in a subsequent PR.

@pankajkoti pankajkoti merged commit 65eb4fe into main Jan 5, 2024
7 checks passed
@pankajkoti pankajkoti deleted the 231-send-dag-failure-alerts branch January 5, 2024 12:19
@josh-fell
Copy link
Contributor

@pankajkoti This is exactly what I was envisioning. Looks great!

@pankajkoti
Copy link
Contributor Author

Thanks @josh-fell , appreciate you review 🙏🏽

@pankajkoti pankajkoti mentioned this pull request Jan 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Send alerts for Ask Astro DAG failures to Slack channel
5 participants