Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add dag run_id to Audit Log #37597

Closed
1 of 2 tasks
bbovenzi opened this issue Feb 21, 2024 · 4 comments · Fixed by #37731
Closed
1 of 2 tasks

Add dag run_id to Audit Log #37597

bbovenzi opened this issue Feb 21, 2024 · 4 comments · Fixed by #37731
Assignees
Labels

Comments

@bbovenzi
Copy link
Contributor

Description

In event logs, we keep track of dag_id and task_id when relevant. But any task action is happening to a task instance but it is hard to tell which task instance it was. Sometimes, we use execution_date but it would be better for us to switch to run_id.

Use case/motivation

With a run_id field,
We could filter the audit log and see only events that happened in a single dag run.
We could link directly to the task instance in question. Right now we can only link to the general dag.

Related issues

No response

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@SamWheating
Copy link
Contributor

👋 Hey Brent, this is a good idea.

Feel free to assign to me if you want a hand, I've got some capacity in the next week or two.

@bbovenzi
Copy link
Contributor Author

Thanks @SamWheating. We migrated most things from execution_date to run_id but we seemingly forgot this one. So hopefully we can look up those PRs too for help. But doing this fully could require a big migration of the audit logs table.

@SamWheating
Copy link
Contributor

But doing this fully could require a big migration of the audit logs table.

I guess that depends, should we add the empty column and then start populating it moving forwards, or should we aim to repopulate the run_id for previous Log rows based on a join with the DagRun table (similar to this)

Not really sure what the standard is here, and this migration might be pretty hefty due to the potentially high volume of the Log table.

Thoughts, @bbovenzi ?

@bbovenzi
Copy link
Contributor Author

bbovenzi commented Feb 26, 2024

But doing this fully could require a big migration of the audit logs table.

I guess that depends, should we add the empty column and then start populating it moving forwards, or should we aim to repopulate the run_id for previous Log rows based on a join with the DagRun table (similar to this)

Not really sure what the standard is here, and this migration might be pretty hefty due to the potentially high volume of the Log table.

Thoughts, @bbovenzi ?

I'm not opposed to only populating it moving forward. Going through old logs and translating execution_date to run_id will be a heavy lift. Currently, we do not record execution_date often enough, so many logs are already lacking information.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants