Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add indexes for CASCADE deletes for task_instance #24488

Merged
merged 5 commits into from
Jun 16, 2022

Conversation

dstandish
Copy link
Contributor

@dstandish dstandish commented Jun 16, 2022

When we add foreign keys with ON DELETE CASCADE, and we delete rows in the foreign table, the database needs to join back to the referencing table. If there's no suitable index, then it can be slow to perform the deletes.

cc @wolfier

closes #24484

@dstandish dstandish changed the title Add indexes for CASCADE deletes Add indexes for CASCADE deletes for task_instance Jun 16, 2022
@dstandish dstandish force-pushed the add-indexes-for-cascade-deletes branch from 0701258 to 55a576d Compare June 16, 2022 14:07
@github-actions
Copy link

The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.

@github-actions github-actions bot added the full tests needed We need to run full set of tests for this PR to merge label Jun 16, 2022
When we add foreign keys with ON DELETE CASCADE, and we delete rows in the foreign table, the database needs to join back to the referencing table.  If there's no suitable index, then it can be slow to perform the deletes.
@dstandish dstandish force-pushed the add-indexes-for-cascade-deletes branch from a464bb5 to a350845 Compare June 16, 2022 17:39
@dstandish dstandish merged commit 677c422 into apache:main Jun 16, 2022
@dstandish dstandish deleted the add-indexes-for-cascade-deletes branch June 16, 2022 18:41
@jedcunningham jedcunningham added this to the Airflow 2.3.3 milestone Jun 16, 2022
@jedcunningham jedcunningham added the type:bug-fix Changelog: Bug Fixes label Jun 16, 2022
ephraimbuddy pushed a commit that referenced this pull request Jun 29, 2022
When we add foreign keys with ON DELETE CASCADE, and we delete rows in the foreign table, the database needs to join back to the referencing table.  If there's no suitable index, then it can be slow to perform the deletes.

(cherry picked from commit 677c422)
@calfzhou
Copy link
Contributor

calfzhou commented Jul 11, 2022

Upgrading airflow 2.3.2 to 2.3.3, db upgrade failed:

Running upgrade 3c94c427fdf6 -> f5fcbda3e651, Add indexes for CASCADE deletes on task_instance
...
MySQLdb.OperationalError: (1067, "Invalid default value for 'end_date'")

(MySQL 8.0)

I checked that my task_reschedule table is empty.

@potiuk
Copy link
Member

potiuk commented Jul 11, 2022

@calfzhou - you might have your mysql configured with "NO ZERO DATE" and "STRICT MODE".
https://dev.mysql.com/doc/refman/5.7/en/sql-mode.html#sqlmode_no_zero_date.

Can you please confirm that?

If so, you might want to disable it - https://stackoverflow.com/questions/9192027/invalid-default-value-for-create-date-timestamp-field.

@calfzhou
Copy link
Contributor

@calfzhou - you might have your mysql configured with "NO ZERO DATE" and "STRICT MODE". https://dev.mysql.com/doc/refman/5.7/en/sql-mode.html#sqlmode_no_zero_date.

Can you please confirm that?

If so, you might want to disable it - https://stackoverflow.com/questions/9192027/invalid-default-value-for-create-date-timestamp-field.

Thanks @potiuk , that works! I removed NO_ZERO_DATE from sql mode.

@potiuk
Copy link
Member

potiuk commented Jul 11, 2022

Cool!

One question @calfzhou - maybe you would like to add a note about it as PR to https://airflow.apache.org/docs/apache-airflow/stable/howto/set-up-database.html?highlight=database+setup#setting-up-a-mysql-database ? We already have a few watch-outs there and adding this one might be useful for others.

It's super easy - just click "Suggest a change on this page" and you will get a PR where you will be able to directly edit the page sources and you could just use similar approach as other warnings there ?

@calfzhou
Copy link
Contributor

PR posted #24983

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
full tests needed We need to run full set of tests for this PR to merge kind:documentation type:bug-fix Changelog: Bug Fixes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

airflow db clean task_instance takes a long time
5 participants