Skip to content

Conversation

@uranusjr
Copy link
Member

As discussed previously.

Also improved the dag_run relationship to use the dag_run_id field directly because why not?

@uranusjr uranusjr requested review from XD-DENG, ashb and kaxil as code owners March 17, 2022 06:35
@github-actions github-actions bot added the full tests needed We need to run full set of tests for this PR to merge label Mar 17, 2022
@github-actions
Copy link

The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.

@uranusjr
Copy link
Member Author

Forgot to mention, we can do this directly because the prior select in the migration already filtered out all xcom rows without a matching dag_run row, and since task_instance already has a foreign key to dag_run, it is not possible to have dangling xcom rows when we add the fk constraint in the migration.

@uranusjr
Copy link
Member Author

Eh, need to fix tests now though, we are currently doing a lot of dangling inserts.

@uranusjr uranusjr marked this pull request as draft March 17, 2022 08:19
@uranusjr
Copy link
Member Author

Alright hopefully this fixes everything. Mostly changing the test code, but I also needed to change how airflow task test works conceptually to make this foreign key work (see the commit referencing this command). Currently airflow task test would create a “fake” DAG run in-memory to run just that task, but this does not work well with database-level foreign key. So instead I make this command always create a “real” DAG run in the database, and delete it (with everything the run generates because ON DELETE CASCADE) after the task is run. Personally I think this is better because now this command is no longer a weird outlier on database integrity, but not sure if this is an acceptable semantic change.

Also improved the dag_run relationship to use the dag_run_id field
directly because why not?
Previously, 'airflow task test' creates a in-memory DAG run and task
instance to run the task. However, with XCom establishing a database
level foreign key, this violates integrity and causes issues when the
task pushes to XCom. Therefore, we rewrite the command to create a
"real" DAG run (and task instance) instead and delete it afterwards.
Since both task instance and XCom tables declare the foreign key with
ON DELETE CASCADE, this also cleans them up properly and leaves no trace
in the database after the command is complete.
@uranusjr uranusjr marked this pull request as ready for review March 18, 2022 05:19
@ashb
Copy link
Member

ashb commented Mar 18, 2022

Some good to me

@uranusjr uranusjr merged commit 71c980a into apache:main Mar 21, 2022
@uranusjr uranusjr deleted the xcom-ti-fk branch March 21, 2022 03:50
@ephraimbuddy ephraimbuddy added the changelog:skip Changes that should be skipped from the changelog (CI, tests, etc..) label Apr 8, 2022
@jedcunningham jedcunningham added this to the Airflow 2.3.0 milestone Apr 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:dynamic-task-mapping AIP-42 changelog:skip Changes that should be skipped from the changelog (CI, tests, etc..) full tests needed We need to run full set of tests for this PR to merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants