Skip to content

Airflow database migration fails in MySQL group replication mode #27082

@tirkarthi

Description

@tirkarthi

Apache Airflow version

Other Airflow 2 version (please specify below)

What happened

We are running MySQL in group replication mode and as per docs a primary key is required in a table. We noticed during our migration from Airflow 2.1.x to 2.3.4 that Airflow creates a temporary table to perform xcom related migrations by inserting the records and then renaming the table __airflow_tmp_xcom to xcom. The insert statement fails since the primary key is created after insertions. We have made a fix to move the primary key creation to be on the temporary table. Another possible solution would be to set the pragma as per docs. I have opened this issue for discussion since this might be helpful to others who might be running a similar MySQL setup and face this issue with this and possibly other migrations to 2.4.x .

op.execute(f"INSERT INTO __airflow_tmp_xcom {query.selectable.compile(op.get_bind())}")
if is_sqlite:
op.execute("PRAGMA foreign_keys=off")
op.drop_table("xcom")
if is_sqlite:
op.execute("PRAGMA foreign_keys=on")
op.rename_table("__airflow_tmp_xcom", "xcom")
with op.batch_alter_table("xcom") as batch_op:
batch_op.create_primary_key("xcom_pkey", ["dag_run_id", "task_id", "map_index", "key"])

https://dev.mysql.com/doc/refman/8.0/en/group-replication-requirements.html

Primary Keys. Every table that is to be replicated by the group must have a defined primary key, or primary key equivalent where the equivalent is a non-null unique key. Such keys are required as a unique identifier for every row within a table, enabling the system to determine which transactions conflict by identifying exactly which rows each transaction has modified. Group Replication has its own built-in set of checks for primary keys or primary key equivalents, and does not use the checks carried out by the sql_require_primary_key system variable. You may set sql_require_primary_key=ON for a server instance where Group Replication is running, and you may set the REQUIRE_TABLE_PRIMARY_KEY_CHECK option of the CHANGE REPLICATION SOURCE TO | CHANGE MASTER TO statement to ON for a Group Replication channel. However, be aware that you might find some transactions that are permitted under Group Replication's built-in checks are not permitted under the checks carried out when you set sql_require_primary_key=ON or REQUIRE_TABLE_PRIMARY_KEY_CHECK=ON.

What you think should happen instead

The migration should succeed without any issues.

How to reproduce

Perform airflow db upgrade to 2.3.0 with MySQL in group replication mode.

Operating System

Redhat

Versions of Apache Airflow Providers

No response

Deployment

Other Docker-based deployment

Deployment details

No response

Anything else

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions