Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issue 123: Repeated reserialization of DAGs #125

Merged
merged 1 commit into from
Aug 2, 2024

Conversation

rafidka
Copy link
Contributor

@rafidka rafidka commented Jul 29, 2024

Issue #, if available: #124

Description of changes

Issue #124 happens because we are always calling airflow db migrate to make sure the database is migrated. The assumption was that the operation is a no-op if the DB is already initialized. However, this is not the case, and a reserialization of DAGs is happening every time this command is called. To avoid this, I am changing the code for executing migration to check if migration is needed.

Testing

I did multiple tests using the Docker Compose setup. More testing by Amazon MWAA developers before this PR is merged. Suggested testing:

  1. Create an internal 2.9.2 environment and make sure it is functioning normally. Also check ECS logs to ensure there are no issues.
  2. Create a pre-2.9.2 environment and do an in-place upgrade to 2.9.2 and make sure the process is flawless.
  3. Run a 2.9.2 environment with auto scaling enabled (ensuring that workers are being created and terminated continuously) and make sure migrations are not causing any issues.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Issue aws#124 happens because we are always calling `airflow db migrate` to
make sure the database is migrated. The assumption was that the
operation is a no-op if the DB is already initialized. However, this is
not the case, and a reserialization of DAGs is happening every time this
command is called. To avoid this, I am changing the code for executing
migration to check if migration is needed.
@jaklan
Copy link

jaklan commented Aug 2, 2024

When it's merged, will MWAA 2.9.2-based instances be automatically updated? We are also affected on DEV & UAT environments by apache/airflow#40082 and waiting for the fix before we propagate upgrade from 2.8 to 2.9 on PRD

@rafidka
Copy link
Contributor Author

rafidka commented Aug 2, 2024

When it's merged, will MWAA 2.9.2-based instances be automatically updated? We are also affected on DEV & UAT environments by apache/airflow#40082 and waiting for the fix before we propagate upgrade from 2.8 to 2.9 on PRD

Unfortunately no, you would need to trigger an environment update to have the Docker containers updated. Even a no-op update would do, though, so it should be pretty easy to do.

@jaklan
Copy link

jaklan commented Aug 2, 2024

@rafidka sure, that's not an issue. Are changes immediately propagated to MWAA after merging PR or is there any build- / deployment-related delay we have to take into consideration before we can trigger an update and test new containers?

@rafidka
Copy link
Contributor Author

rafidka commented Aug 2, 2024

Are changes immediately propagated to MWAA after merging PR or is there any build- / deployment-related delay we have to take into consideration before we can trigger an update and test new containers?

After the PR is merged, we sync it internally and deploy it via internal pipelines where it goes through testing, and then becomes available to different regions around the world gradually. Usually, assuming no test failures or pipeline blockages, this takes a couple of days. I am currently out of office, but @dhegberg should be able to help you with status updates. Daniel, we need to have this PR merged asap, as it is a production issue for our customers.

@Mercury2699 Mercury2699 merged commit e0412e4 into aws:main Aug 2, 2024
1 check passed
@rafidka rafidka deleted the fix-124 branch August 2, 2024 22:00
@jaklan
Copy link

jaklan commented Aug 6, 2024

PR is merged, so asking for the status update then 😄 Can we proceed with the re-deployment or not yet?

@shubham22
Copy link

@Mercury2699 - can you confirm if this has been deployed to all the waves/regions?

@Mercury2699
Copy link

@jaklan Yes, this fix has been deployed to all regions. Customers can trigger environment update to receive the latest image.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants