-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added 'extend_existing' to Sqla Table object #626
Conversation
Codecov Report
@@ Coverage Diff @@
## main #626 +/- ##
=======================================
Coverage 93.30% 93.30%
=======================================
Files 41 41
Lines 1672 1672
Branches 211 211
=======================================
Hits 1560 1560
Misses 89 89
Partials 23 23
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM 👍
@utkarsharma2 isn't the root cause of the issue because concurrent tests are trying to create a table with the same name (
Wouldn't it be better to change the example DAG to create tables with unique names? Would this be enough to fix the problem? If we didn't set the names of these tables explicitly, it would also have the advantage that clean up would delete the temporary tables with unique names by the end of the execution. |
autoload_with=self.sqlalchemy_engine, | ||
extend_existing=True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without these arguments, the Python SDK would raise an exception if concurrent processes tried to create the same table (what was happening in the CI).
By introducing this change, we may be hiding this type of issue - making it harder for users to troubleshoot this anti-pattern (having two processes trying to create a table with same name).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tatiana This was my first guess as well I converted persistent tables to temp tables, but the issue still persisted - https://github.com/astronomer/astro-sdk/runs/7765749178?check_suite_focus=true
And I don't think busting the cache will hide the issue if anything this will bring up the issue sooner since now you will be referring to the database directly and not the cache. I don't understand how this is an antipattern.
However, this comes at the cost of running a query to get table columns every time we create the table object, but since we are not handling all the table operations via SQLA. SQLA doesn't have any knowledge about updates done to a table by some operator. So, in my opinion, this should be added to keep such situations in check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tatiana But I do understand your point, having temp would be better, and chances of collision would be reduced. I can make this change on top of existing changes. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per the discussion on a call with @tatiana, we concluded the following:
- The
extend_existing
is required and was the root cause of the issue in CI. - Persistent tables in example dag can also lead to conflicts when there are parallel test suites running.
Based on the above points I have updated the PR to reflect those changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing, thank you very much, @utkarsharma2 !
What is the current behavior? Quite often this test group fails to run in the CI: nox -s "test-3.8(airflow='2.3')" -- --splits 12 --group 1 --cov=src --cov-report=xml --cov-branch With: =========================== short test summary info ============================ FAILED tests/test_example_dags.py::test_example_dag[example_snowflake_partial_table_with_append] ==== 1 failed, 29 passed, 329 deselected, 22 warnings in 294.40s (0:04:54) ===== Even if there were no changes that affected this test/our code-base, for instance: #480 https://github.com/astronomer/astro-sdk/runs/7231300739 closes: #516 What is the new behavior? The issue was with SQLA's reflection cache which keeps track of the tables created and maintains all the tables in a map schema_columns[self.normalize_name(table_name)]. When this test was individually run there are no issues on local since SQLA cache is just initialized just for this test and because of which this was an intermittent issue CI. The fix is to introduce a parameter extend_existing which basically ensures the cache is not used. Ran test multiple time to ensure this is working: https://github.com/astronomer/astro-sdk/actions/runs/2832783413 Does this introduce a breaking change? Nope
What is the current behavior? Quite often this test group fails to run in the CI: nox -s "test-3.8(airflow='2.3')" -- --splits 12 --group 1 --cov=src --cov-report=xml --cov-branch With: =========================== short test summary info ============================ FAILED tests/test_example_dags.py::test_example_dag[example_snowflake_partial_table_with_append] ==== 1 failed, 29 passed, 329 deselected, 22 warnings in 294.40s (0:04:54) ===== Even if there were no changes that affected this test/our code-base, for instance: #480 https://github.com/astronomer/astro-sdk/runs/7231300739 closes: #516 What is the new behavior? The issue was with SQLA's reflection cache which keeps track of the tables created and maintains all the tables in a map schema_columns[self.normalize_name(table_name)]. When this test was individually run there are no issues on local since SQLA cache is just initialized just for this test and because of which this was an intermittent issue CI. The fix is to introduce a parameter extend_existing which basically ensures the cache is not used. Ran test multiple time to ensure this is working: https://github.com/astronomer/astro-sdk/actions/runs/2832783413 Does this introduce a breaking change? Nope
Description
What is the current behavior?
Quite often this test group fails to run in the CI:
nox -s "test-3.8(airflow='2.3')" -- --splits 12 --group 1 --cov=src --cov-report=xml --cov-branch
With:
=========================== short test summary info ============================
FAILED tests/test_example_dags.py::test_example_dag[example_snowflake_partial_table_with_append]
==== 1 failed, 29 passed, 329 deselected, 22 warnings in 294.40s (0:04:54) =====
Even if there were no changes that affected this test/our code-base, for instance:
#480
https://github.com/astronomer/astro-sdk/runs/7231300739
closes: #516
What is the new behavior?
The issue was with SQLA's reflection cache which keeps track of the tables created and maintains all the tables in a map
schema_columns[self.normalize_name(table_name)]
. When this test was individually run there are no issues on local since SQLA cache is just initialized just for this test and because of which this was an intermittent issue CI. The fix is to introduce a parameter extend_existing which basically ensures the cache is not used.Ran test multiple time to ensure this is working:
https://github.com/astronomer/astro-sdk/actions/runs/2832783413
Does this introduce a breaking change?
Nope