Added 'extend_existing' to Sqla Table object #626

utkarsharma2 · 2022-08-10T12:56:52Z

Description

What is the current behavior?

Quite often this test group fails to run in the CI:

nox -s "test-3.8(airflow='2.3')" -- --splits 12 --group 1 --cov=src --cov-report=xml --cov-branch

With:

=========================== short test summary info ============================
FAILED tests/test_example_dags.py::test_example_dag[example_snowflake_partial_table_with_append]
==== 1 failed, 29 passed, 329 deselected, 22 warnings in 294.40s (0:04:54) =====

Even if there were no changes that affected this test/our code-base, for instance:
#480
https://github.com/astronomer/astro-sdk/runs/7231300739

closes: #516

What is the new behavior?

The issue was with SQLA's reflection cache which keeps track of the tables created and maintains all the tables in a map
schema_columns[self.normalize_name(table_name)]. When this test was individually run there are no issues on local since SQLA cache is just initialized just for this test and because of which this was an intermittent issue CI. The fix is to introduce a parameter extend_existing which basically ensures the cache is not used.

Ran test multiple time to ensure this is working:
https://github.com/astronomer/astro-sdk/actions/runs/2832783413

Does this introduce a breaking change?

Nope

codecov · 2022-08-10T13:17:18Z

Codecov Report

Merging #626 (580dea0) into main (1288291) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##             main     #626   +/-   ##
=======================================
  Coverage   93.30%   93.30%           
=======================================
  Files          41       41           
  Lines        1672     1672           
  Branches      211      211           
=======================================
  Hits         1560     1560           
  Misses         89       89           
  Partials       23       23

Impacted Files	Coverage Δ
src/astro/databases/base.py	`96.12% <ø> (ø)`

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

feluelle

SGTM 👍

tatiana · 2022-08-10T14:24:41Z

@utkarsharma2 isn't the root cause of the issue because concurrent tests are trying to create a table with the same name (home and homes2)?

astro-sdk/example_dags/example_snowflake_partial_table_with_append.py

Line 88 in 367f94b

name="homes",

Wouldn't it be better to change the example DAG to create tables with unique names? Would this be enough to fix the problem? If we didn't set the names of these tables explicitly, it would also have the advantage that clean up would delete the temporary tables with unique names by the end of the execution.

tatiana · 2022-08-10T14:30:34Z

src/astro/databases/base.py

+            autoload_with=self.sqlalchemy_engine,
+            extend_existing=True,


Without these arguments, the Python SDK would raise an exception if concurrent processes tried to create the same table (what was happening in the CI).
By introducing this change, we may be hiding this type of issue - making it harder for users to troubleshoot this anti-pattern (having two processes trying to create a table with same name).

@tatiana This was my first guess as well I converted persistent tables to temp tables, but the issue still persisted - https://github.com/astronomer/astro-sdk/runs/7765749178?check_suite_focus=true

And I don't think busting the cache will hide the issue if anything this will bring up the issue sooner since now you will be referring to the database directly and not the cache. I don't understand how this is an antipattern.

However, this comes at the cost of running a query to get table columns every time we create the table object, but since we are not handling all the table operations via SQLA. SQLA doesn't have any knowledge about updates done to a table by some operator. So, in my opinion, this should be added to keep such situations in check.

@tatiana But I do understand your point, having temp would be better, and chances of collision would be reduced. I can make this change on top of existing changes. WDYT?

As per the discussion on a call with @tatiana, we concluded the following:

The extend_existing is required and was the root cause of the issue in CI.

Persistent tables in example dag can also lead to conflicts when there are parallel test suites running.

Based on the above points I have updated the PR to reflect those changes.

Amazing, thank you very much, @utkarsharma2 !

example_dags/example_snowflake_partial_table_with_append.py

Noticed an issue after approving

What is the current behavior? Quite often this test group fails to run in the CI: nox -s "test-3.8(airflow='2.3')" -- --splits 12 --group 1 --cov=src --cov-report=xml --cov-branch With: =========================== short test summary info ============================ FAILED tests/test_example_dags.py::test_example_dag[example_snowflake_partial_table_with_append] ==== 1 failed, 29 passed, 329 deselected, 22 warnings in 294.40s (0:04:54) ===== Even if there were no changes that affected this test/our code-base, for instance: #480 https://github.com/astronomer/astro-sdk/runs/7231300739 closes: #516 What is the new behavior? The issue was with SQLA's reflection cache which keeps track of the tables created and maintains all the tables in a map schema_columns[self.normalize_name(table_name)]. When this test was individually run there are no issues on local since SQLA cache is just initialized just for this test and because of which this was an intermittent issue CI. The fix is to introduce a parameter extend_existing which basically ensures the cache is not used. Ran test multiple time to ensure this is working: https://github.com/astronomer/astro-sdk/actions/runs/2832783413 Does this introduce a breaking change? Nope

Added 'extend_existing' to Sqla Table object

5d3fb0c

utkarsharma2 requested review from dimberman, tatiana, sunank200, pankajastro, pankajkoti and feluelle as code owners August 10, 2022 12:56

utkarsharma2 marked this pull request as draft August 10, 2022 12:57

Merge branch 'main' into FixFlakyTests

7dd949b

utkarsharma2 marked this pull request as ready for review August 10, 2022 14:15

pankajkoti approved these changes Aug 10, 2022

View reviewed changes

feluelle approved these changes Aug 10, 2022

View reviewed changes

tatiana reviewed Aug 10, 2022

View reviewed changes

Convert persistent tables to temp tables

580dea0

tatiana previously approved these changes Aug 11, 2022

View reviewed changes

tatiana reviewed Aug 11, 2022

View reviewed changes

example_dags/example_snowflake_partial_table_with_append.py Show resolved Hide resolved

tatiana approved these changes Aug 11, 2022

View reviewed changes

utkarsharma2 merged commit cd5c5e9 into main Aug 11, 2022

utkarsharma2 deleted the FixFlakyTests branch August 11, 2022 13:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added 'extend_existing' to Sqla Table object #626

Added 'extend_existing' to Sqla Table object #626

utkarsharma2 commented Aug 10, 2022 •

edited

codecov bot commented Aug 10, 2022 •

edited

feluelle left a comment

tatiana commented Aug 10, 2022 •

edited

tatiana Aug 10, 2022 •

edited

utkarsharma2 Aug 10, 2022 •

edited

utkarsharma2 Aug 10, 2022

utkarsharma2 Aug 11, 2022

tatiana Aug 11, 2022

Added 'extend_existing' to Sqla Table object #626

Added 'extend_existing' to Sqla Table object #626

Conversation

utkarsharma2 commented Aug 10, 2022 • edited

Description

What is the current behavior?

What is the new behavior?

Does this introduce a breaking change?

codecov bot commented Aug 10, 2022 • edited

Codecov Report

feluelle left a comment

Choose a reason for hiding this comment

tatiana commented Aug 10, 2022 • edited

tatiana Aug 10, 2022 • edited

Choose a reason for hiding this comment

utkarsharma2 Aug 10, 2022 • edited

Choose a reason for hiding this comment

utkarsharma2 Aug 10, 2022

Choose a reason for hiding this comment

utkarsharma2 Aug 11, 2022

Choose a reason for hiding this comment

tatiana Aug 11, 2022

Choose a reason for hiding this comment

utkarsharma2 commented Aug 10, 2022 •

edited

codecov bot commented Aug 10, 2022 •

edited

tatiana commented Aug 10, 2022 •

edited

tatiana Aug 10, 2022 •

edited

utkarsharma2 Aug 10, 2022 •

edited