Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make schema field in source-snowflake mean a subset of the specified o… #20465

Merged
merged 14 commits into from Jan 12, 2023

Conversation

rodireich
Copy link
Contributor

@rodireich rodireich commented Dec 14, 2022

What

the schema field in source-snowflake is not very useful today:
Regardless of what the user inputs in schema, all tables from all schemas are discovered and included in catalog.
This change changes its behavior to be more in line with other source connectors.

How

If schema is specified, the catalog discovery will be limited to only tables included in this schema.
If field left open, all tables from all schema will be included in catalog - similarly to today's discovery.

Recommended reading order

  1. spec,json
  2. SnowflakeDataSourceUtils.java

🚨 User Impact 🚨

This is going to have in impact, as source-snowflake connectors today have a mandatory schema field, but replication might include tables from outside the specified schema.
These customers needs to be instructed to leave the schema field empty upon upgrade (field is mandatory today).
Existing connection that only sync tables from the specified schema should not have a problem.

>>>>>>>> We need to be carful in publishing this in order not to break existing connections

Do not merge until we have a go ahead from TCS!!

cc: @erica-airbyte

@rodireich rodireich linked an issue Dec 14, 2022 that may be closed by this pull request
@github-actions
Copy link
Contributor

github-actions bot commented Dec 14, 2022

Affected Connector Report

NOTE ⚠️ Changes in this PR affect the following connectors. Make sure to do the following as needed:

  • Run integration tests
  • Bump connector or module version
  • Add changelog
  • Publish the new version

❌ Sources (30)

Connector Version Changelog Publish
source-alloydb 1.0.34
source-alloydb-strict-encrypt 1.0.34 🔵
(ignored)
🔵
(ignored)
source-bigquery 0.2.3
source-clickhouse 0.1.14
source-clickhouse-strict-encrypt 0.1.14 🔵
(ignored)
🔵
(ignored)
source-cockroachdb 0.1.18
source-cockroachdb-strict-encrypt 0.1.18 🔵
(ignored)
🔵
(ignored)
source-db2 0.1.16
source-db2-strict-encrypt 0.1.16 🔵
(ignored)
🔵
(ignored)
source-dynamodb 0.1.0
source-e2e-test 2.1.3
source-e2e-test-cloud 2.1.1 🔵
(ignored)
🔵
(ignored)
source-elasticsearch 0.1.1
source-jdbc 0.3.5 🔵
(ignored)
🔵
(ignored)
source-kafka 0.2.3
source-mongodb-strict-encrypt 0.1.19 🔵
(ignored)
🔵
(ignored)
source-mongodb-v2 0.1.19
source-mssql 0.4.26
source-mssql-strict-encrypt 0.4.26 🔵
(ignored)
🔵
(ignored)
source-mysql 1.0.18
source-mysql-strict-encrypt 1.0.18 🔵
(ignored)
🔵
(ignored)
source-oracle 0.3.21
source-oracle-strict-encrypt 0.3.21 🔵
(ignored)
🔵
(ignored)
source-postgres 1.0.35
source-postgres-strict-encrypt 1.0.35 🔵
(ignored)
🔵
(ignored)
source-redshift 0.3.15
source-scaffold-java-jdbc 0.1.0 🔵
(ignored)
🔵
(ignored)
source-sftp 0.1.2
source-snowflake 0.1.28
(diff seed version)
source-tidb 0.2.1
  • See "Actionable Items" below for how to resolve warnings and errors.

✅ Destinations (0)

Connector Version Changelog Publish
  • See "Actionable Items" below for how to resolve warnings and errors.

✅ Other Modules (0)

Actionable Items

(click to expand)

Category Status Actionable Item
Version
mismatch
The version of the connector is different from its normal variant. Please bump the version of the connector.

doc not found
The connector does not seem to have a documentation file. This can be normal (e.g. basic connector like source-jdbc is not published or documented). Please double-check to make sure that it is not a bug.
Changelog
doc not found
The connector does not seem to have a documentation file. This can be normal (e.g. basic connector like source-jdbc is not published or documented). Please double-check to make sure that it is not a bug.

changelog missing
There is no chnagelog for the current version of the connector. If you are the author of the current version, please add a changelog.
Publish
not in seed
The connector is not in the seed file (e.g. source_definitions.yaml), so its publication status cannot be checked. This can be normal (e.g. some connectors are cloud-specific, and only listed in the cloud seed file). Please double-check to make sure that it is not a bug.

diff seed version
The connector exists in the seed file, but the latest version is not listed there. This usually means that the latest version is not published. Please use the /publish command to publish the latest version.

@rodireich
Copy link
Contributor Author

rodireich commented Dec 14, 2022

/test connector=connectors/source-snowflake

🕑 connectors/source-snowflake https://github.com/airbytehq/airbyte/actions/runs/3699515235
❌ connectors/source-snowflake https://github.com/airbytehq/airbyte/actions/runs/3699515235
🐛 https://gradle.com/s/gstpokxsslwiw

Build Failed

Test summary info:

Could not find result summary

.collect(Collectors.toMap(AirbyteStream::getName, s -> s));
.collect(Collectors.toMap(
s ->
"%s.%s".formatted(s.getNamespace(), s.getName()),
Copy link
Contributor Author

@rodireich rodireich Dec 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change makes the test more robust in case previous runs left junk schemas or in case multiple instances of acceptance test are running at the same time.

@rodireich
Copy link
Contributor Author

rodireich commented Dec 14, 2022

/test connector=connectors/source-snowflake

🕑 connectors/source-snowflake https://github.com/airbytehq/airbyte/actions/runs/3699740148
❌ connectors/source-snowflake https://github.com/airbytehq/airbyte/actions/runs/3699740148
🐛 https://gradle.com/s/i5tpwjwha5k6q

Build Failed

Test summary info:

=========================== short test summary info ============================
ERROR test_core.py::TestDiscovery::test_backward_compatibility[inputs0] - doc...
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/tests/test_incremental.py:26: `future_state` not specified, skipping.
============== 28 passed, 1 skipped, 1 error in 62.30s (0:01:02) ===============

@rodireich
Copy link
Contributor Author

rodireich commented Dec 15, 2022

/test connector=connectors/source-snowflake

🕑 connectors/source-snowflake https://github.com/airbytehq/airbyte/actions/runs/3700775419

@rodireich
Copy link
Contributor Author

rodireich commented Dec 15, 2022

/test connector=connectors/source-snowflake

🕑 connectors/source-snowflake https://github.com/airbytehq/airbyte/actions/runs/3700872496
❌ connectors/source-snowflake https://github.com/airbytehq/airbyte/actions/runs/3700872496
🐛 https://gradle.com/s/hvkpfoqcklumy

Build Failed

Test summary info:

=========================== short test summary info ============================
ERROR test_core.py::TestDiscovery::test_backward_compatibility[inputs0] - doc...
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/tests/test_incremental.py:26: `future_state` not specified, skipping.
============== 28 passed, 1 skipped, 1 error in 63.08s (0:01:03) ===============

@rodireich rodireich temporarily deployed to more-secrets December 15, 2022 03:43 — with GitHub Actions Inactive
@rodireich rodireich temporarily deployed to more-secrets December 15, 2022 03:43 — with GitHub Actions Inactive
@rodireich
Copy link
Contributor Author

rodireich commented Dec 15, 2022

/test connector=connectors/source-snowflake

🕑 connectors/source-snowflake https://github.com/airbytehq/airbyte/actions/runs/3701376021
✅ connectors/source-snowflake https://github.com/airbytehq/airbyte/actions/runs/3701376021
Python tests coverage:

	 Name                                                 Stmts   Miss  Cover   Missing
	 ----------------------------------------------------------------------------------
	 source_acceptance_test/base.py                          12      4    67%   16-19
	 source_acceptance_test/config.py                       140      5    96%   87, 93, 238, 242-243
	 source_acceptance_test/conftest.py                     208     92    56%   36, 42-44, 49, 54, 77, 83, 89-91, 110, 115-117, 123-125, 131-132, 137-138, 143, 149, 158-167, 173-178, 193, 217, 248, 254, 262-267, 275-280, 288-301, 306-312, 319-330, 337-353
	 source_acceptance_test/plugin.py                        69     25    64%   22-23, 31, 36, 120-140, 144-148
	 source_acceptance_test/tests/test_core.py              398    111    72%   53, 58, 87-95, 100-107, 111-112, 116-117, 299, 337-354, 363-371, 375-380, 386, 419-424, 462-469, 512-514, 517, 582-590, 602-605, 610, 666-667, 673, 676, 712-722, 735-760
	 source_acceptance_test/tests/test_incremental.py       158     14    91%   52-59, 64-77, 240
	 source_acceptance_test/utils/asserts.py                 39      2    95%   62-63
	 source_acceptance_test/utils/common.py                  94     10    89%   16-17, 32-38, 72, 75
	 source_acceptance_test/utils/compare.py                 62     23    63%   21-51, 68, 97-99
	 source_acceptance_test/utils/connector_runner.py       133     33    75%   24-27, 46-47, 50-54, 57-58, 73-75, 78-80, 83-85, 88-90, 93-95, 124-125, 159-161, 208
	 source_acceptance_test/utils/json_schema_helper.py     107     13    88%   30-31, 38, 41, 65-68, 96, 120, 192-194
	 ----------------------------------------------------------------------------------
	 TOTAL                                                 1599    332    79%

Build Passed

Test summary info:

=========================== short test summary info ============================
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/tests/test_incremental.py:26: `future_state` not specified, skipping.
=================== 29 passed, 1 skipped in 70.39s (0:01:10) ===================

@@ -12691,7 +12691,7 @@
"$schema": "http://json-schema.org/draft-07/schema#",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@grishick , @evantahler Do we need to make changes to connector_catalog as well?

@rodireich rodireich temporarily deployed to more-secrets December 15, 2022 06:25 — with GitHub Actions Inactive
@rodireich rodireich temporarily deployed to more-secrets December 15, 2022 06:25 — with GitHub Actions Inactive
…-schema-not-used' into 20018-snowflake-source-connector-schema-not-used
@rodireich rodireich temporarily deployed to more-secrets December 15, 2022 06:30 — with GitHub Actions Inactive
@rodireich rodireich temporarily deployed to more-secrets December 15, 2022 06:30 — with GitHub Actions Inactive
@rodireich
Copy link
Contributor Author

rodireich commented Dec 16, 2022

Going over metabase I'm seeing the following active cloud source-snowflake connections:

  1. 40ea7f191a87: Only syncing tables of specified schema. Won't break ✔
  2. 5588f84cedd0: Won't break ✔
  3. 8cb1f1336d6d: Won't break ✔
  4. 1109283f47c7: Won't break ✔
  5. d845f68fa43c: Won't break ✔
  6. e34c23e109cf: Won't break ✔

@rodireich rodireich temporarily deployed to more-secrets January 6, 2023 05:56 — with GitHub Actions Inactive
@rodireich rodireich temporarily deployed to more-secrets January 6, 2023 05:57 — with GitHub Actions Inactive
@rodireich rodireich temporarily deployed to more-secrets January 6, 2023 06:07 — with GitHub Actions Inactive
@rodireich rodireich temporarily deployed to more-secrets January 6, 2023 06:07 — with GitHub Actions Inactive
@rodireich
Copy link
Contributor Author

rodireich commented Jan 6, 2023

/test connector=connectors/source-snowflake

🕑 connectors/source-snowflake https://github.com/airbytehq/airbyte/actions/runs/3853261129
❌ connectors/source-snowflake https://github.com/airbytehq/airbyte/actions/runs/3853261129
🐛 https://gradle.com/s/a54klqihmqlw4

Build Failed

Test summary info:

Could not find result summary

@rodireich
Copy link
Contributor Author

rodireich commented Jan 6, 2023

/test connector=connectors/source-snowflake

🕑 connectors/source-snowflake https://github.com/airbytehq/airbyte/actions/runs/3853437386
✅ connectors/source-snowflake https://github.com/airbytehq/airbyte/actions/runs/3853437386
Python tests coverage:

	 Name                                                 Stmts   Miss  Cover   Missing
	 ----------------------------------------------------------------------------------
	 source_acceptance_test/base.py                          12      4    67%   16-19
	 source_acceptance_test/config.py                       141      5    96%   87, 93, 239, 243-244
	 source_acceptance_test/conftest.py                     211     95    55%   36, 42-44, 49, 54, 77, 83, 89-91, 110, 115-117, 123-125, 131-132, 137-138, 143, 149, 158-167, 173-178, 193, 217, 248, 254, 262-267, 275-285, 293-306, 311-317, 324-335, 342-358
	 source_acceptance_test/plugin.py                        69     25    64%   22-23, 31, 36, 120-140, 144-148
	 source_acceptance_test/tests/test_core.py              402    115    71%   53, 58, 93-104, 109-116, 120-121, 125-126, 308, 346-363, 376-387, 391-396, 402, 435-440, 478-485, 528-530, 533, 598-606, 618-621, 626, 682-683, 689, 692, 728-738, 751-776
	 source_acceptance_test/tests/test_incremental.py       160     14    91%   56-63, 68-81, 244
	 source_acceptance_test/utils/asserts.py                 39      2    95%   62-63
	 source_acceptance_test/utils/common.py                  94     10    89%   16-17, 32-38, 72, 75
	 source_acceptance_test/utils/compare.py                 62     23    63%   21-51, 68, 97-99
	 source_acceptance_test/utils/connector_runner.py       133     33    75%   24-27, 46-47, 50-54, 57-58, 73-75, 78-80, 83-85, 88-90, 93-95, 124-125, 159-161, 208
	 source_acceptance_test/utils/json_schema_helper.py     107     13    88%   30-31, 38, 41, 65-68, 96, 120, 192-194
	 ----------------------------------------------------------------------------------
	 TOTAL                                                 1609    339    79%

Build Passed

Test summary info:

=========================== short test summary info ============================
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/tests/test_core.py:377: The previous and actual discovered catalogs are identical.
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/tests/test_incremental.py:30: `future_state` not specified, skipping.
=================== 28 passed, 2 skipped in 72.17s (0:01:12) ===================

@rodireich rodireich temporarily deployed to more-secrets January 6, 2023 07:20 — with GitHub Actions Inactive
@rodireich rodireich temporarily deployed to more-secrets January 6, 2023 07:21 — with GitHub Actions Inactive
@rodireich
Copy link
Contributor Author

rodireich commented Jan 6, 2023

/publish connector=connectors/source-snowflake

🕑 Publishing the following connectors:
connectors/source-snowflake
https://github.com/airbytehq/airbyte/actions/runs/3853527039


Connector Did it publish? Were definitions generated?
connectors/source-snowflake

if you have connectors that successfully published but failed definition generation, follow step 4 here ▶️

@rodireich rodireich temporarily deployed to more-secrets January 6, 2023 07:44 — with GitHub Actions Inactive
@rodireich rodireich temporarily deployed to more-secrets January 6, 2023 07:45 — with GitHub Actions Inactive
@rodireich rodireich temporarily deployed to more-secrets January 6, 2023 08:05 — with GitHub Actions Inactive
@rodireich rodireich temporarily deployed to more-secrets January 6, 2023 08:05 — with GitHub Actions Inactive
@rodireich rodireich temporarily deployed to more-secrets January 6, 2023 09:02 — with GitHub Actions Inactive
@rodireich rodireich temporarily deployed to more-secrets January 6, 2023 09:02 — with GitHub Actions Inactive
@rodireich rodireich temporarily deployed to more-secrets January 11, 2023 17:14 — with GitHub Actions Inactive
@rodireich rodireich temporarily deployed to more-secrets January 11, 2023 17:14 — with GitHub Actions Inactive
@rodireich rodireich merged commit 94c84f7 into master Jan 12, 2023
@rodireich rodireich deleted the 20018-snowflake-source-connector-schema-not-used branch January 12, 2023 17:30
jbfbell pushed a commit that referenced this pull request Jan 13, 2023
…o… (#20465)

* Make schema field in source-postgres mean a subset of the specified of schema when during discover(). update UI

* Add missing file

* Fix failing acceptance test

* sanity

* update doc

* typo

* version bump and release note

* Fix failing test

* fix format
@sheshan-doye-konvergeai
Copy link

sheshan-doye-konvergeai commented Feb 27, 2023

trying snowflake 1.28 source version,
it seems now nothing is discovered if schema is provided
response:
{"catalog":{"streams":[]},"jobInfo":{"id":"c2af2abf-05c8-4525-aea2-ec10002a6832","configType":"discover_schema","configId":"Optional[e2d65910-8c8b-40a1-ae7d-ee2416b2bfa2]","createdAt":1677493097784,"endedAt":1677493103991,"succeeded":true

am i missing anything?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues area/documentation Improvements or additions to documentation connectors/source/snowflake
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Source Snowflake - connector Schema not used
6 participants