Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[low-code cdk] Allow for read command to be run on low code connector streams w/o a schema file #18532

Merged
merged 5 commits into from
Nov 3, 2022

Conversation

brianjlai
Copy link
Contributor

@brianjlai brianjlai commented Oct 27, 2022

What

Right now, connectors built using the low-code framework require that each stream have a schema defined in the schemas/<stream_schema>.json. For low-code connector iterative development, schemas are not strictly necessary for read. To make the Connector Builder UI scope smaller, this change allows read to be performed w/o schemas

How

  • Defines a MockSchemaLoader which returns an empty schema back to the underlying CDK. This MockSchemaLoader is the new default for low code connectors that don't define a schema_loader. New connectors built in the Connector Builder UI will not need schemas defined to run read.
  • Renamed JsonSchema to JsonFileSchemaLoader for better clarity
  • We don't have a great versioning story, but we only have a few low code connectors, so I went ahead and migrated all of them to the new name. I can redeploy them after I publish the CDK.

Recommended reading order

  1. mock_schema_loader.py
  2. json_file_schema_loader.py
  3. declarative_stream.py
  4. The existing low code connector yaml that are switching to the new name

Copy link
Contributor

@sherifnada sherifnada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would this cause all discover operations to return empty schemas unless the developer specifically specifies a Json schema loader for the connector? if so why not wrap the json loader in a "fallback" or "EmptyOnError" loader or something, which defaults to empty if the loader it wraps fails?

@brianjlai
Copy link
Contributor Author

would this cause all discover operations to return empty schemas unless the developer specifically specifies a Json schema loader for the connector? if so why not wrap the json loader in a "fallback" or "EmptyOnError" loader or something, which defaults to empty if the loader it wraps fails?

Given our discussion during planning specifically that an empty schema would be a part of the initial development flow before exporting the YAML and adding schemas before running source acceptance tests, we should be okay on this point

Copy link
Contributor

@sherifnada sherifnada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

non blocking coments

"ListStreamSlicer": ListStreamSlicer,
"MinMaxDatetime": MinMaxDatetime,
"MockSchemaLoader": MockSchemaLoader,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: I'd replace Mock with Empty in the name

@@ -62,9 +63,10 @@
"HttpRequester": HttpRequester,
"InterpolatedBoolean": InterpolatedBoolean,
"InterpolatedString": InterpolatedString,
"JsonSchema": JsonSchema,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a worry that this might break existing users?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I scanned through our existing low code implementations and i'm going to migrate them to the new name JsonFileSchemaLoader after the CDK is published first. However, on second thought, I think instead of doing the rename now, we'll just bundle this up into a series of cosmetic changes we have slated in November. Even though Hacktober fest is over, some contributors are still making adjustment and we don't want to add unnecessary friction if it can wait

@brianjlai
Copy link
Contributor Author

brianjlai commented Nov 3, 2022

/publish-cdk dry-run=true

🕑 https://github.com/airbytehq/airbyte/actions/runs/3382684219
https://github.com/airbytehq/airbyte/actions/runs/3382684219

@brianjlai
Copy link
Contributor Author

brianjlai commented Nov 3, 2022

/publish-cdk dry-run=false

🕑 https://github.com/airbytehq/airbyte/actions/runs/3383169255
https://github.com/airbytehq/airbyte/actions/runs/3383169255

@brianjlai
Copy link
Contributor Author

brianjlai commented Nov 3, 2022

/test connector=connectors/source-greenhouse

🕑 connectors/source-greenhouse https://github.com/airbytehq/airbyte/actions/runs/3386946834
❌ connectors/source-greenhouse https://github.com/airbytehq/airbyte/actions/runs/3386946834
🐛 https://gradle.com/s/njbewgfnia5pw

Build Failed

Test summary info:

	 =========================== short test summary info ============================
	 FAILED unit_tests/test_streams.py::test_ignore_403 - requests.exceptions.JSON...
	 �[31m================= �[31m�[1m1 failed�[0m, �[32m10 passed�[0m, �[33m1520 warnings�[0m�[31m in 53.94s�[0m�[31m =================�[0m

@brianjlai
Copy link
Contributor Author

brianjlai commented Nov 3, 2022

/test connector=connectors/source-courier

🕑 connectors/source-courier https://github.com/airbytehq/airbyte/actions/runs/3386955355
✅ connectors/source-courier https://github.com/airbytehq/airbyte/actions/runs/3386955355
Python tests coverage:

	 Name                                                 Stmts   Miss  Cover   Missing
	 ----------------------------------------------------------------------------------
	 source_acceptance_test/base.py                          12      4    67%   16-19
	 source_acceptance_test/config.py                       133      3    98%   87, 93, 230
	 source_acceptance_test/conftest.py                     196     97    51%   35, 41-43, 48, 54, 60, 66, 72-74, 80-95, 100, 105-107, 113-115, 121-122, 127-128, 133, 139, 148-157, 163-168, 232, 238, 244-250, 258-263, 271-284, 289-295, 302-313, 320-336
	 source_acceptance_test/plugin.py                        69     25    64%   22-23, 31, 36, 120-140, 144-148
	 source_acceptance_test/tests/test_core.py              329    106    68%   39, 50-58, 63-70, 74-75, 79-80, 164, 202-219, 228-236, 240-245, 251, 284-289, 327-334, 377-379, 382, 447-455, 484-485, 491, 494, 530-540, 553-578
	 source_acceptance_test/tests/test_incremental.py       145     20    86%   21-23, 29-31, 36-43, 48-61, 224
	 source_acceptance_test/utils/asserts.py                 37      2    95%   57-58
	 source_acceptance_test/utils/common.py                  77     10    87%   15-16, 24-30, 64, 67
	 source_acceptance_test/utils/compare.py                 62     23    63%   21-51, 68, 97-99
	 source_acceptance_test/utils/config_migration.py        23     23     0%   5-37
	 source_acceptance_test/utils/connector_runner.py       112     50    55%   23-26, 32, 36, 39-68, 71-73, 76-78, 81-83, 86-88, 91-93, 96-114, 148-150
	 source_acceptance_test/utils/json_schema_helper.py     105     13    88%   30-31, 38, 41, 65-68, 96, 120, 190-192
	 ----------------------------------------------------------------------------------
	 TOTAL                                                 1479    376    75%

Build Passed

Test summary info:

=========================== short test summary info ============================
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/plugin.py:63: Skipping TestIncremental.test_two_sequential_reads: not found in the config.
============ 26 passed, 1 skipped, 27 warnings in 90.48s (0:01:30) =============

@brianjlai
Copy link
Contributor Author

note: greenhouse has an existing test failure

@brianjlai brianjlai merged commit cf2f23d into master Nov 3, 2022
@brianjlai brianjlai deleted the brian/low_code_read_without_schema_file branch November 3, 2022 16:05
girarda pushed a commit that referenced this pull request Nov 3, 2022
… streams w/o a schema file (#18532)

* use mockschemaloader as the default and rename JsonSchema to JsonFileSchemaLoader

* rename mock to EmptySchemaLoader

* retain existing JsonSchema while hacktoberfest is still in progress

* bump version and changelog
mlavoie-sm360 pushed a commit to mlavoie-sm360/airbyte that referenced this pull request Nov 8, 2022
… streams w/o a schema file (airbytehq#18532)

* use mockschemaloader as the default and rename JsonSchema to JsonFileSchemaLoader

* rename mock to EmptySchemaLoader

* retain existing JsonSchema while hacktoberfest is still in progress

* bump version and changelog
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants