-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CDK: Add schema normalization to declarative stream #32786
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
|
686fdfd
to
5da985f
Compare
05eb109
to
7b49fd1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confirmed this works as expected.
example record without casting ID to a string:
{"type": "RECORD", "record": {"stream": "email_templates", "data": {"id": 4071586003,
example with the normalization enabled and the schema defining id as a string:
{"type": "RECORD", "record": {"stream": "email_templates", "data": {"id": "4071585003",
description: Responsible for normalization according to the schema. | ||
type: string | ||
enum: | ||
- no |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's call this NO_TRANSFORM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this is the option that user will see in the UI, right? wouldn't "No Trasform" will look better in that case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good question. @lmossman does the FE rename enums or show them as they are in the schema?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Connector Builder currently just shows enum values as they are, e.g. for HTTP Method we show GET
and POST
in the UI.
But it also would be pretty low effort to rename the values when adding them to the builder UI
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks Lake! @keu let's go with something user-friendly then.
Thinking more about the name, I sort of dislike normalization because the name clashes with another Airbyte concept.
How about
None
Default
airbyte-cdk/python/airbyte_cdk/sources/declarative/declarative_component_schema.yaml
Outdated
Show resolved
Hide resolved
airbyte-cdk/python/airbyte_cdk/sources/declarative/extractors/http_selector.py
Outdated
Show resolved
Hide resolved
50a60b5
to
3c6b935
Compare
…/add-schema-normalization
…/add-schema-normalization
@@ -707,17 +709,15 @@ def create_http_requester(self, model: HttpRequesterModel, config: Config, *, na | |||
parameters=model.parameters or {}, | |||
) | |||
|
|||
model_http_method = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@girarda removed this after adding --set-default-enum-member
flag
Warning Soft code freeze is in effect until 2024-01-02. Please avoid merging to master. #freedom-and-responsibility |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is great thanks @keu !
Co-authored-by: Eugene Kulak <kulak.eugene@gmail.com> Co-authored-by: Yevhenii Kurochkin <ykurochkin@flyaps.com> Co-authored-by: Alexandre Girard <alexandre@airbyte.io>
Co-authored-by: Eugene Kulak <kulak.eugene@gmail.com> Co-authored-by: Yevhenii Kurochkin <ykurochkin@flyaps.com> Co-authored-by: Alexandre Girard <alexandre@airbyte.io>
Co-authored-by: Eugene Kulak <kulak.eugene@gmail.com> Co-authored-by: Yevhenii Kurochkin <ykurochkin@flyaps.com> Co-authored-by: Alexandre Girard <alexandre@airbyte.io>
What
solving #30737
The implementation follows this proposal here
It applies schema_normalization at the very end - after we extracted records and transformed them.
How
Describe the solution
Recommended reading order
x.java
y.python
🚨 User Impact 🚨
Are there any breaking changes? What is the end result perceived by the user?
For connector PRs, use this section to explain which type of semantic versioning bump occurs as a result of the changes. Refer to our Semantic Versioning for Connectors guidelines for more information. Breaking changes to connectors must be documented by an Airbyte engineer (PR author, or reviewer for community PRs) by using the Breaking Change Release Playbook.
If there are breaking changes, please merge this PR with the 🚨🚨 emoji so changelog authors can further highlight this if needed.
Pre-merge Actions
Expand the relevant checklist and delete the others.
New Connector
Community member or Airbyter
./gradlew :airbyte-integrations:connectors:<name>:integrationTest
.0.0.1
Dockerfile
has version0.0.1
README.md
bootstrap.md
. See description and examplesdocs/integrations/<source or destination>/<name>.md
including changelog with an entry for the initial version. See changelog exampledocs/integrations/README.md
Airbyter
If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.
Updating a connector
Community member or Airbyter
Airbyter
If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.
Connector Generator
-scaffold
in their name) have been updated with the latest scaffold by running./gradlew :airbyte-integrations:connector-templates:generator:generateScaffolds
then checking in your changesUpdating the Python CDK
Airbyter
Before merging:
--use-local-cdk --name=source-<connector>
as optionsairbyte-ci connectors --use-local-cdk --name=source-<connector> test
After merging: