Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Low-Code CDK] Generate Pydantic models from the handwritten component manifest schema #20044

Closed
3 tasks
brianjlai opened this issue Dec 3, 2022 · 1 comment
Closed
3 tasks
Assignees

Comments

@brianjlai
Copy link
Contributor

brianjlai commented Dec 3, 2022

Low Code Schema Refactor Phase 2

Tell us about the problem you're trying to solve

As part of processing an incoming connector manifest, we want to be able to parse a manifest into related Pydantic models. In order to be able to do that, we first need to have the Pydantic models defined.

Describe the solution you’d like

Instead of handwriting every Pydantic model by hand, we should leverage our existing tooling that can generate Pydantic models based on an input schema. Using the schema written before as input, there should be a gradle command defined that will generate all the Pydantic models and store them within airbyte-cdk/python/airbyte_cdk/sources/declarative/models.

Implementation Details

We already have some prior art that we can work off because we are doing a similar process for transforming the airbyte protocol into Pydantic models in airbyte-cdk/python/bin/generate-protocol-files.sh.

We should be trying to leverage the existing airbyte/code-generator which under the hood is installing and using the datamodel-code-generator library.

In doing a spike on the level of effort, one option while doing object generation we may want to leverage is --enum-field-as-literal one. This would allow us to enforce a Literal on the type name when a component definition is ambiguous. For example, for OffsetIncrement and PageIncrement who have equivalent fields, Pydantic will choose the first one that matches while parsing. However, we can use Literal['OffsetIncrement'] to ensure the correct component gets resolved.

Acceptance Criteria

  • Pydantic models are defined within the airbyte-cdk declarative package and should be accessible at runtime
  • Pydantic models are to be derived from the component manifest schema
  • A gradle command exists to easily regenerate models when the schema is changed
@brianjlai
Copy link
Contributor Author

brianjlai commented Dec 6, 2022

grooming notes:

  • run it in the build pipeline and fail if a diff is detected
  • any time we mess with the CI pipeline, theres always some unknowns

maxi297 added a commit that referenced this issue Dec 15, 2022
octavia-approvington pushed a commit that referenced this issue Dec 20, 2022
* handwritten low code manifest example components

* add MinMaxDatetime to jsonschema

* add a basic gradle command to generate manifest components

* Add auth components to handwritten component schema

- ApiKeyAuthenticator
- BasicHttpAuthenticator
- BearerAuthenticator
- DeclarativeOauth2Authenticator
- NoAuth

* Respect optional properties in DeclarativeOauth2Authenticator

* Fix `Dict[str, Any]` mapping in auth components

* add default error handler composite error handler and http response filter components

* [low code component schema] adding backoff strategies to schema

* [low code component schema] fix float types

* [low code component schema] add RecordFilter

* Remove `config` from auth components

* [low code component schema] add Interpolation (with pending question on 'type' not being defined)

* Add CartesianProductStreamSlicer & DatetimeStreamSlicer

* Add ListStreamSlicer, and fix nesting of DatetimeStreamSlicer

* [low code component schema] add InterpolatedRequestOptionsProvider

* Add slicer components, and fix a couple of components after reviewing output

* [low code component schema] adding transformations and adding type to interpolators

* adding spec and a few small tweaks

* Add DefaultSchemaLoader

* [low code component schema] attempt on custom class

* Add descriptions for auth components

* add RequestOption

* remove interpolated objects from the schema in favor of strings only

* a few schema fixes and adding some custom pagination and stream slicer

* [low code component schema] fix CustomBackoffStrategy

* Add CustomRecordExtractor

* add some description and add additional properties

* insert a transformer to hydrate default manifest components and perform validation against the handwritten schema

* [low code component schema] validating existing schemas

* [low code component schema] clean validation script

* add manifest transformer tests and a few tweaks to the schema

* Revert "[low code component schema] clean validation script"

This reverts commit 2408f41.

* Revert "[low code component schema] validating existing schemas"

This reverts commit 9d39977.

* [low code component schema] integrate validation script to gradle

* [low code component schema] updating validation script permissions

* remove a few model gen spike files and clean up comments

* default types should take parent type into account and a few schema changes

* [ISSUE #20044] generate pydantic models from handwritten schema

* [ISSUE #20044] code review

* [ISSUE #20044] re-generating declarative component schema files

Co-authored-by: brianjlai <brian.lai@airbyte.io>
Co-authored-by: Catherine Noll <noll.catherine@gmail.com>
maxi297 added a commit that referenced this issue Dec 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants