Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#17047 Airbyte CDK: Improve error for returning non-iterable from connectors parse_response #17626

Conversation

Gitznik
Copy link
Contributor

@Gitznik Gitznik commented Oct 5, 2022

What

The change makes it easier to debug when the user returns the response dict directly from the parse_response method in the python cdk, instead of yielding it. Before this change, it leads to a Pydantic validation error that does not help pointing you in the right direction:
pydantic.main.BaseModel.__init__\npydantic.error_wrappers.ValidationError: 1 validation error for AirbyteRecordMessage\ndata\n value is not a valid dict (type=type_error.dict)

Closes #17047

How

Added a validator that runs before the default pydantic validators that checks if the data is a valid dict and if not, raises a ValueError with a pointer to the return type of the parse_response method.

Pre-merge Checklist

Expand the relevant checklist and delete the others.

Airbyte Python CDK Improvement

Community member or Airbyter

  • Community member? Grant edit access to maintainers (instructions)
  • Unit & integration tests added and passing: 914 passed, 669 warnings in 10.04s. Community members, please provide proof of success locally e.g: screenshot or copy-paste unit, integration, and acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run ./gradlew :airbyte-integrations:connectors:<name>:integrationTest.
  • Code reviews completed
  • PR name follows PR naming conventions

Tests

Unit

Test output is too long to add it here.

============================================================================================================================================== 914 passed, 669 warnings in 10.54s ===============================================================================================================================================

@Gitznik Gitznik requested a review from a team as a code owner October 5, 2022 19:56
@github-actions github-actions bot added the CDK Connector Development Kit label Oct 5, 2022
@Gitznik Gitznik changed the title #17047 airbyte cdk improve invalid message data type error #17047 airbyte python CKD - improve invalid message data type error Oct 5, 2022
@Gitznik Gitznik changed the title #17047 airbyte python CKD - improve invalid message data type error #17047 Airbyte CDK: Improve error for returning non-iterable from connectors parse_response Oct 5, 2022
Copy link
Contributor

@girarda girarda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution! Two small comments then we should be good to merge

def data_is_dict(cls: AirbyteRecordMessage, value: Dict[str, Any]):
if isinstance(value, dict):
return value
raise ValueError("Data object is not a dictionary. "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we show the object as part of the exception message to help troubleshooting the root cause?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great point. Added.

@@ -800,3 +800,10 @@ def test_checkpoint_state_from_stream_instance():
assert actual_message == _as_state(
{"teams": {"updated_at": "2022-09-11"}, "managers": {"updated": "expected_here"}}, "managers", {"updated": "expected_here"}
)


def test_airbyte_record_message_custom_data_validation_error():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: let's also add a test case for the valid path

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure thing.

@marcosmarxm
Copy link
Member

@girarda can you review this again?

@marcosmarxm marcosmarxm self-assigned this Oct 6, 2022
@girarda
Copy link
Contributor

girarda commented Oct 6, 2022

/publish-cdk dry-run=true

🕑 https://github.com/airbytehq/airbyte/actions/runs/3201190745
https://github.com/airbytehq/airbyte/actions/runs/3201190745

@girarda
Copy link
Contributor

girarda commented Oct 6, 2022

Love it! Thank you for your contribution @Gitznik ❤️

I'll publish the CDK in a follow-up PR

@girarda girarda merged commit d9ad272 into airbytehq:master Oct 6, 2022
girarda added a commit that referenced this pull request Oct 7, 2022
…from connectors parse_response (#17626)"

This reverts commit d9ad272.
girarda added a commit that referenced this pull request Oct 7, 2022
…from connectors parse_response (#17707)

* Bump cdk version

* Revert "#17047 Airbyte CDK: Improve error for returning non-iterable from connectors parse_response (#17626)"

This reverts commit d9ad272.

* Bump
letiescanciano added a commit that referenced this pull request Oct 7, 2022
…vation

* master: (32 commits)
  fixed octavia position and z-index on onboarding page (#17708)
  Revert "Revert "Do not wait the end of a reset to return an update (#17591)" (#17640)" (#17669)
  source-google-analytics-v4: use hits metric for check (#17717)
  Source linkedin-ads: retry 429/5xx when refreshing access token (#17724)
  🐛 Source Mixpanel: solve cursor field none expected array (#17699)
  🎉 8890 Source MySql: Fix large table issue by fetch size (#17236)
  Test e2e testing tool commands (#17722)
  fixed escape character i18n error (#17706)
  Docs: adds missing " in transformations-with-airbyte.md (#17723)
  Change Osano token to new project (#17720)
  Source Github: improve 502 handling for `comments` stream (#17715)
  #17506 source snapchat marketing: retry failed request for refreshing access token (#17596)
  MongoDb Source: Increase performance of discover (#17614)
  Testing tool commands for run scenarios (#17550)
  Kustomize: Missing NORMALIZATION_JOB_* environment variables in stable-with-resource-limits overlays (#17713)
  Fix console errors (#17696)
  Revert: #17047 Airbyte CDK: Improve error for returning non-iterable from connectors parse_response (#17707)
  #17047 Airbyte CDK: Improve error for returning non-iterable from connectors parse_response (#17626)
  📝 Postgres source: document occasional full refresh under cdc mode (#17705)
  Bump Airbyte version from 0.40.12 to 0.40.13 (#17682)
  ...
jhammarstedt pushed a commit to jhammarstedt/airbyte that referenced this pull request Oct 31, 2022
… from connectors parse_response (airbytehq#17626)

* Improve airbyte cdk invalid message data type error message

* Test cdk invalid message data type custom error is raised

* Fix test to pass stream as a string

* Add valid record message data input type test

* Add object type and value to AirbyteRecordMessage validator message

Co-authored-by: Alexandre Girard <alexandre@airbyte.io>
jhammarstedt pushed a commit to jhammarstedt/airbyte that referenced this pull request Oct 31, 2022
…iterable from connectors parse_response (airbytehq#17707)

* Bump cdk version

* Revert "airbytehq#17047 Airbyte CDK: Improve error for returning non-iterable from connectors parse_response (airbytehq#17626)"

This reverts commit d9ad272.

* Bump
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CDK Connector Development Kit community
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Airbyte CDK: Improve error for returning non-iterable from connectors parse_response
4 participants