Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🎉 🐛 Source Airtable: cast native Airtable Types to JSONSchema Types #21962

Merged
merged 7 commits into from
Jan 28, 2023

Conversation

bazarnov
Copy link
Collaborator

@bazarnov bazarnov commented Jan 27, 2023

What

Resolving: https://github.com/airbytehq/alpha-beta-issues/issues/299

How

  • refactored the schema_helpers.py to provide the functionality to cast data types automatically

🚨 User Impact 🚨

@erica-airbyte
This is the BREAKING CHANGE.

But what should customer do? :
After the update, the customers should Refresh Schema and reset the data to pick up the correct schema types and avoid issues on the destination side. Unfortunately.

Reason for the breaking change:
The reason for the breaking change is this one: https://github.com/airbytehq/alpha-beta-issues/issues/299

Pre-merge Checklist

Expand the relevant checklist and delete the others.

Updating a connector

Community member or Airbyter

  • Unit & integration tests added and passing. Community members, please provide proof of success locally e.g: screenshot or copy-paste unit, integration, and acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run ./gradlew :airbyte-integrations:connectors:<name>:integrationTest.
  • Code reviews completed
  • Documentation updated
    • Connector's README.md
    • Connector's bootstrap.md. See description and examples
    • Changelog updated in docs/integrations/<source or destination>/<name>.md including changelog. See changelog example
  • PR name follows PR naming conventions

Airbyter

If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.

  • Create a non-forked branch based on this PR and test the below items on it
  • Build is successful
  • If new credentials are required for use in CI, add them to GSM. Instructions.
  • /test connector=connectors/<name> command is passing
  • New Connector version released on Dockerhub and connector version bumped by running the /publish command described here

@bazarnov
Copy link
Collaborator Author

bazarnov commented Jan 27, 2023

/test connector=connectors/source-airtable

🕑 connectors/source-airtable https://github.com/airbytehq/airbyte/actions/runs/4023898212
✅ connectors/source-airtable https://github.com/airbytehq/airbyte/actions/runs/4023898212
Python tests coverage:

Name                                Stmts   Miss  Cover
-------------------------------------------------------
source_airtable/__init__.py             2      0   100%
source_airtable/streams.py             92      1    99%
source_airtable/source.py              38      1    97%
source_airtable/schema_helpers.py      41      5    88%
source_airtable/auth.py                17      6    65%
-------------------------------------------------------
TOTAL                                 190     13    93%
	 Name                                                 Stmts   Miss  Cover   Missing
	 ----------------------------------------------------------------------------------
	 source_acceptance_test/base.py                          12      4    67%   16-19
	 source_acceptance_test/config.py                       141      5    96%   87, 93, 239, 243-244
	 source_acceptance_test/conftest.py                     211     95    55%   36, 42-44, 49, 54, 77, 83, 89-91, 110, 115-117, 123-125, 131-132, 137-138, 143, 149, 158-167, 173-178, 193, 217, 248, 254, 262-267, 275-285, 293-306, 311-317, 324-335, 342-358
	 source_acceptance_test/plugin.py                        69     25    64%   22-23, 31, 36, 120-140, 144-148
	 source_acceptance_test/tests/test_core.py              402    115    71%   53, 58, 93-104, 109-116, 120-121, 125-126, 308, 346-363, 376-387, 391-396, 402, 435-440, 478-485, 528-530, 533, 598-606, 618-621, 626, 682-683, 689, 692, 728-738, 751-776
	 source_acceptance_test/tests/test_incremental.py       160     14    91%   58-65, 70-83, 246
	 source_acceptance_test/utils/asserts.py                 39      2    95%   62-63
	 source_acceptance_test/utils/common.py                  94     10    89%   16-17, 32-38, 72, 75
	 source_acceptance_test/utils/compare.py                 62     23    63%   21-51, 68, 97-99
	 source_acceptance_test/utils/connector_runner.py       133     33    75%   24-27, 46-47, 50-54, 57-58, 73-75, 78-80, 83-85, 88-90, 93-95, 124-125, 159-161, 208
	 source_acceptance_test/utils/json_schema_helper.py     107     13    88%   30-31, 38, 41, 65-68, 96, 120, 192-194
	 ----------------------------------------------------------------------------------
	 TOTAL                                                 1609    339    79%

Build Passed

Test summary info:

=========================== short test summary info ============================
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/plugin.py:63: Skipping TestIncremental.test_two_sequential_reads: Incremental syncs are not supported on this connector.
SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/tests/test_core.py:94: The previous and actual specifications are identical.
SKIPPED [2] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/tests/test_core.py:386: Backward compatibility tests are disabled for version 1.0.2.
=================== 38 passed, 4 skipped in 73.74s (0:01:13) ===================

@octavia-squidington-iv octavia-squidington-iv added the area/documentation Improvements or additions to documentation label Jan 27, 2023
@bazarnov bazarnov requested review from a team and lazebnyi January 27, 2023 11:09
Copy link
Contributor

@erohmensing erohmensing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks legit to me - just a few questions in the review. TCS might want to reach out to users to let them know of the breaking change, in case they have pipelines consuming the data in string format - have you all aligned on any follow-up necessary?

@bazarnov
Copy link
Collaborator Author

bazarnov commented Jan 27, 2023

Looks legit to me - just a few questions in the review. TCS might want to reach out to users to let them know of the breaking change, in case they have pipelines consuming the data in string format - have you all aligned on any follow-up necessary?

We could safely push this one into Master because:

  • we hold the Cloud version on 1.0.1 until 1.0.2 server-side components are updated for the OAuth part and merged into Cloud master and deployed.
  • for OSS folks - we can safely release the 2.0.0 and give them a shot to try the new functionality.
  • IMPORTANT: we would need to reach out to the existing Customers and announce that a big update is coming and they should be ready to Reset the data and Refresh the Schema.
  • once the Cloud part is ready, we just unpin the version for source-airtable and have both OAuth + dynamic schema discovery + cast to JsonSchema types at the same time, and it will look like a single Major update for the Users to minimize the customer's effort.

Copy link
Contributor

@erohmensing erohmensing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

once the Cloud part is ready, we just unpin the version for source-airtable and have both OAuth + dynamic schema discovery + cast to JsonSchema types at the same time, and it will look like a single Major update for the Users to minimize the customer's effort.

Thank you for the thorough overview! This sounds like a good idea to bunch a lot of changes together

IMPORTANT: we would need to reach out to the existing Customers and announce that a big update is coming and they should be ready to Reset the data and Refresh the Schema.

👍🏻 please make sure TCS is aware of that on the original issue

docs/integrations/sources/airtable.md Show resolved Hide resolved
@bazarnov
Copy link
Collaborator Author

bazarnov commented Jan 27, 2023

/publish connector=connectors/source-airtable

🕑 Publishing the following connectors:
connectors/source-airtable
https://github.com/airbytehq/airbyte/actions/runs/4027890555


Connector Did it publish? Were definitions generated?
connectors/source-airtable

if you have connectors that successfully published but failed definition generation, follow step 4 here ▶️

@bazarnov
Copy link
Collaborator Author

bazarnov commented Jan 28, 2023

/publish connector=connectors/source-airtable

🕑 Publishing the following connectors:
connectors/source-airtable
https://github.com/airbytehq/airbyte/actions/runs/4029352514


Connector Did it publish? Were definitions generated?
connectors/source-airtable

if you have connectors that successfully published but failed definition generation, follow step 4 here ▶️

@octavia-squidington-iii octavia-squidington-iii temporarily deployed to more-secrets January 28, 2023 01:16 — with GitHub Actions Inactive
@octavia-squidington-iii octavia-squidington-iii temporarily deployed to more-secrets January 28, 2023 01:17 — with GitHub Actions Inactive
@airbytehq airbytehq deleted a comment from github-actions bot Jan 28, 2023
@bazarnov bazarnov merged commit f981668 into master Jan 28, 2023
@bazarnov bazarnov deleted the baz/source-airtable-cast-schema branch January 28, 2023 07:16
@bazarnov
Copy link
Collaborator Author

@erica-airbyte The server-side changes for Airtable OAuth are deployed to the Cloud according to the latest release and Cloud version. Are we good to unpin the source-airtable version to pick up the latest 2.0.0, meaning we should reach out to the Customers before this goes live?

@erica-airbyte
Copy link
Contributor

@bazarnov Yes, we need to reach out to customer if this change will cause their syncs to start failing. Will this change have any downstream affect to their destination data? (ie. stream name changing, field name changing, etc.)

@bazarnov
Copy link
Collaborator Author

bazarnov commented Jan 30, 2023

@bazarnov Yes, we need to reach out to customer if this change will cause their syncs to start failing. Will this change have any downstream affect to their destination data? (ie. stream name changing, field name changing, etc.)

@erica-airbyte The change touches only the data types in stream schemas, the column names are still remain for the customers who are already updated to 1.0.1 and more, if updating from 0.1.3 - this will be completely breaking, and will require re-set up the connector.

@erica-airbyte
Copy link
Contributor

Ok, thanks @bazarnov we will send this outreach tomorrow to customers.

Can we plan to have this update go out on Thursday?

@erica-airbyte
Copy link
Contributor

Comms have been sent. @davydov-d please deploy this tomorrow

@davydov-d
Copy link
Collaborator

@erica-airbyte done

@tybernstein
Copy link
Contributor

@davydov-d I had a user that tried to refresh the source schema after this was deployed and encountered the following errors:
Screenshot 2023-02-01 at 10 28 47 AM
Screenshot 2023-02-01 at 10 24 28 AM

@davydov-d
Copy link
Collaborator

@bazarnov as you were the one to work on this PR, are you aware of what could go wrong on the screenshots above?

@erica-airbyte
Copy link
Contributor

cc @bazarnov can you take a look at this real quick. Customer are not able to get around this.

@erica-airbyte
Copy link
Contributor

@pedroslopez as Airbyte OC are you able to take a look here? It looks like the latest update that went out this AM might have broken customer, is this an easy fix or should we roll this back?

@tybernstein
Copy link
Contributor

On a test workspace I changed the Auth Method to OAuth and gave access to all workspaces. Is there a chance that this update is not compatible with Users that are using API keys?

@bazarnov
Copy link
Collaborator Author

bazarnov commented Feb 1, 2023

@erica-airbyte @pedroslopez I've just tested using prod - all works as expected.

@TBernstein4 does the customer proceed with these instructions to create a new PAT with all necessary scopes and permissions?
From the setup guide:

Standard Scopes required for the successfull authentication:

data.records
data.recordComments
schema.bases

Also, could you please share the workspace link with this customer so we can examine the input config?

@tybernstein
Copy link
Contributor

I will ask them about the PAT. I will mention you in the oncall issue that has the effected workspace links.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues area/documentation Improvements or additions to documentation connectors/source/airtable
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants