Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

destination-redshift should fail syncs if records or properties are too large, rather than silently skipping records and succeeding #27993

Merged
merged 8 commits into from Jul 14, 2023

Conversation

evantahler
Copy link
Contributor

@evantahler evantahler commented Jul 5, 2023

Closes #19990, albiet in a less-than-ideal way. At least now, users will be notified when source data won't fit in the destination. From there, a view or other filtering mechanism on the source can be used to adjust the incoming records to fix within Redshift.

…oo large, rather than silently skipping records and succeding
@evantahler evantahler marked this pull request as ready for review July 5, 2023 22:35
@evantahler evantahler requested a review from a team as a code owner July 5, 2023 22:35
@github-actions
Copy link
Contributor

github-actions bot commented Jul 5, 2023

Before Merging a Connector Pull Request

Wow! What a great pull request you have here! 🎉

To merge this PR, ensure the following has been done/considered for each connector added or updated:

  • PR name follows PR naming conventions
  • Breaking changes are considered. If a Breaking Change is being introduced, ensure an Airbyte engineer has created a Breaking Change Plan and you've followed all steps in the Breaking Changes Checklist
  • Connector version has been incremented in the Dockerfile and metadata.yaml according to our Semantic Versioning for Connectors guidelines
  • Secrets in the connector's spec are annotated with airbyte_secret
  • All documentation files are up to date. (README.md, bootstrap.md, docs.md, etc...)
  • Changelog updated in docs/integrations/<source or destination>/<name>.md with an entry for the new version. See changelog example
  • The connector tests are passing in CI
  • You've updated the connector's metadata.yaml file (new!)
  • If set, you've ensured the icon is present in the platform-internal repo. (Docs)

If the checklist is complete, but the CI check is failing,

  1. Check for hidden checklists in your PR description

  2. Toggle the github label checklist-action-run on/off to re-run the checklist CI.

@evantahler evantahler changed the title destination-redshift will fail syncs if records or properties are too large, rather than silently skipping records and succeding destination-redshift will fail syncs if records or properties are too large, rather than silently skipping records and succeeding Jul 5, 2023
@octavia-squidington-iii octavia-squidington-iii added the area/documentation Improvements or additions to documentation label Jul 5, 2023
@evantahler evantahler changed the title destination-redshift will fail syncs if records or properties are too large, rather than silently skipping records and succeeding destination-redshift should fail syncs if records or properties are too large, rather than silently skipping records and succeeding Jul 5, 2023
@octavia-squidington-iii
Copy link
Collaborator

destination-redshift test report (commit 116f297c83) - ❌

⏲️ Total pipeline duration: 30mn19s

Step Result
Validate airbyte-integrations/connectors/destination-redshift/metadata.yaml
Connector version semver check
QA checks
Build connector tar
Build destination-redshift docker image for platform linux/x86_64
Build airbyte/normalization-redshift:dev
./gradlew :airbyte-integrations:connectors:destination-redshift:integrationTest

🔗 View the logs here

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=destination-redshift test

@octavia-squidington-iii
Copy link
Collaborator

destination-redshift test report (commit 8398de38f4) - ❌

⏲️ Total pipeline duration: 28mn55s

Step Result
Validate airbyte-integrations/connectors/destination-redshift/metadata.yaml
Connector version semver check
QA checks
Build connector tar
Build destination-redshift docker image for platform linux/x86_64
Build airbyte/normalization-redshift:dev
./gradlew :airbyte-integrations:connectors:destination-redshift:integrationTest

🔗 View the logs here

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=destination-redshift test

Copy link
Contributor

@edgao edgao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you think of keeping this behavior available as a config option? I'm nervous about sources that don't have the ability to do that sort of filtering

if not: We can also delete implementsRecordSizeLimitChecks and testSyncVeryBigRecords from DestinationAcceptanceTest / RedshiftStagingS3DestinationAcceptanceTest

either way - this diff lgtm

@evantahler
Copy link
Contributor Author

evantahler commented Jul 6, 2023

Thanks for the review @edgao!

what do you think of keeping this behavior available as a config option? I'm nervous about sources that don't have the ability to do that sort of filtering

Give the choice, I didn't want to go with a config option, because whenever possible I want the behavior of our destinations to match. Snowflake and BQ have the same problem today, and one day we should holistically tackle this problem, rather than a piecemeal per-destination solution.

For example, once we get one-stream-one-table released (#26028), we could add a feature to every destination that handles per-field and per-record size limitations. We could just write the PK and cursor for the record, and then include an _airbyte_medata.error = ['data for column "foo" too large']. This would work for every destination, and not require any config settings. 1s1t solves every problem ;)

@octavia-squidington-iii
Copy link
Collaborator

destination-oracle-strict-encrypt test report (commit 8e65341d0e) - ❌

⏲️ Total pipeline duration: 12mn27s

Step Result
Validate airbyte-integrations/connectors/destination-oracle-strict-encrypt/metadata.yaml
Connector version semver check
QA checks
Build connector tar
Build destination-oracle-strict-encrypt docker image for platform linux/x86_64
Build airbyte/normalization-oracle:dev
./gradlew :airbyte-integrations:connectors:destination-oracle-strict-encrypt:integrationTest

🔗 View the logs here

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=destination-oracle-strict-encrypt test

@edgao
Copy link
Contributor

edgao commented Jul 6, 2023

sgtm :shipit:

@octavia-squidington-iii
Copy link
Collaborator

destination-pubsub test report (commit 8e65341d0e) - ❌

⏲️ Total pipeline duration: 06mn46s

Step Result
Validate airbyte-integrations/connectors/destination-pubsub/metadata.yaml
Connector version semver check
QA checks
Build connector tar
Build destination-pubsub docker image for platform linux/x86_64
./gradlew :airbyte-integrations:connectors:destination-pubsub:integrationTest

🔗 View the logs here

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=destination-pubsub test

@octavia-squidington-iii
Copy link
Collaborator

destination-vertica test report (commit 8e65341d0e) - ❌

⏲️ Total pipeline duration: 01mn31s

Step Result
Validate airbyte-integrations/connectors/destination-vertica/metadata.yaml
Connector version semver check
QA checks
Build connector tar

🔗 View the logs here

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=destination-vertica test

@octavia-squidington-iii
Copy link
Collaborator

destination-mqtt test report (commit 8e65341d0e) - ❌

⏲️ Total pipeline duration: 07mn44s

Step Result
Validate airbyte-integrations/connectors/destination-mqtt/metadata.yaml
Connector version semver check
QA checks
Build connector tar
Build destination-mqtt docker image for platform linux/x86_64
./gradlew :airbyte-integrations:connectors:destination-mqtt:integrationTest

🔗 View the logs here

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=destination-mqtt test

@octavia-squidington-iii
Copy link
Collaborator

destination-azure-blob-storage test report (commit 8e65341d0e) - ❌

⏲️ Total pipeline duration: 06mn30s

Step Result
Validate airbyte-integrations/connectors/destination-azure-blob-storage/metadata.yaml
Connector version semver check
QA checks
Build connector tar
Build destination-azure-blob-storage docker image for platform linux/x86_64
./gradlew :airbyte-integrations:connectors:destination-azure-blob-storage:integrationTest

🔗 View the logs here

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=destination-azure-blob-storage test

@octavia-squidington-iii
Copy link
Collaborator

destination-keen test report (commit 8e65341d0e) - ❌

⏲️ Total pipeline duration: 04mn47s

Step Result
Validate airbyte-integrations/connectors/destination-keen/metadata.yaml
Connector version semver check
QA checks
Build connector tar
Build destination-keen docker image for platform linux/x86_64
./gradlew :airbyte-integrations:connectors:destination-keen:integrationTest

🔗 View the logs here

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=destination-keen test

@octavia-squidington-iii
Copy link
Collaborator

destination-starburst-galaxy test report (commit 8e65341d0e) - ❌

⏲️ Total pipeline duration: 12mn02s

Step Result
Validate airbyte-integrations/connectors/destination-starburst-galaxy/metadata.yaml
Connector version semver check
QA checks
Build connector tar
Build destination-starburst-galaxy docker image for platform linux/x86_64
./gradlew :airbyte-integrations:connectors:destination-starburst-galaxy:integrationTest

🔗 View the logs here

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=destination-starburst-galaxy test

@octavia-squidington-iii
Copy link
Collaborator

destination-yugabytedb test report (commit 8e65341d0e) - ❌

⏲️ Total pipeline duration: 11mn28s

Step Result
Validate airbyte-integrations/connectors/destination-yugabytedb/metadata.yaml
Connector version semver check
QA checks
Build connector tar
Build destination-yugabytedb docker image for platform linux/x86_64
./gradlew :airbyte-integrations:connectors:destination-yugabytedb:integrationTest

🔗 View the logs here

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=destination-yugabytedb test

@octavia-squidington-iii
Copy link
Collaborator

destination-exasol test report (commit 8e65341d0e) - ❌

⏲️ Total pipeline duration: 06mn57s

Step Result
Validate airbyte-integrations/connectors/destination-exasol/metadata.yaml
Connector version semver check
QA checks
Build connector tar
Build destination-exasol docker image for platform linux/x86_64
./gradlew :airbyte-integrations:connectors:destination-exasol:integrationTest

🔗 View the logs here

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=destination-exasol test

@octavia-squidington-iii
Copy link
Collaborator

destination-dev-null test report (commit 8e65341d0e) - ❌

⏲️ Total pipeline duration: 03mn33s

Step Result
Validate airbyte-integrations/connectors/destination-dev-null/metadata.yaml
Connector version semver check
QA checks
Build connector tar
Build destination-dev-null docker image for platform linux/x86_64
./gradlew :airbyte-integrations:connectors:destination-dev-null:integrationTest

🔗 View the logs here

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=destination-dev-null test

@octavia-squidington-iii
Copy link
Collaborator

destination-redshift test report (commit c6b4ef0732) - ✅

⏲️ Total pipeline duration: 01mn33s

Step Result
Validate airbyte-integrations/connectors/destination-redshift/metadata.yaml
Connector version semver check
QA checks
Build connector tar
Build destination-redshift docker image for platform linux/x86_64
Build airbyte/normalization-redshift:dev
./gradlew :airbyte-integrations:connectors:destination-redshift:integrationTest

🔗 View the logs here

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=destination-redshift test

@octavia-squidington-iii
Copy link
Collaborator

destination-doris test report (commit c6b4ef0732) - ❌

⏲️ Total pipeline duration: 04mn14s

Step Result
Validate airbyte-integrations/connectors/destination-doris/metadata.yaml
Connector version semver check
QA checks
Build connector tar
Build destination-doris docker image for platform linux/x86_64
./gradlew :airbyte-integrations:connectors:destination-doris:integrationTest

🔗 View the logs here

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=destination-doris test

@octavia-squidington-iii
Copy link
Collaborator

destination-dynamodb test report (commit c6b4ef0732) - ✅

⏲️ Total pipeline duration: 16mn00s

Step Result
Validate airbyte-integrations/connectors/destination-dynamodb/metadata.yaml
Connector version semver check
QA checks
Build connector tar
Build destination-dynamodb docker image for platform linux/x86_64
./gradlew :airbyte-integrations:connectors:destination-dynamodb:integrationTest

🔗 View the logs here

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=destination-dynamodb test

@octavia-squidington-iii
Copy link
Collaborator

destination-gcs test report (commit c6b4ef0732) - ✅

⏲️ Total pipeline duration: 09mn16s

Step Result
Validate airbyte-integrations/connectors/destination-gcs/metadata.yaml
Connector version semver check
QA checks
Build connector tar
Build destination-gcs docker image for platform linux/x86_64
./gradlew :airbyte-integrations:connectors:destination-gcs:integrationTest

🔗 View the logs here

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=destination-gcs test

@octavia-squidington-iii
Copy link
Collaborator

destination-r2 test report (commit c6b4ef0732) - ❌

⏲️ Total pipeline duration: 06mn01s

Step Result
Validate airbyte-integrations/connectors/destination-r2/metadata.yaml
Connector version semver check
QA checks
Build connector tar
Build destination-r2 docker image for platform linux/x86_64
./gradlew :airbyte-integrations:connectors:destination-r2:integrationTest

🔗 View the logs here

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=destination-r2 test

@octavia-squidington-iii
Copy link
Collaborator

destination-azure-blob-storage test report (commit c6b4ef0732) - ❌

⏲️ Total pipeline duration: 08mn43s

Step Result
Validate airbyte-integrations/connectors/destination-azure-blob-storage/metadata.yaml
Connector version semver check
QA checks
Build connector tar
Build destination-azure-blob-storage docker image for platform linux/x86_64
./gradlew :airbyte-integrations:connectors:destination-azure-blob-storage:integrationTest

🔗 View the logs here

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=destination-azure-blob-storage test

@octavia-squidington-iii
Copy link
Collaborator

destination-s3-glue test report (commit c6b4ef0732) - ✅

⏲️ Total pipeline duration: 08mn58s

Step Result
Validate airbyte-integrations/connectors/destination-s3-glue/metadata.yaml
Connector version semver check
QA checks
Build connector tar
Build destination-s3-glue docker image for platform linux/x86_64
./gradlew :airbyte-integrations:connectors:destination-s3-glue:integrationTest

🔗 View the logs here

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=destination-s3-glue test

@octavia-squidington-iii
Copy link
Collaborator

destination-mysql test report (commit c6b4ef0732) - ❌

⏲️ Total pipeline duration: 41mn14s

Step Result
Validate airbyte-integrations/connectors/destination-mysql/metadata.yaml
Connector version semver check
QA checks
Build connector tar
Build destination-mysql docker image for platform linux/x86_64
Build airbyte/normalization-mysql:dev
./gradlew :airbyte-integrations:connectors:destination-mysql:integrationTest

🔗 View the logs here

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=destination-mysql test

@octavia-squidington-iii
Copy link
Collaborator

destination-redpanda test report (commit c6b4ef0732) - ❌

⏲️ Total pipeline duration: 08mn40s

Step Result
Validate airbyte-integrations/connectors/destination-redpanda/metadata.yaml
Connector version semver check
QA checks
Build connector tar
Build destination-redpanda docker image for platform linux/x86_64
./gradlew :airbyte-integrations:connectors:destination-redpanda:integrationTest

🔗 View the logs here

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=destination-redpanda test

@octavia-squidington-iii
Copy link
Collaborator

destination-iceberg test report (commit c6b4ef0732) - ❌

⏲️ Total pipeline duration: 533mn08s

Step Result
Validate airbyte-integrations/connectors/destination-iceberg/metadata.yaml
Connector version semver check
QA checks
Build connector tar
Build destination-iceberg docker image for platform linux/x86_64
./gradlew :airbyte-integrations:connectors:destination-iceberg:integrationTest

🔗 View the logs here

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=destination-iceberg test

@octavia-squidington-iii
Copy link
Collaborator

destination-bigquery test report (commit c6b4ef0732) - ✅

⏲️ Total pipeline duration: 27mn31s

Step Result
Validate airbyte-integrations/connectors/destination-bigquery/metadata.yaml
Connector version semver check
QA checks
Build connector tar
Build destination-bigquery docker image for platform linux/x86_64
Build airbyte/normalization:dev
./gradlew :airbyte-integrations:connectors:destination-bigquery:integrationTest

🔗 View the logs here

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=destination-bigquery test

@octavia-squidington-iii
Copy link
Collaborator

destination-snowflake test report (commit c6b4ef0732) - ❌

⏲️ Total pipeline duration: 38mn46s

Step Result
Validate airbyte-integrations/connectors/destination-snowflake/metadata.yaml
Connector version semver check
QA checks
Build connector tar
Build destination-snowflake docker image for platform linux/x86_64
Build airbyte/normalization-snowflake:dev
./gradlew :airbyte-integrations:connectors:destination-snowflake:integrationTest

🔗 View the logs here

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=destination-snowflake test

@octavia-squidington-iii
Copy link
Collaborator

destination-kinesis test report (commit c6b4ef0732) - ✅

⏲️ Total pipeline duration: 09mn37s

Step Result
Validate airbyte-integrations/connectors/destination-kinesis/metadata.yaml
Connector version semver check
QA checks
Build connector tar
Build destination-kinesis docker image for platform linux/x86_64
./gradlew :airbyte-integrations:connectors:destination-kinesis:integrationTest

🔗 View the logs here

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=destination-kinesis test

@octavia-squidington-iii
Copy link
Collaborator

destination-csv test report (commit c6b4ef0732) - ✅

⏲️ Total pipeline duration: 05mn33s

Step Result
Validate airbyte-integrations/connectors/destination-csv/metadata.yaml
Connector version semver check
QA checks
Build connector tar
Build destination-csv docker image for platform linux/x86_64
./gradlew :airbyte-integrations:connectors:destination-csv:integrationTest

🔗 View the logs here

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=destination-csv test

@octavia-squidington-iii
Copy link
Collaborator

destination-kafka test report (commit c6b4ef0732) - ❌

⏲️ Total pipeline duration: 09mn46s

Step Result
Validate airbyte-integrations/connectors/destination-kafka/metadata.yaml
Connector version semver check
QA checks
Build connector tar
Build destination-kafka docker image for platform linux/x86_64
./gradlew :airbyte-integrations:connectors:destination-kafka:integrationTest

🔗 View the logs here

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=destination-kafka test

@evantahler
Copy link
Contributor Author

Enough of the tests for other connectors are passing, and the builds all seem to be OK... so... Merging!

@evantahler
Copy link
Contributor Author

/approve-and-merge reason="enough of the connector tests are passing"

@octavia-approvington
Copy link
Contributor

It's time
fine lets go

@octavia-approvington octavia-approvington merged commit b81cc03 into master Jul 14, 2023
26 of 70 checks passed
@octavia-approvington octavia-approvington deleted the evan/redshift-fails-on-big-data branch July 14, 2023 19:27
efimmatytsin pushed a commit to scentbird/airbyte that referenced this pull request Jul 27, 2023
… too large, rather than silently skipping records and succeeding (airbytehq#27993)

* `destination-redshift` will fail syncs if records or properties are too large, rather than silently skipping records and succeding

* Bump version

* remove tests that don't matter any more

* more test removal

* more test removal

---------

Co-authored-by: Augustin <augustin@airbyte.io>
@alexnikitchuk
Copy link
Contributor

Thanks for the review @edgao!

what do you think of keeping this behavior available as a config option? I'm nervous about sources that don't have the ability to do that sort of filtering

Give the choice, I didn't want to go with a config option, because whenever possible I want the behavior of our destinations to match. Snowflake and BQ have the same problem today, and one day we should holistically tackle this problem, rather than a piecemeal per-destination solution.

For example, once we get one-stream-one-table released (#26028), we could add a feature to every destination that handles per-field and per-record size limitations. We could just write the PK and cursor for the record, and then include an _airbyte_medata.error = ['data for column "foo" too large']. This would work for every destination, and not require any config settings. 1s1t solves every problem ;)

This looks like a breaking change. IMO better to have the option to switch back to the old behavior.

@jablonskijakub
Copy link
Contributor

jablonskijakub commented Aug 21, 2023

thanks @evantahler for your PR. tbh I was not aware of the issue with Redshift's limits on SUPER type until I upgraded my connector to your version.
Unfortunately it is not quite a solution I would prefer as I need to load all rows from a source db and I was wondering if it would be possible to allow for copying if a row is < 1mb but varchar fields exceed the limit and blocking normalisation so that the data remains in the raw format? what could be a solution here otherwise?

ok, I already know that it is not possible. https://docs.aws.amazon.com/redshift/latest/dg/limitations-super.html#

a workaround would be appreciated though :)

@evantahler
Copy link
Contributor Author

We've got a feature on the roadmap to use our new features in Destinations V2 (#26028) to have a work around! We'll import the columns we can, and include an error message, per-record, for any column which has a filed that is too large for the destination.

Would you prefer the too-large field with the error to have NULL or a truncated value?

@jablonskijakub
Copy link
Contributor

sounds great! personally I would prefer to have it as NULL as then it would indicate that given field was too big for a destination and with a truncated value I suspect it might sl without being noticed.

If I understood you correctly Destinations V2 would be used instead on the normalisation step which is being used atm. I believe my issue is slightly different as my payload field is in a jsonb format and being read as a single string value (please correct me if I'm wrong) thus exceeding 65535 bytes for a single value within a SUPER data type.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues area/documentation Improvements or additions to documentation connectors/destination/redshift
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Redshift reports successful sync after ignoring some of the records
7 participants