Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 Destination MSSQL: fix issue with unicode symbols handling (ex.\u2028 showed as ?) #3671

Merged
merged 5 commits into from
Jun 1, 2021

Conversation

etsybaev
Copy link
Contributor

@etsybaev etsybaev commented May 27, 2021

What

#3552 MSSQL Destination: Fix newline handling

How

Fixed MSSQL _airbyte_data field's type to support unicode.

This is for Microsoft SQL Server:
NVARCHAR is Unicode - 2 bytes per character, therefore max. of 1 billion characters; will handle East Asian, Arabic, Hebrew, Cyrillic etc. characters just fine.
VARCHAR is non-Unicode - 1 byte per character, max. capacity is 2 billion characters, but limited to the character set you're SQL Server is using, basically - no support for those languages mentioned before

Pre-merge Checklist

  • Run integration tests
  • Publish Docker images

Recommended reading order

  1. test.java
  2. component.ts
  3. the rest

@auto-assign auto-assign bot requested review from cgardens and davinchia May 27, 2021 19:46
@etsybaev etsybaev linked an issue May 27, 2021 that may be closed by this pull request
Copy link
Contributor

@sherifnada sherifnada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for also updating the docs!

Once you've addressed the one comment, please bump the connector version like described here and update your PR with the new version: https://docs.airbyte.io/contributing-to-airbyte/building-new-connector#updating-a-connector

then publish the connector like described in the link.

@@ -53,7 +53,7 @@ public String createTableQuery(String schemaName, String tableName) {
+ "WHERE s.name = '%s' AND t.name = '%s') "
+ "CREATE TABLE %s.%s ( \n"
+ "%s VARCHAR(64) PRIMARY KEY,\n"
+ "%s VARCHAR(MAX),\n"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ChristopheDuong do we also need to do the same change in DBT i.e: writing strings as NVARCHARs?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normalization is not handling MSSQL yet (so nothing to do there, unless implementing it...), otherwise I don't know how the mssql-dbt adapter plugin handle strings atm

@@ -53,7 +53,7 @@ public String createTableQuery(String schemaName, String tableName) {
+ "WHERE s.name = '%s' AND t.name = '%s') "
+ "CREATE TABLE %s.%s ( \n"
+ "%s VARCHAR(64) PRIMARY KEY,\n"
+ "%s VARCHAR(MAX),\n"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@etsybaev can you leave a comment in the code describing why NVARCHAR is used since we typically use VARCHAR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sherifnada done. Added comment to code and documentation. Thanks

@etsybaev
Copy link
Contributor Author

Hi @sherifnada.
I've updated versions but found it and updated in 2 places. Not sure if I also had to update it in airbyte-config/init/src/main/resources/seed/destination_definitions.yaml ?
Could you please also clarify what is the workflow for pushing images? the command "./tools/bin/check_images_exist.sh" works for me locally, but obviously failed in the build for this PR as the image has not been released globally. What are the next steps from my side?
Many thanks in advance!

@sherifnada
Copy link
Contributor

@etsybaev to publish an image you need to:

  1. before bumping the image version, have a passing build and PR approvals
  2. bump the image in Dockerfile, YAML, and JSON config files. The build will fail -- it's OK because we know the only change we made was version bump.
  3. Run the /publish command
  4. once publish is successful, merge to master

@etsybaev
Copy link
Contributor Author

etsybaev commented Jun 1, 2021

/test connector=connectors/destination-mssql

🕑 connectors/destination-mssql https://github.com/airbytehq/airbyte/actions/runs/894927119
✅ connectors/destination-mssql https://github.com/airbytehq/airbyte/actions/runs/894927119

@etsybaev
Copy link
Contributor Author

etsybaev commented Jun 1, 2021

/publish connector=connectors/destination-mssql

🕑 connectors/destination-mssql https://github.com/airbytehq/airbyte/actions/runs/894970067
✅ connectors/destination-mssql https://github.com/airbytehq/airbyte/actions/runs/894970067

@etsybaev etsybaev merged commit 1fed4b0 into master Jun 1, 2021
@etsybaev etsybaev deleted the tsybaiev/3552 branch June 1, 2021 06:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MSSQL Destination: Fix newline handling
3 participants