Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Source-Mssql] Format Datetime and Datetime2 datatypes to 6-digit microsecond precision #32573

Merged
merged 9 commits into from
Nov 30, 2023

Conversation

nguyenaiden
Copy link
Contributor

@nguyenaiden nguyenaiden commented Nov 15, 2023

Description: SQLServer's datetime2 data type has a max timestamp precision of 7, but for some destinations, specifically BigQuery, sending a record with this precision - e.g: 2023-11-08T01:20:11.3733338 - will be rejected by BigQuery with a thrown error, and this leads to a dropped record on the other end as the entry will show up as null
This also updated datetime to have the same precision for consistency.

Temporary solution is to enforce a 6-digit precision after reading the record from the source before data emission.

Mssql Record Airbyte Record
9999-12-31T13:00:04.12345 9999-12-31T13:00:04.123450
2023-11-08T01:20:11.3733338 2023-11-08T01:20:11.373333

Note: This is to temporarily unblock a potential customer. This will be reverted when Destination/Platform start to handle this issue on their end.

Copy link

vercel bot commented Nov 15, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated (UTC)
airbyte-docs ⬜️ Ignored (Inspect) Visit Preview Nov 30, 2023 7:56pm

Copy link
Contributor

github-actions bot commented Nov 15, 2023

Before Merging a Connector Pull Request

Wow! What a great pull request you have here! 🎉

To merge this PR, ensure the following has been done/considered for each connector added or updated:

  • PR name follows PR naming conventions
  • Breaking changes are considered. If a Breaking Change is being introduced, ensure an Airbyte engineer has created a Breaking Change Plan.
  • Connector version has been incremented in the Dockerfile and metadata.yaml according to our Semantic Versioning for Connectors guidelines
  • You've updated the connector's metadata.yaml file any other relevant changes, including a breakingChanges entry for major version bumps. See metadata.yaml docs
  • Secrets in the connector's spec are annotated with airbyte_secret
  • All documentation files are up to date. (README.md, bootstrap.md, docs.md, etc...)
  • Changelog updated in docs/integrations/<source or destination>/<name>.md with an entry for the new version. See changelog example
  • Migration guide updated in docs/integrations/<source or destination>/<name>-migrations.md with an entry for the new version, if the version is a breaking change. See migration guide example
  • If set, you've ensured the icon is present in the platform-internal repo. (Docs)

If the checklist is complete, but the CI check is failing,

  1. Check for hidden checklists in your PR description

  2. Toggle the github label checklist-action-run on/off to re-run the checklist CI.

@octavia-squidington-iii octavia-squidington-iii added the area/documentation Improvements or additions to documentation label Nov 15, 2023
@nguyenaiden nguyenaiden marked this pull request as ready for review November 15, 2023 20:37
@nguyenaiden nguyenaiden requested a review from a team as a code owner November 15, 2023 20:37
@nguyenaiden nguyenaiden changed the title [Source-Mssql] Format Datetime and Datetime2 datatypes to 6 digit Microsecond precision [Source-Mssql] Format Datetime and Datetime2 datatypes to 6-digit microsecond precision Nov 15, 2023
Copy link
Contributor

@stephane-airbyte stephane-airbyte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just one small comment

try {
node.put(columnName, getObject(resultSet, index, LocalDateTime.class).format(microsecondsFormatter));
} catch (final Exception e) {
// for backward compatibility
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does that mean? Should we at least log something here ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After thinking about it, I'm going to remove this section altogether. This was done inside of AbstractJdbcCompatibleSourceOperations#putTimestamp for generic JDBC sources because we didn't know if we were able to parse the result to the appropriate Datetime/Date/Timestamp Java type. But since we know how a MSSQL datetime result looks like, this isn't necessary.

private final Set<String> BINARY = Set.of("VARBINARY", "BINARY");
private final Set<String> DATETIME_TYPES = Set.of("DATETIME", "DATETIME2", "SMALLDATETIME");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see also date, datetimeoffset, time.
Any other one also relevant?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those 3 are the only ones that gets translated into timestamp. Datetimeoffset is timestamp with timezone, and the remaining date and time have their own translation.

private static final String DATETIMEOFFSET = "DATETIMEOFFSET";
private static final String TIME_TYPE = "TIME";
private static final String SMALLMONEY_TYPE = "SMALLMONEY";
private static final String GEOMETRY = "GEOMETRY";
private static final String GEOGRAPHY = "GEOGRAPHY";
private static final String DEBEZIUM_DATETIMEOFFSET_FORMAT = "yyyy-MM-dd HH:mm:ss XXX";

private static final String DATETIME_FORMAT_MICROSECONDS = "yyyy-MM-dd'T'HH:mm:ss[.][SSSSSS]";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why 6?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're hard coding it to 6 to fit with most of the popular destinations for now. This work is slated to be reverted once Destination/platform establish a precision standard on their end.

@@ -6,7 +6,7 @@ plugins {
airbyteJavaConnector {
cdkVersionRequired = '0.5.0'
features = ['db-sources']
useLocalCdk = false
useLocalCdk = true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ℹ️ Reminder to change before merging

@nguyenaiden nguyenaiden force-pushed the mssql-datetime-patch branch 2 times, most recently from adc9e2b to ddcddc7 Compare November 30, 2023 18:43
@nguyenaiden
Copy link
Contributor Author

nguyenaiden commented Nov 30, 2023

/publish-java-cdk

🕑 https://github.com/airbytehq/airbyte/actions/runs/7051227850
✅ Successfully published Java CDK version=0.6.2!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues area/documentation Improvements or additions to documentation CDK Connector Development Kit connectors/source/mssql connectors/source/mssql-strict-encrypt
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants