Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

馃悰 Source Snowflake: Timestamps coerced to UTC and lose the original timezone offsets #31568

Closed
1 task
aaronsteers opened this issue Oct 18, 2023 · 0 comments 路 Fixed by #31631
Closed
1 task
Labels
area/connectors Connector related issues autoteam connectors/source/snowflake needs-triage team/db-dw-sources Backlog for Database and Data Warehouse Sources team type/bug Something isn't working

Comments

@aaronsteers
Copy link
Collaborator

aaronsteers commented Oct 18, 2023

Connector Name

source-snowflake

Connector Version

0.2.1

What step the error happened?

None

Relevant information

The Snowflake Source connector uses a timestamp conversion that loses the original offset because Timestamp and getTimestamp() are not offset-aware:

protected void putTimestampWithTimezone(final ObjectNode node, final String columnName, final ResultSet resultSet, final int index)
throws SQLException {
final Timestamp timestamp = resultSet.getTimestamp(index);
node.put(columnName, DateTimeConverter.convertToTimestampWithTimezone(timestamp));
}

The base implementation looks correct, so this might be as easy as deleting Snowflake's override of the base implementation:

  protected void putTimestampWithTimezone(final ObjectNode node, final String columnName, final ResultSet resultSet, final int index)
      throws SQLException {
    final OffsetDateTime timestamptz = getObject(resultSet, index, OffsetDateTime.class);
    final LocalDate localDate = timestamptz.toLocalDate();
    node.put(columnName, resolveEra(localDate, timestamptz.format(TIMESTAMPTZ_FORMATTER)));
  }

More context from: #29072 (comment)

@evantahler - Thanks for pinging me here. I had not previously seen this issue.

Regarding this point:

we explicitly got dinged for remapping timestamp_tz to UTC for all records though

In slack, I gather that this might be a source problem, not a destination problem. If that's the case, we can go back to writing the right data type, and solve the timezone thing on the sources?

I confirmed there is a problem in the Snowflake Source: the base class implementation for the source looks fine, but the Snowflake subclass overrides it in a way that coerces all timezones to UTC.

https://airbytehq-team.slack.com/archives/C046LMDDY6N/p1697506520126359?thread_ts=1695064900.052369&cid=C046LMDDY6N

Regarding #2 (loss of original timezone), I confirmed in the source code that source-snowflake is casting the value to a java.sql.Timestamp object via resultSet.getTimestamp(), and this type does not support timezone. To preserve original timezone, we'd need to change this to cast into something like the OffsetDateTime class instead.
The base class's implementation in the CDK looks correct, so we may just need to delete the Snowflake override.

Note: The fact that this exists in the Source doesn't necessarily mean there isn't also a problem in the destination. But it's safe to say that the simpler explanation to say this is a source problem at this point, not a destination.

I will re-open if I find any other issue.

Additional context here (internal link):

https://airbytehq-team.slack.com/archives/C046LMDDY6N/p1697506520126359?thread_ts=1695064900.052369&cid=C046LMDDY6N

Relevant log output

No response

Contribute

  • Yes, I want to contribute
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues autoteam connectors/source/snowflake needs-triage team/db-dw-sources Backlog for Database and Data Warehouse Sources team type/bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants