Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Destination Snowflake v2: doublecheck type mapping #29072

Closed
edgao opened this issue Aug 3, 2023 · 5 comments
Closed

Destination Snowflake v2: doublecheck type mapping #29072

edgao opened this issue Aug 3, 2023 · 5 comments
Labels
team/destinations Destinations team's backlog

Comments

@edgao
Copy link
Contributor

edgao commented Aug 3, 2023

couple things from https://docs.google.com/document/d/1DyI8-V0H8fisEowhnmfpfXXbAQqhsAbaGkkvcbRd2S8/edit#heading=h.f9wnipto58qj

  • time with timezone - we map to varchar, is there a better type?
  • timestamp_tz / timestamp_ltz
    • make sure we're not changing the TZ to UTC (iirc this was happening in source-postgres, not the destination)
    • map to timestamp_ltz instead of timestamp_ntz? (what's the difference?)
  • number/integer/decimal - map to better types?
    • integer should be integer, obviously
    • for numeric columns - what should we do about precision/scale?
@edgao edgao added the team/destinations Destinations team's backlog label Aug 3, 2023
@evantahler
Copy link
Contributor

I think I would prefer to keep time-with-zone a timestamp-ish type, even if it is forced to UTC. Not having functions like DAY(...) work seems rough.

@edgao
Copy link
Contributor Author

edgao commented Aug 4, 2023

we explicitly got dinged for remapping timestamp_tz to UTC for all records though :/ so I'm not actually sure what they want

For TIMESTAMP_TZ [...] It's changing the timezone offset to UTC format for all the records

maybe we could map airbyte type time_with_timestamp to snowflake type timestamp_tz, and just prepend 1970-01-01 (probably confusing for users)? or... map to snowflake time, and add a second column the_column_ab_tz (probably also confusing, and makes everyones' lives more complicated)?

@evantahler
Copy link
Contributor

we explicitly got dinged for remapping timestamp_tz to UTC for all records though

In slack, I gather that this might be a source problem, not a destination problem. If that's the case, we can go back to writing the right data type, and solve the timezone thing on the sources?

@evantahler
Copy link
Contributor

@aaronsteers recently looked into this... I think things are OK.

@aaronsteers
Copy link
Collaborator

aaronsteers commented Oct 18, 2023

@evantahler - Thanks for pinging me here. I had not previously seen this issue.

Regarding this point:

we explicitly got dinged for remapping timestamp_tz to UTC for all records though

In slack, I gather that this might be a source problem, not a destination problem. If that's the case, we can go back to writing the right data type, and solve the timezone thing on the sources?

I confirmed there is a problem in the Snowflake Source: the base class implementation for the source looks fine, but the Snowflake subclass overrides it in a way that coerces all timezones to UTC.

https://airbytehq-team.slack.com/archives/C046LMDDY6N/p1697506520126359?thread_ts=1695064900.052369&cid=C046LMDDY6N

Regarding #2 (loss of original timezone), I confirmed in the source code that source-snowflake is casting the value to a java.sql.Timestamp object via resultSet.getTimestamp(), and this type does not support timezone. To preserve original timezone, we'd need to change this to cast into something like the OffsetDateTime class instead.
The base class's implementation in the CDK looks correct, so we may just need to delete the Snowflake override.

Note: The fact that this exists in the Source doesn't necessarily mean there isn't also a problem in the destination. But it's safe to say that the simpler explanation to say this is a source problem at this point, not a destination.

I will re-open if I find any other issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
team/destinations Destinations team's backlog
Projects
None yet
Development

No branches or pull requests

3 participants