-
-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for data type timezone conversion to UTC #249
Comments
Hi, thanks for the great user story! I already try to map timestamps as best as I can. I fully empathise with your usecase. You wouldn't happen to know to which ODBC datatypes I need to test / investigate, but I partly fear that these mit be types which are not part of the ODBC standard, but driver specific extensions. If they are part of the standard, it is most likely just an oversight from me not handling them accordingly. Drivery specific types are more tricky. Their meaning and interpretation is not agreed on cross drivers (well, I guess that is what makes them specific). I would be more hesitant to implement logic for that. Not saying no just now, but it is something to carefully think about. My bandwith is a bit low at the moment, may be a while before I get to investigate this. Don't know about your ODBC or Rust skills, but if you want I can give you pointers how to drive this story a bit. Otherwise, just querying the table you likely have with a verbose logging enabled If you want to go the extra mile use Cheers, Markus |
For SQL Server it seems to be a vendor/driver specific implementation with a type called typedef struct tagSS_TIMESTAMPOFFSET_STRUCT {
SQLSMALLINT year;
SQLUSMALLINT month;
SQLUSMALLINT day;
SQLUSMALLINT hour;
SQLUSMALLINT minute;
SQLUSMALLINT second;
SQLUINTEGER fraction;
SQLSMALLINT timezone_hour;
SQLSMALLINT timezone_minute;
} SQL_SS_TIMESTAMPOFFSET_STRUCT;
I couldn't find an offical documentation for Postgres right away. I will discuss in the team what the best option is to tackle this issue. |
I see, this is why the timestamp logic of We could then introduce database specify logic to handle "other" types. Just a thought. I'll sleep over it. |
👍 |
If you could find out what the struct for PostgreSQL would look like, we may be able to support that, too. |
I could not find any specific documentation about that. I guess because I guess the best way would be to try this out. I currently don't have the time to build a test setup for PostgreSQL with ODBC ... |
Hello @leo-schick , Cheers, Markus |
Tests with Postgres have shown, that to an ODBC client both |
@pacman82 There is a way to tell postgres to export the data in a specific timezone, e.g. UTC. The |
In my tests it seems that Postgres does convert the timestamp to UTC. It is just that from the client side I have no way of telling wether this is instant semantics or not. So the Any ideas for configuring ODBC are better placed with the maintainers of the PostgreSQL ODBC driver. Yet, as written before. At least to me it seems to already do what you want. |
@leo-schick Almost forgot to ask. Does the tool work for you now? |
In TSQL/SQL Server, the
datetimeoffset
- which is practially datetime with timezone offset - is currently converted into a parquet string. I don't like that because the data stored there is quite large and might be bad to sort.Postgres has a similar data type called
timestamp with time zone
Unfortunately, PARQUET does not support a logical data type which supports time zone, but there are serveral considerations how to deal with different time zones, see here: https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#timestamp One suggestion is to convert all to UTC.
I would like to have the option to tell odbc2parquet that all datetimeoffset/timestamp with time zone data types shall be converted into a UTC TIMESTAMP in the PARQUET export instead of into a PARQUET string. I suggest to add this option with an additional parameter.
The benefit of this is that filtering or sorting on a PARQUET timestamp is much faster than on a PARQUET string.
The text was updated successfully, but these errors were encountered: