-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Python] Add timezone information when printing TimestampArray #39315
Comments
It looks like we already print the timezone? arrow::as_arrow_array(Sys.time())
#> Array
#> <timestamp[us, tz=America/Halifax]>
#> [
#> 2023-12-20 13:33:42.632165
#> ] ...and it also looks like our abbreviated printer displays it too: dplyr::glimpse(arrow::arrow_table(ts = Sys.time()))
#> Table
#> 1 rows x 1 columns
#> $ ts <timestamp[us, tz=America/Halifax]> 2023-12-20 09:34:44 The |
Thanks for the info Dewey! |
While we are at improving the repr of arrays, we can probably leave out the boilerplate in |
Some variants:
(maybe the first is actually fine, and also the most succinct) |
I think this and other improvements to the way timezone-aware timestamps are printed would be very helpful for users. The way PyArrow currently prints timezone-aware timestamp values can be very confusing. For example, you might try to create a Table like this: from datetime import datetime
import pyarrow as pa
t = pa.table(
{'ts': [datetime(1969, 1, 1, 1, 1, 1)]},
schema=pa.schema([("ts", pa.timestamp("us", tz="America/New_York"))])
) When you print it, it looks like the time represents the 01:01:01 EST:
But upon closer inspection, it is actually representing the time 01:01:01 UTC which converts to 20:01:01 EST:
|
We could display local time (using |
That would require |
Fair point. It seems it is not. I suppose the same logic could be implemented in vanilla python to avoid new dependencies. |
The main issue (as was discussed in the original issue #30117, before we closed that after adding the "Z" suffix) is that showing the local timezone requires a timezone database to be present, and this is not guaranteed. (I think the requirement on Compute could probably be fixed, by moving or replicating the logic of At the time the original issue was discussed, the tz database wasn't yet supported for Windows, but that has improved nowadays (although the user still need to download it manually and put it in the correct location or point pyarrow to it). We could decide to actually print wall time if a tzdb is available, and otherwise still fall back on showing the UTC values with "Z" suffix. That would be an annoying inconsistency in the pretty printing, but at least make it less confusing for many users on linux/mac. |
Thanks for the reminder Joris! (I've forgotten about that discussion) |
Describe the enhancement requested
PrettyPrint
forTimestamp
is currently printing values in UTC even when the timezone is defined. This can get confusing and there is a PR open with a simple fix of adding "Z" to the end of the string in case timezone is defined:#39272
This way we can at least see which tz is the data printed in.
It would also be good to add timezone information when printing the array, for example:
In Python this can be done by adding a separate
__repr__
toTimestampArray
class.Would something similar also be needed for R or is timezone information available when printing an Array? cc @paleolimbot
Component(s)
Python, R
The text was updated successfully, but these errors were encountered: