-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-4505: [C++] adding pretty print for dates, times, and timestamps #4268
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
It looks almost good to me.
88de6cb
to
b53c013
Compare
b53c013
to
9086c9d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I think the timezone (which is an arbitrary string) shouldn't be appended to the pretty-printed values.
cpp/src/arrow/pretty_print.cc
Outdated
Status WriteDataValues(const TimestampArray& array) { | ||
const int64_t* data = array.raw_values(); | ||
const auto type = static_cast<const TimestampType*>(array.type().get()); | ||
const auto fmt = "%F %T" + type->timezone(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The timezone is part of the type, it shouldn't be appended to the pretty-printed values IMHO.
cpp/src/arrow/pretty_print-test.cc
Outdated
1970-01-02 03:04:05.000678Z, | ||
null | ||
])expected"; | ||
CheckPrimitive<TimestampType, int64_t>(timestamp(TimeUnit::MICRO, "Z"), {0, 10}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While "Z" is a valid suffix to ISO 8601 timestamps, the name of the timezone wouldn't be "Z", it would be something like "UTC".
Besides, if the timezone is named e.g. "Europe/Paris", you don't want to append that to the formatted timestamp.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed. Not for this PR, but it would then be nice to actually somehow show the type information in the repr of the array:
In [55]: pa.array([pd.Timestamp("2012-01-01", tz='Europe/Paris')])
Out[55]:
<pyarrow.lib.TimestampArray object at 0x7ffa5d5060e8>
[
1325376000000000
]
because currently you don't see the timezone, and also not the unit of the timestamp, without actually printing the array's type as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jorisvandenbossche That looks more like a Python-specific issue.
Thank you @bkietz. This can be merged when CI is green. |
<TimeUnit>
since epoch)<TimeUnit>
since midnight)<TimeUnit>
since midnight)