Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-30117: [C++][Python] Add "Z" to the end of timestamp print string when tz defined #39272

Merged
merged 7 commits into from
Jan 8, 2024

Conversation

AlenkaF
Copy link
Member

@AlenkaF AlenkaF commented Dec 18, 2023

What changes are included in this PR?

This PR updates the PrettyPrint for Timestamp type so that "Z" is printed at the end of the output string if the timezone has been defined. This way we add minimum information about the values being stored in UTC.

Are these changes tested?

Yes.

Are there any user-facing changes?

There is a change in how TimestampArray prints out the data. With this change "Z" would be added to the end of the string if the timezone is defined.

Copy link

⚠️ GitHub issue #30117 has been automatically assigned in GitHub to PR creator.

@AlenkaF AlenkaF marked this pull request as ready for review December 18, 2023 10:36
Copy link
Member

@rok rok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems good. Let's just check that weird zones don't cause issues.

python/pyarrow/tests/test_types.py Outdated Show resolved Hide resolved
cpp/src/arrow/util/formatting_util_test.cc Outdated Show resolved Hide resolved
@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting review Awaiting review labels Dec 18, 2023
Co-authored-by: Rok Mihevc <rok@mihevc.org>
@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Dec 18, 2023
@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting change review Awaiting change review labels Dec 18, 2023
Co-authored-by: Rok Mihevc <rok@mihevc.org>
@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Dec 18, 2023
@github-actions github-actions bot added awaiting merge Awaiting merge and removed awaiting change review Awaiting change review labels Dec 18, 2023
@rok
Copy link
Member

rok commented Dec 18, 2023

Forgot to ask: does this currently print timestamp's type when printing the array? And if it does can you add a test that it prints the correct timezone. If not we can leave it for the follow-up.

@AlenkaF
Copy link
Member Author

AlenkaF commented Dec 18, 2023

Forgot to ask: does this currently print timestamp's type when printing the array? And if it does can you add a test that it prints the correct timezone. If not we can leave it for the follow-up.

No, when printing the array there is no info about the type, one should still inspect the type separately:

>>> import pyarrow as pa
>>> a = pa.array([0], pa.timestamp('s', tz='+02:00'))
>>> a
<pyarrow.lib.TimestampArray object at 0x131399fc0>
[
  1970-01-01 00:00:00Z
]

>>> a.type
TimestampType(timestamp[s, tz=+02:00])

as mentioned on the issue, we can add this if find important.

@rok
Copy link
Member

rok commented Dec 18, 2023

as mentioned on the issue, we can add this if find important.

If there's an easy way to include timezone it would be great!

@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting merge Awaiting merge labels Dec 19, 2023
@jorisvandenbossche
Copy link
Member

does this currently print timestamp's type when printing the array?

Given this is a completely separate code path (this is something defined in the python repr, not the C++ pretty printer), let's keep that for a separate PR?
(but yes, I agree it would be good to do that!)

@rok
Copy link
Member

rok commented Dec 19, 2023

Given this is a completely separate code path (this is something defined in the python repr, not the C++ pretty printer), let's keep that for a separate PR? (but yes, I agree it would be good to do that!)

Agreed, let's make another issue for it.

@AlenkaF
Copy link
Member Author

AlenkaF commented Dec 19, 2023

Given this is a completely separate code path (this is something defined in the python repr, not the C++ pretty printer), let's keep that for a separate PR? (but yes, I agree it would be good to do that!)

Agreed, let's make another issue for it.

It would be good to add that to R also, if I am not mistaken. Will create an issue for it 👍

@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Dec 20, 2023
@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting change review Awaiting change review labels Dec 20, 2023
@AlenkaF
Copy link
Member Author

AlenkaF commented Dec 20, 2023

Opened an issue for printing timezone information in TimestampArray: #39315
cc @rok @jorisvandenbossche

@jorisvandenbossche jorisvandenbossche merged commit a288364 into apache:main Jan 8, 2024
33 of 35 checks passed
@jorisvandenbossche jorisvandenbossche removed the awaiting change review Awaiting change review label Jan 8, 2024
@github-actions github-actions bot added the awaiting merge Awaiting merge label Jan 8, 2024
@jorisvandenbossche
Copy link
Member

Thanks!

@AlenkaF AlenkaF deleted the gh-30117-print-tz branch January 8, 2024 13:32
Copy link

After merging your PR, Conbench analyzed the 6 benchmarking runs that have been run so far on merge-commit a288364.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 4 possible false positives for unstable benchmarks that are known to sometimes produce them.

clayburn pushed a commit to clayburn/arrow that referenced this pull request Jan 23, 2024
…tring when tz defined (apache#39272)

### What changes are included in this PR?

This PR updates the PrettyPrint for Timestamp type so that "Z" is printed at the end of the output string if the timezone has been defined. This way we add minimum information about the values being stored in UTC.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

There is a change in how `TimestampArray` prints out the data. With this change "Z" would be added to the end of the string if the timezone is defined.
* Closes: apache#30117

Lead-authored-by: AlenkaF <frim.alenka@gmail.com>
Co-authored-by: Alenka Frim <AlenkaF@users.noreply.github.com>
Co-authored-by: Rok Mihevc <rok@mihevc.org>
Signed-off-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
dgreiss pushed a commit to dgreiss/arrow that referenced this pull request Feb 19, 2024
…tring when tz defined (apache#39272)

### What changes are included in this PR?

This PR updates the PrettyPrint for Timestamp type so that "Z" is printed at the end of the output string if the timezone has been defined. This way we add minimum information about the values being stored in UTC.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

There is a change in how `TimestampArray` prints out the data. With this change "Z" would be added to the end of the string if the timezone is defined.
* Closes: apache#30117

Lead-authored-by: AlenkaF <frim.alenka@gmail.com>
Co-authored-by: Alenka Frim <AlenkaF@users.noreply.github.com>
Co-authored-by: Rok Mihevc <rok@mihevc.org>
Signed-off-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
zanmato1984 pushed a commit to zanmato1984/arrow that referenced this pull request Feb 28, 2024
…tring when tz defined (apache#39272)

### What changes are included in this PR?

This PR updates the PrettyPrint for Timestamp type so that "Z" is printed at the end of the output string if the timezone has been defined. This way we add minimum information about the values being stored in UTC.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

There is a change in how `TimestampArray` prints out the data. With this change "Z" would be added to the end of the string if the timezone is defined.
* Closes: apache#30117

Lead-authored-by: AlenkaF <frim.alenka@gmail.com>
Co-authored-by: Alenka Frim <AlenkaF@users.noreply.github.com>
Co-authored-by: Rok Mihevc <rok@mihevc.org>
Signed-off-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
felipecrv added a commit that referenced this pull request Apr 6, 2024
…' when formatting timestamps (#41045)

### What changes are included in this PR?

A test that reproduces an issue found by the fuzzer and a fix for it.

### Are these changes tested?

 - A test
 - Comments clarifying somethings around `formatting.h`
 - Increasing the size of the local buffer used to format timestamps

The issue was introduced only recently (unreleased): #39272
* GitHub Issue: #41044

Authored-by: Felipe Oliveira Carvalho <felipekde@gmail.com>
Signed-off-by: Felipe Oliveira Carvalho <felipekde@gmail.com>
tolleybot pushed a commit to tmct/arrow that referenced this pull request May 2, 2024
…the 'Z' when formatting timestamps (apache#41045)

### What changes are included in this PR?

A test that reproduces an issue found by the fuzzer and a fix for it.

### Are these changes tested?

 - A test
 - Comments clarifying somethings around `formatting.h`
 - Increasing the size of the local buffer used to format timestamps

The issue was introduced only recently (unreleased): apache#39272
* GitHub Issue: apache#41044

Authored-by: Felipe Oliveira Carvalho <felipekde@gmail.com>
Signed-off-by: Felipe Oliveira Carvalho <felipekde@gmail.com>
vibhatha pushed a commit to vibhatha/arrow that referenced this pull request May 25, 2024
…the 'Z' when formatting timestamps (apache#41045)

### What changes are included in this PR?

A test that reproduces an issue found by the fuzzer and a fix for it.

### Are these changes tested?

 - A test
 - Comments clarifying somethings around `formatting.h`
 - Increasing the size of the local buffer used to format timestamps

The issue was introduced only recently (unreleased): apache#39272
* GitHub Issue: apache#41044

Authored-by: Felipe Oliveira Carvalho <felipekde@gmail.com>
Signed-off-by: Felipe Oliveira Carvalho <felipekde@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[C++][Python][R] PrettyPrint ignores timezone
3 participants