Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ORC-763: [C++] Fix ORC timestamp inconsistencies with Java ORC #661

Merged
merged 2 commits into from Mar 23, 2021

Conversation

wgtmac
Copy link
Member

@wgtmac wgtmac commented Mar 22, 2021

What changes were proposed in this pull request?

The Java ORC reader takes one second off from timestamp values with negative second and nanosecond greater than 999999. The C++ ORC reader/writer should use same logic to be consistent.

Why are the changes needed?

WIthout this fix, we may see data inconsistency of timestamp values from different ORC readers.

How was this patch tested?

It can be tested manually by C++/Java ORC tools.

Copy link
Contributor

@pgaref pgaref left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this @wgtmac !
Looks to be consistent with https://github.com/apache/orc/blob/master/java/core/src/java/org/apache/orc/impl/TreeReaderFactory.java#L1227-L1229

Shall we also add a test with the TS around 1970 as per ticket to check behaviour?

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, @wgtmac !
+1, LGTM (and +1 for @pgaref 's comment)

@dongjoon-hyun
Copy link
Member

Thank you for adding the test case.

@dongjoon-hyun dongjoon-hyun merged commit ba45bcc into apache:master Mar 23, 2021
@dongjoon-hyun
Copy link
Member

Thank you, @dongjoon-hyun and @pgaref .
Merged to master.

dongjoon-hyun pushed a commit that referenced this pull request Mar 23, 2021
### What changes were proposed in this pull request?
The Java ORC reader takes one second off from timestamp values with negative second and nanosecond greater than 999999. The C++ ORC reader/writer should use same logic to be consistent.

### Why are the changes needed?
WIthout this fix, we may see data inconsistency of timestamp values from different ORC readers.

### How was this patch tested?
It can be tested manually by C++/Java ORC tools.

(cherry picked from commit ba45bcc)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
@dongjoon-hyun
Copy link
Member

This is backported to branch-1.6, too.

@pgaref
Copy link
Contributor

pgaref commented Mar 23, 2021

Thanks for the fix @wgtmac !
I was thinking about the backport too, thanks @dongjoon-hyun !

@wgtmac wgtmac deleted the ORC-763 branch February 23, 2022 06:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants