New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HIVE-25104: Backward incompatible timestamp serialization in Parquet for certain timezones #2282
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
zabetak
commented
May 17, 2021
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/DataWritableReadSupport.java
Show resolved
Hide resolved
zabetak
commented
May 17, 2021
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java
Show resolved
Hide resolved
kgyrtkirk
added
tests unstable
tests pending
and removed
tests pending
tests unstable
labels
May 17, 2021
kgyrtkirk
added
tests pending
tests unstable
tests passed
and removed
tests unstable
tests pending
labels
May 18, 2021
jcamachor
reviewed
May 24, 2021
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/DataWritableReadSupport.java
Outdated
Show resolved
Hide resolved
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/DataWritableReadSupport.java
Show resolved
Hide resolved
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java
Show resolved
Hide resolved
@klcopp , could you take a look too? Thanks |
kgyrtkirk
added
tests pending
tests failed
and removed
tests passed
tests pending
labels
May 27, 2021
…eter The conversion can be skipped by passing an appropriate timezone so the boolean parameter is redundant and makes the code harder to understand.
…meter The conversion can be skipped by passing an appropriate timezone so the boolean parameter is redundant and makes the code harder to understand. Adapt callers to pass the appropriate timezone (to perform or not the conversion) to retain the old behavior.
1. Add new read/write config properties to control legacy zone conversions in Parquet. 2. Deprecate hive.parquet.timestamp.legacy.conversion.enabled property since it is not clear if it applies on conversion during read or write. 3. Exploit file metadata and property to choose between new/old conversion rules. 4. Update existing tests to remove usages of now deprecated hive.parquet.timestamp.legacy.conversion.enabled property.
…operty (Optional)
Use old hive.parquet.timestamp.legacy.conversion.enabled property to control legacy conversion when reading timestamps from Parquet files. When the user has set explicitly the legacy conversion flag using the old property name Hive needs to take it into account otherwise data may be corrupted.
… callers The method was used only in two places and doesn't significantly improve readability.
…dd-util Code didn't compile after fd6e701 was merged to master.
kgyrtkirk
added
tests pending
tests unstable
tests passed
and removed
tests failed
tests pending
tests unstable
labels
May 27, 2021
jcamachor
approved these changes
Jun 4, 2021
ashish-kumar-sharma
pushed a commit
to ashish-kumar-sharma/hive
that referenced
this pull request
Mar 31, 2022
…for certain timezones (Stamatis Zampetakis, reviewed by Jesus Camacho Rodriguez) Closes apache#2282
amanraj2520
pushed a commit
to amanraj2520/hive
that referenced
this pull request
Jan 23, 2024
…for certain timezones (Stamatis Zampetakis, reviewed by Jesus Camacho Rodriguez) Closes apache#2282
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
Why are the changes needed?
Does this PR introduce any user-facing change?
No.
How was this patch tested?
TestParquetTimestampsHive2Compatibility
)parquet_int96_legacy_compatibility_timestamp.q
)hive.parquet.timestamp.write.legacy.conversion.enabled=true