-
Notifications
You must be signed in to change notification settings - Fork 4.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HIVE-23829: Compute Stats Incorrect for Binary Columns #1313
Conversation
serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazySerDeParameters.java
Outdated
Show resolved
Hide resolved
serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazySerDeParameters.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM pending test
… cases where debugging is not enabled Added base64 decode option to hive conf for lazysimple serde Updated LazySimpleDeserializeRead to use base64 decode option
@HunterL Really great stuff. Need one test with Edit: The default is |
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. |
@HunterL Merged to master. Thanks! |
Updated the LazySimple SerDe to no longer attempt to auto-detect if Binary columns were Base64 and instead use a table property. The previous way this was done was expensive and did not correctly check if the values were valid Base64 which in niche cases could result in statistics being computed incorrectly.