Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HIVE-27522: Iceberg: Bucket partition transformation date type support #4507

Merged
merged 3 commits into from Jul 25, 2023

Conversation

deniskuzZ
Copy link
Member

@deniskuzZ deniskuzZ commented Jul 20, 2023

What changes were proposed in this pull request?

Bucket partition transformation date type support

Why are the changes needed?

UDFArgumentException: ICEBERG_BUCKET() only takes STRING/CHAR/VARCHAR/BINARY/INT/LONG/DECIMAL/FLOAT/DOUBLE types as first argument, got DATE

Does this PR introduce any user-facing change?

No

Is the change a dependency upgrade?

No

How was this patch tested?

qtest

Copy link

@aturoczy aturoczy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@deniskuzZ deniskuzZ changed the title Bucket partition transformation date type support HIVE-27522: Iceberg: Bucket partition transformation date type support Jul 21, 2023
Copy link
Contributor

@SourabhBadhya SourabhBadhya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. 1 minor comment.

Transform<Integer, Integer> dateTransform = Transforms.bucket(Types.DateType.get(), numBuckets);
evaluator = arg -> {
DateWritableV2 val = (DateWritableV2) converter.convert(arg.get());
result.set(dateTransform.apply(val.get().toEpochDay()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about val.getDays() instead of val.get().toEpochDay()?

Copy link
Member Author

@deniskuzZ deniskuzZ Jul 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍, fixed

result.set(dateTransform.apply(val.getDays()));
};
break;

default:
throw new UDFArgumentException(
" ICEBERG_BUCKET() only takes STRING/CHAR/VARCHAR/BINARY/INT/LONG/DECIMAL/FLOAT/DOUBLE" +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shoud we add DATE type in the UDFArgumentException message?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice catch, thanks!

case DATE:
converter = new PrimitiveObjectInspectorConverter.DateConverter(argumentOI,
PrimitiveObjectInspectorFactory.writableDateObjectInspector);
Transform<Integer, Integer> dateTransform = Transforms.bucket(Types.DateType.get(), numBuckets);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Transform<T, Integer> bucket(Type type, int numBuckets) will be removed in the future.
https://github.com/apache/iceberg/blob/e7d21a948865dc182a718f779757cb5f181856cb/api/src/main/java/org/apache/iceberg/transforms/Transforms.java#L196-L208

Maybe we can use this instread:
Function<Object, Integer> dateTransform = Transforms.bucket(numBuckets).bind(Types.DateType.get());

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed

@sonarcloud
Copy link

sonarcloud bot commented Jul 24, 2023

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 8 Code Smells

No Coverage information No Coverage information
No Duplication information No Duplication information

Copy link
Contributor

@zhangbutao zhangbutao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. Change looks good to me.

Copy link
Member

@ayushtkn ayushtkn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@deniskuzZ deniskuzZ merged commit 25826cc into apache:master Jul 25, 2023
5 checks passed
@deniskuzZ deniskuzZ deleted the bucket_date_column branch July 25, 2023 16:09
tarak271 pushed a commit to tarak271/hive-1 that referenced this pull request Dec 19, 2023
…t (Denys Kuzmenko, reviewed by Attila Turoczy, Ayush Saxena, Butao Zhang, Sourabh Badhya)

Closes apache#4507
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants