Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-9377: [Java] Support unsigned dictionary indices #7817

Closed
wants to merge 3 commits into from

Conversation

liyafan82
Copy link
Contributor

@github-actions
Copy link

try (VarCharVector dictionaryVector = new VarCharVector("dict vector", allocator)) {
setVector(dictionaryVector, "0", "1", "2", "3", "4", "5", "6", "7", "8", "9");
Dictionary dictionary1 = new Dictionary(dictionaryVector,
new DictionaryEncoding(/*id=*/10L, /*ordered=*/false, /*indexType=*/ new ArrowType.Int(8, false)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please provide a parameter comment for bit_width and "false". here and below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parameter comments added. Thanks for your kind reminder.

try (UInt2Vector vector = new UInt2Vector("vector", allocator)) {
vector.allocateNew(2);

vector.setWithPossibleTruncate(0, 0xffffL);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a constant someplace (or can this be moved to a constant on Uint8 vector) to better indicating you are using MAX_INT.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea!
I've extracted constants for maximum unsigned integers, and placed them in corresponding UIntXvector classes.

@Test
public void testUInt1RoundTrip() throws IOException {
try (UInt1Vector encodedVector1 = (UInt1Vector) createEncodedVector(8)) {
setVector(encodedVector1, (byte) 1, (byte) 3, (byte) 5, (byte) 7, (byte) 9);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should max value round trips be be added here and below as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds reasonable. I have revised the test case to include max for UInt1 and UInt2.

public void init() {
allocator = new RootAllocator(Long.MAX_VALUE);
dictionaryVector = new VarCharVector("dict vector", allocator);
setVector(dictionaryVector, "0", "1", "2", "3", "4", "5", "6", "7", "8", "9");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it might be nice to verify dictionary for at least uint1 work for vectors of length > 127

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion. I have reivsed the test case to to make vector length > 127. Pls check.

Copy link
Contributor

@emkornfield emkornfield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments on improving test coverage otherwise I think this seems reasonable.

@emkornfield
Copy link
Contributor

Thanks @liyafan82

emkornfield pushed a commit to emkornfield/arrow that referenced this pull request Oct 16, 2020
See https://issues.apache.org/jira/browse/ARROW-9377

Closes apache#7817 from liyafan82/fly_0721_uns

Authored-by: liyafan82 <fan_li_ya@foxmail.com>
Signed-off-by: Micah Kornfield <emkornfield@gmail.com>
GeorgeAp pushed a commit to sirensolutions/arrow that referenced this pull request Jun 7, 2021
See https://issues.apache.org/jira/browse/ARROW-9377

Closes apache#7817 from liyafan82/fly_0721_uns

Authored-by: liyafan82 <fan_li_ya@foxmail.com>
Signed-off-by: Micah Kornfield <emkornfield@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants