ARROW-799: [Java] Change readerIndex/writerIndex to readerIndex()/wri…#540
ARROW-799: [Java] Change readerIndex/writerIndex to readerIndex()/wri…#540icexelloss wants to merge 1 commit intoapache:masterfrom
Conversation
…terIndex() in ArrowBuf to make it shade/relocate friendly
julienledem
left a comment
There was a problem hiding this comment.
I'm not sure that we need this.
If shading and relocating netty, one should shade and relocate this particular class in the same way so that the packages match.
| } | ||
|
|
||
| return ByteBufUtil.decodeString(this, index, length, charset); | ||
| return super.toString(index, length, charset); |
There was a problem hiding this comment.
Yes. super.toString(index, length, charset); calls ByteBufUtil.decodeString(this, index, length, charset)
| PlatformDependent.putShort(addr(writerIndex), (short) value); | ||
| writerIndex += 2; | ||
| PlatformDependent.putShort(addr(writerIndex()), (short) value); | ||
| writerIndex(writerIndex() + 2); |
There was a problem hiding this comment.
We need to make sure this doesn't impact performance.
There was a problem hiding this comment.
Good point. I ran this code and didn't see performance diffs, are there other tests you'd like me to run?
public class TestArrowBufPerformance {
@Test
public void test() {
int count = 10 * 1024 * 1024;
RootAllocator allocator = new RootAllocator(8 * count);
for (int j = 0; j < 100; ++j) {
ArrowBuf buf = allocator.buffer(4 * count);
long start = System.currentTimeMillis();
for (int i = 0; i < count; ++i) {
buf.writeInt(i);
}
System.out.println(System.currentTimeMillis() - start);
buf.close();
}
allocator.close();
}
}
before:
count 100.000000
mean 40.650000
std 8.048307
min 29.000000
25% 34.000000
50% 38.500000
75% 45.250000
max 64.000000
after:
count 100.000000
mean 40.930000
std 7.002532
min 31.000000
25% 36.000000
50% 39.000000
75% 47.000000
max 65.000000
There was a problem hiding this comment.
calling writerIndex() instead of setting the fields adds extra checks. It shows in your test that we are adding some overhead.
How about you shade by adding a class prefix instead of a package prefix? I think that would remove the need for this.
There was a problem hiding this comment.
There was a problem hiding this comment.
See: https://maven.apache.org/plugins/maven-shade-plugin/shade-mojo.html#relocations
probably something like:
<relocations>
<relocation>
<pattern>io.netty.buffer.</pattern>
<shadedPattern>io.netty.buffer.Arrow_</shadedPattern>
</relocation>
</relocations>
There was a problem hiding this comment.
Julien, thanks for the tip! Yes this works. We can close this then.
|
The use case here is to use arrow as a uberjar. If we relocate io.netty, then ArrowBuf will be moved to shaded.io.netty.buffer.ArrowBuf. This will break user code as they will likely With this, at least we can relocate io.netty but exclude ArrowBuf from relocation. This will work if user don't import/use other buffer classes under java/vector/src/io/netty/buffer/directly, which I think is reasonable with the current arrow java api. If you think this is still not necessary, I am ok with closing this. We can probably do some work around on our side. But I do think shading netty in arrow is quite difficult and this makes it a bit easier. |
|
@julienledem What do you think of this? |
julienledem
left a comment
There was a problem hiding this comment.
Hi @icexelloss, sorry for being a little slow to get back to you. see my suggestion bellow.
| PlatformDependent.putShort(addr(writerIndex), (short) value); | ||
| writerIndex += 2; | ||
| PlatformDependent.putShort(addr(writerIndex()), (short) value); | ||
| writerIndex(writerIndex() + 2); |
There was a problem hiding this comment.
calling writerIndex() instead of setting the fields adds extra checks. It shows in your test that we are adding some overhead.
How about you shade by adding a class prefix instead of a package prefix? I think that would remove the need for this.
|
Awesome! Thanks @icexelloss. Would you mind posting your final solution to the mailing list? Other people were looking into doing something similar. |
|
Should we add this to the documentation (now that we have a better website)? |
|
I am happy to write some documentation for this.
I will write something this weekend and let you guys know.
Li
…On Fri, Apr 28, 2017 at 3:18 PM, Wes McKinney ***@***.***> wrote:
Should we add this to the documentation (now that we have a better
website)?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#540 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAwbrK3lzn9dicKtyeRgW6LDMV4gzlz4ks5r0jt1gaJpZM4M99Mr>
.
|
|
I changed the title for ARROW-799. We should start a Markdown-based Java user guide (to accompany the existing javadoc) that we can publish on the website. We can put it at site/_docs/java |
Bumps org.apache.orc:orc-core from 2.0.3 to 2.1.0. Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Change readerIndex/writerIndex to readerIndex()/writerIndex() to make ArrowBuf shade/relocation friendly.