Skip to content

ARROW-799: [Java] Change readerIndex/writerIndex to readerIndex()/wri…#540

Closed
icexelloss wants to merge 1 commit intoapache:masterfrom
icexelloss:feature/arrowbuf-relocate-friendly
Closed

ARROW-799: [Java] Change readerIndex/writerIndex to readerIndex()/wri…#540
icexelloss wants to merge 1 commit intoapache:masterfrom
icexelloss:feature/arrowbuf-relocate-friendly

Conversation

@icexelloss
Copy link
Copy Markdown
Contributor

@icexelloss icexelloss commented Apr 14, 2017

Change readerIndex/writerIndex to readerIndex()/writerIndex() to make ArrowBuf shade/relocation friendly.

…terIndex() in ArrowBuf to make it shade/relocate friendly
Copy link
Copy Markdown
Member

@julienledem julienledem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that we need this.
If shading and relocating netty, one should shade and relocate this particular class in the same way so that the packages match.

}

return ByteBufUtil.decodeString(this, index, length, charset);
return super.toString(index, length, charset);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this equivalent?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. super.toString(index, length, charset); calls ByteBufUtil.decodeString(this, index, length, charset)

PlatformDependent.putShort(addr(writerIndex), (short) value);
writerIndex += 2;
PlatformDependent.putShort(addr(writerIndex()), (short) value);
writerIndex(writerIndex() + 2);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to make sure this doesn't impact performance.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I ran this code and didn't see performance diffs, are there other tests you'd like me to run?

public class TestArrowBufPerformance {
    @Test
    public void test() {
        int count = 10 * 1024 * 1024;
        RootAllocator allocator = new RootAllocator(8 * count);

        for (int j = 0; j < 100; ++j) {
            ArrowBuf buf = allocator.buffer(4 * count);
            long start = System.currentTimeMillis();
            for (int i = 0; i < count; ++i) {
                buf.writeInt(i);
            }
            System.out.println(System.currentTimeMillis() - start);
            buf.close();
        }
        allocator.close();
    }
}

before:

count 	100.000000
mean 	40.650000
std 	8.048307
min 	29.000000
25% 	34.000000
50% 	38.500000
75% 	45.250000
max 	64.000000

after:

count 	100.000000
mean 	40.930000
std 	7.002532
min 	31.000000
25% 	36.000000
50% 	39.000000
75% 	47.000000
max 	65.000000

Copy link
Copy Markdown
Member

@julienledem julienledem Apr 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

calling writerIndex() instead of setting the fields adds extra checks. It shows in your test that we are adding some overhead.
How about you shade by adding a class prefix instead of a package prefix? I think that would remove the need for this.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Member

@julienledem julienledem Apr 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See: https://maven.apache.org/plugins/maven-shade-plugin/shade-mojo.html#relocations
probably something like:

              <relocations>
                <relocation>
                  <pattern>io.netty.buffer.</pattern>
                  <shadedPattern>io.netty.buffer.Arrow_</shadedPattern>
                </relocation>
              </relocations>

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Julien, thanks for the tip! Yes this works. We can close this then.

@icexelloss
Copy link
Copy Markdown
Contributor Author

icexelloss commented Apr 18, 2017

The use case here is to use arrow as a uberjar. If we relocate io.netty, then ArrowBuf will be moved to shaded.io.netty.buffer.ArrowBuf. This will break user code as they will likely

import io.netty.buffer.ArrowBuf

With this, at least we can relocate io.netty but exclude ArrowBuf from relocation. This will work if user don't import/use other buffer classes under java/vector/src/io/netty/buffer/directly, which I think is reasonable with the current arrow java api.

If you think this is still not necessary, I am ok with closing this. We can probably do some work around on our side. But I do think shading netty in arrow is quite difficult and this makes it a bit easier.

@wesm
Copy link
Copy Markdown
Member

wesm commented Apr 20, 2017

@julienledem?

@icexelloss
Copy link
Copy Markdown
Contributor Author

@julienledem What do you think of this?

Copy link
Copy Markdown
Member

@julienledem julienledem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @icexelloss, sorry for being a little slow to get back to you. see my suggestion bellow.

PlatformDependent.putShort(addr(writerIndex), (short) value);
writerIndex += 2;
PlatformDependent.putShort(addr(writerIndex()), (short) value);
writerIndex(writerIndex() + 2);
Copy link
Copy Markdown
Member

@julienledem julienledem Apr 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

calling writerIndex() instead of setting the fields adds extra checks. It shows in your test that we are adding some overhead.
How about you shade by adding a class prefix instead of a package prefix? I think that would remove the need for this.

@icexelloss icexelloss closed this Apr 27, 2017
@julienledem
Copy link
Copy Markdown
Member

Awesome! Thanks @icexelloss. Would you mind posting your final solution to the mailing list? Other people were looking into doing something similar.

@wesm
Copy link
Copy Markdown
Member

wesm commented Apr 28, 2017

Should we add this to the documentation (now that we have a better website)?

@icexelloss
Copy link
Copy Markdown
Contributor Author

icexelloss commented Apr 28, 2017 via email

@wesm
Copy link
Copy Markdown
Member

wesm commented May 5, 2017

I changed the title for ARROW-799. We should start a Markdown-based Java user guide (to accompany the existing javadoc) that we can publish on the website. We can put it at site/_docs/java

@wesm wesm deleted the feature/arrowbuf-relocate-friendly branch October 19, 2017 01:42
pribor pushed a commit to GlobalWebIndex/arrow that referenced this pull request Oct 24, 2025
Bumps org.apache.orc:orc-core from 2.0.3 to 2.1.0.

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants