Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BEAM-9325] Override proper write method in UnownedOutputStream #11263

Merged
merged 4 commits into from
Mar 31, 2020

Conversation

lukemin89
Copy link

@lukemin89 lukemin89 commented Mar 30, 2020

org.apache.beam.sdk.util.UnownedOutputStream does not override the method
public void write(byte b[], int off, int len) throws IOException
resulting in extremely slow writing speed.
This is because java.io.FilteredOutputStream does not provide a proper method.


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • [R: @iemejia ] Choose reviewer(s) and mention them in a comment (R: @username).
  • Format the pull request title like [BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replace BEAM-XXX with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

Post-Commit Tests Status (on master branch)

Lang SDK Apex Dataflow Flink Gearpump Samza Spark
Go Build Status --- --- Build Status --- --- Build Status
Java Build Status Build Status Build Status
Build Status
Build Status
Build Status
Build Status
Build Status Build Status Build Status
Build Status
Build Status
Python Build Status
Build Status
Build Status
Build Status
--- Build Status
Build Status
Build Status
Build Status
Build Status
--- --- Build Status
XLang --- --- --- Build Status --- --- Build Status

Pre-Commit Tests Status (on master branch)

--- Java Python Go Website
Non-portable Build Status Build Status
Build Status
Build Status Build Status
Portable --- Build Status --- ---

See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.

Copy link
Member

@iemejia iemejia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am curious on how did you find that the reason for the performance issue you mention is the lack of this method, and if you have seen a performance? (the implementation does not check for boundares so that's an issue to fix).

@iemejia
Copy link
Member

iemejia commented Mar 30, 2020

CC: @lukecwik you may be interested on taking a quick look since it seems you authored UnownedOutputStream

@lukemin89
Copy link
Author

Thanks for the review!

I was looking to improve my 10+TB GBK steps and happened to find this. I just decided to fix it as it should be an effortless fix.

I'm not sure what you mean by if I've seen the performance. In my pipeline, it would be difficult to find bottleneck even if I use Pprof. Especially, Dataflow PProf does not show much about GBK step.

In case you meant if I ran benchmark, I just ran a short benchmark using jmh showing

Benchmark                                                 Mode  Cnt      Score       Error   Units
Main.reuseWrite1kChunksByteArrayOutputStream             thrpt    5  25487.387 ±  2144.764  ops/ms
Main.reuseWrite1kChunksCustomNonSynchronousOutputStream  thrpt    5  53692.972 ± 11194.620  ops/ms
Main.reuseWrite1kChunksUnownedOutputStream               thrpt    5     56.419 ±     1.125  ops/ms
Main.reuseWrite1kChunksUnownedOutputStreamOverride       thrpt    5  16714.851 ±  3091.762  ops/ms

Benchmark                                                  Mode  Cnt      Score       Error   Units
Main.reuseWrite256ChunksByteArrayOutputStream             thrpt    5  43725.940 ±  6729.958  ops/ms
Main.reuseWrite256ChunksCustomNonSynchronousOutputStream  thrpt    5  70449.452 ± 11155.332  ops/ms
Main.reuseWrite256ChunksUnownedOutputStream               thrpt    5    221.629 ±     2.325  ops/ms
Main.reuseWrite256ChunksUnownedOutputStreamOverride       thrpt    5  24123.512 ±  5081.709  ops/ms

Benchmark                                                 Mode  Cnt       Score      Error   Units
Main.reuseWrite32ChunksByteArrayOutputStream             thrpt    5   48755.580 ± 2890.195  ops/ms
Main.reuseWrite32ChunksCustomNonSynchronousOutputStream  thrpt    5  231230.842 ± 8340.905  ops/ms
Main.reuseWrite32ChunksUnownedOutputStream               thrpt    5    1780.829 ±   47.172  ops/ms
Main.reuseWrite32ChunksUnownedOutputStreamOverride       thrpt    5   26909.471 ±  381.218  ops/ms

The numbers are all over because it was on my laptop, but you can roughly see.

@lukecwik
Copy link
Member

lukecwik commented Mar 31, 2020

What a terrible choice for the FilterOutputStream implementation. Reading the javadoc they clearly state that everyone who subclasses it needs to provide the optimal write(byte[], int, int) method. Also double checked grepcode for the source and they write one byte at a time.

I was unaware of the FilterOutputStream problem when writing this.

@lukecwik
Copy link
Member

Please also fix JAXBCoder.java, as it too uses a FilteredOutputStream:

private static class CloseIgnoringOutputStream extends FilterOutputStream {

expected.write(data1, 0, data1.length);
osActual.write(data1, 0, data1.length);

assertArrayEquals(expected.toByteArray(), actual.toByteArray());
Copy link
Member

@lukecwik lukecwik Mar 31, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't actually test that the singular version of the method was called since if FilteredOutputStream wrote one byte at a time you would still get the expected result. You'll need to use a mock and validate that #write(byte[], int, int) was called the correct number of times.

Copy link
Author

@lukemin89 lukemin89 Mar 31, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just added CallCountOutputStream to test the proper number of call count.

@iemejia
Copy link
Member

iemejia commented Mar 31, 2020

What a terrible choice for the FilterOutputStream implementation. Reading the javadoc they clearly state that everyone who subclasses it needs to provide the optimal write(byte[], int, int) method. Also double checked grepcode for the source and they write one byte at a time.

I was unaware of the FilterOutputStream problem when writing this.

Yes this was a surprise to me too, kind of unexpected, but glad that @lukemin89 found the issue. Did you find it via some static analysis or just by performance 'luck' ?

@lukemin89
Copy link
Author

lukemin89 commented Mar 31, 2020

I wish I found it in a fancier way, but I just found it by luck.

@lukecwik
Copy link
Member

retest this please

Copy link
Member

@iemejia iemejia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM Thanks @lukemin89 I really like this type of PRs simple but that fix a hidden problem. Great find!

@iemejia
Copy link
Member

iemejia commented Mar 31, 2020

Mmm seems tests are not running on this one, weird.

@iemejia
Copy link
Member

iemejia commented Mar 31, 2020

Now they are! Time to wait to and then merge.

@iemejia
Copy link
Member

iemejia commented Mar 31, 2020

I wish I found it in a fancier way, but I just found it by luck.

Curious this looks like something that can be matched 'easily' by static analyzers.

@lukecwik lukecwik merged commit e15a41d into apache:master Mar 31, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants