Improve loadBytes performance #5027

Merged
merged 1 commit into from Apr 24, 2017

Conversation

Projects
None yet
2 participants
@gohai
Contributor

gohai commented Apr 24, 2017

I was surprised that loading a 54 MB file with loadBytes took as long as it did (2.7 seconds). Hit the code with a hammer a little bit, and now it's down to 0.4 seconds. And seems to works for URLs still...

Improve loadBytes performance
Before, loading a 54 MB file took 2.7 seconds on my Macbook Air. With this change applied, it only takes 0.4 seconds.

@gohai gohai requested a review from benfry Apr 24, 2017

@benfry

This comment has been minimized.

Show comment
Hide comment
@benfry

benfry Apr 24, 2017

Member

Hm, this is literally what BufferedInputStream should be doing, which IIRC is why it was originally written in this goofy manner.

Actually, I think the performance problem may be that createInput() now (3.3.1 vs 3.3) creates a BufferedInputStream by default, so the code was creating a BufferedInputStream of a BufferedInputStream… What happens if you just remove the extra BufferedInputStream wrapper and keep reading byte by byte?

Member

benfry commented Apr 24, 2017

Hm, this is literally what BufferedInputStream should be doing, which IIRC is why it was originally written in this goofy manner.

Actually, I think the performance problem may be that createInput() now (3.3.1 vs 3.3) creates a BufferedInputStream by default, so the code was creating a BufferedInputStream of a BufferedInputStream… What happens if you just remove the extra BufferedInputStream wrapper and keep reading byte by byte?

@gohai

This comment has been minimized.

Show comment
Hide comment
@gohai

gohai Apr 24, 2017

Contributor

When I get rid of the return new BufferedInputStream(input) in both createInput variants, and replace them by a simple return input, this still takes the same time as before for me.

Contributor

gohai commented Apr 24, 2017

When I get rid of the return new BufferedInputStream(input) in both createInput variants, and replace them by a simple return input, this still takes the same time as before for me.

@benfry benfry merged commit dacaf5b into processing:master Apr 24, 2017

@benfry

This comment has been minimized.

Show comment
Hide comment
@benfry

benfry Apr 24, 2017

Member

Alright, thanks for checking. (Having done various rounds of optimization on those functions, I'm surprised that it was using the previous method…) Also, did you try 8192 or other buffer sizes? (Usually 8k is more of the sweet spot…)

Member

benfry commented Apr 24, 2017

Alright, thanks for checking. (Having done various rounds of optimization on those functions, I'm surprised that it was using the previous method…) Also, did you try 8192 or other buffer sizes? (Usually 8k is more of the sweet spot…)

@gohai

This comment has been minimized.

Show comment
Hide comment
@gohai

gohai Apr 25, 2017

Contributor

I just tried - they seem to be about the same as 4k on my machine, but hard to measure with page caching and all.

Contributor

gohai commented Apr 25, 2017

I just tried - they seem to be about the same as 4k on my machine, but hard to measure with page caching and all.

@benfry

This comment has been minimized.

Show comment
Hide comment
@benfry

benfry Apr 25, 2017

Member

Ok, thanks. Also checked in some additional optimized versions that will be used when loading from a file or URL (where we can get the size beforehand).

Member

benfry commented Apr 25, 2017

Ok, thanks. Also checked in some additional optimized versions that will be used when loading from a file or URL (where we can get the size beforehand).

@gohai gohai deleted the gohai:loadBytes branch May 4, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment