Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use rayon_core::ThreadPool + threads fallback on WASM #203

Merged
merged 6 commits into from
Mar 10, 2023

Conversation

johannesvollmer
Copy link
Owner

No description provided.

@johannesvollmer
Copy link
Owner Author

johannesvollmer commented Mar 6, 2023

previously:

Running benches\read.rs (target\release\deps\read-346953cee12d2063.exe)
test read_single_image_rle_all_channels               ... bench:  23,077,480 ns/iter (+/- 5,920,839)
test read_single_image_rle_non_parallel_all_channels  ... bench:  37,191,290 ns/iter (+/- 13,142,675)
test read_single_image_rle_non_parallel_rgba          ... bench:  39,958,790 ns/iter (+/- 11,766,276)
test read_single_image_rle_rgba                       ... bench:  27,534,700 ns/iter (+/- 15,942,424)
test read_single_image_uncompressed_non_parallel_rgba ... bench:  20,412,270 ns/iter (+/- 3,284,810)
test read_single_image_uncompressed_rgba              ... bench:  20,208,720 ns/iter (+/- 2,113,310)
test read_single_image_zips_non_parallel_rgba         ... bench: 102,007,620 ns/iter (+/- 19,710,591)
test read_single_image_zips_rgba                      ... bench:  34,430,980 ns/iter (+/- 6,738,088)

Running benches\write.rs (target\release\deps\write-ee295502a3d29255.exe)
test write_nonparallel_zip1_to_buffered      ... bench: 478,018,990 ns/iter (+/- 60,906,865)
test write_parallel_any_channels_to_buffered ... bench:  39,375,920 ns/iter (+/- 11,970,844)
test write_parallel_zip16_to_buffered        ... bench: 125,661,040 ns/iter (+/- 19,316,207)
test write_parallel_zip1_to_buffered         ... bench: 107,721,620 ns/iter (+/- 17,027,000)
test write_uncompressed_to_buffered          ... bench:  36,058,700 ns/iter (+/- 15,324,005)

parallel speed: 161%, 145%, 443%, ...

now with rayon:

Running benches\read.rs (target\release\deps\read-0137d5553932e671.exe)
test read_single_image_rle_all_channels               ... bench:  28,930,690 ns/iter (+/- 5,038,888)
test read_single_image_rle_non_parallel_all_channels  ... bench:  37,150,920 ns/iter (+/- 11,987,478)
test read_single_image_rle_non_parallel_rgba          ... bench:  39,809,670 ns/iter (+/- 9,344,255)
test read_single_image_rle_rgba                       ... bench:  32,667,300 ns/iter (+/- 2,025,493)
test read_single_image_uncompressed_non_parallel_rgba ... bench:  19,300,970 ns/iter (+/- 933,911)
test read_single_image_uncompressed_rgba              ... bench:  19,479,210 ns/iter (+/- 6,858,961)
test read_single_image_zips_non_parallel_rgba         ... bench: 102,712,370 ns/iter (+/- 34,694,717)
test read_single_image_zips_rgba                      ... bench:  33,905,040 ns/iter (+/- 4,875,561)

Running benches\write.rs (target\release\deps\write-c4636ec9e59cae60.exe)
test write_nonparallel_zip1_to_buffered      ... bench: 522,813,110 ns/iter (+/- 144,004,227)
test write_parallel_any_channels_to_buffered ... bench:  55,153,260 ns/iter (+/- 22,642,415)
test write_parallel_zip16_to_buffered        ... bench: 152,128,800 ns/iter (+/- 21,509,143)
test write_parallel_zip1_to_buffered         ... bench: 118,787,040 ns/iter (+/- 11,839,229)
test write_uncompressed_to_buffered          ... bench:  34,574,950 ns/iter (+/- 25,650,534)

parallel speed: 128%, 122%, 440%, ...

we might want to try the non-fifo spawn calls in the threadpool, they might slow things down

@johannesvollmer
Copy link
Owner Author

johannesvollmer commented Mar 7, 2023

master

Running benches\read.rs (target\release\deps\read-e1f4d1352c653dce.exe)
test read_single_image_rle_all_channels               ... bench:  18,840,020 ns/iter (+/- 2,707,747)
test read_single_image_rle_non_parallel_all_channels  ... bench:  29,031,200 ns/iter (+/- 2,341,532)
test read_single_image_rle_non_parallel_rgba          ... bench:  31,188,730 ns/iter (+/- 2,227,412)
test read_single_image_rle_rgba                       ... bench:  21,799,280 ns/iter (+/- 2,330,220)
test read_single_image_uncompressed_non_parallel_rgba ... bench:  16,437,840 ns/iter (+/- 2,210,454)
test read_single_image_uncompressed_rgba              ... bench:  17,363,250 ns/iter (+/- 2,278,051)
test read_single_image_zips_non_parallel_rgba         ... bench:  76,905,370 ns/iter (+/- 3,415,472)
test read_single_image_zips_rgba                      ... bench:  20,967,470 ns/iter (+/- 2,909,308)

Running benches\write.rs (target\release\deps\write-9d3ba8636b780ab2.exe)
test write_nonparallel_zip1_to_buffered      ... bench: 317,108,770 ns/iter (+/- 12,161,251)
test write_parallel_any_channels_to_buffered ... bench:  30,446,700 ns/iter (+/- 4,009,221)
test write_parallel_zip16_to_buffered        ... bench:  61,044,410 ns/iter (+/- 4,958,282)
test write_parallel_zip1_to_buffered         ... bench:  52,370,310 ns/iter (+/- 4,008,664)
test write_uncompressed_to_buffered          ... bench:  26,831,360 ns/iter (+/- 4,751,253)

which means a parallel gain of 154%, 143%, 606% ...

without fifo

Running benches\read.rs (target\release\deps\read-ea6aead5236060bf.exe)
test read_single_image_rle_all_channels               ... bench:  20,310,830 ns/iter (+/- 7,008,828)
test read_single_image_rle_non_parallel_all_channels  ... bench:  30,482,600 ns/iter (+/- 7,425,710)
test read_single_image_rle_non_parallel_rgba          ... bench:  32,887,670 ns/iter (+/- 11,718,651)
test read_single_image_rle_rgba                       ... bench:  22,851,780 ns/iter (+/- 6,029,902)
test read_single_image_uncompressed_non_parallel_rgba ... bench:  17,642,540 ns/iter (+/- 6,100,866)
test read_single_image_uncompressed_rgba              ... bench:  18,237,210 ns/iter (+/- 5,941,470)
test read_single_image_zips_non_parallel_rgba         ... bench:  81,840,630 ns/iter (+/- 6,128,765)
test read_single_image_zips_rgba                      ... bench:  22,718,730 ns/iter (+/- 4,300,432)

Running benches\write.rs (target\release\deps\write-b56af3ef55d097f6.exe)
test write_nonparallel_zip1_to_buffered      ... bench: 349,552,540 ns/iter (+/- 63,847,946)
test write_parallel_any_channels_to_buffered ... bench:  34,021,550 ns/iter (+/- 8,007,533)
test write_parallel_zip16_to_buffered        ... bench:  61,634,630 ns/iter (+/- 6,244,950)
test write_parallel_zip1_to_buffered         ... bench:  53,021,430 ns/iter (+/- 4,410,378)
test write_uncompressed_to_buffered          ... bench:  28,873,260 ns/iter (+/- 9,948,521)

which means parallel gain of 150%, 144%, 659%, ...

This was linked to issues Mar 7, 2023
@notgull
Copy link

notgull commented Mar 9, 2023

Thanks for doing this! Is this a breaking change?

@johannesvollmer
Copy link
Owner Author

absolutely! it's fine though, don't worry. this part of the API is pretty deep in the guts, so I'm sure not too many projects use it. considering the wins of WASM, this is absolutely worth the trouble

@johannesvollmer johannesvollmer merged commit f4e1d0d into master Mar 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

WASM support Port from threadpool to rayon
2 participants