Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inflate: not faster than miniz_oxide on exr benchmark #46

Closed
Shnatsel opened this issue Jan 1, 2023 · 6 comments
Closed

Inflate: not faster than miniz_oxide on exr benchmark #46

Shnatsel opened this issue Jan 1, 2023 · 6 comments

Comments

@Shnatsel
Copy link
Contributor

Shnatsel commented Jan 1, 2023

git clone https://github.com/Shnatsel/exrs
cd exrs
git checkout zune-inflate
cargo bench read_single_image_non_parallel_zips_rgba

The Cargo.toml file is locked in to 0.2.1. Edit it to point to 0.2.2 and you'll see a performance regression, to the point that miniz_oxide used on the master branch becomes slightly faster than zune-inflate.

This is not reflected in the zune-inflate benchmark, so apparently it represents significantly different data patterns.

Edit: actually 0.2.1 doesn't pass tests, you have to upgrade to 0.2.2 for correctness.

@Shnatsel Shnatsel changed the title Inflate: Performance regression from 0.2.1 to 0.2.2 on exr benchmark Inflate: not faster than miniz_oxide on exr benchmark Jan 1, 2023
@Shnatsel
Copy link
Contributor Author

Shnatsel commented Jan 1, 2023

Improved by #41

@Shnatsel
Copy link
Contributor Author

Shnatsel commented Jan 1, 2023

Half the time here is spent in various parts of RLE decoding, so closing in favor of #41

@Shnatsel Shnatsel closed this as completed Jan 1, 2023
@etemesi254
Copy link
Owner

Looking into it

@Shnatsel
Copy link
Contributor Author

Shnatsel commented Jan 1, 2023

If you profile the benchmark with Samply (either on Linux or on Mac) you can double-click the items in the flame graph to view the source code, and where the time is spent within the function line by line. I've found it helpful to understand where time is spent.

This only works while Samply is running, this won't work in profiles shared on the web. That's how I got this image.

@etemesi254
Copy link
Owner

I didn't know about samply.

Thanks for that,

was wondering how you actually got those values

@Shnatsel
Copy link
Contributor Author

Shnatsel commented Jan 1, 2023

I've updated the article about bounds checks with instructions for using Samply on Linux.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants