-
-
Notifications
You must be signed in to change notification settings - Fork 21.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize image conversion for half and float formats. #92291
Optimize image conversion for half and float formats. #92291
Conversation
7ad67d6
to
d260a2b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested locally, it works as expected.
Benchmark
Using an optimized editor binary (optimize=speed lto=full
).
Testing project: image-half-convert-benchmark.zip
(only source format is taken into account in the benchmark)
PC specifications
- CPU: Intel Core i9-13900K
- GPU: NVIDIA GeForce RTX 4090
- RAM: 64 GB (2×32 GB DDR5-5800 C30)
- SSD: Solidigm P44 Pro 2 TB
- OS: Linux (Fedora 39)
The time taken to create an image and convert its format 1,000 times (the image is recreated on every iteration seems identical before and after this PR. Here it is for reference:
Format | After (this PR) |
---|---|
RH | 77.5 ms |
RGH | 99.5 ms |
RGBH | 119.5 ms |
RGBAH | 140.8 ms |
RF | 65.5 ms |
RGF | 77.1 ms |
RGBF | 75.2 ms |
RGBAF | 79.9 ms |
I'm a bit surprised I don't see any performance improvement in this scenario, but at least it's not slower than before.
Note that the stripped binary size also grows by 8 KB with this PR, likely due to the addition of a template function. It's not a huge issue but I thought it'd be worth mentioning nonetheless.
Thank you for the review!
The first conversion happens between the source 8-bit uint image and the specified half/float format. In order to get the correct results, only the second conversion/match statement has to be measured. Here's the benchmark with the different measurements: image-half-convert-benchmark.zip |
Thanks! |
Optimizes conversion between RGBA variants of half and float image formats.
On average, this makes the conversion process ~7 times faster, depending on the formats. This is especially important for RGB images, which need to be converted to RGBA by most GPUs.
Build - Windows 64-bit, production
CPU: Ryzen 9 5900X 12-Core
RAM: 64 GB DDR4 3000 MHz
4096x4096 image:
MRP: image-half-convert.zip