-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance issue with the redundancy operations #3
Comments
That's outrageous number! Just a sanity test, did you benchmark it with --release? |
Anyway, welcome! |
@tyt2y3 Yes, I am testing it with the --release and set opt-level = 3 in the Cargo.toml file. According to the output file from the objdump, this is not optimized because this view() function stores these values from the registers on the stack and then takes them out of the stack when they are needed. So I believe that this inline keyword does not change anything, and my test also shows that the running time did not change after adding this inline keyword for view() function. |
Sad. Thank you for the investigation. PR would be welcomed. Though I would prefer keeping the view() function in place because other crates depended on it. |
Hi, I submitted a pull request. Here's the link #5 (comment) |
visioncortex/src/color_clusters/builder.rs
Line 401 in 402bd07
This view() function will create a ClustersView structure inside a loop, but for the rest code of this loop, only the width, height and clusters_indices will be used. And for every loop iteration, this view() function will create the same values again and again which introduce a lot of redundancy operations. Perhaps we can dispense with this view() function and instead directly reference the self first and then get the needed variables inside these functions.
If we change the code of
visioncortex/src/color_clusters/builder.rs
Line 416 in 402bd07
Into
(Need to do some modifications with the neighbours() function)
Change
visioncortex/src/color_clusters/builder.rs
Lines 438 to 439 in 402bd07
Into
(Need to modified the deepen function too.)
Change
visioncortex/src/color_clusters/builder.rs
Line 443 in 402bd07
Into
Then comment out the line 401 and 436.
visioncortex/src/color_clusters/builder.rs
Line 401 in 402bd07
visioncortex/src/color_clusters/builder.rs
Line 436 in 402bd07
Then it can avoid these redundant operations.
According to my test it will get a 43.8% speedup (7.3s to 5.1s) with the vtracer application (https://github.com/visioncortex/vtracer).
Hope this information helps!
The text was updated successfully, but these errors were encountered: