Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Improve lodepng" dc1033a makes ECT much slower #4

Closed
AlyoshaVasilieva opened this issue Mar 21, 2016 · 6 comments
Closed

"Improve lodepng" dc1033a makes ECT much slower #4

AlyoshaVasilieva opened this issue Mar 21, 2016 · 6 comments

Comments

@AlyoshaVasilieva
Copy link
Contributor

Testing compiles of dc1033a vs 83a1fd9 shows that at least for me dc1033a is much slower.

Using GCC 5.3.0 msys2 to compile on Windows 7 x64 with "-flto -march=native -mtune=native" added to C(XX)FLAGS on an AVX2-capable Intel CPU, the resulting binary of dc1033a is approx 2-3x slower than 83a1fd9. I am testing on large (7MB, ~8 megapixel) 24-bit PNGs.

Using commandline ect -9 --strict --mt-deflate=12 image.png. Hopefully can be fixed.

Thanks for making ECT.

@fhanau
Copy link
Owner

fhanau commented Mar 21, 2016

Thank you for the report, can you send me an example picture?

@AlyoshaVasilieva
Copy link
Contributor Author

Cannot distribute test images. But can reproduce on this 4000x2000 gradient from Photoshop to a lesser extent: https://i.imgur.com/dktlptr.png

83a1fd9 -9 speed: 150 seconds
dc1033a -9 speed: 191 seconds

83a1fd9 -5 speed: 62 seconds
dc1033a -5 speed: 83 seconds

Both commits create identical output.

@ghost
Copy link

ghost commented Mar 23, 2016

great improvement for 0.3: it's faster and it compresses better, and it's even better without dc1033a. this commit make ect slower (for same results) according to my tests.

@fhanau
Copy link
Owner

fhanau commented Mar 23, 2016

I tested the image. The commit does indeed decrease performance, but only on windows. On OS X, the performance stays the same(tested with gcc and clang). The error appears to be in the new lodepng_inflate function, which used to call lodepng's inflate implementation, but now uses zlib's.

@fhanau
Copy link
Owner

fhanau commented Mar 24, 2016

Should have the old performance now as the patch is partially reversed.

@ghost
Copy link

ghost commented Mar 24, 2016

for ect -3 :

v0.3 without dc1033a
3,23 MB (3 395 870 Bytes) --> 43.165s
3,24 MB (3 400 691 Bytes) --> 41.652s (--mt-deflate)

v0.3 with e438159
3,23 MB (3 395 870 Bytes) --> 39.811s
3,24 MB (3 400 691 Bytes) --> 39.171s (--mt-deflate)


for ect -7 :

v0.3 without dc1033a
3,16 MB (3 315 624 Bytes) --> 132.491s
3,16 MB (3 315 624 Bytes) --> 132.398s (--mt-deflate)

v0.3 with e438159
3,16 MB (3 315 624 Bytes) --> 130.370s
3,16 MB (3 315 624 Bytes) --> 130.338s (--mt-deflate)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants