Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decompression error at higher optimization levels #4

Open
nemequ opened this issue Sep 16, 2015 · 5 comments
Open

Decompression error at higher optimization levels #4

nemequ opened this issue Sep 16, 2015 · 5 comments

Comments

@nemequ
Copy link
Contributor

nemequ commented Sep 16, 2015

I'm pretty sure that this is basically the same issue as lz4/lz4#79, the discussion around that issue should be informative.

With GCC 5.1.1 (at least) at -O3 (or -O2 -ftree-loop-vectorize -fvect-cost-model) on x86_64, wfLZ segfaults during decompression at wfLZ.c:563, which is

*( (ureg_t*)dst ) = *( (ureg_t*)cpySrc );

Note that Yann blogged about some research he did into accessing unaligned data which you might be interested in. The resulting GCC bug report may also be interesting.

@ShaneYCG
Copy link
Owner

Thanks for the report, I will review this and likely apply a GCC+ARM check to disable unaligned access. I have an ARMv7 board lying around I've slacked on setting up, now seems like a good time!

@nemequ
Copy link
Contributor Author

nemequ commented Sep 16, 2015

It isn't just ARM. This happens on x86_64.

@ShaneYCG
Copy link
Owner

Ah, thanks. I will set up a Linux/BSD box with a newer GCC to run my tests on too.

@nemequ
Copy link
Contributor Author

nemequ commented Sep 18, 2015

If you want I can give you SSH access to an x86_64 machine which runs into this. It's an old budget laptop (a Toshiba Satellite A205-S5805) which I really only use these days to run the Squash Benchmark. It's running Fedora 21 (with GCC 4.9.1).

@xcrh
Copy link

xcrh commented Nov 16, 2015

I guess it could also fail in windows as well, if mingw would get more or less recent gcc. So possibly you can try to get mingw(-64) and try to compile with it using -O3 to see what happens.

While -O3 is a kind of aggressive and known-to-be unsafe optimization, it is what actually makes things FAST and in compression algo this is really handy. E.g. -O3 seems to be "best of the best" for LZ4, which seems to play in the same league - blazing fast LZ algos. And according to LZ4 author, gcc even beats MSVS by 10% these days, at least in his algo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants