Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

-connect (ClusterConnected()) behaves strangely when compiled without -g #2

Closed
jewettaij opened this issue Mar 5, 2019 · 3 comments

Comments

@jewettaij
Copy link
Owner

"Heisenbug" ?
The "-connect* argument of "filter_mrc" (which invokes the "ClusterConnected()" function in filter3d.hpp), is behaving strangely, but only when compiled in gcc with optimizations and OpenMP enabled.

If you compile it using the settings located in "for_debugging_and_profiling/setup_gcc_linux_dbg.sh", (which uses the -g3 flag), then these problems go away. This is a serious bug, partly because running the code without OpenMP makes it almost intolerably slow. I will look into this soon.

jewettaij added a commit that referenced this issue Mar 6, 2019
…). I'll try to fix whatever is wrong with ClusterConnect() so that we can put them back in eventually, because the code is ~4 times slower without optimization.
@jewettaij
Copy link
Owner Author

jewettaij commented Mar 6, 2019

The problem actually had nothing to do with the "-g" flag or OpenMP or multithreading (whew). Instead the problem occurred when compiled using optimizations (using the the -O1, -O2, or -O3 compiler flags) with GCC.

I removed these flags from the compiler options (currently located in the "setup_gcc.sh" files). This enabled me to go back to using OpenMP, but it still resulted in performance that was roughly 4x slower than before.

For now, using the CLANG compiler instead of GCC seems to totally fix this problem. (The resulting binaries seem to be about 20% faster too.)

Incidentally, I ran valgrind on "filter_mrc" using"
valgrind --tool=memcheck --leak-check=yes --show-reachable=yes --num-callers=20 --track-fds=yes filter_mrc ...
and it did not find any errors. So that does not appear to be the source of the problem. I'll keep playing with valgrind's other tools to see if I can track this down.

It's possible there is a bug in the code, but it also could be a compiler glitch. (It would not be the first time I encountered one in gcc.) For now, I'm going to paper-over this problem by moving to CLANG.

@jewettaij
Copy link
Owner Author

More thorough checking with other valgrind tools failed to discover any problems. Perhaps this is copping out, but I'm leaning towards calling this a bug in GCC optimization. (Again, it would not be the first glitch I've run into with GCC. The old pre-3.0 compilers were a nightmare.) Either way, it would be nice if this code worked on all compilers. If I have time, I'll try tinkering with the code in "ClusterConnected()" to try and coax this code into behaving nicely with GCC. For now, use CLANG.

jewettaij added a commit that referenced this issue Jul 14, 2021
…ory-read errors which were harmless and probably not the cause of the "Heisenbug" problems I had earlier with the gcc compiler (issue #2 in the github gracker).
@jewettaij
Copy link
Owner Author

jewettaij commented Aug 10, 2021

Since upgrading my compiler from gcc version 7.5 to gcc 9.3, this problem seems to have mysteriously corrected itself.

(Incidentally, the function where the problem occurred has been renamed from "ClusterConnected()" to "LabelConnected()". I don't know if the change in the compiler, or small changes in the code since 2019 could have fixed the problem, or whether I really have fixed the problem.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant