-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[0.2.2] -D_GLIBCXX_ASSERTIONS unit test crash + performance issues #917
Comments
Hi! Thanks for the detailed info in the report! I'll look further into this a little later in the week, but as noted in #916, we haven't tested or attempted to support installing Arbor as a shared library. |
The tests crash even with a static library build. Logs attached: |
The crash is actually an assertion failure, because Fedora builds with -D_GLIBCXX_ASSERTIONS to find misuses of the C++ STL classes. This particular assertion failure means a vector access specified an index that is >= the size of the vector. I've seen this happen in a lot of projects. Most often it is because the code attempted to access element 0 of an empty vector. That is the case here. The problem is this code in arbor/include/arbor/schedule.hpp:
That can be called with an empty vector v. Index 0 is invalid in that case. |
Thank you for the diagnosis, @jamesjer! |
That analysis is spot on. You can see why this doesn't cause any runtime errors, but it is highly dubious behaviour. |
We test regularly with valgrind and address sanitizer, and they didn't flag this as an issue. For the release, would you consider compiling with that flag turned off, because it will hurt performance. Some quick tests on my laptop show at least a 25% slow down. |
That's a big impact! That flag has been in use since Fedora 28. You can see the writeup on it here: https://fedoraproject.org/wiki/Changes/HardeningFlags28. Note that the proposers referred to "cheap" range checks. I would not call a 25% slow down cheap. @sanjayankur31 perhaps we should bring this up on fedora-devel-list. |
Just to confirm, I believe using |
Fixes arbor-sim#917. * Avoid use of empty ranges determined by expressions `&vector[i]` where i equals vector.size(). Replace with expressions using `vector.data()` or subrange views.
I've started a thread now: Performance penalties caused by -D_GLIBCXX_ASSERTIONS - devel - Fedora Mailing-Lists Let's see how that goes, and we'll report back here :) |
Avoid use of empty ranges determined by expressions `&vector[i]` where i equals vector.size(). Replace with expressions using `vector.data()` or subrange views. Fixes #917.
The time measurements I gave quite high level: the total time taken in the time stepping loop. |
I did find a link describing the sort of optimization issue that Florian mentions in the reply on the mailing list: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85929 |
We did some digging, and found that some key loops are up to 3x slower with the flag enabled. There are two scenarios where the compiler can't hoist or remove the bounds checks. void foo(std::vector<double>& a, const std::vector<int>& idx) {
for (size_t i=0; i<idx.size(); ++i) {
a[idx[i]] += 42.;
}
} The second is when we access multiple using vec = std::vector<double>;
void foo(vec& a) {
for (size_t i=0; i<a.size(); ++i) {
a[i] += 42.;
}
} The compiler generates less efficient code. // Compiler generates different code
// without vectorization and with bounds checks.
void foo(vec& a, vec& b) {
for (size_t i=0; i<a.size(); ++i) {
a[i] += 42.;
b[i] += 42.;
}
} We verified that this was "fixed" in the code by rewriting the equivalent to the second loop as follows: // Compiler generates optimized code with the flag set
void foo(vec& a, vec& b) {
double* _a = a.data();
double* _b = a.data();
for (size_t i=0; i<a.size(); ++i) {
_a[i] += 42.;
_b[i] += 42.;
}
} For this reason we don't recommend compiling with this flag for release builds, but it is a good idea to use when compiling and running tests (we will add it to our Travis builds!). c.f. the following GodBolt example: |
Just as an addendum, it turns out that the check also breaks auto-vectorization when iterating over a sub-range, e.g. the size check is not hoisted in code like the following:
|
In the code of #917 (comment) (bcumming, here about 12 hours ago) there are no |
@jresler, these are all good points, however the circumstances in which the compiler can hoist/eliminate the check using The examples I gave all used const std::size_t n = a.size();
// compiler can't assert that n==b.size(), so no hoisting of check.
for (size_t i=0; i<n; ++i) {
a[i] += 42.;
b[i] += 42.;
} Multhreading doesn't change the compiler's behavior when we pass a vector by To get optimizations to work, we would have to extract raw pointers using For reference, the function where we observed 3x slowdown is below. We fixed it with the flag enabled by using pointers: The workaround is to either take the performance hit, or use raw pointers, which ironically would make compiling with |
Strong aliasing guarantees allow us (and the compiler) not to have to worry about this happening in this circumstance unless the vectors were over e.g.
In principle there is enough information for the compiler to optimize the |
@bcumming As written, the subroutine with two vectors has a bug, and the bug directly inhibits optimization of the assertion checks. Fixing the bug gives the compiler enough hints to optimize the assertions. @halfflat Strong aliasing restrictions do inhibit problems due to the code of the function itself, but still allow a separate thread to interfere. When I compile this modified code (using
then I get the following code in which the compiler has done moderately well at hoisting the assertion checking:
|
I'm going to open a new issue for the |
I'm seeing a crash while running the unit tests:
Hardware information:
Compiler:
Complete list of packages attached. The complete build log is also attached:
arbor-build-log.txt
arbor-root-log.txt
Additional question: is libarborsup.so meant to be a public library for use? It isn't installed in /usr/lib64 etc, but is required by
unit
to run the tests. Here, I've specifiedLD_LIBRARY_PATH
to ensure it is found.Thanks,
The text was updated successfully, but these errors were encountered: