New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
asan issue with Clang 6 and std::stable_sort w/ indirect_adapter #139
Comments
I switched the C++17 branch to Clang 7 and it seems that the error disappeared, which is quite surprising considering that it occurred at every single run with Clang 6. Two possibilities come to my mind:
Whatever it is I'm keeping this issue open for now. |
On my ubuntu 18.04 with clang
I cannot reproduce this error: CXX=clang++-6.0 CPPSORT_SANITIZE=undefined cmake -B build . -DCPPSORT_SANITIZE=undefined,address -DCMAKE_CXX_COMPILER=/usr/bin/clang++-6.0 -DCMAKE_EXPORT_COMPILE_COMMANDS=On
(cd build && ctest -I 304,305) Prints
Full output is: Like you said, with later versions it already wasn't happening. I assume that - unless I'm missing some obvious flag - this is no longer an issue on |
I also repeated using clang-6.0 on that old commit:
Of course with the old variable naming in cmake invocation
Now it repros
Looking at it |
Interesting, so some magic change made the bug disappear since then? |
The bug or false-positive. I'll keep you posted. |
Okay, so I sat with it first (before just bisecting, which will likely find that problem is gone with 2ab06c1). For many hours. There's bad news. The problem seems traceable to the indirection projection. This is an immutable lambda, but still it ostensibly changes during execution (as part of I really can't decide yet whether that's a miscompilation, or I'm missing some source of UB lurking there. I added a unit test The good news is you can see this happening. The bad news is it's not just with Clang6. ((In fact with GCC-10 + ASAN it actually reports AddressSanitizer: stack-buffer-overflow). What follows is some of the thinking/observations that went into making the repro:
Among the weirder observations (if that's not enough for you)
I fail to see something wrong with the actual code. Could it really be just... miscompilation? Pretty scary. Maybe I'm just going crazy and the error is right before our eyes. I'd appreciate a second pair of eyes. Meanwhile, I'm sleeping on it. Maybe tomorrow brings fresh ideas. Apparent workarounds:
|
Okay, so I sat with it first (before just bisecting, which will likely find that problem is gone with 2ab06c1). For many hours. There's bad news. The problem seems traceable to the indirection projection. This is an immutable lambda, but still it ostensibly changes during execution (as part of `project_compare`). I really can't decide yet whether that's a miscompilation, or I'm missing some source of UB lurking there. I added a unit test `issue_139` on branch `issue_139`. The unit test does not repro the ASAN diagnostic, but it *does* manifest the source of the UB, where the captured `&proj` reference (originally include/cpp-sort/adapters/indirect_adapter.h:L123) gets clobbered. The good news is you can see this happening. The bad news is it's not just with Clang6. ((In fact with GCC-10 + ASAN it actually reports _AddressSanitizer: stack-buffer-overflow_). What follows is some of the thinking/observations that went into making the repro: ----- - I figured it might have been down to lambda codegen, but rewriting it as a callable struct made no difference. (Omitted from the unit test for brevity). - Next I figured it might be due to some weird codegen issue with empty-element `tuple<>` optimization in `projection_compare` (because replacing the tuple by it's elements **did** remove the symptom). But sketching an "adhoc" projection_compare in terms of `tuple` does **not** display the same symptom. [Included in the repro code in case you want to check] Among the weirder observations (if that's not enough for you) - depending on how I sequence the tests, the first invocation of `projection_compare::compare()` does the right thing (`&proj == projptr`), but a second invocation fails. - depending on whether I phrase taking the address of `proj` as `&proj` or `std::addressof(proj)` can give different diagnostics some of the time: at times ASAN croaked about invoking a member on a null pointer when using `std::addressof`. ¯\\_(ツ)_/¯ - It's weird that this manifested with std_sort only. Then again, the repro case fails with other compilers/clang versions just as well, even though that never tripped ASAN before. I fail to see something wrong with the actual code. Could it really be just... miscompilation? Pretty scary. Maybe I'm just going crazy and the error is right before our eyes. I'd appreciate a second pair of eyes. Meanwhile, I'm sleeping on it. Maybe tomorrow brings fresh ideas. ----- Apparent workarounds: - capturing `proj` by value - replacing `tuple` projection_compare::data by its parts works - perhaps it is gone with 2ab06c1 (I still wish to understand the phenomenon I'm seeing)
Sleep brought me the convincing realization that there /must/ be UB if so many compilers/versions agree. Is it worth putting this through CI? I want to rule out mis-builds due to my build system. |
Ah. Got it. I think. The tuple elements in Also checking back:
Fires the second static_assert. I did check earlier what the actual types deduced were. However (a) it unsuspect (b) just tracing the names made the ASAN diagnostic go away. By now, since I've established my own way of detecting the badness in the UT, I don't need the ASAN diagnostic to know it went bad. So I can see it. Will prepare a PR if still relevant to existing release(s) |
Okay, so I sat with it first (before just bisecting, which will likely find that problem is gone with 2ab06c1). For many hours. There's bad news. The problem seems traceable to the indirection projection. This is an immutable lambda, but still it ostensibly changes during execution (as part of `project_compare`). I really can't decide yet whether that's a miscompilation, or I'm missing some source of UB lurking there. I added a unit test `issue_139` on branch `issue_139`. The unit test does not repro the ASAN diagnostic, but it *does* manifest the source of the UB, where the captured `&proj` reference (originally include/cpp-sort/adapters/indirect_adapter.h:L123) gets clobbered. The good news is you can see this happening. The bad news is it's not just with Clang6. ((In fact with GCC-10 + ASAN it actually reports _AddressSanitizer: stack-buffer-overflow_). What follows is some of the thinking/observations that went into making the repro: ----- - I figured it might have been down to lambda codegen, but rewriting it as a callable struct made no difference. (Omitted from the unit test for brevity). - Next I figured it might be due to some weird codegen issue with empty-element `tuple<>` optimization in `projection_compare` (because replacing the tuple by it's elements **did** remove the symptom). But sketching an "adhoc" projection_compare in terms of `tuple` does **not** display the same symptom. [Included in the repro code in case you want to check] Among the weirder observations (if that's not enough for you) - depending on how I sequence the tests, the first invocation of `projection_compare::compare()` does the right thing (`&proj == projptr`), but a second invocation fails. - depending on whether I phrase taking the address of `proj` as `&proj` or `std::addressof(proj)` can give different diagnostics some of the time: at times ASAN croaked about invoking a member on a null pointer when using `std::addressof`. ¯\\_(ツ)_/¯ - It's weird that this manifested with std_sort only. Then again, the repro case fails with other compilers/clang versions just as well, even though that never tripped ASAN before. I fail to see something wrong with the actual code. Could it really be just... miscompilation? Pretty scary. Maybe I'm just going crazy and the error is right before our eyes. I'd appreciate a second pair of eyes. Meanwhile, I'm sleeping on it. Maybe tomorrow brings fresh ideas. ----- Apparent workarounds: - capturing `proj` by value - replacing `tuple` projection_compare::data by its parts works - perhaps it is gone with 2ab06c1 (I still wish to understand the phenomenon I'm seeing)
Thanks a lot for analyzing this so throroughly. After taking a look at the library, I likely have the issue in the following places:
Apparently the bug in |
It is not relevant anymore, you fixed it in 89eb52d
It's not the fix I had in mind though. I had this:
Why remove a reference that shouldn't be there? And in the process unqualify cv as well |
I guess that the Cargo cult if you will. |
Yeah I had a hunch that |
I virtually don't have users, so I don't plan to fix old versions unless someone specifically asks for it :p I guess that your solution is strictly better than the one using Thanks again for the investigation, without that the issue might have remained open forever 😄 |
Cheers. I would not retain the unit test. Even though it was able to pick up on this UB before ASAN did some of the time, it will not detect unexpected issues anyways. It was a narrow tool for closed analysis :) Likewise, I thought better of adding the If you are positive that there will not be a use for storing projection/comparator by reference I'm actually happier with the remove_reference_t approach¹. I guess it comes down to the question whether decaying plays well with e.g. constant_function etc. I didn't delve into that territory of the library, so I don't know all the ramifications. ¹(/OT is there a practical difference between |
To answer the last question: I replaced I don't have a use case for storing comparator/projection by reference, so I'll probably keep using |
Asan finds a stack buffer overflow with the following test and the
std::stable_sort
equivalent:I've no idea why. Instinctively I would say that it's linked to
projection_compare
becausestd_sorter
has to use if, but it could also be linked to the newindirect_adapter
that accepts projection-only sorters. It's probably a mix of both, but I can't find the actual issue.Here is the failing job issue with the full asan log: https://travis-ci.org/Morwenn/cpp-sort/jobs/471096685
The text was updated successfully, but these errors were encountered: