-
Notifications
You must be signed in to change notification settings - Fork 835
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix few tsan issues #16710
Fix few tsan issues #16710
Conversation
fe7bc07
to
1a23c28
Compare
1a23c28
to
5ddaa6b
Compare
threads.emplace_back([&driver, initialObserver, initialState, iters, | ||
threadSeed = gen(), result = &results[thrIdx]] { | ||
// the random device is thread_local | ||
RandomGenerator::initialize(RandomGenerator::RandomType::MERSENNE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RandomGenerator::initialize
reset thread_local ptr to null and global enum type
to mersenne
So this code no make sense, because
- we have data race on
type
and we can init it before threads creation. - thread_local ptr initialized by null on thread creation, this thread created right here
@@ -27,6 +27,7 @@ | |||
#include "Aql/AqlTransaction.h" | |||
#include "Aql/ExecutionEngine.h" | |||
#include "Aql/Timing.h" | |||
#include "Aql/QueryCache.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some magic (don't understand how it works before), but I can't build without it (incomplete type), change only shared_ptr creation
@@ -62,6 +62,8 @@ enum class TraversalProfileLevel : uint8_t { | |||
struct QueryOptions { | |||
QueryOptions(); | |||
explicit QueryOptions(arangodb::velocypack::Slice); | |||
QueryOptions(QueryOptions&&) noexcept = default; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It moved in some place but didn't have move :(
tsan_arangodb_suppressions.txt
Outdated
# TODO Fix data race in arangodump | ||
race:arangodump | ||
|
||
# TODO Fix data race in arangorestore | ||
race:arangorestore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A lot of races in random code
deadlock:consensus::Agent::setPersistedState | ||
|
||
# TODO Fix data race in arangodbtests | ||
race:DummyConnection::sendRequest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's stupid race, difficult to fix because need to correct tests
Now them written bad
thread:CacheManagerFeature::start | ||
|
||
# TODO Fix lock order inversion | ||
deadlock:consensus::Agent::setPersistedState |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be fixed by cluster team.
"Agent call State" but sometimes "State call Agent"
# TODO Fix known thread leaks | ||
thread:ClusterFeature::startHeartbeatThread | ||
thread:CacheManagerFeature::start |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be fixed by cluster team
Also not critical them leaked after stop application
4df7985
to
7bc64a2
Compare
std::mutex m; | ||
std::condition_variable cv; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is bug. What if timeout and thenFinal callback still not called
FutureStatus wait_until( | ||
const std::chrono::time_point<Clock, Duration>& timeout_time) { | ||
detail::waitImpl(*this, timeout_time); | ||
return FutureStatus::Ready; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice)
/// waits for the result, returns if it is not available | ||
/// for the specified timeout duration. Future must be valid | ||
template<class Rep, class Period> | ||
FutureStatus wait_for( | ||
const std::chrono::duration<Rep, Period>& timeout_duration) { | ||
return wait_until(std::chrono::steady_clock::now() + timeout_duration); | ||
} | ||
|
||
/// waits for the result, returns if it is not available until | ||
/// specified time point. Future must be valid | ||
template<class Clock, class Duration> | ||
FutureStatus wait_until( | ||
const std::chrono::time_point<Clock, Duration>& timeout_time) { | ||
detail::waitImpl(*this, timeout_time); | ||
return FutureStatus::Ready; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
used only in itself tests
@@ -51,7 +51,7 @@ static void tryToConnectExpectFailure(f::EventLoopService& eventLoopService, | |||
[&](f::Error, std::unique_ptr<f::Request>, | |||
std::unique_ptr<f::Response>) {}); | |||
|
|||
auto success = wg.wait_for(std::chrono::seconds(5)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
failed locally for vst
arangod/Aql/ClusterQuery.cpp
Outdated
try { | ||
_traversers.clear(); | ||
} catch (...) { | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO this should be kept in the ClusterQuery dtor as before (then we can also keep the traversers private).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
35fbf4f
to
556d501
Compare
(cherry picked from commit af11ab0)
* Fix few tsan issues (#16710) (cherry picked from commit af11ab0) * Fix data race on query vptr * Update arangod/Aql/Query.h Co-authored-by: Jan <jsteemann@users.noreply.github.com> * Apply suggestions from code review * Destroy query before destroy traverses Co-authored-by: Jan <jsteemann@users.noreply.github.com> Co-authored-by: Vadim Kondratyev <vadim@arangodb.com>
Scope & Purpose
Checklist
Related Information
(Please reference tickets / specification / other PRs etc)