New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

src: avoid race condition in tracing code #25624

Closed
wants to merge 2 commits into
base: master
from

Conversation

Projects
None yet
6 participants
@addaleax
Copy link
Member

addaleax commented Jan 21, 2019

json_trace_writer_ is protected by stream_mutex_,
but one access to it was not guarded by a lock on said mutex.

Refs: #25512

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • commit message follows commit guidelines
@addaleax

This comment has been minimized.

Copy link
Member Author

addaleax commented Jan 21, 2019

Stress test CI for parallel/test-trace-events-fs-sync with this PR: https://ci.nodejs.org/job/node-stress-single-test/2142/
Regular CI: https://ci.nodejs.org/job/node-test-pull-request/20250/

@bnoordhuis
Copy link
Member

bnoordhuis left a comment

Aside: is the call to WriteFile() on line 136 sound? FlushPrivate() reads num_write_requests_ while holding the lock, then passes that value to WriteFile() after dropping the lock. It looks like you could observe duplicate request ids.

if (!json_trace_writer_) {
return;
{
Mutex::ScopedLock stream_mutex_lock(stream_mutex_);

This comment has been minimized.

@bnoordhuis

bnoordhuis Jan 22, 2019

Member

Is it necessary to nest locks?1 And if so, can you perhaps add a comment explaining why? It's currently kind of hard to eyeball whether this is correct or a deadlock waiting to happen.

It would be nice to have an abstraction that enforces that locks are only taken out in a specific order so you don't run into AB-BA deadlocks.

1 I think the answer is 'yes' because otherwise there's a race window.

This comment has been minimized.

@addaleax

addaleax Jan 27, 2019

Author Member

Is it necessary to nest locks?1

I think so, yes, for the reason you mentioned. But I am not a hundred percent sure either. I have a hard time figuring out this code myself as well.

And if so, can you perhaps add a comment explaining why? It's currently kind of hard to eyeball whether this is correct or a deadlock waiting to happen.

Added a comment; I did check that the locks are never required in another order, so this shouldn’t be making anything worse

addaleax added some commits Jan 21, 2019

src: avoid race condition in tracing code
`json_trace_writer_` is protected by `stream_mutex_`,
but one access to it was not guarded by a lock on said mutex.

Refs: #25512
@addaleax

This comment has been minimized.

Copy link
Member Author

addaleax commented Jan 27, 2019

is the call to WriteFile() on line 136 sound? FlushPrivate() reads num_write_requests_ while holding the lock, then passes that value to WriteFile() after dropping the lock. It looks like you could observe duplicate request ids.

I can’t say for sure. /cc @nodejs/trace-events

@addaleax addaleax force-pushed the addaleax:tracing-race branch from faf7361 to acabd3e Jan 27, 2019

@addaleax

This comment has been minimized.

Copy link
Member Author

addaleax commented Jan 27, 2019

@addaleax

This comment has been minimized.

Copy link
Member Author

addaleax commented Jan 27, 2019

Landed in 06da364

@addaleax addaleax closed this Jan 27, 2019

@addaleax addaleax deleted the addaleax:tracing-race branch Jan 27, 2019

addaleax added a commit that referenced this pull request Jan 27, 2019

src: avoid race condition in tracing code
`json_trace_writer_` is protected by `stream_mutex_`,
but one access to it was not guarded by a lock on said mutex.

Refs: #25512

PR-URL: #25624
Reviewed-By: Denys Otrishko <shishugi@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Richard Lau <riclau@uk.ibm.com>

addaleax added a commit that referenced this pull request Jan 28, 2019

src: avoid race condition in tracing code
`json_trace_writer_` is protected by `stream_mutex_`,
but one access to it was not guarded by a lock on said mutex.

Refs: #25512

PR-URL: #25624
Reviewed-By: Denys Otrishko <shishugi@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Richard Lau <riclau@uk.ibm.com>

@targos targos referenced this pull request Jan 29, 2019

Merged

v11.9.0 proposal #25802

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment