Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race condition in AsyncFileImpl writesOutstanding #2012

Closed
purplefox opened this issue Jun 8, 2017 · 10 comments · Fixed by #2017
Closed

Race condition in AsyncFileImpl writesOutstanding #2012

purplefox opened this issue Jun 8, 2017 · 10 comments · Fixed by #2017
Assignees

Comments

@purplefox
Copy link
Contributor

If an AsyncFileImpl is written to from a thread different to the context that created then the long counter writesOutstanding can get into an invalid state. This is because the counter is updated from both the caller's thread and also on the context when a write is complete.

Fix is to use an AtomicLong for writesOutstanding:

https://gist.github.com/purplefox/723fa69a1330fb4f4f74d4a5dd5e6658

@vietj
Copy link
Member

vietj commented Jun 8, 2017

a couple of questions:

  • can you tell the unsynchronized write that happens ? is it the handler passed to the doWrite method that seems unsynchronized when called back
  • is the atomicity of the long necessary ?
  • could it be fixed using synchronized instead or setting the field as volatile ?

@vietj
Copy link
Member

vietj commented Jun 8, 2017

          context.runOnContext((v) -> {
            writesOutstanding -= buff.limit();
            handler.handle(Future.succeededFuture());
          });

@purplefox
Copy link
Contributor Author

I think the issue is that writeOutstanding can be increased here:

https://github.com/eclipse/vert.x/blob/master/src/main/java/io/vertx/core/file/impl/AsyncFileImpl.java#L375

At the same time it is decreased here:

https://github.com/eclipse/vert.x/blob/master/src/main/java/io/vertx/core/file/impl/AsyncFileImpl.java#L395

If write is called from a different thread to the context thread.

I tried using synchronized to fix initially but this resulted in a deadlock. Volatile can't be used as volatile adds aren't atomic.

@vietj
Copy link
Member

vietj commented Jun 9, 2017

can you provide a reproducer ?

@purplefox
Copy link
Contributor Author

Sure, will try and put one together :)

@purplefox
Copy link
Contributor Author

Here's a reproducer that can run inside vertx-core FileSystemTest:

https://gist.github.com/purplefox/6de14775d331395ceeb01168367715f4

It needs to be run with many iterations to reproduce (hence the @repeat)

@purplefox
Copy link
Contributor Author

If you run the test enough times you will find it hangs eventually, this is because the close() method never completes. It doesn't complete because it is waiting for an invalid value of outstanding writes.

@vietj
Copy link
Member

vietj commented Jun 9, 2017

makes sense thank!

@vietj
Copy link
Member

vietj commented Jun 11, 2017

I pushed a branch with a fix, I used synchronized because that's the class synchronization pattern and also it was necessary to read the closeDeferred field that is set in closeInternal using synchronization. The synchronized block does not propagate to handlers to avoid deadlocks.

@vietj
Copy link
Member

vietj commented Jun 12, 2017

thanks for the report, the patch and the reproducer!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants