Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ISPN-3665 SingleFileStore is not thread-safe for passivation #2180

Closed
wants to merge 1 commit into from

Conversation

pferraro
Copy link
Member

No description provided.

@pferraro pferraro closed this Oct 26, 2013
@pferraro pferraro reopened this Oct 26, 2013
<xs:attribute name="fsyncMode" type="tns:fsyncMode" default="DEFAULT">
<xs:annotation>
<xs:documentation>
Configures how the file changes will be synchronized with the underlying file system. This property has three possible values (The default mode configured is DEFAULT)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be updated to be 2 instead of 3.

@danberindei
Copy link
Member

The PR should also include the test.

@pferraro
Copy link
Member Author

pferraro commented Nov 4, 2013

@danberindei I've rewritten the fix to address the underlying thread-safety issue, rather than working around it via forcing disk syncing. I've also included the test that reproduced the issue and validates the fix.

@mmarkus
Copy link
Contributor

mmarkus commented Nov 7, 2013

@pferraro test is failing on my machine https://gist.github.com/mmarkus/7352945

@pferraro
Copy link
Member Author

pferraro commented Nov 7, 2013

I see the problem. Another thread can still modify the file entry after FileEntry.waitUnlocked() succeeds. So the waitUnlocked() AND file entry invalidation needs to happen while holding the fileList monitor.
Also, waitUnlocked() returns silently if interrupted, which is also wrong. The caller would assume that no thread holds a lock, but that's not necessarily true.
In general, the threading logic in this class is really sloppy - synchronizing on the file entry itself should be enough to prevent threads from writing to the entry. I recommend someone on your team revisit this at some point.

@wburns
Copy link
Member

wburns commented Nov 8, 2013

Looking closer this is a bit more complicated. I just created https://issues.jboss.org/browse/ISPN-3693 which is probably causing quite a bit of confusion as there is an issue with concurrent get and evicts with passivation when eviction is done manually. Basically it can cause the entry to be lost in that it is neither in the data container or in the store. Once that is fixed this should probably be revisited.

@pferraro
Copy link
Member Author

pferraro commented Nov 8, 2013

I'm not denying that there aren't other issues with eviction, but nevertheless ISPN-3693 doesn't invalidate this fix.

@wburns
Copy link
Member

wburns commented Nov 8, 2013

@pferraro Oh don't get me wrong, I wasn't trying to say this fix was not needed. I was just saying that there are other issues other than just the SingleFileStore and the test you provided will not work properly unless ISPN-3693 is fixed as well.

And I totally agree that SingleFileStore should be revisited to be even more thread safe than it is atm 👍

@pferraro
Copy link
Member Author

pferraro commented Nov 8, 2013

@mmarkus Is this test still failing for you? I've run in many times in a row with 100% success.

@danberindei
Copy link
Member

@pferraro I have finally managed to reproduce a failure with a test that doesn't involve the ActivationInterceptor, and it looks like the fix here is not sufficient. What happens is that _load() releases the read lock on the FileEntry before unmarshalling, but keeps accessing its members keyLen, dataLen, and metadataLen. Another thread can write the entry to another location, add the FileEntry instance to the free list, and then reuse the FileEntry to store another cache entry. I'm closing this PR and I'll open another one with the new fix and the SingleFileStore-only test.

@wburns Do you still need Paul's test for PR #2203? If possible, I'd prefer a test that uses DummyInMemoryStore instead.

@pferraro
Copy link
Member Author

@danberindei Thanks for looking into this. I suggest replacing the keyLen, dataLen, and metadataLen with a volatile reference to an immutable object. It would help improve the throughput of the store, I think, if concurrent readers didn't have to sync on the same monitor to guarantee thread-safe access to these values.

@wburns
Copy link
Member

wburns commented Nov 12, 2013

@danberindei Created https://issues.jboss.org/browse/ISPN-3699 to cover the tests for the other changes since this PR is closed now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants