Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ISPN-4626 Race condition in the LocalEntryRetriever between iterator() a... #2791

Conversation

gustavocoding
Copy link

@@ -333,34 +333,33 @@ public Itr(int batchSize) {
@Override
public boolean hasNext() {
boolean hasNext = !queue.isEmpty();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This no longer needs an initial value.

@anistor
Copy link
Member

anistor commented Aug 11, 2014

I believe this PR also solves ISPN-4594, right?

@gustavocoding
Copy link
Author

Yes, and all the failures in the indexless queries tests which sometimes returns no results, not sure if they are all being tracked in Jiras

@anistor
Copy link
Member

anistor commented Aug 11, 2014

I've removed the inital value of hasNext and integrated. Thanks!

@anistor anistor closed this Aug 11, 2014
@@ -333,34 +333,33 @@ public Itr(int batchSize) {
@Override
public boolean hasNext() {
boolean hasNext = !queue.isEmpty();
if (!hasNext && !completed) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the exact case here? I think we are adding a lot of extra contention by always aquiring the lock. I think what was happening is the following which is different than what you stated.

thread 1 sees nothing in queue so hasNext was false
thread 1 is put to sleep
thread 2 adds values and then completes the iterator
thread 1 wakes up and sees completed as true
thread 1 then evaluates this if block and finds no changes.

I am thinking all the change that is required is to remove && !completed. So this way it would force a recheck of the queue.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The case is, in a foreach loop involving the entryRetriever, iterator() and hasNext() are called implicitly one after another. The iterator construction schedules a thread to start iterating through the data container entries, and sometimes it gets delayed, so a call to hasNext returns false and the client code fails, since the foreach loop is never executed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only way I can see false being returned erroneously is the case I described unless you thought of something else? The sleep doesn't even have to occur either it could just be really good timing :)

@anistor anistor reopened this Aug 11, 2014
@gustavocoding
Copy link
Author

PR udpate with @wburns suggestion: seems to solve the race condition with less contention

@wburns
Copy link
Member

wburns commented Aug 11, 2014

Looks like this needs rebase and undoing the previous change.

@gustavocoding
Copy link
Author

Rebased

@wburns
Copy link
Member

wburns commented Aug 11, 2014

Pulling...

@wburns
Copy link
Member

wburns commented Aug 11, 2014

Integrated into master, thanks @gustavonalle !

@wburns wburns closed this Aug 11, 2014
@gustavocoding gustavocoding deleted the entryretriever-indexless branch November 27, 2014 10:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants