Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#3675] Fix livelock issue in MpscLinkedQueue #3684

Closed
wants to merge 1 commit into from
Closed

Conversation

normanmaurer
Copy link
Member

Motivation:

All read operations should be safe to execute from multiple threads which was not the case and so could produce a livelock.

Modifications:

Modify methods so these are safe to be called from multiple threads.

Result:

No more livelock.

@normanmaurer normanmaurer self-assigned this Apr 21, 2015
@normanmaurer normanmaurer added this to the 4.0.28.Final milestone Apr 21, 2015
@normanmaurer
Copy link
Member Author

@trustin @Scottmitch please review...

@gsoltis please check if this fix the problem you reported in #3675


@Override
public Object[] toArray() {
return toList(16).toArray();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the runtime type of toList(..) have to be ArrayList<>? Can you just change the signature of toList(..) to accept a List<..> since it is private? That way we don't have to duplicate the 16 is the default size logic here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Scottmitch sorry but I not understand what you are proposing.. IF I understood you right you may want to have toList accept a List<...>, I just not see why this is better then what we have now. Can you explain ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where did 16 come from? Is this a value we have explicitly decided is good...or are we just using it because it is also the default value of some common collection implementations in Java? If we have justification why this makes sense...then never mind me (but add a comment :) ). Otherwise if we don't have any rational for the size and just choose it because that is what default implementations choose then lets just let the implementation choose the default (i.e. new ArrayList<>()) and change the parameter from a int to a List<>.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Scottmitch The 16 came from the mighty @normanmaurer =P Joking aside I just thought it may not be a bad fit as queues often have some items in there. That said I not care too much about this method as we not use any of these methods that call it in our code-base.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we not use any of these methods that call it in our code-base.

Sounds like the mighty @normanmaurer is setting a trap for our users :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

honestly it is in the internal package... That said if you have a better idea shoot 👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed not a big deal...but why have magic numbers when you can not have them...

Change this:

private List<E> toList(int initialCapacity) {
..
}
@Override
public Object[] toArray() {
    return toList(16).toArray();
}

to this:

private List<E> toList(List<E> elements) {
..
}
private List<E> toList(int initialCapacity) {
  return toList(new ArrayList<E>(initialCapacity));
}
private List<E> toList() {
  return toList(new ArrayList<E>());
}
@Override
public Object[] toArray() {
  return toList().toArray();
}

then also search and replace the other magic 16 number floating around in this file

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@normanmaurer
Copy link
Member Author

@netkins build

@normanmaurer
Copy link
Member Author

@gsoltis do you have any updates?

@gsoltis
Copy link
Contributor

gsoltis commented Apr 27, 2015

Unfortunately, no updates yet from my end. The only thing I can add is that
a few more servers hit the issue, this time after about 2 weeks. Still
working on getting the fix into production.

On Mon, Apr 27, 2015 at 12:18 PM, Norman Maurer notifications@github.com
wrote:

@gsoltis https://github.com/gsoltis do you have any updates?


Reply to this email directly or view it on GitHub
#3684 (comment).

@normanmaurer
Copy link
Member Author

Ok thanks!

On 27 Apr 2015, at 21:26, Greg Soltis notifications@github.com wrote:

Unfortunately, no updates yet from my end. The only thing I can add is that
a few more servers hit the issue, this time after about 2 weeks. Still
working on getting the fix into production.

On Mon, Apr 27, 2015 at 12:18 PM, Norman Maurer notifications@github.com
wrote:

@gsoltis https://github.com/gsoltis do you have any updates?


Reply to this email directly or view it on GitHub
#3684 (comment).


Reply to this email directly or view it on GitHub #3684 (comment).

@@ -197,22 +211,16 @@ public boolean contains(Object o) {
@Override
public Iterator<E> iterator() {
return new Iterator<E>() {
private MpscLinkedQueueNode<E> node = peekNode();
private final Iterator<E> it = toList(16).iterator();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@normanmaurer
Copy link
Member Author

@trustin please review as well

@normanmaurer
Copy link
Member Author

@trustin ping :)

@trustin
Copy link
Member

trustin commented May 4, 2015

@normanmaurer Looks good to me. I wonder if the upstream already has a better fix for this issue. Did you have a chance to talk with him?

@normanmaurer
Copy link
Member Author

@normanmaurer
Copy link
Member Author

@trustin that said as far as I know the upstream queue impl does not reuse "nodes" as head so I think we need to be extra careful, and that is why I think we should use my change ;)

WDYT ?

@trustin
Copy link
Member

trustin commented May 4, 2015

@normanmaurer OK. No worries then.

@normanmaurer
Copy link
Member Author

@trustin cool :) Cherry-pick then ?

@normanmaurer
Copy link
Member Author

@trustin btw let me also add an integer overflow check just to be 100% safe

Motivation:

All read operations should be safe to execute from multiple threads which was not the case and so could produce a livelock.

Modifications:

Modify methods so these are safe to be called from multiple threads.

Result:

No more livelock.
@normanmaurer
Copy link
Member Author

cherry-picked into 4.0 (9f6e06d) , 4.1 (cf66edb) and master (3642fc4)

@normanmaurer normanmaurer deleted the live_lock_fix branch May 6, 2015 04:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants