New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DefaultPromise may notify Listeners in wrong order #2157

Closed
wants to merge 3 commits into
base: 4.0
from

Conversation

Projects
None yet
2 participants
@normanmaurer
Member

normanmaurer commented Jan 27, 2014

This pull-req is not complete yet but show a race in DefaultPromise which can lead to have FutureListener notified in a wrong order.

One way to fix this is to enlarge the scope of the synchronization but I would like to think about some better way to fix it.

This can happen if addListener is called from a different thread then the EventExecutor that is used by the DefaultPromise itself and the DefaultPromise is notified while adding these listeners.

@normanmaurer

This comment has been minimized.

Show comment
Hide comment
@normanmaurer

normanmaurer Jan 27, 2014

Member

@trustin any good idea ?

Member

normanmaurer commented Jan 27, 2014

@trustin any good idea ?

@ghost

This comment has been minimized.

Show comment
Hide comment
@ghost

ghost Jan 27, 2014

Build result for #2157 at f60eeac: Success

ghost commented Jan 27, 2014

Build result for #2157 at f60eeac: Success

@ghost

This comment has been minimized.

Show comment
Hide comment
@ghost

ghost Feb 3, 2014

Build result for #2157 at 6ca8e87b461260f37f4ada9a4b48aac5d398c6ea: Failure

ghost commented Feb 3, 2014

Build result for #2157 at 6ca8e87b461260f37f4ada9a4b48aac5d398c6ea: Failure

@normanmaurer

This comment has been minimized.

Show comment
Hide comment
@normanmaurer

normanmaurer Feb 3, 2014

Member

@trustin please review... we could also make use of Unsafe (if present) to replace Atomic_FieldUpdater as Unsafe is faster because it does not do any validation (which Atomic_FieldUpdater) does.

Member

normanmaurer commented Feb 3, 2014

@trustin please review... we could also make use of Unsafe (if present) to replace Atomic_FieldUpdater as Unsafe is faster because it does not do any validation (which Atomic_FieldUpdater) does.

@normanmaurer

This comment has been minimized.

Show comment
Hide comment
@normanmaurer
Member

normanmaurer commented Feb 3, 2014

Here they also tell about the performance difference:
https://groups.google.com/forum/#!msg/mechanical-sympathy/X-GtLuG0ETo/_KcpM7T8ZX4J

@ghost

This comment has been minimized.

Show comment
Hide comment
@ghost

ghost Feb 3, 2014

Build result for #2157 at 60489c1940fe3ff2e308bda4412c97efb51d993d: Success

ghost commented Feb 3, 2014

Build result for #2157 at 60489c1940fe3ff2e308bda4412c97efb51d993d: Success

@normanmaurer

This comment has been minimized.

Show comment
Hide comment
@normanmaurer

normanmaurer Feb 4, 2014

Member

@daschl please also review

Member

normanmaurer commented Feb 4, 2014

@daschl please also review

@ghost

This comment has been minimized.

Show comment
Hide comment
@ghost

ghost Feb 4, 2014

Build result for #2157 at 6ccad82e3d8de2a3b0ae49bfa3706b73a074707c: Success

ghost commented Feb 4, 2014

Build result for #2157 at 6ccad82e3d8de2a3b0ae49bfa3706b73a074707c: Success

@ghost

This comment has been minimized.

Show comment
Hide comment
@ghost

ghost Feb 4, 2014

Build result for #2157 at 72482f0858673a4432a28e8bee8552bf4b3e105a: Success

ghost commented Feb 4, 2014

Build result for #2157 at 72482f0858673a4432a28e8bee8552bf4b3e105a: Success

@normanmaurer

This comment has been minimized.

Show comment
Hide comment
@normanmaurer

normanmaurer Feb 4, 2014

Member

Also about performance. With this branch I get slightly better performance (able to reproduce every time) , most likely because of the Atomic*FieldUpdater changes:

This branch:

[nmaurer@ear]~/wrk% ./wrk  -H 'Host: localhost' -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8' -H 'Connection: keep-alive' -d 120 -c 256 -t 8 --pipeline 16 http://localhost:8080/plaintext
Running 2m test @ http://localhost:8080/plaintext
  8 threads and 256 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     2.09ms    2.96ms  55.53ms   89.68%
    Req/Sec   223.13k    77.56k  403.55k    64.58%
  199582048 requests in 2.00m, 26.95GB read
Requests/sec: 1663182.82
Transfer/sec:    229.99MB

4.0 branch:

[nmaurer@ear]~/wrk% ./wrk  -H 'Host: localhost' -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8' -H 'Connection: keep-alive' -d 120 -c 256 -t 8 --pipeline 16 http://localhost:8080/plaintext
Running 2m test @ http://localhost:8080/plaintext
  8 threads and 256 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     2.08ms    2.86ms  52.51ms   89.43%
    Req/Sec   222.70k    77.23k  410.67k    64.37%
  199282272 requests in 2.00m, 26.91GB read
Requests/sec: 1660680.58
Transfer/sec:    229.64MB
Member

normanmaurer commented Feb 4, 2014

Also about performance. With this branch I get slightly better performance (able to reproduce every time) , most likely because of the Atomic*FieldUpdater changes:

This branch:

[nmaurer@ear]~/wrk% ./wrk  -H 'Host: localhost' -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8' -H 'Connection: keep-alive' -d 120 -c 256 -t 8 --pipeline 16 http://localhost:8080/plaintext
Running 2m test @ http://localhost:8080/plaintext
  8 threads and 256 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     2.09ms    2.96ms  55.53ms   89.68%
    Req/Sec   223.13k    77.56k  403.55k    64.58%
  199582048 requests in 2.00m, 26.95GB read
Requests/sec: 1663182.82
Transfer/sec:    229.99MB

4.0 branch:

[nmaurer@ear]~/wrk% ./wrk  -H 'Host: localhost' -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8' -H 'Connection: keep-alive' -d 120 -c 256 -t 8 --pipeline 16 http://localhost:8080/plaintext
Running 2m test @ http://localhost:8080/plaintext
  8 threads and 256 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     2.08ms    2.86ms  52.51ms   89.43%
    Req/Sec   222.70k    77.23k  410.67k    64.37%
  199282272 requests in 2.00m, 26.91GB read
Requests/sec: 1660680.58
Transfer/sec:    229.64MB
@trustin

This comment has been minimized.

Show comment
Hide comment
@trustin

trustin Feb 4, 2014

Member

UnsafeAtomic*FieldUpdater could be extracted out to separate .java files?

Member

trustin commented Feb 4, 2014

UnsafeAtomic*FieldUpdater could be extracted out to separate .java files?

@trustin

View changes

Show outdated Hide outdated common/src/test/java/io/netty/util/concurrent/DefaultPromiseTest.java
@trustin

This comment has been minimized.

Show comment
Hide comment
@trustin

trustin Feb 4, 2014

Member

I wish it does not create a new entry when there's only one listener.

Member

trustin commented Feb 4, 2014

I wish it does not create a new entry when there's only one listener.

@normanmaurer

This comment has been minimized.

Show comment
Hide comment
@normanmaurer

normanmaurer Feb 5, 2014

Member

Will see what I can do... Rest looks good beside your comments and ready to pull in after addressing them?

Am 05.02.2014 um 00:36 schrieb Trustin Lee notifications@github.com:

I wish it does not create a new entry when there's only one listener.


Reply to this email directly or view it on GitHub.

Member

normanmaurer commented Feb 5, 2014

Will see what I can do... Rest looks good beside your comments and ready to pull in after addressing them?

Am 05.02.2014 um 00:36 schrieb Trustin Lee notifications@github.com:

I wish it does not create a new entry when there's only one listener.


Reply to this email directly or view it on GitHub.

@ghost

This comment has been minimized.

Show comment
Hide comment
@ghost

ghost Feb 5, 2014

Build result for #2157 at 91d413322f407e9c49556289ce2a2d9d4f2f7c62: Success

ghost commented Feb 5, 2014

Build result for #2157 at 91d413322f407e9c49556289ce2a2d9d4f2f7c62: Success

@ghost

This comment has been minimized.

Show comment
Hide comment
@ghost

ghost Feb 5, 2014

Build result for #2157 at 85a6b905f3589ac334257929e1b779e4f8ca57c7: Success

ghost commented Feb 5, 2014

Build result for #2157 at 85a6b905f3589ac334257929e1b779e4f8ca57c7: Success

@normanmaurer

This comment has been minimized.

Show comment
Hide comment
@normanmaurer

normanmaurer Feb 5, 2014

Member

@trustin addressed all your comments. Please review again and let me know what you think now ;)

Member

normanmaurer commented Feb 5, 2014

@trustin addressed all your comments. Please review again and let me know what you think now ;)

@ghost

This comment has been minimized.

Show comment
Hide comment
@ghost

ghost Feb 5, 2014

Build result for #2157 at 7de37ca4c29d4ed8bdf59d2d7cc9403167bb3abb: Success

ghost commented Feb 5, 2014

Build result for #2157 at 7de37ca4c29d4ed8bdf59d2d7cc9403167bb3abb: Success

@normanmaurer

This comment has been minimized.

Show comment
Hide comment
@normanmaurer

normanmaurer Feb 6, 2014

Member

@trustin ok to merge?

Member

normanmaurer commented Feb 6, 2014

@trustin ok to merge?

@ghost

This comment has been minimized.

Show comment
Hide comment
@ghost

ghost Feb 6, 2014

Build result for #2157 at a8e84db1eabcb2e6559a23c4f655ce411f163afc: Success

ghost commented Feb 6, 2014

Build result for #2157 at a8e84db1eabcb2e6559a23c4f655ce411f163afc: Success

Norman Maurer added some commits Jan 27, 2014

Norman Maurer
Rework the DefaulPromise to eliminate race reported as [#2157]
This also remove some more synchronization and make heavy use of atomic operations now. The downside is that it creates more objects when FutureListeners are added. But this was the only performant way I could find to still make sure the order is correct while still not do heavy synchronization.
Norman Maurer
Provide an optimized AtomicIntegerFieldUpdater, AtomicLongFieldUpdate…
…r and AtomicReferenceFieldUpdater

Make use of these in the new DefaultPromise but also in other areas.
@ghost

This comment has been minimized.

Show comment
Hide comment
@ghost

ghost Feb 6, 2014

Build result for #2157 at 2fbc42a: Success

ghost commented Feb 6, 2014

Build result for #2157 at 2fbc42a: Success

@normanmaurer

This comment has been minimized.

Show comment
Hide comment
@normanmaurer

normanmaurer Feb 6, 2014

Member

Squashed into 2 commits

Member

normanmaurer commented Feb 6, 2014

Squashed into 2 commits

if (!isDone()) {
if (listeners == null) {
listeners = listener;
Entry newEntry = new Entry(listener);

This comment has been minimized.

@trustin

trustin Feb 6, 2014

Member

Looks like you create a new Entry even if listener is the first one.

@trustin

trustin Feb 6, 2014

Member

Looks like you create a new Entry even if listener is the first one.

PROGRESSIVE_SIZE_UPDATER.decrementAndGet(this);
}
break;
}

This comment has been minimized.

@trustin

trustin Feb 6, 2014

Member

This loop looks weird to me. The first if (entry == null || ...) block and the second if (...) { if (...) { block are pretty much same except that the second one decreases the counter, which means the second block will never be evaluated.

@trustin

trustin Feb 6, 2014

Member

This loop looks weird to me. The first if (entry == null || ...) block and the second if (...) { if (...) { block are pretty much same except that the second one decreases the counter, which means the second block will never be evaluated.

}
boolean isRemoved() {
return curr == REMOVED_LISTENER;

This comment has been minimized.

@trustin

trustin Feb 6, 2014

Member

Could you help me understand why null is not used here?

@trustin

trustin Feb 6, 2014

Member

Could you help me understand why null is not used here?

// This can either be GenericListener or Entry

This comment has been minimized.

@trustin

trustin Feb 6, 2014

Member
  • This comment should be for entry.
  • s/GenericListener/GenericFutureListener/
  • Please use /** ... */ and {@link ...}
@trustin

trustin Feb 6, 2014

Member
  • This comment should be for entry.
  • s/GenericListener/GenericFutureListener/
  • Please use /** ... */ and {@link ...}
}
}
return this;
entry = entry.next;

This comment has been minimized.

@trustin

trustin Feb 6, 2014

Member
  • This loop is vulnerable to the race where more than on thread attempt to append a listener. When one thread succeeds to append a listener, the other thread can dereference it and thus we lose it.
  • Also, you are traversing the linked list sequentially, which is potentially slow. Perhaps we could maintain the reference to the last listener?
@trustin

trustin Feb 6, 2014

Member
  • This loop is vulnerable to the race where more than on thread attempt to append a listener. When one thread succeeds to append a listener, the other thread can dereference it and thus we lose it.
  • Also, you are traversing the linked list sequentially, which is potentially slow. Perhaps we could maintain the reference to the last listener?

This comment has been minimized.

@normanmaurer

normanmaurer Feb 6, 2014

Member

Yes... but this will make it even more hard to not use synchronized and I though it should not be a big problem. Maybe I could also just use synchronized and be happy ;)

@normanmaurer

normanmaurer Feb 6, 2014

Member

Yes... but this will make it even more hard to not use synchronized and I though it should not be a big problem. Maybe I could also just use synchronized and be happy ;)

@trustin

This comment has been minimized.

Show comment
Hide comment
@trustin

trustin Feb 6, 2014

Member

How about this:

  1. Add a volatile field that tells if the primary notification is actually made. It will be set from the executor thread when all listeners added so far were notified (see notifyListeners()).
  2. In addListener() we check if the volatile field is set. If set, it is safe to use the current execution path. If not set, always use executor.execute() for notification.
  3. In the notification task we created in addListener(), we always check if the primary notification is made by checking the volatile field, and then re-schedule the task instead of making a notification. If the volatile field is set, we are safe to notify immediately here.

This basically uses executor's task queue for ordering.

Member

trustin commented Feb 6, 2014

How about this:

  1. Add a volatile field that tells if the primary notification is actually made. It will be set from the executor thread when all listeners added so far were notified (see notifyListeners()).
  2. In addListener() we check if the volatile field is set. If set, it is safe to use the current execution path. If not set, always use executor.execute() for notification.
  3. In the notification task we created in addListener(), we always check if the primary notification is made by checking the volatile field, and then re-schedule the task instead of making a notification. If the volatile field is set, we are safe to notify immediately here.

This basically uses executor's task queue for ordering.

@ghost

This comment has been minimized.

Show comment
Hide comment
@ghost

ghost Feb 6, 2014

Build result for #2157 at d3a1274: Success

ghost commented Feb 6, 2014

Build result for #2157 at d3a1274: Success

@normanmaurer

This comment has been minimized.

Show comment
Hide comment
@normanmaurer

normanmaurer Feb 6, 2014

Member

@trustin tried this and did not work out so far. Not sure if we are just trying to be too smart..

Member

normanmaurer commented Feb 6, 2014

@trustin tried this and did not work out so far. Not sure if we are just trying to be too smart..

@normanmaurer

This comment has been minimized.

Show comment
Hide comment
@normanmaurer

normanmaurer Feb 6, 2014

Member

I will check again this evening about this issue..

Member

normanmaurer commented Feb 6, 2014

I will check again this evening about this issue..

@trustin

This comment has been minimized.

Show comment
Hide comment
@trustin

trustin Feb 7, 2014

Member

#2186 shows an alternative fix

Member

trustin commented Feb 7, 2014

#2186 shows an alternative fix

@normanmaurer

This comment has been minimized.

Show comment
Hide comment
@normanmaurer

normanmaurer Feb 7, 2014

Member

Was fixed by 309ee68

Member

normanmaurer commented Feb 7, 2014

Was fixed by 309ee68

@normanmaurer normanmaurer deleted the promise_race branch Feb 8, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment