Someone intrest this result? #55

qinxian · 2013-06-12T18:50:54Z

This is some result abount OnePublisherToOneProcessorRawBatchThroughputTest
Run 0, Disruptor=806,451,612 ops/sec
Run 1, Disruptor=821,018,062 ops/sec
Run 2, Disruptor=1,122,964,626 ops/sec
Run 3, Disruptor=1,164,144,353 ops/sec
Run 4, Disruptor=1,133,144,475 ops/sec
Run 5, Disruptor=1,186,239,620 ops/sec
Run 6, Disruptor=1,175,088,131 ops/sec
Run 7, Disruptor=1,143,510,577 ops/sec
Run 8, Disruptor=1,174,398,120 ops/sec
Run 9, Disruptor=1,153,402,537 ops/sec
Run 10, Disruptor=1,154,068,090 ops/sec
Run 11, Disruptor=1,175,088,131 ops/sec
Run 12, Disruptor=1,133,144,475 ops/sec
Run 13, Disruptor=1,049,868,766 ops/sec
Run 14, Disruptor=1,094,690,749 ops/sec
Run 15, Disruptor=1,164,144,353 ops/sec
Run 16, Disruptor=1,186,239,620 ops/sec
Run 17, Disruptor=1,219,512,195 ops/sec
Run 18, Disruptor=1,207,729,468 ops/sec
Run 19, Disruptor=1,196,888,090 ops/sec
In my memory, the best result I saw, is 1,500M o/s, just by some try
//mybusyspin
Run 0, Disruptor=1,583,531,274 ops/sec
Run 1, Disruptor=1,454,545,454 ops/sec
Run 2, Disruptor=1,803,426,510 ops/sec
Run 3, Disruptor=1,728,608,470 ops/sec
Run 4, Disruptor=1,777,777,777 ops/sec
Run 5, Disruptor=1,728,608,470 ops/sec
Run 6, Disruptor=1,908,396,946 ops/sec
Run 7, Disruptor=1,754,385,964 ops/sec
Run 8, Disruptor=1,937,984,496 ops/sec
Run 9, Disruptor=1,706,484,641 ops/sec
Run 10, Disruptor=1,801,801,801 ops/sec
Run 11, Disruptor=1,776,198,934 ops/sec
Run 12, Disruptor=1,855,287,569 ops/sec
Run 13, Disruptor=1,828,153,564 ops/sec
Run 14, Disruptor=1,471,670,345 ops/sec
Run 15, Disruptor=1,801,801,801 ops/sec
Run 16, Disruptor=1,752,848,378 ops/sec
Run 17, Disruptor=1,910,219,675 ops/sec
Run 18, Disruptor=1,828,153,564 ops/sec
Run 19, Disruptor=1,855,287,569 ops/sec

Hahha, this is my new modest-lock, applied for both end
Run 0, Concurrentor=1,644,736,842 ops/sec
Run 1, Concurrentor=1,640,689,089 ops/sec
Run 2, Concurrentor=1,968,503,937 ops/sec
Run 3, Concurrentor=1,968,503,937 ops/sec
Run 4, Concurrentor=1,968,503,937 ops/sec
Run 5, Concurrentor=1,968,503,937 ops/sec
Run 6, Concurrentor=1,998,001,998 ops/sec
Run 7, Concurrentor=1,968,503,937 ops/sec
Run 8, Concurrentor=1,968,503,937 ops/sec
Run 9, Concurrentor=1,968,503,937 ops/sec
Run 10, Concurrentor=2,000,000,000 ops/sec
Run 11, Concurrentor=1,968,503,937 ops/sec
Run 12, Concurrentor=1,968,503,937 ops/sec
Run 13, Concurrentor=1,968,503,937 ops/sec
Run 14, Concurrentor=1,968,503,937 ops/sec
Run 15, Concurrentor=2,000,000,000 ops/sec
Run 16, Concurrentor=2,000,000,000 ops/sec
Run 17, Concurrentor=2,000,000,000 ops/sec
Run 18, Concurrentor=1,968,503,937 ops/sec
Run 19, Concurrentor=1,968,503,937 ops/sec

so, does it have some xxx?

mikeb01 · 2013-06-12T21:18:47Z

Actually that test is a little bit of inside joke on my part, demonstrating how you can lie with a benchmark. All it is doing is testing how fast the sequencer can signal the consumer, but only on every 10th update. It doesn't do any actual useful work.

How does the modest-lock fair with the OnePublisherToOneProcessorUniCastThroughputTest? Also do you have a link to the code?

If all I wanted to do was improve that test I could just have one thread polling a sequence with another thread updating it in batches of 10.

Existing Disruptor:
Starting Disruptor tests
Run 0, Disruptor=1,998,001,998 ops/sec
Run 1, Disruptor=2,079,002,079 ops/sec
Run 2, Disruptor=2,157,497,303 ops/sec
Run 3, Disruptor=2,114,164,904 ops/sec
Run 4, Disruptor=2,152,852,529 ops/sec
Run 5, Disruptor=2,205,071,664 ops/sec
Run 6, Disruptor=3,577,817,531 ops/sec
Run 7, Disruptor=3,546,099,290 ops/sec
Run 8, Disruptor=3,610,108,303 ops/sec

Simple polling code:
Starting Disruptor tests
Run 0, Disruptor=6,191,950,464 ops/sec
Run 1, Disruptor=6,042,296,072 ops/sec
Run 2, Disruptor=6,369,426,751 ops/sec
Run 3, Disruptor=6,289,308,176 ops/sec
Run 4, Disruptor=6,389,776,357 ops/sec

qinxian · 2013-06-13T03:51:51Z

OnePublisherToOneProcessorRawBatchThroughputTest
Seems both batch 10, a little improvement in my machine. strange! should be ~=10x
Run 0, Disruptor=871,839,581 ops/sec
Run 1, Disruptor=811,030,008 ops/sec
Run 2, Disruptor=1,231,527,093 ops/sec
Run 3, Disruptor=1,320,132,013 ops/sec
Run 4, Disruptor=1,320,132,013 ops/sec
Run 5, Disruptor=1,185,536,455 ops/sec
Run 6, Disruptor=1,143,510,577 ops/sec
Run 7, Disruptor=1,067,235,859 ops/sec
Run 8, Disruptor=1,243,008,079 ops/sec
Run 9, Disruptor=1,268,230,818 ops/sec
Run 10, Disruptor=1,391,788,448 ops/sec
Run 11, Disruptor=1,334,222,815 ops/sec
Run 12, Disruptor=1,320,132,013 ops/sec
Run 13, Disruptor=1,292,824,822 ops/sec
Run 14, Disruptor=1,255,492,780 ops/sec
Run 15, Disruptor=1,267,427,122 ops/sec
Run 16, Disruptor=1,219,512,195 ops/sec
Run 17, Disruptor=1,243,008,079 ops/sec
Run 18, Disruptor=1,164,144,353 ops/sec
Run 19, Disruptor=1,085,187,194 ops/sec

OK lets back to the 10-1 pattern.
Indeed the modest-lock very simple,just like this:

if((counter&1)==1) Thread.yield();
return counter-1;

This result after applied the modest-lock to this test, still use write10:read1 pattern
Run 0, Disruptor=1,257,071,024 ops/sec
Run 1, Disruptor=1,292,824,822 ops/sec
Run 2, Disruptor=1,579,778,830 ops/sec
Run 3, Disruptor=1,542,020,046 ops/sec
Run 4, Disruptor=1,506,024,096 ops/sec
Run 5, Disruptor=1,581,027,667 ops/sec
Run 6, Disruptor=1,523,229,246 ops/sec
Run 7, Disruptor=1,506,024,096 ops/sec
Run 8, Disruptor=1,471,670,345 ops/sec
Run 9, Disruptor=1,471,670,345 ops/sec
Run 10, Disruptor=1,506,024,096 ops/sec
Run 11, Disruptor=1,542,020,046 ops/sec
Run 12, Disruptor=1,542,020,046 ops/sec
Run 13, Disruptor=1,543,209,876 ops/sec
Run 14, Disruptor=1,506,024,096 ops/sec
Run 15, Disruptor=1,542,020,046 ops/sec
Run 16, Disruptor=1,506,024,096 ops/sec
Run 17, Disruptor=1,454,545,454 ops/sec
Run 18, Disruptor=1,543,209,876 ops/sec
Run 19, Disruptor=1,506,024,096 ops/sec

change SingleProduceSequencer to next reserve operation by Thread.yield instead of parkNanos
Run 0, Disruptor=1,257,071,024 ops/sec
Run 1, Disruptor=1,292,824,822 ops/sec
Run 2, Disruptor=1,579,778,830 ops/sec
Run 3, Disruptor=1,542,020,046 ops/sec
Run 4, Disruptor=1,506,024,096 ops/sec
Run 5, Disruptor=1,581,027,667 ops/sec
Run 6, Disruptor=1,523,229,246 ops/sec
Run 7, Disruptor=1,506,024,096 ops/sec
Run 8, Disruptor=1,471,670,345 ops/sec
Run 9, Disruptor=1,471,670,345 ops/sec
Run 10, Disruptor=1,506,024,096 ops/sec
Run 11, Disruptor=1,542,020,046 ops/sec
Run 12, Disruptor=1,542,020,046 ops/sec
Run 13, Disruptor=1,543,209,876 ops/sec
Run 14, Disruptor=1,506,024,096 ops/sec
Run 15, Disruptor=1,542,020,046 ops/sec
Run 16, Disruptor=1,506,024,096 ops/sec
Run 17, Disruptor=1,454,545,454 ops/sec
Run 18, Disruptor=1,543,209,876 ops/sec
Run 19, Disruptor=1,506,024,096 ops/sec

apply the modest-lock to the next reserve operation:
Run 0, Disruptor=1,490,312,965 ops/sec
Run 1, Disruptor=1,489,203,276 ops/sec
Run 2, Disruptor=1,662,510,390 ops/sec
Run 3, Disruptor=1,683,501,683 ops/sec
Run 4, Disruptor=1,662,510,390 ops/sec
Run 5, Disruptor=1,683,501,683 ops/sec
Run 6, Disruptor=1,683,501,683 ops/sec
Run 7, Disruptor=1,662,510,390 ops/sec
Run 8, Disruptor=1,662,510,390 ops/sec
Run 9, Disruptor=1,662,510,390 ops/sec
Run 10, Disruptor=1,661,129,568 ops/sec
Run 11, Disruptor=1,662,510,390 ops/sec
Run 12, Disruptor=1,662,510,390 ops/sec
Run 13, Disruptor=1,662,510,390 ops/sec
Run 14, Disruptor=1,683,501,683 ops/sec
Run 15, Disruptor=1,683,501,683 ops/sec
Run 16, Disruptor=1,662,510,390 ops/sec
Run 17, Disruptor=1,684,919,966 ops/sec
Run 18, Disruptor=1,683,501,683 ops/sec
Run 19, Disruptor=1,684,919,966 ops/sec

BTW: a strange feeling: Indeed, we just do guess for the OS scheduler program.
All we need is the right scheduler, but ...

qinxian · 2013-06-13T07:41:52Z

I created a gist at here:https://gist.github.com/qinxian/5771879

mikeb01 · 2013-06-13T08:03:16Z

Are you running with HyperThreading enabled?

qinxian · 2013-06-13T08:37:39Z

NO!
AMD x3:)

qinxian · 2013-06-13T08:40:01Z

BTW, I tried the JDK8 @contended annotation at field.
seems the padding volatile long field faster than the long[] implementation.

mikeb01 · 2013-06-14T03:31:04Z

The different I see with ModestLock is not as marked as your results and the difference on the OnePublisherToOneProcessorUniCastThroughputTest is lower than the noise. I think these small optimisations will vary between hardware platforms. One of reasons we made the WaitStrategy pluggable is to allow these types of optimisations. If it speeds up your system end to end, then go for it, but don't base your decision on the OnePublisherToOneProcessorRawBatchThroughputTest, as it doesn't test anything useful, base it on your own macro-benchmarks.

I've also had a go with @contended, didn't make a massive difference, but it should be a little bit quicker as it would remove one indirection. Unfortunately it will be a while before Java 8 is the standard. I might do a Java 8 specific version if there is enough interest.

qinxian · 2013-06-14T04:49:00Z

Expected when P10-C10 pattern improvement 2.X over P10-C1 pattern in your result.

Indeed I always use JDK8 with win8 on AMD HT.
At these cases, If both ends employ modest-lock, the version work at 2.X effect.
so some deduction on the spec-relative seems: Multiple publisher can obtain a profit from similar way.
From above messages, seems you test for Intel HT. so seems it's IHT vs. AHT.
but I still intrest the Intel HT modest-lock test results at now.

Of cause, the results just only an one-test-case. As to if useful or useless, depends.
but it does be some kind of reference, right?

BTW, like before "guess" words, some sadness on the kernel. One like me, only works at high level, no some willing to lower level, maybe cannot, maybe kernel cannot. A real world!

BTW, you do plan refactor the WaitStrategy to more generalized? I did some work.

mikeb01 · 2013-06-15T07:41:17Z

I'm going to close this as it not really an issue just a discussion, which can happen on the google groups page.

mikeb01 closed this as completed Jun 15, 2013

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Someone intrest this result? #55

Someone intrest this result? #55

qinxian commented Jun 12, 2013

mikeb01 commented Jun 12, 2013

qinxian commented Jun 13, 2013

qinxian commented Jun 13, 2013

mikeb01 commented Jun 13, 2013

qinxian commented Jun 13, 2013

qinxian commented Jun 13, 2013

mikeb01 commented Jun 14, 2013

qinxian commented Jun 14, 2013

mikeb01 commented Jun 15, 2013

Someone intrest this result? #55

Someone intrest this result? #55

Comments

qinxian commented Jun 12, 2013

mikeb01 commented Jun 12, 2013

qinxian commented Jun 13, 2013

qinxian commented Jun 13, 2013

mikeb01 commented Jun 13, 2013

qinxian commented Jun 13, 2013

qinxian commented Jun 13, 2013

mikeb01 commented Jun 14, 2013

qinxian commented Jun 14, 2013

mikeb01 commented Jun 15, 2013