Skip to content

Make the rexi:stream2 interface unacked message limit configurable#2360

Merged
kocolosk merged 3 commits intoapache:masterfrom
ksnavely:rexi-stream-limit-configurability
Jan 8, 2020
Merged

Make the rexi:stream2 interface unacked message limit configurable#2360
kocolosk merged 3 commits intoapache:masterfrom
ksnavely:rexi-stream-limit-configurability

Conversation

@ksnavely
Copy link
Contributor

@ksnavely ksnavely commented Dec 13, 2019

Overview

Experimentation with the rexi:stream2 interface's unacked message limit demonstrates possible performance gains for multiple database behaviors.

The rexi:stream2 interface is used in a few different behaviors:

  • replication document streaming
  • _changes_feed streaming
  • view rows streaming
  • mango/view rows streaming

Each area deserves performance benchmarking. This change introduces the stream limit configurability but leaves the default value unchanged at 10 unacked messages. Credit to Adam K. and others at Cloudant for recommending the change, I just benchmarked it.

Testing recommendations

The change itself is pretty simple. The impact on performance has more study to go but looks interesting. I performed some benchmarking for view queries but more testing is recommended before actually tuning this parameter in production.

View Query throughput

I benchmarked view queries with some infrastructure we have at Cloudant. This was against a 12 node cluster based on 24 cores, 2x (E5-2690-V3-DodecaCore) CPUs with 110 GiB of memory and 4 ~1TB INTEL SSDSC2BA012T4 SSDs in a RAID10 array. 1000 keys are randomly requested from a 10M document database using the startkey and limit query parameters for 3 minutes.

The default stream limit is 10 unacked messages. I saw interesting results when I lowered that limit to just 5.

q stream_limit mean rows std-dev % ratio 3 std-dev %
8 10 173614106.7 0.12    
8 5 187940682 0.56 1.08251965 1.718138528
32 10 77188191.67 0.22    
32 5 99207830 0.37 1.28527211 1.291394595
96 10 20762084 0.24    
96 5 30782750.67 0.35 1.48264262 1.27314571
192 10 8904000 0.59    
192 5 14036990 0.21 1.57648136 1.878776197

At q=192 we see a 57% view row throughput performance increase. A sample of one iteration at rexi.stream_limit = 10 v. 5 demonstrates an improvement in individual request latency and not just parallelization. Latency units are microseconds.

stream_limit = 5
elapsed,     window,     n,       ops,  min,      mean,       median,   95th,     99th,     99_9th,   max,      errors
10.000281,   10.000281,  561000,  561,  1134573,  1666722.1,  1617352,  2273697,  2353689,  2389964,  2392113,  0
20.00129,    10.001009,  745000,  745,  1144932,  1335189.9,  1326701,  1462245,  1520764,  1567102,  1573410,  0
30.001287,   9.999997,   802000,  802,  1042538,  1258929.2,  1251155,  1402165,  1453030,  1528361,  1536619,  0
40.001317,   10.00003,   800000,  800,  1032665,  1245629.6,  1248567,  1376081,  1415524,  1452826,  1484954,  0
50.001298,   9.999981,   802000,  802,  1014799,  1252657.8,  1253937,  1367812,  1417968,  1481127,  1481189,  0
60.001318,   10.00002,   797000,  797,  1047839,  1254061.5,  1251303,  1393906,  1450908,  1540917,  1562104,  0
70.001353,   10.000035,  803000,  803,  1045775,  1246969.1,  1246831,  1367293,  1421043,  1470623,  1507962,  0
80.001317,   9.999964,   801000,  801,  1025772,  1245143.2,  1242988,  1373814,  1418670,  1465759,  1485672,  0
90.001304,   9.999987,   793000,  793,  1015760,  1255433.4,  1254121,  1376767,  1450516,  1491983,  1496072,  0
100.001315,  10.000011,  795000,  795,  1069197,  1268790.9,  1262832,  1394740,  1450224,  1495376,  1498029,  0
110.001324,  10.000009,  796000,  796,  1056648,  1253952.9,  1250055,  1372161,  1422495,  1454542,  1458376,  0
120.001316,  9.999992,   794000,  794,  1054098,  1252099.9,  1250818,  1375524,  1427505,  1474370,  1486870,  0
130.001286,  9.99997,    801990,  802,  668611,   1248346.3,  1246323,  1374547,  1429222,  1482653,  1491607,  0
140.001319,  10.000033,  787000,  787,  1065847,  1264999.3,  1257818,  1400785,  1463176,  1529058,  1537051,  0
150.001302,  9.999983,   794000,  794,  1051476,  1262683.0,  1258843,  1406323,  1461536,  1512772,  1526105,  0
160.001321,  10.000019,  790000,  790,  1063021,  1268442.6,  1264129,  1401099,  1453433,  1502549,  1531900,  0
170.001315,  9.999994,   788000,  788,  1056898,  1275569.1,  1269725,  1413619,  1458781,  1502920,  1523379,  0
180.001284,  9.999969,   789000,  789,  1024621,  1259355.8,  1255821,  1392030,  1438607,  1495039,  1504616,  0
180.003949,  0.002665,   0,       0,    1024621,  1259355.8,  1255821,  1392030,  1438607,  1495039,  1504616,  0
stream_limit = 10
elapsed,     window,     n,       ops,  min,      mean,       median,   95th,     99th,     99_9th,   max,      errors
10.000518,   10.000518,  386000,  386,  1656105,  2369116.8,  2192431,  3291076,  3462750,  3668092,  3668092,  0
20.00146,    10.000942,  449000,  449,  1728864,  2140757.7,  2115720,  2492391,  2595034,  2706904,  2708989,  0
30.001481,   10.000021,  493000,  493,  1610482,  2046360.9,  2031778,  2376983,  2568696,  2675727,  2710861,  0
40.001396,   9.999915,   507000,  507,  1565729,  1980248.6,  1968823,  2317640,  2457478,  2542027,  2600528,  0
50.001403,   10.000007,  508000,  508,  1581850,  1982387.3,  1963656,  2306622,  2501209,  2547937,  2598457,  0
60.001474,   10.000071,  505000,  505,  1462136,  1993608.8,  1973884,  2336177,  2563048,  2662358,  2743854,  0
70.001446,   9.999972,   500000,  500,  1532174,  1991161.0,  1981341,  2331385,  2494894,  2703529,  2743854,  0
80.001459,   10.000013,  506000,  506,  1477446,  1996017.6,  1970280,  2346799,  2525981,  2612735,  2646609,  0
90.001481,   10.000022,  495000,  495,  1555075,  1991687.4,  1976678,  2320477,  2454469,  2569207,  2607561,  0
100.001478,  9.999997,   510000,  510,  1546882,  1975390.7,  1964618,  2308398,  2436471,  2518250,  2602680,  0
110.00147,   9.999992,   508000,  508,  1544731,  1976731.7,  1958935,  2322606,  2540712,  2679522,  2694847,  0
120.001395,  9.999925,   494000,  494,  1542079,  1998231.5,  1986667,  2325963,  2480229,  2547717,  2611612,  0
130.001477,  10.000082,  505000,  505,  1567037,  1997824.8,  1987945,  2323675,  2466577,  2497205,  2659075,  0
140.001444,  9.999967,   499000,  499,  1595522,  1984275.0,  1955481,  2337220,  2496933,  2593482,  2659075,  0
150.001447,  10.000003,  496000,  496,  1556501,  2011598.6,  1991053,  2371781,  2482612,  2565511,  2699609,  0
160.001468,  10.000021,  495000,  495,  1565492,  2025735.4,  1998663,  2379759,  2559441,  2638932,  2715489,  0
170.001483,  10.000015,  493000,  493,  1523741,  2026562.1,  2012797,  2360097,  2549477,  2606274,  2709912,  0
180.001402,  9.999919,   493000,  493,  1610915,  2014766.8,  1985032,  2373448,  2531224,  2597175,  2628447,  0
180.003842,  0.00244,    1000,    1,    1610915,  2014251.3,  1984686,  2373448,  2531224,  2597175,  2628447,  0

Related Issues or Pull Requests

None to my knowledge.

Checklist

  • Code is written and works correctly
  • Changes are covered by tests
  • Any new configurable parameters are documented in rel/overlay/etc/default.ini
  • A PR for documentation changes has been made in https://github.com/apache/couchdb-documentation

@ksnavely
Copy link
Contributor Author

FWIW 1 of the 12 nodes of this cluster was down for maintenance during testing.

Copy link
Contributor

@nickva nickva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuff!

stream2(Msg) ->
stream2(Msg, 10, 300000).
Limit = config:get_integer("rexi", "stream_limit", 10),
stream2(Msg, Limit, 300000).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit, but it might be prudent to go stream/1 --> stream/2 --> stream/3 rather than stream/1 --> stream/3 on the off chance the timeout param in stream/2 is converted from a hardcoded value to a config lookup like you did here (we also shouldn't be duplicating magic numbers).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you mean and I agree on not repeating constants.

@ksnavely ksnavely force-pushed the rexi-stream-limit-configurability branch from 87020f7 to 2aa2a01 Compare December 13, 2019 22:54
@ksnavely
Copy link
Contributor Author

ksnavely commented Dec 17, 2019

I've been performing additional benchmarks, focussed on replication recently. I was not able to demonstrate a replication performance gain due to e.g. _changes performance gains. I was able to demonstrate the opposite however, finding a q=192, 1M doc DB takes ~80% longer to index with the stream limit set to 100.

The stream limit 5 and 10 replication time results were within error of each other for q=192 and 1M docs. I believe this indicates replication's dominant bottleneck is not rexi's streaming performance.

@ksnavely
Copy link
Contributor Author

ksnavely commented Jan 6, 2020

Alright, @wenli200133 ran some partitioned query benchmarks for this recently, here is what she found:

q stream_limit mean rows std-dev % ratio 3 std-dev %
1 5 519961.67 0.48    
1 10 519948 0.82 0.999974 2.8504736
1 100 385213.67 30.3 0.74087 90.933281
1 1000 388600.33 28 1.008792 123.76918
8 5 2739565.67 0.23    
8 10 2750676.33 1.01 1.004056 3.1075714
8 100 2753663.67 0.21 1.001086 3.0948021
8 1000 2763239.67 0.55 1.003478 1.7661823

For higher Q values, there was no statistically significant variation in performance with the stream limit. Note the large instability at 100 and 1000 for q=1. Again this was for 1000 keys queried using startkey and limit.

@wohali
Copy link
Member

wohali commented Jan 6, 2020

@ksnavely what does that mean for q=1? this is a pretty common scenario in desktop/embedded CouchDB environments.

@ksnavely
Copy link
Contributor Author

ksnavely commented Jan 6, 2020

@wohali for q=1 I don't believe we're going to see a performance change given an e.g. 10 -> 5 unacked message limit for rexi:stream2. The data collected so far indicates the performance differences taper off as q drops for this range of the unacked message limit.

Based on testing I would not recommend increasing the limit -- the q=1 case is precisely where Li saw instability at a stream limit of 100, 1000.

@kocolosk
Copy link
Member

kocolosk commented Jan 7, 2020

This is good data, especially because it's a surprising result! (Our hypothesis was that increasing the stream_limit would deliver better throughput for partitioned databases and databases with small Q).

Given that stream_limit=5 matches stream_limit=10 at low Q and outperforms it by 30-50% at high Q, shouldn't we just make that the default?

@ksnavely
Copy link
Contributor Author

ksnavely commented Jan 7, 2020

+1 to update the default to 5, but I think we should leave the new configurability in case Couch users encounter unexpected behavior in production environments.

@kocolosk kocolosk merged commit 08d6538 into apache:master Jan 8, 2020
wohali pushed a commit that referenced this pull request Jan 8, 2020
…2360)

Also lower the default stream_limit to 5 based on the results of
performance testing.

Co-authored-by: Adam Kocoloski <kocolosk@apache.org>
Co-authored-by: Kyle Snavely <kjsnavely@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants