Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parsec/perf: consensus only sync events (up to 90%) #273

Merged
merged 3 commits into from Mar 12, 2019

Conversation

Projects
None yet
4 participants
@jeanphilippeD
Copy link
Contributor

jeanphilippeD commented Mar 12, 2019

The core of the functionality is in:
perf/parsec: process_event for sync event only (Up to 89%)

Additional benchmarks are added for 8192 events in the commit before.
A follow up commit fix the main remaining bottleneck for large number of peers.

Process our graphs as if it only had sync events (Requesting/Request/
Response) and these events contained all the observations in the chain
of self parent until the previous sync event.

This provide a significant part of the benefits that would have come
from bundling all observation together without the problematic changes
to the structure of Observation and Vote. It also open further
improvments for remaining efficiencies. And improve both SuperMajority
and Single ConsensusMode.

Note:
Except for the self last ancestor, all the last_ancestor are Requesting
or Request sync events: Only the use of self parent need to be special
cased to find the self_sync_parent.

jeanphilippeD added some commits Mar 12, 2019

perf/parsec: process_event for sync event only (Up to 89%)
Process our graphs as if it only had sync events (Requesting/Request/
Response) and these events contained all the observations in the chain
of self parent until the previous sync event.

This provide a significant part of the benefits that would have come
from bundling all observation together without the problematic changes
to the structure of Observation and Vote. It also open further
improvments for remaining efficiencies. And improve both SuperMajority
and Single ConsensusMode.

Note:
Except for the self last ancestor, all the last_ancestor are Requesting
or Request sync events: Only the use of self parent need to be special
cased to find the self_sync_parent.

Note on benchmarks:
Benchmarks for different number of peers for 1924 and 8192 events
simulate a constent per network gossip event rate. This means that
a peer in a 4 node network will see 8 times more event between its
gossip request than a peer in a 32 node network. This probably explain
why the larger network get a smaller improvment. It also mean that with
higer event rate, the larger network will see that improvment.

time cargo bench --features=testing -- --baseline=pr270

Most interesting:
a_node4_opaque_evt1024 - bench_section_size_evt1024_interleave
                        time:   [34.225 ms 36.167 ms 36.547 ms]
                        change: [-63.495% -62.111% -61.027%] (p = 0.02 < 0.05)
                        Performance has improved.
a_node8_opaque_evt1024 - bench_section_size_evt1024_interleave
                        time:   [63.649 ms 63.937 ms 64.551 ms]
                        change: [-67.051% -66.822% -66.584%] (p = 0.02 < 0.05)
                        Performance has improved.
a_node16_opaque_evt1024 - bench_section_size_evt1024_interleave
                        time:   [146.67 ms 147.18 ms 148.20 ms]
                        change: [-53.128% -52.322% -51.294%] (p = 0.02 < 0.05)
                        Performance has improved.
a_node32_opaque_evt1024 - bench_section_size_evt1024_interleave
                        time:   [414.51 ms 422.52 ms 426.94 ms]
                        change: [-17.420% -15.272% -13.147%] (p = 0.03 < 0.05)
                        Performance has improved.

a_node4_opaque_evt8192 - bench_section_size_evt8192_interleave
                        time:   [486.23 ms 487.95 ms 494.42 ms]
                        change: [-85.447% -85.233% -85.030%] (p = 0.02 < 0.05)
                        Performance has improved.
a_node8_opaque_evt8192 - bench_section_size_evt8192_interleave
                        time:   [941.50 ms 944.90 ms 950.27 ms]
                        change: [-89.063% -89.016% -88.953%] (p = 0.02 < 0.05)
                        Performance has improved.
a_node16_opaque_evt8192 - bench_section_size_evt8192_interleave
                        time:   [1.8589 s 1.8634 s 1.8654 s]
                        change: [-79.882% -79.832% -79.785%] (p = 0.02 < 0.05)
                        Performance has improved.
a_node32_opaque_evt8192 - bench_section_size_evt8192_interleave
                        time:   [3.3108 s 3.3142 s 3.3333 s]
                        change: [-39.731% -39.528% -39.294%] (p = 0.02 < 0.05)
                        Performance has improved.

Other results:
minimal - benches       time:   [863.75 us 865.74 us 868.51 us]
                        change: [-12.561% -12.084% -11.501%] (p = 0.02 < 0.05)
                        Performance has improved.
static - benches        time:   [2.9964 ms 3.0166 ms 3.0234 ms]
                        change: [-16.762% -12.236% -9.4093%] (p = 0.03 < 0.05)
                        Change within noise threshold.
dynamic - benches       time:   [2.1336 ms 2.1678 ms 2.3269 ms]
                        change: [-16.184% -11.844% -7.3155%] (p = 0.03 < 0.05)
                        Change within noise threshold.

a_node4_opaque_evt8 - bench_section_size_evt8
                        time:   [1.0932 ms 1.1058 ms 1.1124 ms]
                        change: [-22.677% -21.470% -20.252%] (p = 0.02 < 0.05)
                        Performance has improved.
a_node4_opaque_evt8 - bench_section_size_evt8_single
                        time:   [958.76 us 974.01 us 1.0071 ms]
                        change: [-14.934% -11.627% -8.1626%] (p = 0.02 < 0.05)
                        Change within noise threshold.
a_node8_opaque_evt8 - bench_section_size_evt8
                        time:   [4.3838 ms 4.4863 ms 4.6149 ms]
                        change: [-21.485% -13.778% -7.3450%] (p = 0.05 > 0.05)
                        No change in performance detected.
a_node8_opaque_evt8 - bench_section_size_evt8_single
                        time:   [5.8706 ms 5.8805 ms 5.9326 ms]
                        change: [-8.1477% -6.7443% -5.4966%] (p = 0.02 < 0.05)
                        Change within noise threshold.
a_node16_opaque_evt8 - bench_section_size_evt8
                        time:   [22.615 ms 23.299 ms 24.866 ms]
                        change: [-3.4956% +0.4349% +6.0293%] (p = 0.91 > 0.05)
                        No change in performance detected.
a_node16_opaque_evt8 - bench_section_size_evt8_single
                        time:   [16.088 ms 16.310 ms 16.982 ms]
                        change: [-4.6727% -1.8646% +0.5703%] (p = 0.29 > 0.05)
                        No change in performance detected.
a_node32_opaque_evt8 - bench_section_size_evt8
                        time:   [200.34 ms 205.51 ms 216.85 ms]
                        change: [-17.264% -9.1455% +0.1371%] (p = 0.20 > 0.05)
                        No change in performance detected.
a_node32_opaque_evt8 - bench_section_size_evt8_single
                        time:   [94.832 ms 101.22 ms 106.51 ms]
                        change: [-22.803% -17.463% -11.818%] (p = 0.02 < 0.05)
                        Performance has improved.
a_node48_opaque_evt8 - bench_section_size_evt8
                        time:   [390.80 ms 391.79 ms 400.77 ms]
                        change: [-15.324% -10.462% -6.5824%] (p = 0.03 < 0.05)
                        Change within noise threshold.
a_node48_opaque_evt8 - bench_section_size_evt8_single
                        time:   [319.57 ms 326.45 ms 333.85 ms]
                        change: [-13.801% -11.611% -9.5961%] (p = 0.02 < 0.05)
                        Change within noise threshold.

a_node4_opaque_evt16 - bench_section_size_evt16
                        time:   [1.3346 ms 1.3367 ms 1.3407 ms]
                        change: [-29.416% -28.526% -27.025%] (p = 0.02 < 0.05)
                        Performance has improved.
a_node4_opaque_evt16 - bench_section_size_evt16_single
                        time:   [1.0017 ms 1.0022 ms 1.0086 ms]
                        change: [-32.463% -31.021% -29.568%] (p = 0.02 < 0.05)
                        Performance has improved.
a_node8_opaque_evt16 - bench_section_size_evt16
                        time:   [7.5234 ms 7.5367 ms 7.5693 ms]
                        change: [-16.962% -15.763% -14.077%] (p = 0.02 < 0.05)
                        Performance has improved.
a_node8_opaque_evt16 - bench_section_size_evt16_single
                        time:   [4.0923 ms 4.1153 ms 4.1342 ms]
                        change: [-7.4917% -6.9419% -6.5670%] (p = 0.02 < 0.05)
                        Change within noise threshold.
a_node16_opaque_evt16 - bench_section_size_evt16
                        time:   [30.776 ms 30.884 ms 30.928 ms]
                        change: [-11.310% -11.015% -10.718%] (p = 0.03 < 0.05)
                        Performance has improved.
a_node16_opaque_evt16 - bench_section_size_evt16_single
                        time:   [18.011 ms 18.015 ms 18.040 ms]
                        change: [-16.933% -13.834% -11.874%] (p = 0.02 < 0.05)
                        Performance has improved.
a_node32_opaque_evt16 - bench_section_size_evt16
                        time:   [149.16 ms 149.49 ms 151.27 ms]
                        change: [-8.9866% -7.2361% -5.7857%] (p = 0.02 < 0.05)
                        Change within noise threshold.
a_node32_opaque_evt16 - bench_section_size_evt16_single
                        time:   [146.74 ms 148.35 ms 149.14 ms]
                        change: [-5.1797% -3.7823% -2.1465%] (p = 0.03 < 0.05)
                        Change within noise threshold.
a_node48_opaque_evt16 - bench_section_size_evt16
                        time:   [792.21 ms 807.59 ms 887.43 ms]
                        change: [-16.255% -9.0132% -1.0866%] (p = 0.14 > 0.05)
                        No change in performance detected.
a_node48_opaque_evt16 - bench_section_size_evt16_single
                        time:   [199.52 ms 199.87 ms 202.69 ms]
                        change: [-0.9024% +2.1478% +4.3763%] (p = 0.25 > 0.05)
                        No change in performance detected.

PublicIdname754598-001 - bench_routing/mock_crust_merge_merge_three_sections_...
                        time:   [98.431 ms 98.856 ms 99.746 ms]
                        change: [-6.7036% -3.6814% -1.4846%] (p = 0.07 > 0.05)
                        No change in performance detected.
PublicIdname754598-002 - bench_routing/mock_crust_merge_merge_three_sections_...
                        time:   [210.17 ms 211.22 ms 213.73 ms]
                        change: [-2.5255% -1.7659% -0.8347%] (p = 0.03 < 0.05)
                        Change within noise threshold.
PublicIdname754598-003 - bench_routing/mock_crust_merge_merge_three_sections_...
                        time:   [9.1307 ms 9.1523 ms 9.1818 ms]
                        change: [-5.8631% -4.7028% -3.8745%] (p = 0.02 < 0.05)
                        Change within noise threshold.
PublicIdname93b63e-001 - bench_routing/mock_crust_merge_merge_three_sections_...
                        time:   [96.557 ms 97.063 ms 98.219 ms]
                        change: [-4.2174% -3.1694% -2.1104%] (p = 0.02 < 0.05)
                        Change within noise threshold.
PublicIdname93b63e-002 - bench_routing/mock_crust_merge_merge_three_sections_...
                        time:   [165.38 ms 165.80 ms 166.68 ms]
                        change: [-1.5194% -1.1500% -0.7799%] (p = 0.02 < 0.05)
                        Change within noise threshold.
PublicIdname93b63e-003 - bench_routing/mock_crust_merge_merge_three_sections_...
                        time:   [28.050 ms 28.090 ms 28.157 ms]
                        change: [-7.7499% -7.0346% -6.3209%] (p = 0.02 < 0.05)
                        Change within noise threshold.
PublicIdname93b63e-004 - bench_routing/mock_crust_merge_merge_three_sections_...
                        time:   [19.703 ms 19.725 ms 19.896 ms]
                        change: [-10.512% -8.0772% -5.8919%] (p = 0.02 < 0.05)
                        Change within noise threshold.
PublicIdname93b63e-005 - bench_routing/mock_crust_merge_merge_three_sections_...
                        time:   [8.7149 ms 8.7765 ms 8.8017 ms]
                        change: [-5.0975% -4.4885% -4.1496%] (p = 0.02 < 0.05)
                        Change within noise threshold.
fix/parsec: use efficient look up for consensused key
Profiling, this was a significant bottleneck for a_node32_opaque_evt8192.
Provide an additional 30% improvment for this test case.

time cargo bench --features=testing -- --baseline=pr270

Most relevant:
a_node4_opaque_evt8192 - bench_section_size_evt8192_interleave
                        time:   [445.93 ms 448.52 ms 451.45 ms]
                        change: [-86.675% -86.479% -86.313%] (p = 0.02 < 0.05)
                        Performance has improved.

a_node8_opaque_evt8192 - bench_section_size_evt8192_interleave
                        time:   [728.25 ms 730.40 ms 731.71 ms]
                        change: [-91.534% -91.509% -91.485%] (p = 0.02 < 0.05)
                        Performance has improved.

a_node16_opaque_evt8192 - bench_section_size_evt8192_interleave
                        time:   [1.1881 s 1.1955 s 1.1989 s]
                        change: [-87.129% -87.076% -87.019%] (p = 0.02 < 0.05)
                        Performance has improved.

a_node32_opaque_evt8192 - bench_section_size_evt8192_interleave
                        time:   [1.8734 s 1.8740 s 1.8797 s]
                        change: [-65.922% -65.840% -65.758%] (p = 0.02 < 0.05)
                        Performance has improved.

Other results:
minimal - benches       time:   [868.85 us 870.34 us 877.36 us]
                        change: [-12.055% -11.421% -10.767%] (p = 0.02 < 0.05)
                        Performance has improved.
static - benches        time:   [2.9915 ms 3.0860 ms 3.1350 ms]
                        change: [-15.862% -11.266% -7.2462%] (p = 0.03 < 0.05)
                        Change within noise threshold.
dynamic - benches       time:   [2.1319 ms 2.2236 ms 2.2647 ms]
                        change: [-16.034% -12.491% -8.7934%] (p = 0.02 < 0.05)
                        Change within noise threshold.

a_node4_opaque_evt8 - bench_section_size_evt8
                        time:   [1.1055 ms 1.1192 ms 1.1534 ms]
                        change: [-21.859% -19.973% -17.698%] (p = 0.02 < 0.05)
                        Performance has improved.
a_node4_opaque_evt8 - bench_section_size_evt8_single
                        time:   [943.56 us 945.36 us 949.63 us]
                        change: [-17.112% -14.841% -11.531%] (p = 0.02 < 0.05)
                        Performance has improved.
a_node8_opaque_evt8 - bench_section_size_evt8
                        time:   [4.3567 ms 4.3715 ms 4.3980 ms]
                        change: [-23.508% -15.763% -10.625%] (p = 0.03 < 0.05)
                        Performance has improved.
a_node8_opaque_evt8 - bench_section_size_evt8_single
                        time:   [5.9579 ms 6.0012 ms 6.2261 ms]
                        change: [-6.2598% -3.9327% -1.5599%] (p = 0.05 > 0.05)
                        No change in performance detected.
a_node16_opaque_evt8 - bench_section_size_evt8
                        time:   [21.861 ms 22.174 ms 22.210 ms]
                        change: [-6.7347% -5.8173% -5.0704%] (p = 0.03 < 0.05)
                        Change within noise threshold.
a_node16_opaque_evt8 - bench_section_size_evt8_single
                        time:   [16.264 ms 16.360 ms 16.584 ms]
                        change: [-3.6290% -2.5460% -1.7460%] (p = 0.03 < 0.05)
                        Change within noise threshold.
a_node32_opaque_evt8 - bench_section_size_evt8
                        time:   [193.11 ms 194.45 ms 201.96 ms]
                        change: [-21.082% -13.400% -4.7305%] (p = 0.06 > 0.05)
                        No change in performance detected.
a_node32_opaque_evt8 - bench_section_size_evt8_single
                        time:   [94.242 ms 94.314 ms 94.446 ms]
                        change: [-24.489% -22.341% -17.815%] (p = 0.02 < 0.05)
                        Performance has improved.
a_node48_opaque_evt8 - bench_section_size_evt8
                        time:   [386.16 ms 387.52 ms 388.61 ms]
                        change: [-16.745% -11.979% -8.1798%] (p = 0.03 < 0.05)
                        Change within noise threshold.
a_node48_opaque_evt8 - bench_section_size_evt8_single
                        time:   [314.21 ms 315.55 ms 317.78 ms]
                        change: [-16.306% -14.798% -13.569%] (p = 0.02 < 0.05)
                        Performance has improved.

a_node4_opaque_evt16 - bench_section_size_evt16
                        time:   [1.3306 ms 1.3324 ms 1.3364 ms]
                        change: [-29.682% -28.800% -27.322%] (p = 0.02 < 0.05)
                        Performance has improved.
a_node4_opaque_evt16 - bench_section_size_evt16_single
                        time:   [1.0028 ms 1.0049 ms 1.0054 ms]
                        change: [-32.433% -31.014% -29.585%] (p = 0.02 < 0.05)
                        Performance has improved.
a_node8_opaque_evt16 - bench_section_size_evt16
                        time:   [7.4620 ms 7.4781 ms 7.4898 ms]
                        change: [-17.614% -16.470% -14.846%] (p = 0.02 < 0.05)
                        Performance has improved.
a_node8_opaque_evt16 - bench_section_size_evt16_single
                        time:   [4.0954 ms 4.1094 ms 4.1157 ms]
                        change: [-7.3918% -7.1215% -6.9166%] (p = 0.03 < 0.05)
                        Change within noise threshold.
a_node16_opaque_evt16 - bench_section_size_evt16
                        time:   [30.597 ms 30.705 ms 30.873 ms]
                        change: [-11.834% -11.420% -11.005%] (p = 0.02 < 0.05)
                        Performance has improved.
a_node16_opaque_evt16 - bench_section_size_evt16_single
                        time:   [18.083 ms 18.121 ms 18.530 ms]
                        change: [-15.914% -12.801% -10.300%] (p = 0.02 < 0.05)
                        Performance has improved.
a_node32_opaque_evt16 - bench_section_size_evt16
                        time:   [149.96 ms 153.14 ms 154.75 ms]
                        change: [-8.2212% -6.2451% -4.2259%] (p = 0.03 < 0.05)
                        Change within noise threshold.
a_node32_opaque_evt16 - bench_section_size_evt16_single
                        time:   [147.66 ms 148.26 ms 149.16 ms]
                        change: [-4.8407% -3.6341% -1.9090%] (p = 0.03 < 0.05)
                        Change within noise threshold.
a_node48_opaque_evt16 - bench_section_size_evt16
                        time:   [789.91 ms 790.47 ms 793.77 ms]
                        change: [-19.714% -13.642% -8.2363%] (p = 0.03 < 0.05)
                        Change within noise threshold.
a_node48_opaque_evt16 - bench_section_size_evt16_single
                        time:   [188.30 ms 189.29 ms 189.53 ms]
                        change: [-6.6240% -3.7598% -1.9640%] (p = 0.05 < 0.05)
                        Change within noise threshold.

a_node4_opaque_evt1024 - bench_section_size_evt1024_interleave
                        time:   [33.873 ms 34.009 ms 34.064 ms]
                        change: [-63.871% -63.737% -63.629%] (p = 0.02 < 0.05)
                        Performance has improved.
a_node8_opaque_evt1024 - bench_section_size_evt1024_interleave
                        time:   [64.243 ms 64.532 ms 66.733 ms]
                        change: [-66.744% -66.247% -65.505%] (p = 0.02 < 0.05)
                        Performance has improved.
a_node16_opaque_evt1024 - bench_section_size_evt1024_interleave
                        time:   [139.01 ms 139.17 ms 140.99 ms]
                        change: [-55.649% -54.852% -53.821%] (p = 0.02 < 0.05)
                        Performance has improved.
a_node32_opaque_evt1024 - bench_section_size_evt1024_interleave
                        time:   [391.29 ms 392.06 ms 394.22 ms]
                        change: [-22.471% -20.500% -19.037%] (p = 0.02 < 0.05)
                        Performance has improved.

PublicIdname754598-001 - bench_routing/mock_crust_merge_merge_three_sections_...
                        time:   [95.567 ms 95.782 ms 96.263 ms]
                        change: [-9.8561% -6.9618% -4.9651%] (p = 0.03 < 0.05)
                        Change within noise threshold.
PublicIdname754598-002 - bench_routing/mock_crust_merge_merge_three_sections_...
                        time:   [205.70 ms 206.15 ms 207.55 ms]
                        change: [-4.4919% -4.0008% -3.5076%] (p = 0.02 < 0.05)
                        Change within noise threshold.
PublicIdname754598-003 - bench_routing/mock_crust_merge_merge_three_sections_...
                        time:   [8.9738 ms 8.9819 ms 9.0386 ms]
                        change: [-7.5666% -6.3666% -5.5021%] (p = 0.02 < 0.05)
                        Change within noise threshold.
PublicIdname93b63e-001 - bench_routing/mock_crust_merge_merge_three_sections_...
                        time:   [94.533 ms 94.867 ms 94.899 ms]
                        change: [-6.5344% -5.6461% -5.0402%] (p = 0.03 < 0.05)
                        Change within noise threshold.
PublicIdname93b63e-002 - bench_routing/mock_crust_merge_merge_three_sections_...
                        time:   [161.00 ms 161.78 ms 162.57 ms]
                        change: [-4.1497% -3.6312% -3.2090%] (p = 0.02 < 0.05)
                        Change within noise threshold.
PublicIdname93b63e-003 - bench_routing/mock_crust_merge_merge_three_sections_...
                        time:   [27.668 ms 27.765 ms 27.802 ms]
                        change: [-8.9997% -8.2817% -7.5652%] (p = 0.02 < 0.05)
                        Change within noise threshold.
PublicIdname93b63e-004 - bench_routing/mock_crust_merge_merge_three_sections_...
                        time:   [19.410 ms 19.551 ms 19.620 ms]
                        change: [-11.656% -9.3311% -7.2150%] (p = 0.02 < 0.05)
                        Change within noise threshold.
PublicIdname93b63e-005 - bench_routing/mock_crust_merge_merge_three_sections_...
                        time:   [8.7049 ms 8.7325 ms 8.7830 ms]
                        change: [-5.2068% -4.6517% -4.3455%] (p = 0.02 < 0.05)
                        Change within noise threshold.
test/parsec: add benchmarks for 8192 events (8 time denser than 1024)
8 times more events, 8 times fewer gossip events per steps:
This simulate 8 times more events in the same time period.

Avoid expensive check_unexpected_accusations at each steps in network.rs.
Do it only at the end if !options.intermediate_consistency_checks.

Test:

time cargo bench --features=testing -- --save-baseline=pr270 8192

a_node4_opaque_evt8192 - bench_section_size_evt8192_interleave
                        time:   [3.2884 s 3.3444 s 3.3689 s]
a_node8_opaque_evt8192 - bench_section_size_evt8192_interleave
                        time:   [8.5874 s 8.6039 s 8.6202 s]
a_node16_opaque_evt8192 - bench_section_size_evt8192_interleave
                        time:   [9.2225 s 9.2276 s 9.2536 s]
a_node32_opaque_evt8192 - bench_section_size_evt8192_interleave
                        time:   [5.4791 s 5.4896 s 5.5005 s]

@jeanphilippeD jeanphilippeD requested review from Fraser999 and fizyk20 Mar 12, 2019

@jeanphilippeD jeanphilippeD removed the request for review from pierrechevalier83 Mar 12, 2019

@@ -314,10 +314,11 @@ impl MetaElection {
self.meta_events.retain(|event_index, _| {
event_index.topological_index() >= new_consensus_start_index
});
let decided_keys_lookup: FnvHashSet<_> = decided_keys.iter().collect();

This comment has been minimized.

@fizyk20

fizyk20 Mar 12, 2019

Contributor

Is this necessary? Seems like an additional allocation that doesn't gain us anything.

This comment has been minimized.

@jeanphilippeD

jeanphilippeD Mar 12, 2019

Author Contributor

Yes, it is in its own commit as a follow up.
There can be 100's or 1000's of keys: 30% improvement.

This comment has been minimized.

@fizyk20

fizyk20 Mar 12, 2019

Contributor

Ah, I see now. This probably changes .contains() from linear to constant time? Makes sense, then 👍

@pierrechevalier83 pierrechevalier83 merged commit a75fcb4 into maidsafe:master Mar 12, 2019

1 check passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details

@jeanphilippeD jeanphilippeD deleted the jeanphilippeD:skip_non_sync_events_for_consensus branch Mar 12, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.