Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clientv3: Upgrade to round robin balancer based on gRPC 1.12 balancer API #9860

Merged
merged 39 commits into from Jun 16, 2018

Conversation

jpbetz
Copy link
Contributor

@jpbetz jpbetz commented Jun 15, 2018

To simplify balancer failover logic, leverage gPRC's new load balancer API and ease gRPC dependency upgrades, we've rewritten the etcd clientv3 load balancer implementation. This PR merges the new load balancer development branch to master.

Design: docs/client-architecture.rst

Benchmark: https://github.com/coreos/dbtester/tree/master/test-results/2018Q2-02-etcd-client-balancer

Key changes:

  • Round Robin load balancing
  • Use gRPC's new load balancer API
  • Interceptor based retries

gyuho and others added 30 commits June 15, 2018 13:41
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
…happy-path load balancer test

Author:    Joe Betz <jpbetz@google.com>
Date:      Wed Mar 28 15:51:33 2018 -0700
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
Otherwise, "grpc.Dial" blocks when "grpc.WithTimeout" dial
option gets deprecated.

Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
Otherwise, grpc.DialContext would just return before
connection is up.

Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
"grpc.WithTimeout" dial option is being deprecated.

Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
…ng, fix tests to block on dial when required.
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
@jpbetz jpbetz added this to the etcd-v3.4 milestone Jun 15, 2018
@gyuho
Copy link
Contributor

gyuho commented Jun 15, 2018

We already extensively tested this branch. And benchmark against current master branch shows no regression with (slightly) better read throughputs.

# Write 1M keys, 256-byte key, 1KB value, Best Throughput (etcd 1K clients with 100 conns)
+---------------------------------------+-----------------------------+---------------------------------+
|                                       | etcd-v3.4-b241e383-go1.10.3 | etcd-v3.4-balancer0615-go1.10.3 |
+---------------------------------------+-----------------------------+---------------------------------+
|                         TOTAL-SECONDS |                 31.1256 sec |                     31.2477 sec |
|                  TOTAL-REQUEST-NUMBER |                   1,000,000 |                       1,000,000 |
|                        MAX-THROUGHPUT |              33,760 req/sec |                  34,587 req/sec |
|                        AVG-THROUGHPUT |              32,127 req/sec |                  32,002 req/sec |
|                        MIN-THROUGHPUT |               4,965 req/sec |                  10,454 req/sec |
|                       FASTEST-LATENCY |                   4.6587 ms |                       2.4888 ms |
|                           AVG-LATENCY |                  31.0604 ms |                      31.2033 ms |
|                       SLOWEST-LATENCY |                 117.5620 ms |                     114.9492 ms |
|                           Latency p10 |                13.431526 ms |                    14.691959 ms |
|                           Latency p25 |                17.993337 ms |                    19.586467 ms |
|                           Latency p50 |                24.734914 ms |                    25.571253 ms |
|                           Latency p75 |                42.801499 ms |                    42.138762 ms |
|                           Latency p90 |                57.777309 ms |                    55.289961 ms |
|                           Latency p95 |                65.311487 ms |                    60.855029 ms |
|                           Latency p99 |                78.819013 ms |                    75.192049 ms |
|                         Latency p99.9 |                97.808156 ms |                    92.254135 ms |
|      SERVER-TOTAL-NETWORK-RX-DATA-SUM |                      5.2 GB |                          5.2 GB |
|      SERVER-TOTAL-NETWORK-TX-DATA-SUM |                      3.9 GB |                          4.0 GB |
|           CLIENT-TOTAL-NETWORK-RX-SUM |                      258 MB |                          324 MB |
|           CLIENT-TOTAL-NETWORK-TX-SUM |                      1.5 GB |                          1.6 GB |
|                  SERVER-MAX-CPU-USAGE |                    440.30 % |                        537.67 % |
|               SERVER-MAX-MEMORY-USAGE |                      1.2 GB |                          1.2 GB |
|                  CLIENT-MAX-CPU-USAGE |                    570.00 % |                        593.00 % |
|               CLIENT-MAX-MEMORY-USAGE |                       95 MB |                          171 MB |
|                    CLIENT-ERROR-COUNT |                           0 |                               0 |
|  SERVER-AVG-READS-COMPLETED-DELTA-SUM |                           0 |                              73 |
|    SERVER-AVG-SECTORS-READS-DELTA-SUM |                           0 |                               0 |
| SERVER-AVG-WRITES-COMPLETED-DELTA-SUM |                     103,846 |                         109,864 |
|  SERVER-AVG-SECTORS-WRITTEN-DELTA-SUM |                  23,873,928 |                      20,586,688 |
|           SERVER-AVG-DISK-SPACE-USAGE |                      2.7 GB |                          2.7 GB |
+---------------------------------------+-----------------------------+---------------------------------+
# Read 3M same keys, 256-byte key, 1KB value, Best Throughput (etcd 1K clients with 100 conns)
+---------------------------------------+-----------------------------+---------------------------------+
|                                       | etcd-v3.4-b241e383-go1.10.3 | etcd-v3.4-balancer0615-go1.10.3 |
+---------------------------------------+-----------------------------+---------------------------------+
|                         TOTAL-SECONDS |                 17.8744 sec |                     17.8226 sec |
|                  TOTAL-REQUEST-NUMBER |                   3,000,000 |                       3,000,000 |
|                        MAX-THROUGHPUT |             176,763 req/sec |                 172,164 req/sec |
|                        AVG-THROUGHPUT |             167,837 req/sec |                 168,325 req/sec |
|                        MIN-THROUGHPUT |              38,290 req/sec |                   7,453 req/sec |
|                       FASTEST-LATENCY |                   0.5131 ms |                       0.5025 ms |
|                           AVG-LATENCY |                   4.6043 ms |                       4.6358 ms |
|                       SLOWEST-LATENCY |                  37.8623 ms |                      29.7872 ms |
|                           Latency p10 |                 1.729814 ms |                     2.372096 ms |
|                           Latency p25 |                 2.383698 ms |                     3.036887 ms |
|                           Latency p50 |                 3.961112 ms |                     4.055946 ms |
|                           Latency p75 |                 6.137971 ms |                     5.684766 ms |
|                           Latency p90 |                 8.458589 ms |                     7.767217 ms |
|                           Latency p95 |                10.006860 ms |                     9.068512 ms |
|                           Latency p99 |                13.232563 ms |                    12.085174 ms |
|                         Latency p99.9 |                18.042299 ms |                    16.128133 ms |
|      SERVER-TOTAL-NETWORK-RX-DATA-SUM |                      1.2 GB |                          1.3 GB |
|      SERVER-TOTAL-NETWORK-TX-DATA-SUM |                      4.5 GB |                          4.6 GB |
|           CLIENT-TOTAL-NETWORK-RX-SUM |                      4.4 GB |                          4.8 GB |
|           CLIENT-TOTAL-NETWORK-TX-SUM |                      1.2 GB |                          1.3 GB |
|                  SERVER-MAX-CPU-USAGE |                    891.33 % |                        867.33 % |
|               SERVER-MAX-MEMORY-USAGE |                       58 MB |                           68 MB |
|                  CLIENT-MAX-CPU-USAGE |                   1453.00 % |                       1510.00 % |
|               CLIENT-MAX-MEMORY-USAGE |                      158 MB |                          255 MB |
|                    CLIENT-ERROR-COUNT |                           0 |                               0 |
|  SERVER-AVG-READS-COMPLETED-DELTA-SUM |                           0 |                               0 |
|    SERVER-AVG-SECTORS-READS-DELTA-SUM |                           0 |                               0 |
| SERVER-AVG-WRITES-COMPLETED-DELTA-SUM |                          51 |                              96 |
|  SERVER-AVG-SECTORS-WRITTEN-DELTA-SUM |                         448 |                           1,112 |
|           SERVER-AVG-DISK-SPACE-USAGE |                       64 MB |                           64 MB |
+---------------------------------------+-----------------------------+---------------------------------+

ref. https://github.com/coreos/dbtester/tree/master/test-results/2018Q2-02-etcd-client-balancer

Once CIs pass, it should be safe to merge. And we will keep testing after merge.

The design doc will be served here https://etcd.readthedocs.io/en/latest.

Thanks a lot @jpbetz!

@WIZARD-CXY
Copy link
Contributor

nice

hexfusion added a commit to hexfusion/etcd that referenced this pull request Aug 5, 2019
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
hexfusion added a commit to hexfusion/etcd that referenced this pull request Aug 5, 2019
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
hexfusion pushed a commit to hexfusion/etcd that referenced this pull request Aug 5, 2019
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
hexfusion pushed a commit to hexfusion/etcd that referenced this pull request Aug 5, 2019
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
hexfusion pushed a commit to hexfusion/etcd that referenced this pull request Aug 5, 2019
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
hexfusion added a commit to hexfusion/etcd that referenced this pull request Aug 5, 2019
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
hexfusion added a commit to hexfusion/etcd that referenced this pull request Aug 5, 2019
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
hexfusion added a commit to hexfusion/etcd that referenced this pull request Aug 5, 2019
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
hexfusion added a commit to hexfusion/etcd that referenced this pull request Aug 5, 2019
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
hexfusion added a commit to hexfusion/etcd that referenced this pull request Aug 5, 2019
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
hexfusion added a commit to hexfusion/etcd that referenced this pull request Aug 5, 2019
These changes were originally fixed in etcd-io#9860 commit 9304d1a

Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
@hexfusion hexfusion mentioned this pull request Aug 6, 2019
4 tasks
hexfusion added a commit to hexfusion/etcd that referenced this pull request Aug 6, 2019
These changes were originally fixed in etcd-io#9860 commit 9304d1a

Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
@horkhe
Copy link
Contributor

horkhe commented Jan 6, 2020

We noticed that different cluster nodes have different store revisions, so the same key has different create revision depending on what node you are asking. It looks like with the new load balancer, client does not stick with a connection to particular node, but shoots requests in round robin fashion. As a result it is impossible to rely on key revisions in queries. How do you suggest to deal with this problem? (server v3.3.10 + go client v3.3.18)

@jpbetz
Copy link
Contributor Author

jpbetz commented Jan 6, 2020

It’s a major bug for revisions to differ across nodes for the same write. Please open a separate issue for it and provide as information as you can about how etcd got into the state.

@WIZARD-CXY
Copy link
Contributor

WIZARD-CXY commented Jan 9, 2020

@horkhe I think you can use linear read for consistency.

@horkhe
Copy link
Contributor

horkhe commented Jan 9, 2020

I have no idea how we got into that state but 3 out of 4 clusters that we operate turned out to be in that weird state. We have been observing some weird behaviour of our applications for awhile, that now can be explained by inconsistent revisions in Etcd. In the end we fixed all our clusters by removing follower nodes, wiping there data and re-adding them to the cluster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

None yet

4 participants