Multiple endpoints #55

dariko · 2019-02-25T09:19:41Z

This commits adds a new endpoints parameter for the BaseClient class.
This parameter, which must be used alternatively to host and port,
can be set to a list of EtcdEndpoint, which is a simple data class with
address and port attributes.

A new decorator is present: retry all hosts. This decorator if
applied to a function of a class descendant from BaseClient has the
function calls retried against all the endpoints if an exception
occours.
If one of the tries returns with no errors the return value is propagated;
if all the tries throws exceptions of the same type the first one is
propagated; if they throw different exceptions an Etcd3Exception is
thrown. All the failed tries are logged.

A new utility class EtcdCluster is added to manage the lifecycle of
a etcd cluster run in containers. This semplifies the testing code and
lays the groundwork for testing requests failovers.
This class replaces the methods in docker_cli and etcd_go_cli.

A set of fixtures(etcd_cluster, etcd_cluster_ssl, client and
io_client manages the lifecycle of the test resources.

All the tests are set to run using the BaseClient's endpoint parameter,
additional tests have been added to test host and port.

A fix for the aiter methods in python 3.7 is included.

TODO: docs, failover testing

dariko · 2019-02-25T09:21:56Z

this fixes the auto retry between endpoint part of #25

codecov · 2019-02-25T10:14:11Z

Codecov Report

Merging #55 into master will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master      #55   +/-   ##
=======================================
  Coverage   90.41%   90.41%           
=======================================
  Files          52       52           
  Lines        2890     2890           
  Branches      324      324           
=======================================
  Hits         2613     2613           
  Misses        170      170           
  Partials      107      107

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8206a66...8206a66. Read the comment docs.

Revolution1 · 2019-02-25T10:46:34Z

@stupidchen

This commits adds a new `endpoints` parameter for the `BaseClient` class. This parameter, which must be used alternatively to `host` and `port`, can be set to a list of EtcdEndpoint, which is a simple data class with `address` and `port` attributes. A new decorator is present: `retry all hosts`. This decorator if applied to a function of a class descendant from `BaseClient` has the function calls retried against all the `endpoints` if an exception occours. If one of the tries returns with no errors the return value is propagated; if all the tries throws exceptions of the same type the first one is propagated; if they throw different exceptions an Etcd3Exception is thrown. All the failed tries are logged.

https://docs.python.org/3/reference/datamodel.html#async-iterators

A new utility class `EtcdCluster` is added to manage the lifecycle of a etcd cluster run in containers. This class replaces the methods in `docker_cli` and `etcd_go_cli`. A set of fixtures(`etcd_cluster`, `etcd_cluster_ssl`, `client` and `io_client` manages the lifecycle of the test resources. All the tests are set to run using the `BaseClient`'s `endpoint` parameter, additional tests have been added to test `host` and `port`.

dariko · 2019-02-25T11:05:14Z

This still needs works on the stateful. classes

etcd3/aio_client.py

Revolution1 · 2019-02-25T11:51:23Z

etcd3/baseclient.py

+                ret = func(self, *args, **kwargs)
+                got_result = True
+                break
+            except Exception as e:


only retry when connection fails

just catch these errors aiohttp.ClientError requests.RequestException urllib3.exceptions.HTTPError

etcd3/baseclient.py

Revolution1 · 2019-02-25T19:14:36Z

tests/envs.py

+NO_DOCKER_SERVICE = True
+try:  # pragma: no cover
+    import docker  # noqa
+    NO_DOCKER_SERVICE = False


I was intended to mock all the api, so I added the NO_ETCD_SERVICE to make test runnable when no etcd server avaliable.

you can just delete this since NO_DOCKER_SERVICE always false now

Revolution1 · 2019-02-25T19:44:35Z

etcd3/utils.py

@@ -382,3 +382,12 @@ def find_executable(executable, path=None):  # pragma: no cover
            f = os.path.join(p, execname)
            if os.path.isfile(f):
                return f
+
+
+class EtcdEndpoint():


this class contains only host and port but made creating a client less friendly

any further design on this?

better put this into etcd3/__init__.py

tests/test_maintenance_apis.py

…nge them in flight

Revolution1 · 2019-02-25T20:26:22Z

Under current design, we should create a multi-endpoint client like this:

client=Client(endpoints=[EtcdEndpoint(), EtcdEndpoint()])

seems not very friendly

here are the some other code for your reference:

sync endpoints:
https://github.com/etcd-io/etcd/blob/5effa154b464faa6a9ca88296df831eb7f0b8955/clientv3/client.go#L162
auto sync endpoints:
https://github.com/etcd-io/etcd/blob/5effa154b464faa6a9ca88296df831eb7f0b8955/clientv3/client.go#L162
endpoint picking policy(clientv3)
https://github.com/etcd-io/etcd/blob/5effa154b464faa6a9ca88296df831eb7f0b8955/clientv3/balancer/picker/picker_policy.go#L22
switch endpoint after fail, for next request (clientv2)
https://github.com/etcd-io/etcd/blob/5effa154b464faa6a9ca88296df831eb7f0b8955/client/client.go#L389
switch endpoint after connection fail (python-etcd)
https://github.com/jplana/python-etcd/blob/b227f496c038b2b856c4d76c9525b3547e5c8dc4/src/etcd/client.py#L862

seems other clients does not do "auto switch" of one single request.

may be we should discuss about the influence of "auto switch"

dariko · 2019-02-25T22:35:27Z

@Revolution1
Thank you for the references!

The constructor can still be called as before, the only change is that the the arguments needs to be named ([Aio]Client(host='127.0.0.1', port=2379)); calling it like this will populate the endpoints list with a single EtcdEndpoint (see https://github.com/dariko/etcd3-py/blob/multiple_endpoints/etcd3/baseclient.py#L98-L105) from which, when I'll have integrated server discovery, the list will be expanded.
Still, I think it's important to allow for many endpoints to be specified as this could prevent a situation in which a client is configured to use only a non-available node.
I'm open to suggestion as to how to accept more endpoints while requiring less keystrokes. right now the only idea I can propose is to accept a list of strings:

Client(endpoints=[EtcdEndpoint(host='127.0.0.1', port=1234),
                  EtcdEndpoint(host='127.0.0.2', port=5678)])`
# would become
Client(endpoints=['127.0.0.1:1234','127.0.0.2:5678']

but I'm not so sure it's really better... maybe accept both of them?
I don't think a cluster can have nodes replying to clients with different ssl configurations, but if that is possible the string configuration can become constricting.

About the error detection, my idea was to have something like the code you linked from python-etcd (which I think I think was my inspiration): react to some Exception types on a API request by failing over to the next node.
Automatically retrying should have little or no consequences for some operations (those who do not change data as kv.get or those who can be considered for lack of a better word 'transient' as lease.renew, watch.create ...), but the situation can be less obvious with some other operation (kv.put, transactions ...).
Right now I'd like to get to the point where it's acceptable to enable by default failover for reading methods: I'm pretty sure supporting writing methods requires handling more corner cases which I think are better left to the single application logic, at least as a default.

As per the interface, I was thinking about something on the lines of giving the user the ability to both completely disable request failover or to use a whitelist of safe methods (a failover_whitelist: List[string] parameter which defaults to the name of the 'safe' methods).

this is needed to realistically test discovery

every call to a decorated function will now try the call againts the current in-use endpoint before trying to failover. failover will be tried only if the exception thrown is in failover_exceptions and the api method called is in failover_whitelist. the failover process is blocking: Client and AioClient defines their own Lock compatible object

Revolution1 · 2019-03-02T14:48:09Z

tests/test_py3/test_aio_client.py

@@ -62,5 +62,5 @@ def event_loop():  # pragma: no cover
 async def test_aio_client_host_port(etcd_cluster_ssl):
    endpoint = etcd_cluster_ssl.get_endpoints()[0]
    aio_client = AioClient(host=endpoint.host, port=endpoint.port,
-                           cert=(CERT_PATH, KEY_PATH), verify=CA_PATH)
+                           cert=(CERT_PATH, KEY_PATH), verify=False)


dariko · 2019-03-02T14:58:33Z

I'm gonna reenable certificate validation if needed: I'll need to regenerate the certificates.

I reworked the wrapper/endpoints logic. Now:

baseclient.baseurl refers to self.current_endpoint
self.current_endpoint is initially set to self.current_endpoints[0]

When calling a decorated function the call will be tried against self.current_endpoint.
If the call raises an exception included in failover_exceptions AND the wrapped method is in self.failover_whitelist then the failover process start.
The failover process is serialized using a lock by Client and AioClient: it tries the call againts all the defined endpoints, looping to the next if receiving an exception defined in failover_exceptions. If a call returns the return value is propagated, otherwise an exception will be thrown.

https://app.codacy.com/app/revol/etcd3-py/pullRequest?prid=3161986

Revolution1 · 2019-03-02T15:50:57Z

BTW: you can add your name in AUTHORS.rst 😄

dariko mentioned this pull request Feb 25, 2019

support multi endpoint #25

Open

2 tasks

dariko added 6 commits February 25, 2019 11:53

python 3.7 compatibility

4b1de04

https://docs.python.org/3/reference/datamodel.html#async-iterators

use etcd version from envs.py

c744c1a

Only include async fixtures when running on python3

7a61e92

clarify retry loop

7b33a71

dariko force-pushed the multiple_endpoints branch from b2a30fc to 7b33a71 Compare February 25, 2019 10:54

Repository owner deleted a comment Feb 25, 2019

Revolution1 reviewed Feb 25, 2019

View reviewed changes

iterate over a copy of the endpoint, preventing concurrent ops to cha…

41492ff

…nge them in flight

dariko added 11 commits February 25, 2019 21:41

allow EtcdCluster.etcdctl to failover to a working node

cad609a

more stable cluster, containers status detection

4406119

more stable cluster and containers status detection

ea4814f

add delay before asserting callback was called

a6c50b7

allow test_snapshot to be self-consistent

cebbedb

write snapshot data in docker-shared directory

f4e46dd

test watch util during etcd cluster rolling restart

2e38ccb

python 2 compat

1951e1c

remove useless decorator

a66edf9

use first endpoint data as default

6998a42

create shared directory for containers with permissive mode

78d0a65

Repository owner deleted a comment Feb 25, 2019

dariko added 11 commits February 27, 2019 13:57

disable certificate validation on tests

aead4ea

disable certificate validation on tests

95c7d24

Construct cluster endpoints based on container addresses

ba4ea4f

this is needed to realistically test discovery

move retry_all_hosts to utils.py, initial whitelist support

339f4aa

remove validation preventing minimal call format

ae94b43

replace deprecated log.warn with log.warning

7a07e22

add status to failover_waitlist

681d39c

cleanup watch util failover test

0151244

tests

d47f9ed

python 2 compatibility

709f74e

Revolution1 reviewed Mar 2, 2019

View reviewed changes

prevent mutable in parameter default

3aa610e

https://app.codacy.com/app/revol/etcd3-py/pullRequest?prid=3161986

Repository owner deleted a comment Mar 2, 2019

remove unused imports

8206a66

dariko force-pushed the multiple_endpoints branch from 545cda5 to 8206a66 Compare March 2, 2019 15:22

Repository owner deleted a comment Mar 2, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multiple endpoints #55

Multiple endpoints #55

dariko commented Feb 25, 2019

dariko commented Feb 25, 2019 •

edited

Loading

codecov bot commented Feb 25, 2019 •

edited

Loading

Revolution1 commented Feb 25, 2019

dariko commented Feb 25, 2019

Revolution1 Feb 25, 2019

Revolution1 Feb 26, 2019

Revolution1 Feb 25, 2019

Revolution1 Feb 25, 2019

Revolution1 Feb 26, 2019 •

edited

Loading

Revolution1 commented Feb 25, 2019

dariko commented Feb 25, 2019

Revolution1 Mar 2, 2019

dariko commented Mar 2, 2019

Revolution1 commented Mar 2, 2019

Multiple endpoints #55

Are you sure you want to change the base?

Multiple endpoints #55

Conversation

dariko commented Feb 25, 2019

dariko commented Feb 25, 2019 • edited Loading

codecov bot commented Feb 25, 2019 • edited Loading

Codecov Report

Revolution1 commented Feb 25, 2019

dariko commented Feb 25, 2019

Revolution1 Feb 25, 2019

Choose a reason for hiding this comment

Revolution1 Feb 26, 2019

Choose a reason for hiding this comment

Revolution1 Feb 25, 2019

Choose a reason for hiding this comment

Revolution1 Feb 25, 2019

Choose a reason for hiding this comment

Revolution1 Feb 26, 2019 • edited Loading

Choose a reason for hiding this comment

Revolution1 commented Feb 25, 2019

dariko commented Feb 25, 2019

Revolution1 Mar 2, 2019

Choose a reason for hiding this comment

dariko commented Mar 2, 2019

Revolution1 commented Mar 2, 2019

dariko commented Feb 25, 2019 •

edited

Loading

codecov bot commented Feb 25, 2019 •

edited

Loading

Revolution1 Feb 26, 2019 •

edited

Loading