Client 4.0 #217

mdumandag · 2020-09-07T15:56:04Z

This PR contains required changes for client to work with 4.0 clusters. Namely, ownerless client implementation, protocol changes, connection strategy config, and some part of the serialization improvements.
This PR is pretty big so I will try to summarize the changes I have made.

Documentation is updated with the recent client implementation, logs, and config changes
Code samples are updated. That includes changes required for the client 4.0 and the removal of the boilerplate code that stops the test runner from importing and executing them. With this change, they are easier to read with less indentation. Namely, the following part

if __name__ == "__main__":
    # example code
   pass

config names are shortened. For example, to access an element from SSLConfig one had to do something like config.network_config.ssl_config.enabled which is a mouthful. Now that becomes config.network.ssl.enabled. We have also a PRD to improve the configuration. We may also change that in the scope of this PRD.
Some public APIs which should be private is converted to private (like client.statistics, client.id etc.)
Services over the client objects are renamed with more verbose alternatives. Like client.invoker -> client.invocation_service, client.proxy -> client.proxy_manager
Load balancer configuration is added with RoundRobinLB and RandomLB as provided implementations. (Now, defaults to round-robin. It was random before).
Support for client labels is added.
ConnectionStrategyConfig and ConnectionRetryConfig is added with their respective implementations to provide exponential backoffs over connections.
IndexConfig is added with the same validations as the Java client.
Ability to shuffle member list is added as a property. Implementation of it also supports primary and secondary addresses to prioritize some addresses. Like prioritizing 5701 over 5702 and 5703 when no address is provided.
A client message reader that is backed by io.BytesIO is implemented instead of extending and copying a bytearray while dealing with client messages. It supports batch reads and can handle fragmented messages.
hazelcast.exception module is renamed as hazelcast.errors
Client message is customized according to the Python client's needs. It has a single buffer for outbound messages and the same framed approach as the Java client for inbound messages.
Scripts used in tests (start-rc.sh and run-tests.sh) are updated to close the remote controller on exit.

For more, please refer to the changes.

Protocol PR: hazelcast/hazelcast-client-protocol#341

sancar · 2020-09-21T10:10:08Z

hazelcast/config.py

-        self.serialization_config = SerializationConfig()
-        """Hazelcast serialization configuration"""
-
        self.logger_config = LoggerConfig()


Is there any special reason why we left _config name in logger_config while we have removed it from all others?

Possibly leftover, fixed in #219

sancar · 2020-09-21T11:24:22Z

hazelcast/connection.py

+        self.live = False
+        if self._connect_all_members_timer:
+            self._connect_all_members_timer.cancel()
+


does _connect_all_members_timer wait for already running action to finish?
If not we can end up with an open connection if I am not mistaken.

We are canceling the timer.

If the timer is not started to execute yet, it won't be run.

If the timer is executing right now, it will try to finish its execution(in the reactor thread, but we are not waiting for it right there). In the timer, before trying to get_or_connect to a member, we do have a lifecycle.running check. (which will be False after the shutdown call). If we can pass this check in the reactor thread, we can infact create a connection. We do have a cleanup code in the reactor.shutdown but it is cleaning up connection before executing the timers. (it may create connection after entering the loop in which we iterate over connections) Therefore, I moved waiting for timers to finish before cleaning up the connections. That should solve the problem

sancar · 2020-09-21T12:06:11Z

hazelcast/errors.py

+    if error_class:
+        return error_class(message, _create_error(error_holders, idx + 1))
+    else:
+        return UndefinedErrorCodeError(message, error_holder.class_name)


I have added cause to UndefinedErrorCodeError on java. You can do the same here as well.

sancar · 2020-09-21T12:12:28Z

hazelcast/invocation.py

-
    def on_timeout(self):
-        self.set_exception(TimeoutError("Request timed out after %d seconds." % self._invocation_timeout))
+        self.set_exception(HazelcastTimeoutError("Request timed out."))


This is different than the java side behavior. We are not cancelling the invocation when invocation.timeout.seconds reached.

Good catch. Right now, client removes the pending connection on timeout. Also, added a test that verifies that there is nothing on the pending list after timeout

I have actually meant that java client is not doing anything when invocation timeout is reached.
It could be the case that invocation is a long-running task. We should check the timeout only when an exception occurs. Otherwise, we should not take any action.

Oh, I thought that property means what I did since we had a timer for the timeout before in the Python client. It was added 5 years ago (d962b88) maybe it meant something like this back then 😄

I will remove the timer altogether then

sancar · 2020-09-21T12:25:54Z

hazelcast/cluster.py

+        self._member_list_snapshot = _EMPTY_SNAPSHOT
+        self._initial_list_fetched = threading.Event()
+
+    def start(self, membership_listeners):


Is this leak to public API?

Yes, that should be a private API but since we are using this inside the client.start, I made this public. The convention that the user should follow is, if it is not documented, don't use it.

We could make this protected and still be able to call it inside the client module but that could annoy the linters(we don't have any right now, but I am planning to integrate pylint and black later) due to access to the protected members. Anyways, we could disable that check for this access or remove the method altogether. There are other places in which I did something similar to this. So, I am going to create an issue to track that. There are still to many moving parts, so I am planning to go over it once the structure of the client stabilizes a little bit. Is that OK?

The convention that the user should follow is, if it is not documented, don't use it.

If this is a well-known convention for Python users, then it is ok.
Still does not feel good, since what happens if we want to document something but it is a private API.

mdumandag added 30 commits September 2, 2020 17:34

Add 4.0 codecs and implement new client message

71512f8

implement new client message reader

061b926

implement partition service

8761684

implement cluster service

6e71e52

implement heartbeat manager

5459267

change connection implementation

fe3c0ce

update address providers & translators

ac2f930

fix cyclic imports

e4ba05b

initial config and connection manager changes

c18c93a

more connection manager changes

e75f318

finish connection manager implementation

fdf5b4b

change listener service

f5b793c

change incovation service

ca34d50

make the client work

04b2792

fix proxy implementations

b6afff7

fix proxies and their tests

998ed39

update remote controller, fix serialization and ssl tests

4740a28

reconnect fix and more tests

9022d9b

more test fixes

bfe04f5

fix more tests

ef9a17e

add index config, its tests and connection strategy tests

431875e

more cleanup

d240418

add map#set_ttl

339e109

rename invoker to invocation_service

adf3ad6

more cleanup

ede2120

update readme

26ae618

more cleanup

86f5780

fix more tests

a2c4f8d

move enums to config

4050349

fix run-tests.sh

9d2f882

mdumandag self-assigned this Sep 8, 2020

mdumandag added this to the 4.0 milestone Sep 8, 2020

mdumandag added Source: Internal Type: Documentation Type: Enhancement labels Sep 8, 2020

mdumandag mentioned this pull request Sep 9, 2020

Add UuidSerializer and improve serialization service and its test suite #218

Merged

sancar reviewed Sep 21, 2020

View reviewed changes

double check active connections for hot-path without the lock

a320e14

sancar reviewed Sep 21, 2020

View reviewed changes

wait for timers to finish in the reactor shutdown

b0edeb7

sancar reviewed Sep 21, 2020

View reviewed changes

cancel invocation on timeout

266e1ed

mdumandag mentioned this pull request Sep 21, 2020

Define the public API properly #220

Closed

mdumandag added 4 commits September 21, 2020 16:35

don't timeout invocations if there is no exception

cd46a0e

seperate private API from the public API

c9e33e9

rename logger config

da8d7cd

rename connection manager and client info

1ecbef1

sancar approved these changes Sep 25, 2020

View reviewed changes

mdumandag merged commit 00e19fd into hazelcast:master Oct 5, 2020

mdumandag deleted the 4.0 branch October 5, 2020 06:54

Client 4.0 #217

Client 4.0 #217

Uh oh!

Conversation

mdumandag commented Sep 7, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sancar Sep 21, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mdumandag commented Sep 7, 2020 •

edited

Loading

sancar Sep 21, 2020 •

edited

Loading