I would like to request for some improvement on the documentation for the class as_completed as found in here.
I'm having a use case where I had two different instances of Client object: client_1 and client_2. While using as_completed to gather futures from client_1, the loop is defaulted to the last Client object declared, which happens to be client_2. This resulted in the futures not returned properly and the tasks is stuck in the worker memory after completion. The error ranges from no message at all, to pickling error for some reason, like this:
distributed.protocol.pickle - INFO - Failed to deserialize bytearray(b'\....`)
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/distributed/protocol/pickle.py", line 66, in loads
return pickle.loads(x)
_pickle.UnpicklingError: invalid load key, '\x00'.
distributed.protocol.core - CRITICAL - Failed to deserialize
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/distributed/protocol/core.py", line 130, in loads
value = _deserialize(head, fs, deserializers=deserializers)
File "/usr/local/lib/python3.7/site-packages/distributed/protocol/serialize.py", line 302, in deserialize
return loads(header, frames)
File "/usr/local/lib/python3.7/site-packages/distributed/protocol/serialize.py", line 64, in pickle_loads
return pickle.loads(x, buffers=buffers)
File "/usr/local/lib/python3.7/site-packages/distributed/protocol/pickle.py", line 66, in loads
return pickle.loads(x)
_pickle.UnpicklingError: invalid load key, '\x00'.
It took me a long time, after reviewing the code here that I realized that we need to set the loop attribute to the loop of the corresponding Client object.
I think it can be make more clearer to user by explicitly stating it in the documentation for as_completed.
I would like to request for some improvement on the documentation for the class
as_completedas found in here.I'm having a use case where I had two different instances of
Clientobject:client_1andclient_2. While usingas_completedto gather futures fromclient_1, the loop is defaulted to the lastClientobject declared, which happens to beclient_2. This resulted in the futures not returned properly and the tasks is stuck in the worker memory after completion. The error ranges from no message at all, to pickling error for some reason, like this:It took me a long time, after reviewing the code here that I realized that we need to set the loop attribute to the loop of the corresponding
Clientobject.I think it can be make more clearer to user by explicitly stating it in the documentation for
as_completed.