You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There's also a strange thing - in the daemon log file, I see different errors (for different nodes). E.g. I have this:
02/10/2021 11:09:57 PM <97679> plumpy.processes: [ERROR] Process<112337> failed to register as an RPC subscriber
Traceback (most recent call last):
File "/home/pizzi/.virtualenvs/aiida-dev/lib/python3.7/site-packages/plumpy/processes.py", line 296, in init
identifier = self._communicator.add_rpc_subscriber(self.message_receive, identifier=str(self.pid))
File "/home/pizzi/.virtualenvs/aiida-dev/lib/python3.7/site-packages/plumpy/communications.py", line 120, in add_rpc_subscriber
return self._communicator.add_rpc_subscriber(converted, identifier)
File "/home/pizzi/.virtualenvs/aiida-dev/lib/python3.7/site-packages/kiwipy/rmq/threadcomms.py", line 184, in add_rpc_subscriber
self._communicator.add_rpc_subscriber(self._wrap_subscriber(subscriber), identifier)
File "/home/pizzi/.virtualenvs/aiida-dev/lib/python3.7/site-packages/pytray/aiothreads.py", line 155, in await_
return self.await_submit(awaitable).result(timeout=self.task_timeout)
File "/usr/lib/python3.7/concurrent/futures/_base.py", line 437, in result
raise TimeoutError()
concurrent.futures._base.TimeoutError
but verdi process report 112337 has no log messages... In a similar weird way, if I e.g. look for 109485 in the log file (that is one with an error above), I only find this line in the logs:
so no error... Any idea why? (even the timing is different) I'm confused!
EDIT: (A note on timing: there could be a 1h shift due to UTC vs local time, and a few minutes difference between the submission and the exception - still it's not clear to me why the messages in the log file and in the process report are not the same)
There's also a strange thing - in the daemon log file, I see different errors (for different nodes). E.g. I have this
so no error... Any idea why? (even the timing is different) I'm confused!
It is also of note, if the RPC/Broadcast subscribes do fail to register when the process is being created on the daemon, it will mean that these processes will not be able to receive kill/pause/play/status messages.
Catching of the TimeoutError was added in aiidateam/plumpy#81
@sphuber/@muhrin do you think this is ideal behaviour or, if being recreated on the daemon, could/should this except so that it can be re-queued by RMQ?
Anyway I think this should be opened as a separate specific issue
There's also a strange thing - in the daemon log file, I see different errors (for different nodes). E.g. I have this:
but
verdi process report 112337
has no log messages... In a similar weird way, if I e.g. look for109485
in the log file (that is one with an error above), I only find this line in the logs:so no error... Any idea why? (even the timing is different) I'm confused!
EDIT: (A note on timing: there could be a 1h shift due to UTC vs local time, and a few minutes difference between the submission and the exception - still it's not clear to me why the messages in the log file and in the process report are not the same)
Originally posted by @giovannipizzi in #4745 (comment)
The text was updated successfully, but these errors were encountered: