Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New worker fails to connect until master restart #2302

Closed
ykvch opened this issue Feb 22, 2023 · 2 comments · Fixed by #2309
Closed

New worker fails to connect until master restart #2302

ykvch opened this issue Feb 22, 2023 · 2 comments · Fixed by #2309
Labels

Comments

@ykvch
Copy link
Contributor

ykvch commented Feb 22, 2023

Issue description

  • When existing worker tries reconnecting to master occasionally an exception is shown on master host:
locust-1  | Traceback (most recent call last):
locust-1  |   File "src/gevent/greenlet.py", line 908, in gevent._gevent_cgreenlet.Greenlet.run
locust-1  |   File "/opt/venv/lib/python3.10/site-packages/locust/runners.py", line 1090, in client_listener
locust-1  |     self.clients[msg.node_id].state = STATE_SPAWNING
locust-1  |   File "/opt/venv/lib/python3.10/site-packages/locust/runners.py", line 639, in __getitem__
locust-1  |     return self._worker_nodes[k]
locust-1  | KeyError: 'worker.perf.local_881aaac5d5124549a12e8ccc01c3d36b'
locust-1  | 2023-02-20T12:22:08Z <Greenlet at 0x7fa740e77f60: <bound method MasterRunner.client_listener of <locust.runners.MasterRunner object at 0x7fa740dea4a0>>> failed with KeyError
locust-1  | 
locust-1  | [2023-02-20 12:22:08,909] locust.perf.local/CRITICAL/locust.runners: Unhandled exception in greenlet: <Greenlet at 0x7fa740e77f60: <bound method MasterRunner.client_listener of <locust.runners.Ma
sterRunner object at 0x7fa740dea4a0>>>
locust-1  | Traceback (most recent call last):
locust-1  |   File "src/gevent/greenlet.py", line 908, in gevent._gevent_cgreenlet.Greenlet.run
locust-1  |   File "/opt/venv/lib/python3.10/site-packages/locust/runners.py", line 1090, in client_listener
locust-1  |     self.clients[msg.node_id].state = STATE_SPAWNING
locust-1  |   File "/opt/venv/lib/python3.10/site-packages/locust/runners.py", line 639, in __getitem__
locust-1  |     return self._worker_nodes[k]
locust-1  | KeyError: 'worker.perf.local_881aaac5d5124549a12e8ccc01c3d36b'
  • After that no new workers are able to connect to master.
  • Master restart is required.

Fix proposal

runners.py:1090:

try:
    self.clients[msg.node_id].state = STATE_SPAWNING
except KeyError:
    logger.warning(f"Got spawning message from unknown worker {msg.node_id}. Asking worker to quit.")
    self.server.send_to_client(Message("quit", None, msg.node_id))
@ykvch ykvch added the bug label Feb 22, 2023
@tntC4stl3
Copy link

Also encounter same issue, in my scenario, when I add a bunch of new workers to an exist test run, the issue occurs, BTW, --enable-rebalancing is enabled in my master.

@cyberw
Copy link
Collaborator

cyberw commented Feb 24, 2023

Sounds like a good fix, please make a PR!

cyberw added a commit that referenced this issue Feb 27, 2023
Fix #2302 unknown worker spawning message
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants