Skip to content

Agents show EXITED in supervisorctl  #438

@nancyc12

Description

@nancyc12

Issue

Some agents in TEL agent hosts were found to be EXITED and could not recover themselves.

ubuntu@juju-462031-0:~$ sudo supervisorctl status | grep EXITED
hp-eliteone800-g6-all-in-one-pc-c28167                        EXITED    Jan 06 11:01 AM
rpi3bp009                                                     EXITED    Jan 06 11:12 AM
rpi4b1g001                                                    EXITED    Jan 06 11:09 AM
rpi4b2g001                                                    EXITED    Jan 06 02:30 PM
rpi4b2g002                                                    EXITED    Jan 06 11:06 AM
rpi4b4g001                                                    EXITED    Jan 06 02:29 PM
rpi4b4g002                                                    EXITED    Jan 06 11:04 AM
rpi4b8g001                                                    EXITED    Jan 06 02:22 PM
rpi4b8g002                                                    EXITED    Jan 06 11:08 AM

ubuntu@juju-47ff0e-1:~$ sudo supervisorctl status | grep EXITED
rpi400001                                                      EXITED    Jan 06 02:32 PM

ubuntu@juju-ccf3c0-0:~$ sudo supervisorctl status | grep EXITED
cm3p001                                                        EXITED    Jan 06 11:21 AM
dell-precision7550-c27818                                      EXITED    Jan 09 08:29 AM
rpi400002                                                      EXITED    Jan 06 11:09 AM

Log

ubuntu@juju-462031-0:~$ sudo supervisorctl tail -f rpi4b8g002
==> Press Ctrl-C to exit <==
68c75fc38c/events (Caused by ResponseError('too many 503 error responses'))
[25-01-06 11:08:04]   ERROR: (cmd.py:32)| 'NoneType' object has no attribute 'status_code'
Traceback (most recent call last):
  File "/srv/testflinger-venv/lib/python3.10/site-packages/testflinger_agent/cmd.py", line 27, in main
    start_agent()
  File "/srv/testflinger-venv/lib/python3.10/site-packages/testflinger_agent/__init__.py", line 139, in start_agent
    agent.process_jobs()
  File "/srv/testflinger-venv/lib/python3.10/site-packages/testflinger_agent/agent.py", line 328, in process_jobs
    event_emitter.emit_event(TestEvent.CLEANUP_START)
  File "/srv/testflinger-venv/lib/python3.10/site-packages/testflinger_agent/event_emitter.py", line 55, in emit_event
    self.client.post_status_update(
  File "/srv/testflinger-venv/lib/python3.10/site-packages/testflinger_agent/client.py", line 448, in post_status_update
    % (status_update_uri, job_request.status_code)
AttributeError: 'NoneType' object has no attribute 'status_code

Workaround

Manually start each agent.

sudo supervisorctl start rpi4b8g002

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions