Skip to content

Conversation

nicholaskuechler
Copy link
Collaborator

@nicholaskuechler nicholaskuechler commented Feb 20, 2025

The code doesn't have mappings for all the possible ironic states. If we hit an ironic state with no mapping, the result is an exception:

 kubectl logs -f ironic-node-update-28fgn
2025-02-19 14:23:54 +0000 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): nautobot-default.nautobot.svc.cluster.local:80
2025-02-19 14:23:54 +0000 - urllib3.connectionpool - DEBUG - http://nautobot-default.nautobot.svc.cluster.local:80 "GET /api/ HTTP/1.1" 200 977
2025-02-19 14:23:55 +0000 - urllib3.connectionpool - DEBUG - http://nautobot-default.nautobot.svc.cluster.local:80 "GET /api/dcim/devices/6cc75fc1-756a-4b19-bbab-fe8e63eee45b/ HTTP/1.1" 200 2243
2025-02-19 14:23:55 +0000 - urllib3.connectionpool - DEBUG - http://nautobot-default.nautobot.svc.cluster.local:80 "PATCH /api/dcim/devices/6cc75fc1-756a-4b19-bbab-fe8e63eee45b/ HTTP/1.1" 400 424
Traceback (most recent call last):
  File "/opt/venv/bin/sync-provision-state", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/opt/venv/lib/python3.11/site-packages/understack_workflows/main/sync_provision_state.py", line 60, in main
    do_action(
  File "/opt/venv/lib/python3.11/site-packages/understack_workflows/main/sync_provision_state.py", line 44, in do_action
    nautobot.update_cf(
  File "/opt/venv/lib/python3.11/site-packages/understack_workflows/nautobot.py", line 76, in update_cf
    response = device.save()
               ^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.11/site-packages/pynautobot/core/response.py", line 421, in save
    if req.patch({i: serialized[i] for i in diff}):
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.11/site-packages/pynautobot/core/query.py", line 433, in patch
    return self._make_call(verb="patch", data=data)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.11/site-packages/pynautobot/core/query.py", line 294, in _make_call
    raise RequestError(req)
pynautobot.core.query.RequestError: The request failed with code 400 Bad Request: {'__all__': ["Invalid value for custom field 'ironic_provision_state': Invalid choice (clean failed). Available choices are: active, adopting, available, clean wait, cleaning, deleting, deploy failed, deploying, enroll, error, inspect failed, inspect wait, inspecting, manageable, rescue, rescue failed, rescue wait, rescuing, service failed, service wait, servicing, unrescue failed, unrescuing, verifying, wait call-back"]}
time="2025-02-19T14:23:56.116Z" level=info msg="sub-process exited" argo=true error="<nil>"
Error: exit status 1

The argo workflow fails and the argo workflow pod which ran this code goes in to Error due to above exception. The error'd pod then triggers prometheus alerts. And we have to manually clean up the error pods.

@nicholaskuechler nicholaskuechler force-pushed the sync-provision-state-20250220 branch 3 times, most recently from fee6de3 to 09c65c1 Compare February 20, 2025 20:28
Copy link
Collaborator

@skrobul skrobul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Quarantine is not accurately describing what's happening. That fallback to a None was built for a reason.

normally it should not make a call to Nautobot if the mapped state is None. There obviously is a bug in that code - the nautobot.update_cf should also be under that if block.

@nicholaskuechler nicholaskuechler force-pushed the sync-provision-state-20250220 branch from 09c65c1 to e4b963e Compare February 20, 2025 21:03
@nicholaskuechler nicholaskuechler changed the title fix(understack-workflows): set a default status which exists in Nautobot instead of None which throws exception fix(understack-workflows): don't try to set nautobot status if it doesn't exist Feb 20, 2025
@nicholaskuechler
Copy link
Collaborator Author

The Quarantine is not accurately describing what's happening. That fallback to a None was built for a reason.

normally it should not make a call to Nautobot if the mapped state is None. There obviously is a bug in that code - the nautobot.update_cf should also be under that if block.

Fixed up and updated. Thanks!

@nicholaskuechler nicholaskuechler force-pushed the sync-provision-state-20250220 branch from e4b963e to 19ced87 Compare February 21, 2025 20:20
@skrobul skrobul added this pull request to the merge queue Feb 24, 2025
Merged via the queue into main with commit 53f214e Feb 24, 2025
27 checks passed
@skrobul skrobul deleted the sync-provision-state-20250220 branch February 24, 2025 12:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants