Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cephadm: have agent check for errors before json loading mgr response #56961

Merged
merged 1 commit into from Apr 30, 2024

Conversation

adk3798
Copy link
Contributor

@adk3798 adk3798 commented Apr 17, 2024

Currently, since it tries to json.loads the response payload before checking the return code, if there was an error it fails with

Failed to send metadata to mgr: the JSON object must be str, bytes or bytearray, not ConnectionRefusedError

which is masking the actual failure.

Fixes: https://tracker.ceph.com/issues/65553

Contribution Guidelines

  • To sign and title your commits, please refer to Submitting Patches to Ceph.

  • If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.

  • When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an x between the brackets: [x]. Spaces and capitalization matter when checking off items this way.

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows
  • jenkins test rook e2e

@adk3798 adk3798 requested a review from a team as a code owner April 17, 2024 15:43
src/cephadm/cephadm.py Outdated Show resolved Hide resolved
Currently, since it tries to json.loads the response
payload before checking the return code, if there was
an error it fails with

Failed to send metadata to mgr: the JSON object must be str, bytes or bytearray, not ConnectionRefusedError

which is masking the actual failure.

Also adds more context to the RuntimeError raised

Fixes: https://tracker.ceph.com/issues/65553

Signed-off-by: Adam King <adking@redhat.com>
@adk3798 adk3798 force-pushed the agent-check-error-before-json branch from 04f0835 to 287bd34 Compare April 22, 2024 15:07
@adk3798
Copy link
Contributor Author

adk3798 commented Apr 30, 2024

https://pulpito.ceph.com/adking-2024-04-30_05:42:49-orch:cephadm-wip-adk-testing-2024-04-29-2009-distro-default-smithi/

Most failures were in cluster log failures that are still in the process of being cleaned up.

Besides that, failures were:

  • 1 instance of https://tracker.ceph.com/issues/65718 (known issue)
  • 4 instances of mds_upgrade_sequence test failing (known issue)
  • 1 instance of staggered upgrade with agent failing (known issue)
  • some dead jobs, either reimaging issue or getting stuck before the test has actually began (seemingly on nvme-loop task) (known issue)

@adk3798 adk3798 merged commit 3e1c144 into ceph:main Apr 30, 2024
10 of 12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants