Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issues with in-memory monitor stretch state #40835

Merged
merged 4 commits into from Apr 13, 2021

Conversation

gregsfortytwo
Copy link
Member

This PR resolves https://tracker.ceph.com/issues/50308

My maintenance of the in-memory stretch states was quite sloppy, which could
result in non-leader monitors getting stuck in a half-degraded state, and
subsequently refusing to enable degraded states when they became leader.

Checklist

  • References tracker ticket
  • Updates documentation if necessary
  • Includes tests for new functionality or reproducer for bug

Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox

Signed-off-by: Greg Farnum <gfarnum@redhat.com>
Signed-off-by: Greg Farnum <gfarnum@redhat.com>
…_mode

Signed-off-by: Greg Farnum <gfarnum@redhat.com>
… calls!

Add header comment describing how this works now.

Fixes: https://tracker.ceph.com/issues/50308

Signed-off-by: Greg Farnum <gfarnum@redhat.com>
@gregsfortytwo
Copy link
Member Author

I did local testing to reproduce the not-changing-modes-correctly bug and verify this PR resolves it.

Ran the rados suite and it looks good; 462 passed; failures are:
https://pulpito.ceph.com/gregf-2021-04-13_09:22:15-rados-wip-stretch-mon-state-412-distro-basic-smithi/
6043259: some cls_cas test issues; created ticket https://tracker.ceph.com/issues/50339
6043318: didn't finish scrub before timeout?
6043329: one monitor missed the first election after it finished synchronizing, but got in on the next one
6043351: known valgrind error, https://tracker.ceph.com/issues/50299
6043369: radosgw crash; nothing I could have impacted
6043442: cls_cas issue again
6043535: known valgrind error, https://tracker.ceph.com/issues/50299
6043582: this time test_cls_cas just timed out after the same initial failure
6043606: radosgw crash again
6043656: cls_cas issue again
6043691: https://tracker.ceph.com/issues/45721

@gregsfortytwo gregsfortytwo merged commit 05861ca into ceph:master Apr 13, 2021
@gregsfortytwo gregsfortytwo deleted the wip-stretch-mon-state branch April 13, 2021 22:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants