New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pybind/mgr/prometheus: add StandbyModule and handle failed MON cluster #19744
Conversation
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
src/pybind/mgr/prometheus/module.py
Outdated
cherrypy.engine.start() | ||
cherrypy.engine.start() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
double call to start()
src/pybind/mgr/prometheus/module.py
Outdated
cherrypy.engine.block() | ||
|
||
def shutdown(self): | ||
self.serving = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like self.serving was never really being used anywhere, so can remove it here too
Could you add a test similar to TestDashboard.test_standby in qa/tasks/mgr? I can't remember if you've worked on those tests before, but they're generally convenient to run in a vstart cluster with something like:
(just copied from my bash history) |
c5a1c1a
to
d3c2f5f
Compare
Tidied up the module as per your comments, thanks for that. |
That's true about teuthology, but you don't need anything running, just the python source: just clone it, run ./bootstrap in the clone dir, and then source ./virtualenv/bin/activate |
Calling cherrypy.engine.block() in the stanby module results in a failing mgr failover. Signed-off-by: Jan Fajerski <jfajerski@suse.com>
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
d3c2f5f
to
4a45b02
Compare
@jcsp ok this works for me now.
|
Added in ceph#19744 Signed-off-by: John Spray <john.spray@redhat.com>
Added the .yaml snippet to get the new prometheus test running in the lab environment here: #20047 |
This was throwing IOError("Port 9283 not free on '::'",) when trying to serve, since merging ceph#19744 It's because the standbys (on the same node as the active) are now trying to listen too. Signed-off-by: John Spray <john.spray@redhat.com>
This was throwing IOError("Port 9283 not free on '::'",) when trying to serve, since merging ceph#19744 It's because the standbys (on the same node as the active) are now trying to listen too. Signed-off-by: John Spray <john.spray@redhat.com>
Added in ceph#19744 Signed-off-by: John Spray <john.spray@redhat.com>
This was throwing IOError("Port 9283 not free on '::'",) when trying to serve, since merging ceph#19744 It's because the standbys (on the same node as the active) are now trying to listen too. Fixes: https://tracker.ceph.com/issues/22755 Signed-off-by: John Spray <john.spray@redhat.com>
Added in ceph#19744 Signed-off-by: John Spray <john.spray@redhat.com>
This was throwing IOError("Port 9283 not free on '::'",) when trying to serve, since merging ceph#19744 It's because the standbys (on the same node as the active) are now trying to listen too. Fixes: https://tracker.ceph.com/issues/22755 Signed-off-by: John Spray <john.spray@redhat.com>
This was throwing IOError("Port 9283 not free on '::'",) when trying to serve, since merging ceph#19744 It's because the standbys (on the same node as the active) are now trying to listen too. Fixes: https://tracker.ceph.com/issues/22755 Signed-off-by: John Spray <john.spray@redhat.com> (cherry picked from commit e2c68d5)
No description provided.