Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pybind/mgr/prometheus: add StandbyModule and handle failed MON cluster #19744

Merged
merged 4 commits into from Jan 22, 2018

Conversation

jan--f
Copy link
Contributor

@jan--f jan--f commented Jan 2, 2018

No description provided.

Jan Fajerski added 2 commits January 2, 2018 18:18
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
@jcsp jcsp self-requested a review January 2, 2018 17:36
@jcsp jcsp added the mgr label Jan 2, 2018
cherrypy.engine.start()
cherrypy.engine.start()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

double call to start()

cherrypy.engine.block()

def shutdown(self):
self.serving = False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like self.serving was never really being used anywhere, so can remove it here too

@jcsp
Copy link
Contributor

jcsp commented Jan 10, 2018

Could you add a test similar to TestDashboard.test_standby in qa/tasks/mgr?

I can't remember if you've worked on those tests before, but they're generally convenient to run in a vstart cluster with something like:

LD_LIBRARY_PATH=lib/ PYTHONPATH=lib/cython_modules/lib.2 python ../qa/tasks/vstart_runner.py --create tasks.mgr.test_module_selftest.TestModuleSelftest.test_zabbix

(just copied from my bash history)

@jan--f jan--f force-pushed the mgr-prometheus-standby-mondown branch 2 times, most recently from c5a1c1a to d3c2f5f Compare January 13, 2018 16:29
@jan--f
Copy link
Contributor Author

jan--f commented Jan 13, 2018

Tidied up the module as per your comments, thanks for that.
I also added a test_prometheus.py in qa/tasks/mgr, taking test_dashboard.py as a template. I wasn't able to test it locally though. Your command needs teuthology in the PYTHONPATH I assume.

@jcsp
Copy link
Contributor

jcsp commented Jan 15, 2018

That's true about teuthology, but you don't need anything running, just the python source: just clone it, run ./bootstrap in the clone dir, and then source ./virtualenv/bin/activate

Jan Fajerski added 2 commits January 22, 2018 13:21
Calling cherrypy.engine.block() in the stanby module results in a failing
mgr failover.

Signed-off-by: Jan Fajerski <jfajerski@suse.com>
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
@jan--f jan--f force-pushed the mgr-prometheus-standby-mondown branch from d3c2f5f to 4a45b02 Compare January 22, 2018 12:26
@jan--f
Copy link
Contributor Author

jan--f commented Jan 22, 2018

@jcsp ok this works for me now.

2018-01-22 13:23:32,905.905 INFO:__main__:Stopped test: test_urls (tasks.mgr.test_prometheus.TestPrometheus) in 21.410567s
2018-01-22 13:23:32,905.905 INFO:__main__:
2018-01-22 13:23:32,905.905 INFO:__main__:----------------------------------------------------------------------
2018-01-22 13:23:32,905.905 INFO:__main__:Ran 2 tests in 55.564s
2018-01-22 13:23:32,906.906 INFO:__main__:
2018-01-22 13:23:32,906.906 INFO:__main__:OK

@jcsp jcsp merged commit c05d963 into ceph:master Jan 22, 2018
jcsp pushed a commit to jcsp/ceph that referenced this pull request Jan 22, 2018
Added in ceph#19744

Signed-off-by: John Spray <john.spray@redhat.com>
@jcsp
Copy link
Contributor

jcsp commented Jan 22, 2018

Added the .yaml snippet to get the new prometheus test running in the lab environment here: #20047

jcsp pushed a commit to jcsp/ceph that referenced this pull request Jan 22, 2018
This was throwing IOError("Port 9283 not free on '::'",)
when trying to serve, since merging ceph#19744

It's because the standbys (on the same node as the active) are
now trying to listen too.

Signed-off-by: John Spray <john.spray@redhat.com>
jcsp pushed a commit to jcsp/ceph that referenced this pull request Jan 22, 2018
This was throwing IOError("Port 9283 not free on '::'",)
when trying to serve, since merging ceph#19744

It's because the standbys (on the same node as the active) are
now trying to listen too.

Signed-off-by: John Spray <john.spray@redhat.com>
jcsp pushed a commit to jcsp/ceph that referenced this pull request Jan 23, 2018
Added in ceph#19744

Signed-off-by: John Spray <john.spray@redhat.com>
jcsp pushed a commit to jcsp/ceph that referenced this pull request Jan 23, 2018
This was throwing IOError("Port 9283 not free on '::'",)
when trying to serve, since merging ceph#19744

It's because the standbys (on the same node as the active) are
now trying to listen too.

Fixes: https://tracker.ceph.com/issues/22755
Signed-off-by: John Spray <john.spray@redhat.com>
cache-nez pushed a commit to cache-nez/ceph that referenced this pull request Feb 6, 2018
Added in ceph#19744

Signed-off-by: John Spray <john.spray@redhat.com>
cache-nez pushed a commit to cache-nez/ceph that referenced this pull request Feb 6, 2018
This was throwing IOError("Port 9283 not free on '::'",)
when trying to serve, since merging ceph#19744

It's because the standbys (on the same node as the active) are
now trying to listen too.

Fixes: https://tracker.ceph.com/issues/22755
Signed-off-by: John Spray <john.spray@redhat.com>
smithfarm pushed a commit to smithfarm/ceph that referenced this pull request Apr 9, 2018
This was throwing IOError("Port 9283 not free on '::'",)
when trying to serve, since merging ceph#19744

It's because the standbys (on the same node as the active) are
now trying to listen too.

Fixes: https://tracker.ceph.com/issues/22755
Signed-off-by: John Spray <john.spray@redhat.com>
(cherry picked from commit e2c68d5)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
2 participants