Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix hanging check_mxruntime_health nagios check #12

Closed
wants to merge 1 commit into from

Conversation

jtwaleson
Copy link
Contributor

If runtime admin port did not respond, the check would hang. Checks
would keep piling up, leading to OOMing systems.

This introduces a default timeout of 60 seconds. Alternatively, a
ping could be done before checking the health, but a timeout is still
needed for edge cases.

If runtime admin port did not respond, the check would hang. Checks
would keep piling up, leading to OOMing systems.

This introduces a default timeout of 60 seconds. Alternatively, a
ping could be done before checking the health, but a timeout is still
needed for edge cases.
@knorrie
Copy link
Member

knorrie commented Sep 7, 2017

Thanks. Did you test this?

@jtwaleson
Copy link
Contributor Author

No :'( Only in my head.

@knorrie
Copy link
Member

knorrie commented Nov 2, 2017

Actually, this is not the only place where a timeout would be good to add. Currently if someone will damage their running app so badly that it keeps accepting connections and requests but never answer them, you're also quite out of luck with the CLI. Starting m2ee will just hang on the first runtime_status call, and you won't get your prompt to try stop the app.

I'd propose to change the timeout=None in client.request(action, params, timeout) and set it to a sane low default that should be sufficient to do almost all admin actions, which should return instantly. The other ones which can take longer, like start and stop already define their custom timeouts.

@knorrie
Copy link
Member

knorrie commented Jan 29, 2018

Moving to #26

@knorrie knorrie closed this Jan 29, 2018
@knorrie knorrie deleted the fix_hanging_mxruntime_health_check branch February 19, 2018 18:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants