Conversation
|
I personalty like the approach. |
kinto/core/views/heartbeat.py
Outdated
| for name, callable in heartbeats.items(): | ||
| status[name] = callable(request) | ||
| seconds = float(request.registry.settings['heartbeat_timeout_seconds']) | ||
| with timeout(seconds): |
There was a problem hiding this comment.
nit: missing comment that it could raise TimeoutException
There was a problem hiding this comment.
Can we add a test with the error message displayed to the user?
There was a problem hiding this comment.
I would probably go for a global timeout as well as a timeout per heartbeat too.
There was a problem hiding this comment.
That would be reimplementing harakiri :(
Too bad: |
|
You should do it the other way 1/ call each heartbeat function into a separate thread. use concurrent.futures.ThreadPoolExecutor I am making the assumption that each heartbeat function is isolated enough to run in parallel with the other ones |
|
also: r- |
e14bed7 to
ad1bcad
Compare
docs/configuration/settings.rst
Outdated
| | | | endpoint: ``/v1`` redirects to ``/v1/`` and ``/buckets/default/`` | | ||
| | | | to ``/buckets/default``. No redirections are made when turned off. | | ||
| +-------------------------------------------------+--------------+--------------------------------------------------------------------------+ | ||
| | kinto.heartbeat_timeout_seconds | ``2`` | The maximum duration of each heartbeat entry. Depending of the amount of | |
There was a problem hiding this comment.
2 seconds seems very low. Since we're running in parallel I think we can use 10 seconds by default maybe ?
| error_msg = "'%s' heartbeat has exceeded timeout of %s seconds." | ||
| logger.error(error_msg % (name, seconds)) | ||
|
|
||
| # If any has failed, return a 503 error response. |
There was a problem hiding this comment.
there's one thing missing: when one (or several) heartbeat(s) fails, we need to catch the future(s) exception(s) and log them here with logger.exception so we can track down what happens. The future object holds that exception, so we just need to iterate on them and collect the TB
There was a problem hiding this comment.
another option is to catch them in heartbeat_check and push them in an exceptions list
There was a problem hiding this comment.
also: make sure you call logger.exception in the main thread sequentially on the collected tracebacks, otherwise you might have mangled exceptions since two functions can fail in parallel at the same instant
|
|
||
| # A heartbeat is supposed to return True or False, and never raise. | ||
| # Just in case, go though results to spot any potential exception. | ||
| for future in done: |
There was a problem hiding this comment.
if we re-raise here we're getting a 500 I think. I think what you want is (not tested):
for future in done:
exc = future.exception()
if exc is not None:
logger.error("%r failed" % future.__heartbeat_name)
logger.error(exc)There was a problem hiding this comment.
Yes, I did not change the previous behaviour regarding heartbeats (the try/except is managed there)
There was a problem hiding this comment.
ok I see, so there's the convention that the heartbeat functions should catch all errors. But we never know what might happen in external heartbeat functions.
Is that a behaviour we want to keep ?
e.g. do we want the global heartbeat to completely fail when one heartbeat is producing an error, or do we want to log that error and flag that backend to false in the result ?
kinto/core/views/heartbeat.py
Outdated
| future.result() # Will re-raise. | ||
| exc = future.exception() | ||
| if exc is not None: | ||
| logger.error("'%s' heartbeat failed." % future.__heartbeat_name) |
There was a problem hiding this comment.
nit: you can use %r instead of '%s'
There was a problem hiding this comment.
aaah that's why you used %r! :/
|
Looks great r+ |
Uh oh!
There was an error while loading. Please reload this page.