New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Log looping call exceptions #4008

Merged
merged 4 commits into from Oct 9, 2018

Conversation

Projects
None yet
3 participants
@erikjohnston
Member

erikjohnston commented Oct 5, 2018

If a looping call function errors, then it kills the loop entirely.
Currently it throws away the exception logs, so we should make it
actually log them.

Fixes #3929

erikjohnston added some commits Oct 5, 2018

Log looping call exceptions
If a looping call function errors, then it kills the loop entirely.
Currently it throws away the exception logs, so we should make it
actually log them.

Fixes #3929

@erikjohnston erikjohnston requested a review from matrix-org/synapse-core Oct 5, 2018

@hawkowl

This comment has been minimized.

Contributor

hawkowl commented Oct 8, 2018

Should we not restart the looping call, vs logging and then breaking?

@hawkowl

This comment has been minimized.

Contributor

hawkowl commented Oct 8, 2018

(I guess, though, we do catch it lower, so this actually shouldn't ever happen. Hmmmmm.)

@functools.wraps(f)
def wrap(*args, **kwargs):
try:
logger.info("Running looping call")

This comment has been minimized.

@hawkowl

hawkowl Oct 8, 2018

Contributor

This doesn't seem like a useful thing to log at INFO.

This comment has been minimized.

@erikjohnston

erikjohnston Oct 8, 2018

Member

Woops. That's me failing to remove my debugging.....

@erikjohnston

This comment has been minimized.

Member

erikjohnston commented Oct 8, 2018

Yeah, opinions welcome if we just want to restart the loop. I'm a bit cautious about changing the behaviour vs. the in built twisted one.

@@ -66,7 +67,7 @@ def looping_call(self, f, msec):
f(function): The function to call repeatedly.
msec(float): How long to wait between calls in milliseconds.
"""
call = task.LoopingCall(f)
call = task.LoopingCall(_log_exception_wrapper(f))

This comment has been minimized.

@richvdh

richvdh Oct 8, 2018

Member

is there a reason for doing this rather than adding an errback to the deferred returned by call.start?

def wrap(*args, **kwargs):
try:
logger.info("Running looping call")
return f(*args, **kwargs)

This comment has been minimized.

@richvdh

richvdh Oct 8, 2018

Member

this won't actually catch any asynchronous failures.

@richvdh

This comment has been minimized.

Member

richvdh commented Oct 8, 2018

given this is intended as a last resort to catch utter brokenness, I'd advocate not restarting the broken task, and rather requiring that looping calls not to return failures if they want to keep running.

@erikjohnston erikjohnston requested a review from matrix-org/synapse-core Oct 8, 2018

@richvdh

also, the old log_failure is still sitting around unloved.

@@ -112,17 +114,30 @@ def batch_iter(iterable, size):
return iter(lambda: tuple(islice(sourceiter, size)), ())
def _log_exception_wrapper(f):
"""Used to wrap looping calls to log loudly if they get killed
def make_log_failure_errback(msg, consumeErrors=True):

This comment has been minimized.

@richvdh

richvdh Oct 8, 2018

Member

could you not have msg and consumeErrors as params to log_failure and then just do addErrback(log_failure, msg) ?

This comment has been minimized.

@erikjohnston

erikjohnston Oct 8, 2018

Member

... yes.

This PR is going terribly.

@erikjohnston erikjohnston requested a review from matrix-org/synapse-core Oct 8, 2018

@richvdh

richvdh approved these changes Oct 9, 2018

looks plausible. I assume you've tested the log_failure func does what you expect, somehow?

@erikjohnston

This comment has been minimized.

Member

erikjohnston commented Oct 9, 2018

looks plausible. I assume you've tested the log_failure func does what you expect, somehow?

Yup, and this time I actually tried an async failure!

@erikjohnston erikjohnston merged commit 9eb1a79 into develop Oct 9, 2018

6 checks passed

ci/circleci: sytestpy2merged Your tests passed on CircleCI!
Details
ci/circleci: sytestpy2postgresmerged Your tests passed on CircleCI!
Details
ci/circleci: sytestpy3merged Your tests passed on CircleCI!
Details
ci/circleci: sytestpy3postgresmerged Your tests passed on CircleCI!
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details

michaelkaye added a commit that referenced this pull request Oct 22, 2018

Merge tag 'v0.33.7' into dinsic
**Warning**: This release removes the example email notification templates from
`res/templates` (they are now internal to the python package). This should only
affect you if you (a) deploy your Synapse instance from a git checkout or a
github snapshot URL, and (b) have email notifications enabled.

If you have email notifications enabled, you should ensure that
`email.template_dir` is either configured to point at a directory where you
have installed customised templates, or leave it unset to use the default
templates.

The configuration parser will try to detect the situation where
`email.template_dir` is incorrectly set to `res/templates` and do the right
thing, but will warn about this.

Features
--------

- Ship the example email templates as part of the package ([\#4052](#4052))
- Add support for end-to-end key backup (MSC1687) ([\#4019](#4019))

Bugfixes
--------

- Fix bug which made get_missing_events return too few events ([\#4045](#4045))
- Fix bug in event persistence logic which caused 'NoneType is not iterable' ([\#3995](#3995))
- Fix exception in background metrics collection ([\#3996](#3996))
- Fix exception handling in fetching remote profiles ([\#3997](#3997))
- Fix handling of rejected threepid invites ([\#3999](#3999))
- Workers now start on Python 3. ([\#4027](#4027))
- Synapse now starts on Python 3.7. ([\#4033](#4033))

Internal Changes
----------------

- Log exceptions in looping calls ([\#4008](#4008))
- Optimisation for serving federation requests ([\#4017](#4017))
- Add metric to count number of non-empty sync responses ([\#4022](#4022))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment