-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Fix handling of failures when calling /event_auth. #5317
Conversation
When processing an incoming event over federation, we may try and resolve any unexpected differences in auth events. This is a non-essential process and so should not stop the processing of the event if it fails (e.g. due to the remote disappearing or not implementing the necessary endpoints). Fixes #3330
Codecov Report
@@ Coverage Diff @@
## develop #5317 +/- ##
===========================================
- Coverage 63% 62.99% -0.02%
===========================================
Files 342 342
Lines 35579 35590 +11
Branches 5828 5828
===========================================
+ Hits 22418 22419 +1
- Misses 11593 11601 +8
- Partials 1568 1570 +2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
generally looks sane but I have logging nitpicks
synapse/handlers/federation.py
Outdated
except Exception: | ||
# We don't really mind if the above fails, so lets not fail | ||
# processing if it does. | ||
logger.exception("Failed to call _update_auth_events_and_context_for_auth") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we really log a full stacktrace here? they look incredibly scary in the logs when they happen, so if it's something that we don't count as an error, we shouldn't log the stacktrace. just logger.warning
with the type and value of the exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, maybe. We shouldn't have any expected exceptions (now), so I'm slightly worried about not having the stack trace at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Synapse 1.0.0rc1 (2019-06-07) ============================= Features -------- - Synapse now more efficiently collates room statistics. ([\#4338](#4338), [\#5260](#5260), [\#5324](#5324)) - Add experimental support for relations (aka reactions and edits). ([\#5220](#5220)) - Ability to configure default room version. ([\#5223](#5223), [\#5249](#5249)) - Allow configuring a range for the account validity startup job. ([\#5276](#5276)) - CAS login will now hit the r0 API, not the deprecated v1 one. ([\#5286](#5286)) - Validate federation server TLS certificates by default (implements [MSC1711](https://github.com/matrix-org/matrix-doc/blob/master/proposals/1711-x509-for-federation.md)). ([\#5359](#5359)) - Update /_matrix/client/versions to reference support for r0.5.0. ([\#5360](#5360)) - Add a script to generate new signing-key files. ([\#5361](#5361)) - Update upgrade and installation guides ahead of 1.0. ([\#5371](#5371)) - Replace the `perspectives` configuration section with `trusted_key_servers`, and make validating the signatures on responses optional (since TLS will do this job for us). ([\#5374](#5374)) - Add ability to perform password reset via email without trusting the identity server. ([\#5377](#5377)) - Set default room version to v4. ([\#5379](#5379)) Bugfixes -------- - Fixes client-server API not sending "m.heroes" to lazy-load /sync requests when a rooms name or its canonical alias are empty. Thanks to @dnaf for this work! ([\#5089](#5089)) - Prevent federation device list updates breaking when processing multiple updates at once. ([\#5156](#5156)) - Fix worker registration bug caused by ClientReaderSlavedStore being unable to see get_profileinfo. ([\#5200](#5200)) - Fix race when backfilling in rooms with worker mode. ([\#5221](#5221)) - Fix appservice timestamp massaging. ([\#5233](#5233)) - Ensure that server_keys fetched via a notary server are correctly signed. ([\#5251](#5251)) - Show the correct error when logging out and access token is missing. ([\#5256](#5256)) - Fix error code when there is an invalid parameter on /_matrix/client/r0/publicRooms ([\#5257](#5257)) - Fix error when downloading thumbnail with missing width/height parameter. ([\#5258](#5258)) - Fix schema update for account validity. ([\#5268](#5268)) - Fix bug where we leaked extremities when we soft failed events, leading to performance degradation. ([\#5274](#5274), [\#5278](#5278), [\#5291](#5291)) - Fix "db txn 'update_presence' from sentinel context" log messages. ([\#5275](#5275)) - Fix dropped logcontexts during high outbound traffic. ([\#5277](#5277)) - Fix a bug where it is not possible to get events in the federation format with the request `GET /_matrix/client/r0/rooms/{roomId}/messages`. ([\#5293](#5293)) - Fix performance problems with the rooms stats background update. ([\#5294](#5294)) - Fix noisy 'no key for server' logs. ([\#5300](#5300)) - Fix bug where a notary server would sometimes forget old keys. ([\#5307](#5307)) - Prevent users from setting huge displaynames and avatar URLs. ([\#5309](#5309)) - Fix handling of failures when processing incoming events where calling `/event_auth` on remote server fails. ([\#5317](#5317)) - Ensure that we have an up-to-date copy of the signing key when validating incoming federation requests. ([\#5321](#5321)) - Fix various problems which made the signing-key notary server time out for some requests. ([\#5333](#5333)) - Fix bug which would make certain operations (such as room joins) block for 20 minutes while attemoting to fetch verification keys. ([\#5334](#5334)) - Fix a bug where we could rapidly mark a server as unreachable even though it was only down for a few minutes. ([\#5335](#5335), [\#5340](#5340)) - Fix a bug where account validity renewal emails could only be sent when email notifs were enabled. ([\#5341](#5341)) - Fix failure when fetching batches of events during backfill, etc. ([\#5342](#5342)) - Add a new room version where the timestamps on events are checked against the validity periods on signing keys. ([\#5348](#5348), [\#5354](#5354)) - Fix room stats and presence background updates to correctly handle missing events. ([\#5352](#5352)) - Include left members in room summaries' heroes. ([\#5355](#5355)) - Fix `federation_custom_ca_list` configuration option. ([\#5362](#5362)) - Fix missing logcontext warnings on shutdown. ([\#5369](#5369)) Improved Documentation ---------------------- - Fix docs on resetting the user directory. ([\#5282](#5282)) - Fix notes about ACME in the MSC1711 faq. ([\#5357](#5357)) Internal Changes ---------------- - Synapse will now serve the experimental "room complexity" API endpoint. ([\#5216](#5216)) - The base classes for the v1 and v2_alpha REST APIs have been unified. ([\#5226](#5226), [\#5328](#5328)) - Simplifications and comments in do_auth. ([\#5227](#5227)) - Remove urllib3 pin as requests 2.22.0 has been released supporting urllib3 1.25.2. ([\#5230](#5230)) - Preparatory work for key-validity features. ([\#5232](#5232), [\#5234](#5234), [\#5235](#5235), [\#5236](#5236), [\#5237](#5237), [\#5244](#5244), [\#5250](#5250), [\#5296](#5296), [\#5299](#5299), [\#5343](#5343), [\#5347](#5347), [\#5356](#5356)) - Specify the type of reCAPTCHA key to use. ([\#5283](#5283)) - Improve sample config for monthly active user blocking. ([\#5284](#5284)) - Remove spurious debug from MatrixFederationHttpClient.get_json. ([\#5287](#5287)) - Improve logging for logcontext leaks. ([\#5288](#5288)) - Clarify that the admin change password API logs the user out. ([\#5303](#5303)) - New installs will now use the v54 full schema, rather than the full schema v14 and applying incremental updates to v54. ([\#5320](#5320)) - Improve docstrings on MatrixFederationClient. ([\#5332](#5332)) - Clean up FederationClient.get_events for clarity. ([\#5344](#5344)) - Various improvements to debug logging. ([\#5353](#5353)) - Don't run CI build checks until sample config check has passed. ([\#5370](#5370)) - Automatically retry buildkite builds (max twice) when an agent is lost. ([\#5380](#5380))
When processing an incoming event over federation, we may try and
resolve any unexpected differences in auth events. This is a
non-essential process and so should not stop the processing of the event
if it fails (e.g. due to the remote disappearing or not implementing the
necessary endpoints).
Fixes #3330