Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Only try to backfill event if we haven't tried before recently (exponential backoff) #13635

Conversation

MadLittleMods
Copy link
Contributor

@MadLittleMods MadLittleMods commented Aug 26, 2022

Only try to backfill event if we haven't tried before recently (exponential backoff). No need to keep trying the same backfill point that fails over and over.

Fix #13622
Fix #8451

Follow-up to #13589

Part of #13356


The get_oldest_event_ids_with_depth_in_room tests are using this DAG layout:

flowchart BT

    3 --> A
    subgraph backfillPointsA
        b1:::not_backfilled_yet
        b2:::not_backfilled_yet
        b3:::not_backfilled_yet
    end

    subgraph main shoot
        5 --> 4 --> 3 --> 2:::not_backfilled_yet --> 1:::not_backfilled_yet
    end

    4 --> B
    subgraph backfillPointsB
        b4:::not_backfilled_yet
        b5:::not_backfilled_yet
        b6:::not_backfilled_yet
    end


    A --> b1 --> 2
    A --> b2 --> 2
    A --> b3 --> 2

    B --> b4 --> 3
    B --> b5 --> 3
    B --> b6 --> 3


    classDef not_backfilled_yet opacity: 0.75,stroke-dasharray: 5 5;
Loading

Dev notes

SYNAPSE_POSTGRES=1 SYNAPSE_TEST_LOG_LEVEL=DEBUG python -m twisted.trial tests.storage.test_event_federation.EventFederationWorkerStoreTestCase.test_get_backfill_points_in_room

SYNAPSE_TEST_LOG_LEVEL=DEBUG python -m twisted.trial tests.storage.test_event_federation.EventFederationWorkerStoreTestCase.test_get_backfill_points_in_room

Previous different graph
flowchart BT

    3 --> A
    subgraph backfillPointsA
        b1:::not_backfilled_yet
        b2:::not_backfilled_yet
        b3:::not_backfilled_yet
    end
    
    subgraph main shoot
        5 --> 4 --> 3 --> 2:::not_backfilled_yet --> 1:::not_backfilled_yet
    end

    4 --> B
    subgraph backfillPointsB
        b4:::not_backfilled_yet
        b5:::not_backfilled_yet
        b6:::not_backfilled_yet
    end


    A --> b1 --> 2
    A --> b2 --> 2
    A --> b3 --> 2

    B --> b4 ------> 2
    B --> b5 ------> 2
    B --> b6 ------> 2


    classDef not_backfilled_yet opacity: 0.75,stroke-dasharray: 5 5;

    %% alignment links
    B --- alignment1[ ]
    %% make the alignment links/nodes invisible
    style alignment1 visibility: hidden,color:transparent;
    linkStyle 18 stroke-width:2px,fill:none,stroke:none;
Loading

  • Complement Docker SQLite version: 3.34.1 sqlite3.sqlite_version_info=(3, 34, 1)
  • My local sqlite3 --version on macOS machine -> 3.37.0 2021-12-09 01:34:53 9ff244ce0739f8ee52a3e9671adb4ee54c83c640b02e3f9d185fd2f9a179aapl

‹
{
´
`
,
⹁

Todo

Pull Request Checklist

  • Pull request is based on the develop branch
  • Pull request includes a changelog file. The entry should:
    • Be a short description of your change which makes sense to users. "Fixed a bug that prevented receiving messages from other servers." instead of "Moved X method from EventStore to EventWorkerStore.".
    • Use markdown where necessary, mostly for code blocks.
    • End with either a period (.) or an exclamation mark (!).
    • Start with a capital letter.
    • Feel free to credit yourself, by adding a sentence "Contributed by @github_username." or "Contributed by [Your Name]." to the end of the entry.
  • Pull request includes a sign off
  • Code style is correct
    (run the linters)

@MadLittleMods MadLittleMods added the A-Messages-Endpoint /messages client API endpoint (`RoomMessageListRestServlet`) (which also triggers /backfill) label Aug 26, 2022
Comment on lines 799 to 800
/* TODO: Why */
GROUP BY backward_extrem.event_id
Copy link
Contributor Author

@MadLittleMods MadLittleMods Aug 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why we had this before? There is only ever one backward extremity with a given event_id (or one event with a given event_id)

Removing it and the query seems to behave just as good as before.

@MadLittleMods MadLittleMods requested a review from a team as a code owner September 14, 2022 21:06
MadLittleMods added a commit that referenced this pull request Sep 15, 2022
…e is invalid

Related to
 - #13622
    - #13635
 - #13676

Follow-up to #13815
which tracks event signature failures.

This PR aims to stop us from trying to
`_get_state_ids_after_missing_prev_event` after
we already know that the prev_event will fail
from a previous attempt

To explain an exact scenario around `/messages` -> backfill, we call `/backfill` and first check the signatures of the 100 events. We see bad signature for `$luA4l7QHhf_jadH3mI-AyFqho0U2Q-IXXUbGSMq6h6M` and `$zuOn2Rd2vsC7SUia3Hp3r6JSkSFKcc5j3QTTqW_0jDw` (both member events). Then we process the 98 events remaining that have valid signatures but one of the events references `$luA4l7QHhf_jadH3mI-AyFqho0U2Q-IXXUbGSMq6h6M` as a `prev_event`. So we have to do the whole `_get_state_ids_after_missing_prev_event` rigmarole which pulls in those same events which fail again because the signatures are still invalid.

 - `backfill`
    - `outgoing-federation-request` `/backfill`
    - `_check_sigs_and_hash_and_fetch`
       - `_check_sigs_and_hash_and_fetch_one` for each event received over backfill
          - ❗ `$luA4l7QHhf_jadH3mI-AyFqho0U2Q-IXXUbGSMq6h6M` fails with `Signature on retrieved event was invalid.`: `unable to verify signature for sender domain xxx: 401: Failed to find any key to satisfy: _FetchKeyRequest(...)`
          - ❗ `$zuOn2Rd2vsC7SUia3Hp3r6JSkSFKcc5j3QTTqW_0jDw` fails with `Signature on retrieved event was invalid.`: `unable to verify signature for sender domain xxx: 401: Failed to find any key to satisfy: _FetchKeyRequest(...)`
   - `_process_pulled_events`
      - `_process_pulled_event` for each validated event
         - ❗ Event `$Q0iMdqtz3IJYfZQU2Xk2WjB5NDF8Gg8cFSYYyKQgKJ0` references `$luA4l7QHhf_jadH3mI-AyFqho0U2Q-IXXUbGSMq6h6M` as a `prev_event` which is missing so we try to get it
            - `_get_state_ids_after_missing_prev_event`
               - `outgoing-federation-request` `/state_ids`
               - ❗ `get_pdu` for `$luA4l7QHhf_jadH3mI-AyFqho0U2Q-IXXUbGSMq6h6M` which fails the signature check again
               - ❗ `get_pdu` for `$zuOn2Rd2vsC7SUia3Hp3r6JSkSFKcc5j3QTTqW_0jDw` which fails the signature check

With this PR, we no longer call the burdensome `_get_state_ids_after_missing_prev_event`
because the signature failure will count as an attempt before we try to run this.
@squahtx squahtx assigned squahtx and unassigned squahtx Sep 16, 2022
@erikjohnston erikjohnston self-assigned this Sep 22, 2022
@MadLittleMods MadLittleMods merged commit ac1a317 into develop Sep 23, 2022
@MadLittleMods MadLittleMods deleted the madlittlemods/13622-do-not-retry-backfilling-the-same-event-over-and-over branch September 23, 2022 19:01
@MadLittleMods
Copy link
Contributor Author

Thanks for the review @reivilibre and @erikjohnston! Also thanks to @squahtx and @richvdh for the tips! 🦆

Comment on lines +787 to +798
/**
* Exponential back-off (up to the upper bound) so we don't retry the
* same backfill point over and over. ex. 2hr, 4hr, 8hr, 16hr, etc.
*
* We use `1 << n` as a power of 2 equivalent for compatibility
* with older SQLites. The left shift equivalent only works with
* powers of 2 because left shift is a binary operation (base-2).
* Otherwise, we would use `power(2, n)` or the power operator, `2^n`.
*/
AND (
failed_backfill_attempt_info.event_id IS NULL
OR ? /* current_time */ >= failed_backfill_attempt_info.last_attempt_ts + /*least*/%s((1 << failed_backfill_attempt_info.num_attempts) * ? /* step */, ? /* upper bound */)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two things here.

  1. If num_attempts is 10 then (1 << 10) ** 3600000 overflows a 32bit signed int and postgres errors. See CS-API /messages requests fail on develop when using postgres due to NumericValueOutOfRange if some events have repeatedly failed to backfill #13929 for the investigation.
  2. (Minor): the name "step" for the constant 3600000 (1 hour in ms) suggests that the first backoff period should be 1 hour: but as you noted above the first backoff period is 2 horus.

I'm attempting to PR a fix for (1).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR to fix this: #13936

MadLittleMods added a commit that referenced this pull request Sep 28, 2022
…se (#13879)

There is no need to grab thousands of backfill points when we only need 5 to make the `/backfill` request with. We need to grab a few extra in case the first few aren't visible in the history.

Previously, we grabbed thousands of backfill points from the database, then sorted and filtered them in the app. Fetching the 4.6k backfill points for `#matrix:matrix.org` from the database takes ~50ms - ~570ms so it's not like this saves a lot of time 🤷. But it might save us more time now that `get_backfill_points_in_room`/`get_insertion_event_backward_extremities_in_room` are more complicated after #13635 

This PR moves the filtering and limiting to the SQL query so we just have less data to work with in the first place.

Part of #13356
@MadLittleMods MadLittleMods changed the title Only try to backfill event if we haven't tried before recently Only try to backfill event if we haven't tried before recently (exponential backoff) Sep 29, 2022
squahtx pushed a commit that referenced this pull request Oct 4, 2022
Synapse 1.69.0rc1 (2022-10-04)
==============================

Please note that legacy Prometheus metric names are now deprecated and will be removed in Synapse 1.73.0.
Server administrators should update their dashboards and alerting rules to avoid using the deprecated metric names.
See the [upgrade notes](https://matrix-org.github.io/synapse/v1.69/upgrade.html#upgrading-to-v1690) for more details.

Features
--------

- Allow application services to set the `origin_server_ts` of a state event by providing the query parameter `ts` in [`PUT /_matrix/client/r0/rooms/{roomId}/state/{eventType}/{stateKey}`](https://spec.matrix.org/v1.4/client-server-api/#put_matrixclientv3roomsroomidstateeventtypestatekey), per [MSC3316](matrix-org/matrix-spec-proposals#3316). Contributed by @lukasdenk. ([\#11866](#11866))
- Allow server admins to require a manual approval process before new accounts can be used (using [MSC3866](matrix-org/matrix-spec-proposals#3866)). ([\#13556](#13556))
- Exponentially backoff from backfilling the same event over and over. ([\#13635](#13635), [\#13936](#13936))
- Add cache invalidation across workers to module API. ([\#13667](#13667), [\#13947](#13947))
- Experimental implementation of [MSC3882](matrix-org/matrix-spec-proposals#3882) to allow an existing device/session to generate a login token for use on a new device/session. ([\#13722](#13722), [\#13868](#13868))
- Experimental support for thread-specific receipts ([MSC3771](matrix-org/matrix-spec-proposals#3771)). ([\#13782](#13782), [\#13893](#13893), [\#13932](#13932), [\#13937](#13937), [\#13939](#13939))
- Add experimental support for [MSC3881: Remotely toggle push notifications for another client](matrix-org/matrix-spec-proposals#3881). ([\#13799](#13799), [\#13831](#13831), [\#13860](#13860))
- Keep track when an event pulled over federation fails its signature check so we can intelligently back-off in the future. ([\#13815](#13815))
- Improve validation for the unspecced, internal-only `_matrix/client/unstable/add_threepid/msisdn/submit_token` endpoint. ([\#13832](#13832))
- Faster remote room joins: record _when_ we first partial-join to a room. ([\#13892](#13892))
- Support a `dir` parameter on the `/relations` endpoint per [MSC3715](matrix-org/matrix-spec-proposals#3715). ([\#13920](#13920))
- Ask mail servers receiving emails from Synapse to not send automatic replies (e.g. out-of-office responses). ([\#13957](#13957))

Bugfixes
--------

- Send push notifications for invites received over federation. ([\#13719](#13719), [\#14014](#14014))
- Fix a long-standing bug where typing events would be accepted from remote servers not present in a room. Also fix a bug where incoming typing events would cause other incoming events to get stuck during a fast join. ([\#13830](#13830))
- Fix a bug introduced in Synapse v1.53.0 where the experimental implementation of [MSC3715](matrix-org/matrix-spec-proposals#3715) would give incorrect results when paginating forward. ([\#13840](#13840))
- Fix access token leak to logs from proxy agent. ([\#13855](#13855))
- Fix `have_seen_event` cache not being invalidated after we persist an event which causes inefficiency effects like extra `/state` federation calls. ([\#13863](#13863))
- Faster room joins: Fix a bug introduced in 1.66.0 where an error would be logged when syncing after joining a room. ([\#13872](#13872))
- Fix a bug introduced in 1.66.0 where some required fields in the pushrules sent to clients were not present anymore. Contributed by Nico. ([\#13904](#13904))
- Fix packaging to include `Cargo.lock` in `sdist`. ([\#13909](#13909))
- Fix a long-standing bug where device updates could cause delays sending out to-device messages over federation. ([\#13922](#13922))
- Fix a bug introduced in v1.68.0 where Synapse would require `setuptools_rust` at runtime, even though the package is only required at build time. ([\#13952](#13952))
- Fix a long-standing bug where `POST /_matrix/client/v3/keys/query` requests could result in excessively large SQL queries. ([\#13956](#13956))
- Fix a performance regression in the `get_users_in_room` database query. Introduced in v1.67.0. ([\#13972](#13972))
- Fix a bug introduced in v1.68.0 bug where Rust extension wasn't built in `release` mode when using `poetry install`. ([\#14009](#14009))
- Do not return an unspecified `original_event` field when using the stable `/relations` endpoint. Introduced in Synapse v1.57.0. ([\#14025](#14025))
- Correctly handle a race with device lists when a remote user leaves during a partial join. ([\#13885](#13885))
- Correctly handle sending local device list updates to remote servers during a partial join. ([\#13934](#13934))

Improved Documentation
----------------------

- Add `worker_main_http_uri` for the worker generator bash script. ([\#13772](#13772))
- Update URL for the NixOS module for Synapse. ([\#13818](#13818))
- Fix a mistake in sso_mapping_providers.md: `map_user_attributes` is expected to return `display_name`, not `displayname`. ([\#13836](#13836))
- Fix a cross-link from the registration admin API to the `registration_shared_secret` configuration documentation. ([\#13870](#13870))
- Update the man page for the `hash_password` script to correct the default number of bcrypt rounds performed. ([\#13911](#13911), [\#13930](#13930))
- Emphasize the right reasons when to use `(room_id, event_id)` in a database schema. ([\#13915](#13915))
- Add instruction to contributing guide for running unit tests in parallel. Contributed by @ashfame. ([\#13928](#13928))
- Clarify that the `auto_join_rooms` config option can also be used with Space aliases. ([\#13931](#13931))
- Add some cross references to worker documentation. ([\#13974](#13974))
- Linkify urls in config documentation. ([\#14003](#14003))

Deprecations and Removals
-------------------------

- Remove the `complete_sso_login` method from the Module API which was deprecated in Synapse 1.13.0. ([\#13843](#13843))
- Announce that legacy metric names are deprecated, will be turned off by default in Synapse v1.71.0 and removed altogether in Synapse v1.73.0. See the upgrade notes for more information. ([\#14024](#14024))

Internal Changes
----------------

- Speed up creation of DM rooms. ([\#13487](#13487), [\#13800](#13800))
- Port push rules to using Rust. ([\#13768](#13768), [\#13838](#13838), [\#13889](#13889))
- Optimise get rooms for user calls. Contributed by Nick @ Beeper (@Fizzadar). ([\#13787](#13787))
- Update the script which makes full schema dumps. ([\#13792](#13792))
- Use shared methods for cache invalidation when persisting events, remove duplicate codepaths. Contributed by Nick @ Beeper (@Fizzadar). ([\#13796](#13796))
- Improve the `synapse.api.auth.Auth` mock used in unit tests. ([\#13809](#13809))
- Faster Remote Room Joins: tell remote homeservers that we are unable to authorise them if they query a room which has partial state on our server. ([\#13823](#13823))
- Carry IdP Session IDs through user-mapping sessions. ([\#13839](#13839))
- Fix the release script not publishing binary wheels. ([\#13850](#13850))
- Raise issue if complement fails with latest deps. ([\#13859](#13859))
- Correct the comments in the complement dockerfile. ([\#13867](#13867))
- Create a new snapshot of the database schema. ([\#13873](#13873))
- Faster room joins: Send device list updates to most servers in rooms with partial state. ([\#13874](#13874), [\#14013](#14013))
- Add comments to the Prometheus recording rules to make it clear which set of rules you need for Grafana or Prometheus Console. ([\#13876](#13876))
- Only pull relevant backfill points from the database based on the current depth and limit (instead of all) every time we want to `/backfill`. ([\#13879](#13879))
- Faster room joins: Avoid waiting for full state when processing `/keys/changes` requests. ([\#13888](#13888))
- Improve backfill robustness by trying more servers when we get a `4xx` error back. ([\#13890](#13890))
- Fix mypy errors with canonicaljson 1.6.3. ([\#13905](#13905))
- Faster remote room joins: correctly handle remote device list updates during a partial join. ([\#13913](#13913))
- Complement image: propagate SIGTERM to all workers. ([\#13914](#13914))
- Update an innaccurate comment in Synapse's upsert database helper. ([\#13924](#13924))
- Update mypy (0.950 -> 0.981) and mypy-zope (0.3.7 -> 0.3.11). ([\#13925](#13925), [\#13993](#13993))
- Use dedicated `get_local_users_in_room(room_id)` function to find local users when calculating users to copy over during a room upgrade. ([\#13960](#13960))
- Refactor language in user directory `_track_user_joined_room` code to make it more clear that we use both local and remote users. ([\#13966](#13966))
- Revert catch-all exceptions being recorded as event pull attempt failures (only handle what we know about). ([\#13969](#13969))
- Speed up calculating push actions in large rooms. ([\#13973](#13973), [\#13992](#13992))
- Enable update notifications from Github's dependabot. ([\#13976](#13976))
- Prototype a workflow to automatically add changelogs to dependabot PRs. ([\#13998](#13998), [\#14011](#14011), [\#14017](#14017), [\#14021](#14021), [\#14027](#14027))
- Fix type annotations to be compatible with new annotations in development versions of twisted. ([\#14012](#14012))
- Clear out stale entries in `event_push_actions_staging` table. ([\#14020](#14020))
- Bump versions of GitHub actions. ([\#13978](#13978), [\#13979](#13979), [\#13980](#13980), [\#13982](#13982), [\#14015](#14015), [\#14019](#14019), [\#14022](#14022), [\#14023](#14023))
MadLittleMods added a commit that referenced this pull request Oct 15, 2022
…ure is invalid (#13816)

While #13635 stops us from doing the slow thing after we've already done it once, this PR stops us from doing one of the slow things in the first place.

Related to
 - #13622
    - #13635
 - #13676

Part of #13356

Follow-up to #13815 which tracks event signature failures.

With this PR, we avoid the call to the costly `_get_state_ids_after_missing_prev_event` because the signature failure will count as an attempt before and we filter events based on the backoff before calling `_get_state_ids_after_missing_prev_event` now.

For example, this will save us 156s out of the 185s total that this `matrix.org` `/messages` request. If you want to see the full Jaeger trace of this, you can drag and drop this `trace.json` into your own Jaeger, https://gist.github.com/MadLittleMods/4b12d0d0afe88c2f65ffcc907306b761

To explain this exact scenario around `/messages` -> backfill, we call `/backfill` and first check the signatures of the 100 events. We see bad signature for `$luA4l7QHhf_jadH3mI-AyFqho0U2Q-IXXUbGSMq6h6M` and `$zuOn2Rd2vsC7SUia3Hp3r6JSkSFKcc5j3QTTqW_0jDw` (both member events). Then we process the 98 events remaining that have valid signatures but one of the events references `$luA4l7QHhf_jadH3mI-AyFqho0U2Q-IXXUbGSMq6h6M` as a `prev_event`. So we have to do the whole `_get_state_ids_after_missing_prev_event` rigmarole which pulls in those same events which fail again because the signatures are still invalid.

 - `backfill`
    - `outgoing-federation-request` `/backfill`
    - `_check_sigs_and_hash_and_fetch`
       - `_check_sigs_and_hash_and_fetch_one` for each event received over backfill
          - ❗ `$luA4l7QHhf_jadH3mI-AyFqho0U2Q-IXXUbGSMq6h6M` fails with `Signature on retrieved event was invalid.`: `unable to verify signature for sender domain xxx: 401: Failed to find any key to satisfy: _FetchKeyRequest(...)`
          - ❗ `$zuOn2Rd2vsC7SUia3Hp3r6JSkSFKcc5j3QTTqW_0jDw` fails with `Signature on retrieved event was invalid.`: `unable to verify signature for sender domain xxx: 401: Failed to find any key to satisfy: _FetchKeyRequest(...)`
   - `_process_pulled_events`
      - `_process_pulled_event` for each validated event
         - ❗ Event `$Q0iMdqtz3IJYfZQU2Xk2WjB5NDF8Gg8cFSYYyKQgKJ0` references `$luA4l7QHhf_jadH3mI-AyFqho0U2Q-IXXUbGSMq6h6M` as a `prev_event` which is missing so we try to get it
            - `_get_state_ids_after_missing_prev_event`
               - `outgoing-federation-request` `/state_ids`
               - ❗ `get_pdu` for `$luA4l7QHhf_jadH3mI-AyFqho0U2Q-IXXUbGSMq6h6M` which fails the signature check again
               - ❗ `get_pdu` for `$zuOn2Rd2vsC7SUia3Hp3r6JSkSFKcc5j3QTTqW_0jDw` which fails the signature check
Fizzadar added a commit to beeper/synapse-legacy-fork that referenced this pull request Oct 18, 2022
NOTE: this is absolutely *not* safe for Beeper usage as-is. I have merged
all of the Python code in but all our customizations to the base rules
and push rule evaluator are not yet present in the new Rust module. This
will fail tests as-is and future commits will re-apply our changes in
Rust.

Synapse 1.69.0 (2022-10-17)
===========================

Please note that legacy Prometheus metric names are now deprecated and will be removed in Synapse 1.73.0.
Server administrators should update their dashboards and alerting rules to avoid using the deprecated metric names.
See the [upgrade notes](https://matrix-org.github.io/synapse/v1.69/upgrade.html#upgrading-to-v1690) for more details.

No significant changes since 1.69.0rc4.

Synapse 1.69.0rc4 (2022-10-14)
==============================

Bugfixes
--------

- Fix poor performance of the `event_push_backfill_thread_id` background update, which was introduced in Synapse 1.68.0rc1. ([\matrix-org#14172](matrix-org#14172), [\matrix-org#14181](matrix-org#14181))

Updates to the Docker image
---------------------------

- Fix docker build OOMing in CI for arm64 builds. ([\matrix-org#14173](matrix-org#14173))

Synapse 1.69.0rc3 (2022-10-12)
==============================

Bugfixes
--------

- Fix an issue with Docker images causing the Rust dependencies to not be pinned correctly. Introduced in v1.68.0 ([\matrix-org#14129](matrix-org#14129))
- Fix a bug introduced in Synapse 1.69.0rc1 which would cause registration replication requests to fail if the worker sending the request is not running Synapse 1.69. ([\matrix-org#14135](matrix-org#14135))
- Fix error in background update when rotating existing notifications. Introduced in v1.69.0rc2. ([\matrix-org#14138](matrix-org#14138))

Internal Changes
----------------

- Rename the `url_preview` extra to `url-preview`, for compatability with poetry-core 1.3.0 and [PEP 685](https://peps.python.org/pep-0685/). From-source installations using this extra will need to install using the new name. ([\matrix-org#14085](matrix-org#14085))

Synapse 1.69.0rc2 (2022-10-06)
==============================

Deprecations and Removals
-------------------------

- Deprecate the `generate_short_term_login_token` method in favor of an async `create_login_token` method in the Module API. ([\matrix-org#13842](matrix-org#13842))

Internal Changes
----------------

- Ensure Synapse v1.69 works with upcoming database changes in v1.70. ([\matrix-org#14045](matrix-org#14045))
- Fix a bug introduced in Synapse v1.68.0 where messages could not be sent in rooms with non-integer `notifications` power level. ([\matrix-org#14073](matrix-org#14073))
- Temporarily pin build-system requirements to workaround an incompatibility with poetry-core 1.3.0. This will be reverted before the v1.69.0 release proper, see [\matrix-org#14079](matrix-org#14079). ([\matrix-org#14080](matrix-org#14080))

Synapse 1.69.0rc1 (2022-10-04)
==============================

Features
--------

- Allow application services to set the `origin_server_ts` of a state event by providing the query parameter `ts` in [`PUT /_matrix/client/r0/rooms/{roomId}/state/{eventType}/{stateKey}`](https://spec.matrix.org/v1.4/client-server-api/#put_matrixclientv3roomsroomidstateeventtypestatekey), per [MSC3316](matrix-org/matrix-spec-proposals#3316). Contributed by @lukasdenk. ([\matrix-org#11866](matrix-org#11866))
- Allow server admins to require a manual approval process before new accounts can be used (using [MSC3866](matrix-org/matrix-spec-proposals#3866)). ([\matrix-org#13556](matrix-org#13556))
- Exponentially backoff from backfilling the same event over and over. ([\matrix-org#13635](matrix-org#13635), [\matrix-org#13936](matrix-org#13936))
- Add cache invalidation across workers to module API. ([\matrix-org#13667](matrix-org#13667), [\matrix-org#13947](matrix-org#13947))
- Experimental implementation of [MSC3882](matrix-org/matrix-spec-proposals#3882) to allow an existing device/session to generate a login token for use on a new device/session. ([\matrix-org#13722](matrix-org#13722), [\matrix-org#13868](matrix-org#13868))
- Experimental support for thread-specific receipts ([MSC3771](matrix-org/matrix-spec-proposals#3771)). ([\matrix-org#13782](matrix-org#13782), [\matrix-org#13893](matrix-org#13893), [\matrix-org#13932](matrix-org#13932), [\matrix-org#13937](matrix-org#13937), [\matrix-org#13939](matrix-org#13939))
- Add experimental support for [MSC3881: Remotely toggle push notifications for another client](matrix-org/matrix-spec-proposals#3881). ([\matrix-org#13799](matrix-org#13799), [\matrix-org#13831](matrix-org#13831), [\matrix-org#13860](matrix-org#13860))
- Keep track when an event pulled over federation fails its signature check so we can intelligently back-off in the future. ([\matrix-org#13815](matrix-org#13815))
- Improve validation for the unspecced, internal-only `_matrix/client/unstable/add_threepid/msisdn/submit_token` endpoint. ([\matrix-org#13832](matrix-org#13832))
- Faster remote room joins: record _when_ we first partial-join to a room. ([\matrix-org#13892](matrix-org#13892))
- Support a `dir` parameter on the `/relations` endpoint per [MSC3715](matrix-org/matrix-spec-proposals#3715). ([\matrix-org#13920](matrix-org#13920))
- Ask mail servers receiving emails from Synapse to not send automatic replies (e.g. out-of-office responses). ([\matrix-org#13957](matrix-org#13957))

Bugfixes
--------

- Send push notifications for invites received over federation. ([\matrix-org#13719](matrix-org#13719), [\matrix-org#14014](matrix-org#14014))
- Fix a long-standing bug where typing events would be accepted from remote servers not present in a room. Also fix a bug where incoming typing events would cause other incoming events to get stuck during a fast join. ([\matrix-org#13830](matrix-org#13830))
- Fix a bug introduced in Synapse v1.53.0 where the experimental implementation of [MSC3715](matrix-org/matrix-spec-proposals#3715) would give incorrect results when paginating forward. ([\matrix-org#13840](matrix-org#13840))
- Fix access token leak to logs from proxy agent. ([\matrix-org#13855](matrix-org#13855))
- Fix `have_seen_event` cache not being invalidated after we persist an event which causes inefficiency effects like extra `/state` federation calls. ([\matrix-org#13863](matrix-org#13863))
- Faster room joins: Fix a bug introduced in 1.66.0 where an error would be logged when syncing after joining a room. ([\matrix-org#13872](matrix-org#13872))
- Fix a bug introduced in 1.66.0 where some required fields in the pushrules sent to clients were not present anymore. Contributed by Nico. ([\matrix-org#13904](matrix-org#13904))
- Fix packaging to include `Cargo.lock` in `sdist`. ([\matrix-org#13909](matrix-org#13909))
- Fix a long-standing bug where device updates could cause delays sending out to-device messages over federation. ([\matrix-org#13922](matrix-org#13922))
- Fix a bug introduced in v1.68.0 where Synapse would require `setuptools_rust` at runtime, even though the package is only required at build time. ([\matrix-org#13952](matrix-org#13952))
- Fix a long-standing bug where `POST /_matrix/client/v3/keys/query` requests could result in excessively large SQL queries. ([\matrix-org#13956](matrix-org#13956))
- Fix a performance regression in the `get_users_in_room` database query. Introduced in v1.67.0. ([\matrix-org#13972](matrix-org#13972))
- Fix a bug introduced in v1.68.0 bug where Rust extension wasn't built in `release` mode when using `poetry install`. ([\matrix-org#14009](matrix-org#14009))
- Do not return an unspecified `original_event` field when using the stable `/relations` endpoint. Introduced in Synapse v1.57.0. ([\matrix-org#14025](matrix-org#14025))
- Correctly handle a race with device lists when a remote user leaves during a partial join. ([\matrix-org#13885](matrix-org#13885))
- Correctly handle sending local device list updates to remote servers during a partial join. ([\matrix-org#13934](matrix-org#13934))

Improved Documentation
----------------------

- Add `worker_main_http_uri` for the worker generator bash script. ([\matrix-org#13772](matrix-org#13772))
- Update URL for the NixOS module for Synapse. ([\matrix-org#13818](matrix-org#13818))
- Fix a mistake in sso_mapping_providers.md: `map_user_attributes` is expected to return `display_name`, not `displayname`. ([\matrix-org#13836](matrix-org#13836))
- Fix a cross-link from the registration admin API to the `registration_shared_secret` configuration documentation. ([\matrix-org#13870](matrix-org#13870))
- Update the man page for the `hash_password` script to correct the default number of bcrypt rounds performed. ([\matrix-org#13911](matrix-org#13911), [\matrix-org#13930](matrix-org#13930))
- Emphasize the right reasons when to use `(room_id, event_id)` in a database schema. ([\matrix-org#13915](matrix-org#13915))
- Add instruction to contributing guide for running unit tests in parallel. Contributed by @ashfame. ([\matrix-org#13928](matrix-org#13928))
- Clarify that the `auto_join_rooms` config option can also be used with Space aliases. ([\matrix-org#13931](matrix-org#13931))
- Add some cross references to worker documentation. ([\matrix-org#13974](matrix-org#13974))
- Linkify urls in config documentation. ([\matrix-org#14003](matrix-org#14003))

Deprecations and Removals
-------------------------

- Remove the `complete_sso_login` method from the Module API which was deprecated in Synapse 1.13.0. ([\matrix-org#13843](matrix-org#13843))
- Announce that legacy metric names are deprecated, will be turned off by default in Synapse v1.71.0 and removed altogether in Synapse v1.73.0. See the upgrade notes for more information. ([\matrix-org#14024](matrix-org#14024))

Internal Changes
----------------

- Speed up creation of DM rooms. ([\matrix-org#13487](matrix-org#13487), [\matrix-org#13800](matrix-org#13800))
- Port push rules to using Rust. ([\matrix-org#13768](matrix-org#13768), [\matrix-org#13838](matrix-org#13838), [\matrix-org#13889](matrix-org#13889))
- Optimise get rooms for user calls. Contributed by Nick @ Beeper (@Fizzadar). ([\matrix-org#13787](matrix-org#13787))
- Update the script which makes full schema dumps. ([\matrix-org#13792](matrix-org#13792))
- Use shared methods for cache invalidation when persisting events, remove duplicate codepaths. Contributed by Nick @ Beeper (@Fizzadar). ([\matrix-org#13796](matrix-org#13796))
- Improve the `synapse.api.auth.Auth` mock used in unit tests. ([\matrix-org#13809](matrix-org#13809))
- Faster Remote Room Joins: tell remote homeservers that we are unable to authorise them if they query a room which has partial state on our server. ([\matrix-org#13823](matrix-org#13823))
- Carry IdP Session IDs through user-mapping sessions. ([\matrix-org#13839](matrix-org#13839))
- Fix the release script not publishing binary wheels. ([\matrix-org#13850](matrix-org#13850))
- Raise issue if complement fails with latest deps. ([\matrix-org#13859](matrix-org#13859))
- Correct the comments in the complement dockerfile. ([\matrix-org#13867](matrix-org#13867))
- Create a new snapshot of the database schema. ([\matrix-org#13873](matrix-org#13873))
- Faster room joins: Send device list updates to most servers in rooms with partial state. ([\matrix-org#13874](matrix-org#13874), [\matrix-org#14013](matrix-org#14013))
- Add comments to the Prometheus recording rules to make it clear which set of rules you need for Grafana or Prometheus Console. ([\matrix-org#13876](matrix-org#13876))
- Only pull relevant backfill points from the database based on the current depth and limit (instead of all) every time we want to `/backfill`. ([\matrix-org#13879](matrix-org#13879))
- Faster room joins: Avoid waiting for full state when processing `/keys/changes` requests. ([\matrix-org#13888](matrix-org#13888))
- Improve backfill robustness by trying more servers when we get a `4xx` error back. ([\matrix-org#13890](matrix-org#13890))
- Fix mypy errors with canonicaljson 1.6.3. ([\matrix-org#13905](matrix-org#13905))
- Faster remote room joins: correctly handle remote device list updates during a partial join. ([\matrix-org#13913](matrix-org#13913))
- Complement image: propagate SIGTERM to all workers. ([\matrix-org#13914](matrix-org#13914))
- Update an innaccurate comment in Synapse's upsert database helper. ([\matrix-org#13924](matrix-org#13924))
- Update mypy (0.950 -> 0.981) and mypy-zope (0.3.7 -> 0.3.11). ([\matrix-org#13925](matrix-org#13925), [\matrix-org#13993](matrix-org#13993))
- Use dedicated `get_local_users_in_room(room_id)` function to find local users when calculating users to copy over during a room upgrade. ([\matrix-org#13960](matrix-org#13960))
- Refactor language in user directory `_track_user_joined_room` code to make it more clear that we use both local and remote users. ([\matrix-org#13966](matrix-org#13966))
- Revert catch-all exceptions being recorded as event pull attempt failures (only handle what we know about). ([\matrix-org#13969](matrix-org#13969))
- Speed up calculating push actions in large rooms. ([\matrix-org#13973](matrix-org#13973), [\matrix-org#13992](matrix-org#13992))
- Enable update notifications from Github's dependabot. ([\matrix-org#13976](matrix-org#13976))
- Prototype a workflow to automatically add changelogs to dependabot PRs. ([\matrix-org#13998](matrix-org#13998), [\matrix-org#14011](matrix-org#14011), [\matrix-org#14017](matrix-org#14017), [\matrix-org#14021](matrix-org#14021), [\matrix-org#14027](matrix-org#14027))
- Fix type annotations to be compatible with new annotations in development versions of twisted. ([\matrix-org#14012](matrix-org#14012))
- Clear out stale entries in `event_push_actions_staging` table. ([\matrix-org#14020](matrix-org#14020))
- Bump versions of GitHub actions. ([\matrix-org#13978](matrix-org#13978), [\matrix-org#13979](matrix-org#13979), [\matrix-org#13980](matrix-org#13980), [\matrix-org#13982](matrix-org#13982), [\matrix-org#14015](matrix-org#14015), [\matrix-org#14019](matrix-org#14019), [\matrix-org#14022](matrix-org#14022), [\matrix-org#14023](matrix-org#14023))

# -----BEGIN PGP SIGNATURE-----
#
# iQFEBAABCgAuFiEEBTGR3/RnAzBGUif3pULk7RsPrAkFAmNNMOMQHGVyaWtAbWF0
# cml4Lm9yZwAKCRClQuTtGw+sCVnjB/9jpJRVnicteEpDfVX9iLo2qfIfcO/GhUJK
# pJhv4yuY9whAldvJpmNw2f9tfUbAMcvrjlFvNrjihWmXcAGFazC6i3fNBjPgZW2e
# Sxsuuy8xc9X/OqH2EUpHtNZQX3FfSbdBS93Z62ZO3R8tEbCQvjw6FXBdjjjf5uLO
# y5Lsx94+41FJYOhs1Kt4fN92B9WMACR6e/O1YcsDjIXsoZI3uqO1h8filbQIZee7
# DTATE7eIPtShs2Ezaaeuc7tZGVDyPvgWIbuxuT6OGx20zmuChYJgIcVaD1me4UzJ
# i9bVigtpYN0eUxuWnjLf7YC6Ys/Y9wZ7/lhdgaBwdbQKEJdpi+S4
# =JWaO
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon Oct 17 11:39:31 2022 BST
# gpg:                using RSA key 053191DFF4670330465227F7A542E4ED1B0FAC09
# gpg:                issuer "erik@matrix.org"
# gpg: Can't check signature: No public key

# Conflicts:
#	docker/Dockerfile
#	pyproject.toml
#	synapse/_scripts/update_synapse_database.py
#	synapse/handlers/message.py
#	synapse/handlers/receipts.py
#	synapse/logging/context.py
#	synapse/push/baserules.py
#	synapse/push/bulk_push_rule_evaluator.py
#	synapse/push/push_rule_evaluator.py
#	synapse/replication/http/send_event.py
#	synapse/rest/admin/users.py
#	synapse/rest/client/read_marker.py
#	synapse/rest/client/receipts.py
#	synapse/rest/client/room.py
#	synapse/storage/_base.py
#	synapse/storage/databases/main/__init__.py
#	synapse/storage/databases/main/cache.py
#	synapse/storage/databases/main/events.py
#	synapse/storage/databases/main/receipts.py
#	tests/push/test_push_rule_evaluator.py
netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this pull request Oct 29, 2022
Synapse 1.69.0 (2022-10-17)
===========================

Please note that legacy Prometheus metric names are now deprecated and will be removed in Synapse 1.73.0.
Server administrators should update their dashboards and alerting rules to avoid using the deprecated metric names.
See the [upgrade notes](https://matrix-org.github.io/synapse/v1.69/upgrade.html#upgrading-to-v1690) for more details.


Deprecations and Removals
-------------------------

- Remove the `complete_sso_login` method from the Module API which was
  deprecated in Synapse
  1.13.0. ([\#13843](matrix-org/synapse#13843))

- Announce that legacy metric names are deprecated, will be turned off
  by default in Synapse v1.71.0 and removed altogether in Synapse
  v1.73.0. See the upgrade notes for more
  information. ([\#14024](matrix-org/synapse#14024))

- Deprecate the `generate_short_term_login_token` method in favor of
  an async `create_login_token` method in the Module
  API. ([\#13842](matrix-org/synapse#13842))


Features
--------

- Allow application services to set the `origin_server_ts` of a state
  event by providing the query parameter `ts` in [`PUT
  /_matrix/client/r0/rooms/{roomId}/state/{eventType}/{stateKey}`](https://spec.matrix.org/v1.4/client-server-api/#put_matrixclientv3roomsroomidstateeventtypestatekey),
  per
  [MSC3316](matrix-org/matrix-spec-proposals#3316). Contributed
  by
  @lukasdenk. ([\#11866](matrix-org/synapse#11866))

- Allow server admins to require a manual approval process before new
  accounts can be used (using
  [MSC3866](matrix-org/matrix-spec-proposals#3866)). ([\#13556](matrix-org/synapse#13556))

- Exponentially backoff from backfilling the same event over and
  over. ([\#13635](matrix-org/synapse#13635),
  [\#13936](matrix-org/synapse#13936))

- Add cache invalidation across workers to module
  API. ([\#13667](matrix-org/synapse#13667),
  [\#13947](matrix-org/synapse#13947))

- Experimental implementation of
  [MSC3882](matrix-org/matrix-spec-proposals#3882)
  to allow an existing device/session to generate a login token for
  use on a new
  device/session. ([\#13722](matrix-org/synapse#13722),
  [\#13868](matrix-org/synapse#13868))

- Experimental support for thread-specific receipts
  ([MSC3771](matrix-org/matrix-spec-proposals#3771)). ([\#13782](matrix-org/synapse#13782),
  [\#13893](matrix-org/synapse#13893),
  [\#13932](matrix-org/synapse#13932),
  [\#13937](matrix-org/synapse#13937),
  [\#13939](matrix-org/synapse#13939))

- Add experimental support for [MSC3881: Remotely toggle push
  notifications for another
  client](matrix-org/matrix-spec-proposals#3881). ([\#13799](matrix-org/synapse#13799),
  [\#13831](matrix-org/synapse#13831),
  [\#13860](matrix-org/synapse#13860))

- Keep track when an event pulled over federation fails its signature
  check so we can intelligently back-off in the
  future. ([\#13815](matrix-org/synapse#13815))

- Improve validation for the unspecced, internal-only
  `_matrix/client/unstable/add_threepid/msisdn/submit_token`
  endpoint. ([\#13832](matrix-org/synapse#13832))

- Faster remote room joins: record _when_ we first partial-join to a
  room. ([\#13892](matrix-org/synapse#13892))

- Support a `dir` parameter on the `/relations` endpoint per
  [MSC3715](matrix-org/matrix-spec-proposals#3715). ([\#13920](matrix-org/synapse#13920))

- Ask mail servers receiving emails from Synapse to not send automatic
  replies (e.g. out-of-office
  responses). ([\#13957](matrix-org/synapse#13957))
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Messages-Endpoint /messages client API endpoint (`RoomMessageListRestServlet`) (which also triggers /backfill) A-Performance Performance, both client-facing and admin-facing T-Enhancement New features, changes in functionality, improvements in performance, or user-facing enhancements.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Only try to backfill event if we haven't tried before recently Better handle invalid events when backfilling
6 participants