Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Fix MultiWriterIdGenerator.current_position. #8257

Merged
merged 3 commits into from Sep 8, 2020

Conversation

erikjohnston
Copy link
Member

@erikjohnston erikjohnston commented Sep 4, 2020

It did not correctly handle IDs finishing being persisted out of order, resulting in the current_position lagging until new IDs are persisted.

Currently this is only used for the cache invalidation stream over replication, so this should not have had an observable impact (since we invalidate caches all the time). While #7986 had landed this caused flakey sytests (matrix-org/sytest#948) as they wouldn't write anything until the stream position had advanced.

It did not correctly handle IDs finishing being persisted out of
order, resulting in the `current_position` lagging until new IDs are
persisted.
@erikjohnston erikjohnston requested a review from a team September 4, 2020 14:19
@@ -425,7 +456,7 @@ def _add_persisted_position(self, new_id: int):
# We move the current min position up if the minimum current positions
# of all instances is higher (since by definition all positions less
# that that have been persisted).
min_curr = min(self._current_positions.values())
min_curr = min(self._current_positions.values(), default=0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm struggling to see how this is directly related to this PR? The change looks fine though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sigh, sorry, that is a somewhat unrelated fix that broke that particular sytest. Basically, if the first time a worker see's an ID go past it is out of order, you can get here without anything having been added to current positions, causing this to explode.

min_unfinshed = min(self._unfinished_ids)
for s in self._finished_ids:
if s < min_unfinshed:
if not new_cur or new_cur < s:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not expect new_cur to ever be 0, correct?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not, though probably clearer to put is None

Comment on lines 370 to 379
for s in self._finished_ids:
if s < min_unfinshed:
if not new_cur or new_cur < s:
new_cur = s
else:
finished.add(s)

# We clear these out since they're now all less than the new
# position.
self._finished_ids = finished
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this essentially discarding all finished IDs that are less than the minimum unfinished ID? I'm guessing we can't discard in a loop (which would be clearer to me) because Python doesn't like you modifying things as you iterate them?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all feels like it is doing a lot more work than before, but I guess both the old code and new are essentially iterating _unfinished_ids once (although this also builds a second set).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, basically we want to remove all items that are less than minimum unfinished ID. We could rewrite it to be a sorted list instead and then chop off the front of it, but its not obvious that that is better/quicker/easier.

Comment on lines +364 to +365
# If there are unfinished IDs then the new position will be the
# largest finished ID less than the minimum unfinished ID.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unsure if it is clearer, but I think something like:

new_cur = min_finished
for s in self._finished_ids:
    if s < min_funished:
        new_cur = max(new_cur, s)
    else:
        finished.add(s)

But that might make the handling later on harder.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potentially, but then you iterate over the list twice (since you're doing a min(...))? You could write this as:

new_curr = min(self._finished_ids)
self._finished_ids = set(i for i in self._finished_ids if i > new_curr)

or something?

synapse/storage/util/id_generators.py Show resolved Hide resolved
@erikjohnston erikjohnston merged commit deedb91 into develop Sep 8, 2020
@erikjohnston erikjohnston deleted the erikj/fix_multi_writer_id_gen branch September 8, 2020 13:26
richvdh added a commit that referenced this pull request Oct 1, 2020
Synapse 1.21.0rc1 (2020-10-01)
==============================

Features
--------

- Require the user to confirm that their password should be reset after clicking the email confirmation link. ([\#8004](#8004))
- Add an admin API `GET /_synapse/admin/v1/event_reports` to read entries of table `event_reports`. Contributed by @dklimpel. ([\#8217](#8217))
- Consolidate the SSO error template across all configuration. ([\#8248](#8248), [\#8405](#8405))
- Add a configuration option to specify a whitelist of domains that a user can be redirected to after validating their email or phone number. ([\#8275](#8275), [\#8417](#8417))
- Add experimental support for sharding event persister. ([\#8294](#8294), [\#8387](#8387), [\#8396](#8396), [\#8419](#8419))
- Add the room topic and avatar to the room details admin API. ([\#8305](#8305))
- Add an admin API for querying rooms where a user is a member. Contributed by @dklimpel. ([\#8306](#8306))
- Add `uk.half-shot.msc2778.login.application_service` login type to allow appservices to login. ([\#8320](#8320))
- Add a configuration option that allows existing users to log in with OpenID Connect. Contributed by @BBBSnowball and @OmmyZhang. ([\#8345](#8345))
- Add prometheus metrics for replication requests. ([\#8406](#8406))
- Support passing additional single sign-on parameters to the client. ([\#8413](#8413))
- Add experimental reporting of metrics on expensive rooms for state-resolution. ([\#8420](#8420))
- Add experimental prometheus metric to track numbers of "large" rooms for state resolutiom. ([\#8425](#8425))
- Add prometheus metrics to track federation delays. ([\#8430](#8430))

Bugfixes
--------

- Fix a bug in the media repository where remote thumbnails with the same size but different crop methods would overwrite each other. Contributed by @deepbluev7. ([\#7124](#7124))
- Fix inconsistent handling of non-existent push rules, and stop tracking the `enabled` state of removed push rules. ([\#7796](#7796))
- Fix a longstanding bug when storing a media file with an empty `upload_name`. ([\#7905](#7905))
- Fix messages not being sent over federation until an event is sent into the same room. ([\#8230](#8230), [\#8247](#8247), [\#8258](#8258), [\#8272](#8272), [\#8322](#8322))
- Fix a longstanding bug where files that could not be thumbnailed would result in an Internal Server Error. ([\#8236](#8236), [\#8435](#8435))
- Upgrade minimum version of `canonicaljson` to version 1.4.0, to fix an unicode encoding issue. ([\#8262](#8262))
- Fix longstanding bug which could lead to incomplete database upgrades on SQLite. ([\#8265](#8265))
- Fix stack overflow when stderr is redirected to the logging system, and the logging system encounters an error. ([\#8268](#8268))
- Fix a bug which cause the logging system to report errors, if `DEBUG` was enabled and no `context` filter was applied. ([\#8278](#8278))
- Fix edge case where push could get delayed for a user until a later event was pushed. ([\#8287](#8287))
- Fix fetching malformed events from remote servers. ([\#8324](#8324))
- Fix `UnboundLocalError` from occuring when appservices send a malformed register request. ([\#8329](#8329))
- Don't send push notifications to expired user accounts. ([\#8353](#8353))
- Fix a regression in v1.19.0 with reactivating users through the admin API. ([\#8362](#8362))
- Fix a bug where during device registration the length of the device name wasn't limited. ([\#8364](#8364))
- Include `guest_access` in the fields that are checked for null bytes when updating `room_stats_state`. Broke in v1.7.2. ([\#8373](#8373))
- Fix theoretical race condition where events are not sent down `/sync` if the synchrotron worker is restarted without restarting other workers. ([\#8374](#8374))
- Fix a bug which could cause errors in rooms with malformed membership events, on servers using sqlite. ([\#8385](#8385))
- Fix "Re-starting finished log context" warning when receiving an event we already had over federation. ([\#8398](#8398))
- Fix incorrect handling of timeouts on outgoing HTTP requests. ([\#8400](#8400))
- Fix a regression in v1.20.0 in the `synapse_port_db` script regarding the `ui_auth_sessions_ips` table. ([\#8410](#8410))
- Remove unnecessary 3PID registration check when resetting password via an email address. Bug introduced in v0.34.0rc2. ([\#8414](#8414))

Improved Documentation
----------------------

- Add `/_synapse/client` to the reverse proxy documentation. ([\#8227](#8227))
- Add note to the reverse proxy settings documentation about disabling Apache's mod_security2. Contributed by Julian Fietkau (@jfietkau). ([\#8375](#8375))
- Improve description of `server_name` config option in `homserver.yaml`. ([\#8415](#8415))

Deprecations and Removals
-------------------------

- Drop support for `prometheus_client` older than 0.4.0. ([\#8426](#8426))

Internal Changes
----------------

- Fix tests on distros which disable TLSv1.0. Contributed by @danc86. ([\#8208](#8208))
- Simplify the distributor code to avoid unnecessary work. ([\#8216](#8216))
- Remove the `populate_stats_process_rooms_2` background job and restore functionality to `populate_stats_process_rooms`. ([\#8243](#8243))
- Clean up type hints for `PaginationConfig`. ([\#8250](#8250), [\#8282](#8282))
- Track the latest event for every destination and room for catch-up after federation outage. ([\#8256](#8256))
- Fix non-user visible bug in implementation of `MultiWriterIdGenerator.get_current_token_for_writer`. ([\#8257](#8257))
- Switch to the JSON implementation from the standard library. ([\#8259](#8259))
- Add type hints to `synapse.util.async_helpers`. ([\#8260](#8260))
- Simplify tests that mock asynchronous functions. ([\#8261](#8261))
- Add type hints to `StreamToken` and `RoomStreamToken` classes. ([\#8279](#8279))
- Change `StreamToken.room_key` to be a `RoomStreamToken` instance. ([\#8281](#8281))
- Refactor notifier code to correctly use the max event stream position. ([\#8288](#8288))
- Use slotted classes where possible. ([\#8296](#8296))
- Support testing the local Synapse checkout against the [Complement homeserver test suite](https://github.com/matrix-org/complement/). ([\#8317](#8317))
- Update outdated usages of `metaclass` to python 3 syntax. ([\#8326](#8326))
- Move lint-related dependencies to package-extra field, update CONTRIBUTING.md to utilise this. ([\#8330](#8330), [\#8377](#8377))
- Use the `admin_patterns` helper in additional locations. ([\#8331](#8331))
- Fix test logging to allow braces in log output. ([\#8335](#8335))
- Remove `__future__` imports related to Python 2 compatibility. ([\#8337](#8337))
- Simplify `super()` calls to Python 3 syntax. ([\#8344](#8344))
- Fix bad merge from `release-v1.20.0` branch to `develop`. ([\#8354](#8354))
- Factor out a `_send_dummy_event_for_room` method. ([\#8370](#8370))
- Improve logging of state resolution. ([\#8371](#8371))
- Add type annotations to `SimpleHttpClient`. ([\#8372](#8372))
- Refactor ID generators to use `async with` syntax. ([\#8383](#8383))
- Add `EventStreamPosition` type. ([\#8388](#8388))
- Create a mechanism for marking tests "logcontext clean". ([\#8399](#8399))
- A pair of tiny cleanups in the federation request code. ([\#8401](#8401))
- Add checks on startup that PostgreSQL sequences are consistent with their associated tables. ([\#8402](#8402))
- Do not include appservice users when calculating the total MAU for a server. ([\#8404](#8404))
- Typing fixes for `synapse.handlers.federation`. ([\#8422](#8422))
- Various refactors to simplify stream token handling. ([\#8423](#8423))
- Make stream token serializing/deserializing async. ([\#8427](#8427))
netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this pull request Oct 17, 2020
Synapse 1.21.2 (2020-10-15)
===========================

Debian packages and Docker images have been rebuilt using the latest versions of dependency libraries, including authlib 0.15.1. Please see bugfixes below.

Security advisory
-----------------

* HTML pages served via Synapse were vulnerable to cross-site scripting (XSS)
  attacks. All server administrators are encouraged to upgrade.
  ([\#8444](matrix-org/synapse#8444))
  ([CVE-2020-26891](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-26891))

  This fix was originally included in v1.21.0 but was missing a security advisory.

  This was reported by [Denis Kasak](https://github.com/dkasak).

Bugfixes
--------

- Fix rare bug where sending an event would fail due to a racey assertion. ([\#8530](matrix-org/synapse#8530))
- An updated version of the authlib dependency is included in the Docker and Debian images to fix an issue using OpenID Connect. See [\#8534](matrix-org/synapse#8534) for details.


Synapse 1.21.1 (2020-10-13)
===========================

This release fixes a regression in v1.21.0 that prevented debian packages from being built.
It is otherwise identical to v1.21.0.

Synapse 1.21.0 (2020-10-12)
===========================

No significant changes since v1.21.0rc3.

As [noted in
v1.20.0](https://github.com/matrix-org/synapse/blob/release-v1.21.0/CHANGES.md#synapse-1200-2020-09-22),
a future release will drop support for accessing Synapse's
[Admin API](https://github.com/matrix-org/synapse/tree/master/docs/admin_api) under the
`/_matrix/client/*` endpoint prefixes. At that point, the Admin API will only
be accessible under `/_synapse/admin`.


Synapse 1.21.0rc3 (2020-10-08)
==============================

Bugfixes
--------

- Fix duplication of events on high traffic servers, caused by PostgreSQL `could not serialize access due to concurrent update` errors. ([\#8456](matrix-org/synapse#8456))


Internal Changes
----------------

- Add Groovy Gorilla to the list of distributions we build `.deb`s for. ([\#8475](matrix-org/synapse#8475))


Synapse 1.21.0rc2 (2020-10-02)
==============================

Features
--------

- Convert additional templates from inline HTML to Jinja2 templates. ([\#8444](matrix-org/synapse#8444))

Bugfixes
--------

- Fix a regression in v1.21.0rc1 which broke thumbnails of remote media. ([\#8438](matrix-org/synapse#8438))
- Do not expose the experimental `uk.half-shot.msc2778.login.application_service` flow in the login API, which caused a compatibility problem with Element iOS. ([\#8440](matrix-org/synapse#8440))
- Fix malformed log line in new federation "catch up" logic. ([\#8442](matrix-org/synapse#8442))
- Fix DB query on startup for negative streams which caused long start up times. Introduced in [\#8374](matrix-org/synapse#8374). ([\#8447](matrix-org/synapse#8447))


Synapse 1.21.0rc1 (2020-10-01)
==============================

Features
--------

- Require the user to confirm that their password should be reset after clicking the email confirmation link. ([\#8004](matrix-org/synapse#8004))
- Add an admin API `GET /_synapse/admin/v1/event_reports` to read entries of table `event_reports`. Contributed by @dklimpel. ([\#8217](matrix-org/synapse#8217))
- Consolidate the SSO error template across all configuration. ([\#8248](matrix-org/synapse#8248), [\#8405](matrix-org/synapse#8405))
- Add a configuration option to specify a whitelist of domains that a user can be redirected to after validating their email or phone number. ([\#8275](matrix-org/synapse#8275), [\#8417](matrix-org/synapse#8417))
- Add experimental support for sharding event persister. ([\#8294](matrix-org/synapse#8294), [\#8387](matrix-org/synapse#8387), [\#8396](matrix-org/synapse#8396), [\#8419](matrix-org/synapse#8419))
- Add the room topic and avatar to the room details admin API. ([\#8305](matrix-org/synapse#8305))
- Add an admin API for querying rooms where a user is a member. Contributed by @dklimpel. ([\#8306](matrix-org/synapse#8306))
- Add `uk.half-shot.msc2778.login.application_service` login type to allow appservices to login. ([\#8320](matrix-org/synapse#8320))
- Add a configuration option that allows existing users to log in with OpenID Connect. Contributed by @BBBSnowball and @OmmyZhang. ([\#8345](matrix-org/synapse#8345))
- Add prometheus metrics for replication requests. ([\#8406](matrix-org/synapse#8406))
- Support passing additional single sign-on parameters to the client. ([\#8413](matrix-org/synapse#8413))
- Add experimental reporting of metrics on expensive rooms for state-resolution. ([\#8420](matrix-org/synapse#8420))
- Add experimental prometheus metric to track numbers of "large" rooms for state resolutiom. ([\#8425](matrix-org/synapse#8425))
- Add prometheus metrics to track federation delays. ([\#8430](matrix-org/synapse#8430))


Bugfixes
--------

- Fix a bug in the media repository where remote thumbnails with the same size but different crop methods would overwrite each other. Contributed by @deepbluev7. ([\#7124](matrix-org/synapse#7124))
- Fix inconsistent handling of non-existent push rules, and stop tracking the `enabled` state of removed push rules. ([\#7796](matrix-org/synapse#7796))
- Fix a longstanding bug when storing a media file with an empty `upload_name`. ([\#7905](matrix-org/synapse#7905))
- Fix messages not being sent over federation until an event is sent into the same room. ([\#8230](matrix-org/synapse#8230), [\#8247](matrix-org/synapse#8247), [\#8258](matrix-org/synapse#8258), [\#8272](matrix-org/synapse#8272), [\#8322](matrix-org/synapse#8322))
- Fix a longstanding bug where files that could not be thumbnailed would result in an Internal Server Error. ([\#8236](matrix-org/synapse#8236), [\#8435](matrix-org/synapse#8435))
- Upgrade minimum version of `canonicaljson` to version 1.4.0, to fix an unicode encoding issue. ([\#8262](matrix-org/synapse#8262))
- Fix longstanding bug which could lead to incomplete database upgrades on SQLite. ([\#8265](matrix-org/synapse#8265))
- Fix stack overflow when stderr is redirected to the logging system, and the logging system encounters an error. ([\#8268](matrix-org/synapse#8268))
- Fix a bug which cause the logging system to report errors, if `DEBUG` was enabled and no `context` filter was applied. ([\#8278](matrix-org/synapse#8278))
- Fix edge case where push could get delayed for a user until a later event was pushed. ([\#8287](matrix-org/synapse#8287))
- Fix fetching malformed events from remote servers. ([\#8324](matrix-org/synapse#8324))
- Fix `UnboundLocalError` from occuring when appservices send a malformed register request. ([\#8329](matrix-org/synapse#8329))
- Don't send push notifications to expired user accounts. ([\#8353](matrix-org/synapse#8353))
- Fix a regression in v1.19.0 with reactivating users through the admin API. ([\#8362](matrix-org/synapse#8362))
- Fix a bug where during device registration the length of the device name wasn't limited. ([\#8364](matrix-org/synapse#8364))
- Include `guest_access` in the fields that are checked for null bytes when updating `room_stats_state`. Broke in v1.7.2. ([\#8373](matrix-org/synapse#8373))
- Fix theoretical race condition where events are not sent down `/sync` if the synchrotron worker is restarted without restarting other workers. ([\#8374](matrix-org/synapse#8374))
- Fix a bug which could cause errors in rooms with malformed membership events, on servers using sqlite. ([\#8385](matrix-org/synapse#8385))
- Fix "Re-starting finished log context" warning when receiving an event we already had over federation. ([\#8398](matrix-org/synapse#8398))
- Fix incorrect handling of timeouts on outgoing HTTP requests. ([\#8400](matrix-org/synapse#8400))
- Fix a regression in v1.20.0 in the `synapse_port_db` script regarding the `ui_auth_sessions_ips` table. ([\#8410](matrix-org/synapse#8410))
- Remove unnecessary 3PID registration check when resetting password via an email address. Bug introduced in v0.34.0rc2. ([\#8414](matrix-org/synapse#8414))


Improved Documentation
----------------------

- Add `/_synapse/client` to the reverse proxy documentation. ([\#8227](matrix-org/synapse#8227))
- Add note to the reverse proxy settings documentation about disabling Apache's mod_security2. Contributed by Julian Fietkau (@jfietkau). ([\#8375](matrix-org/synapse#8375))
- Improve description of `server_name` config option in `homserver.yaml`. ([\#8415](matrix-org/synapse#8415))


Deprecations and Removals
-------------------------

- Drop support for `prometheus_client` older than 0.4.0. ([\#8426](matrix-org/synapse#8426))


Internal Changes
----------------

- Fix tests on distros which disable TLSv1.0. Contributed by @danc86. ([\#8208](matrix-org/synapse#8208))
- Simplify the distributor code to avoid unnecessary work. ([\#8216](matrix-org/synapse#8216))
- Remove the `populate_stats_process_rooms_2` background job and restore functionality to `populate_stats_process_rooms`. ([\#8243](matrix-org/synapse#8243))
- Clean up type hints for `PaginationConfig`. ([\#8250](matrix-org/synapse#8250), [\#8282](matrix-org/synapse#8282))
- Track the latest event for every destination and room for catch-up after federation outage. ([\#8256](matrix-org/synapse#8256))
- Fix non-user visible bug in implementation of `MultiWriterIdGenerator.get_current_token_for_writer`. ([\#8257](matrix-org/synapse#8257))
- Switch to the JSON implementation from the standard library. ([\#8259](matrix-org/synapse#8259))
- Add type hints to `synapse.util.async_helpers`. ([\#8260](matrix-org/synapse#8260))
- Simplify tests that mock asynchronous functions. ([\#8261](matrix-org/synapse#8261))
- Add type hints to `StreamToken` and `RoomStreamToken` classes. ([\#8279](matrix-org/synapse#8279))
- Change `StreamToken.room_key` to be a `RoomStreamToken` instance. ([\#8281](matrix-org/synapse#8281))
- Refactor notifier code to correctly use the max event stream position. ([\#8288](matrix-org/synapse#8288))
- Use slotted classes where possible. ([\#8296](matrix-org/synapse#8296))
- Support testing the local Synapse checkout against the [Complement homeserver test suite](https://github.com/matrix-org/complement/). ([\#8317](matrix-org/synapse#8317))
- Update outdated usages of `metaclass` to python 3 syntax. ([\#8326](matrix-org/synapse#8326))
- Move lint-related dependencies to package-extra field, update CONTRIBUTING.md to utilise this. ([\#8330](matrix-org/synapse#8330), [\#8377](matrix-org/synapse#8377))
- Use the `admin_patterns` helper in additional locations. ([\#8331](matrix-org/synapse#8331))
- Fix test logging to allow braces in log output. ([\#8335](matrix-org/synapse#8335))
- Remove `__future__` imports related to Python 2 compatibility. ([\#8337](matrix-org/synapse#8337))
- Simplify `super()` calls to Python 3 syntax. ([\#8344](matrix-org/synapse#8344))
- Fix bad merge from `release-v1.20.0` branch to `develop`. ([\#8354](matrix-org/synapse#8354))
- Factor out a `_send_dummy_event_for_room` method. ([\#8370](matrix-org/synapse#8370))
- Improve logging of state resolution. ([\#8371](matrix-org/synapse#8371))
- Add type annotations to `SimpleHttpClient`. ([\#8372](matrix-org/synapse#8372))
- Refactor ID generators to use `async with` syntax. ([\#8383](matrix-org/synapse#8383))
- Add `EventStreamPosition` type. ([\#8388](matrix-org/synapse#8388))
- Create a mechanism for marking tests "logcontext clean". ([\#8399](matrix-org/synapse#8399))
- A pair of tiny cleanups in the federation request code. ([\#8401](matrix-org/synapse#8401))
- Add checks on startup that PostgreSQL sequences are consistent with their associated tables. ([\#8402](matrix-org/synapse#8402))
- Do not include appservice users when calculating the total MAU for a server. ([\#8404](matrix-org/synapse#8404))
- Typing fixes for `synapse.handlers.federation`. ([\#8422](matrix-org/synapse#8422))
- Various refactors to simplify stream token handling. ([\#8423](matrix-org/synapse#8423))
- Make stream token serializing/deserializing async. ([\#8427](matrix-org/synapse#8427))
babolivier pushed a commit that referenced this pull request Sep 1, 2021
* commit 'e45b83411':
  Add types to async_helpers (#8260)
  Fix mypy error on develop (#8282)
  Include method in thumbnail media name (#7124)
  Add types to StreamToken and RoomStreamToken (#8279)
  Add a config option for validating 'next_link' parameters against a domain whitelist (#8275)
  Clean up types for PaginationConfig (#8250)
  Use the right constructor for log records (#8278)
  Fix `MultiWriterIdGenerator.current_position`. (#8257)
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants