Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix get_last_event_in_room_before_stream_ordering(...) finding the wrong last event #17295

Conversation

MadLittleMods
Copy link
Collaborator

@MadLittleMods MadLittleMods commented Jun 12, 2024

Found while working on Sliding Sync and split out from #17293


Fix get_last_event_in_room_before_stream_ordering(...) finding the wrong last event.

PR where this was introduced: matrix-org/synapse#14817

What does this affect?

get_last_event_in_room_before_stream_ordering(...) is used in Sync v2 in a lot of different state calculations.

get_last_event_in_room_before_stream_ordering(...) is also used in /rooms/{roomId}/members

Pull Request Checklist

  • Pull request is based on the develop branch
  • Pull request includes a changelog file. The entry should:
    • Be a short description of your change which makes sense to users. "Fixed a bug that prevented receiving messages from other servers." instead of "Moved X method from EventStore to EventWorkerStore.".
    • Use markdown where necessary, mostly for code blocks.
    • End with either a period (.) or an exclamation mark (!).
    • Start with a capital letter.
    • Feel free to credit yourself, by adding a sentence "Contributed by @github_username." or "Contributed by [Your Name]." to the end of the entry.
  • Code style is correct
    (run the linters)

…the last event

Previously, it would use the event with the lowest `stream_ordering` that was
fetched (we want the highest `stream_ordering. `union` does not work how you
might think at first and does not preserve the ordering of the operand queries.

`union all` also doesn't necessarily have that gurantee but it does behave
that way. See https://dba.stackexchange.com/questions/316818/are-results-from-union-all-clauses-always-appended-in-order/316835#316835
We use `union all` because we don't need any of the deduplication logic
(`union` is really a union + distinct). `UNION ALL`` does preserve the
ordering of the operand queries but there is no actual gurantee that it
has this behavior in all scenarios so we need the extra `ORDER BY` at the
bottom.

See https://dba.stackexchange.com/questions/316818/are-results-from-union-all-clauses-always-appended-in-order/316835#316835
@@ -931,7 +944,7 @@ def get_last_event_in_room_before_stream_ordering_txn(
AND rejections.event_id IS NULL
ORDER BY stream_ordering DESC
) AS a
UNION
Copy link
Collaborator Author

@MadLittleMods MadLittleMods Jun 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, the query would return rows in stream_ordering ascending order because UNION does not preserve ordering of the operand queries. UNION also behaves as a union + DISTINCT whereas UNION ALL is just the union behavior you're probably expecting.

Before (ascending):

instance_name=worker1, stream_ordering=8 event_id=$MGVB0M_oj6o7FOjpRivsh7UjxQjuLy3ic8Wo5b3cbYU
instance_name=worker1, stream_ordering=9 event_id=$qTr48e2-uDwwKqVH5YJ1r1brPw9jJA4jUnHClGGPyj0

Since we don't need that DISTINCT behavior (because our sub-queries are mutually exclusive anyway and duplicates wouldn't matter anyway), we are now using UNION ALL which preserves the order of the operand queries (works correctly by itself in my testing) but may not preserve the order in all cases, with different indexes or partitioning or a parallel plan. So we add an extra ORDER BY stream_ordering DESC to the end to give us the actual guarantee.

After (descending):

instance_name=worker1, stream_ordering=9 event_id=$xuRTTdLJcUnjMea6JJb-9wDD4eMA9bXaazs0yzNRKxw
instance_name=worker1, stream_ordering=8 event_id=$dBXh-nyRaTq4NTvDAKQZMKnIev3blRmtJV4BqWzRYB4

See https://dba.stackexchange.com/questions/316818/are-results-from-union-all-clauses-always-appended-in-order/316835#316835 for more info on the UNION ALL order guarantee.

@MadLittleMods MadLittleMods marked this pull request as ready for review June 12, 2024 05:46
@MadLittleMods MadLittleMods requested a review from a team as a code owner June 12, 2024 05:46
@MadLittleMods MadLittleMods merged commit ebdce69 into develop Jun 13, 2024
38 checks passed
@MadLittleMods MadLittleMods deleted the madlittlemods/fix-and-tests-for-get_last_event_in_room_before_stream_ordering branch June 13, 2024 16:00
@MadLittleMods
Copy link
Collaborator Author

Thanks for the review @erikjohnston 🦒

yingziwu added a commit to yingziwu/synapse that referenced this pull request Jul 4, 2024
No significant changes since 1.110.0rc3.

- Fix bug where `/sync` requests could get blocked indefinitely after an upgrade from Synapse versions before v1.109.0. ([\#17386](element-hq/synapse#17386), [\#17391](element-hq/synapse#17391))

- Limit size of presence EDUs to 50 entries. ([\#17371](element-hq/synapse#17371))
- Fix building debian package for debian sid. ([\#17389](element-hq/synapse#17389))

- Fix uploading packages to PyPi. ([\#17363](element-hq/synapse#17363))

- Add initial implementation of an experimental [MSC3575](matrix-org/matrix-spec-proposals#3575) Sliding Sync `/sync` endpoint. ([\#17187](element-hq/synapse#17187))
- Add experimental support for [MSC3823](matrix-org/matrix-spec-proposals#3823) - Account suspension. ([\#17255](element-hq/synapse#17255))
- Improve ratelimiting in Synapse. ([\#17256](element-hq/synapse#17256))
- Add support for the unstable [MSC4151](matrix-org/matrix-spec-proposals#4151) report room API. ([\#17270](element-hq/synapse#17270), [\#17296](element-hq/synapse#17296))
- Filter for public and empty rooms added to Admin-API [List Room API](https://element-hq.github.io/synapse/latest/admin_api/rooms.html#list-room-api). ([\#17276](element-hq/synapse#17276))
- Add `is_dm` filtering to experimental [MSC3575](matrix-org/matrix-spec-proposals#3575) Sliding Sync `/sync` endpoint. ([\#17277](element-hq/synapse#17277))
- Add `is_encrypted` filtering to experimental [MSC3575](matrix-org/matrix-spec-proposals#3575) Sliding Sync `/sync` endpoint. ([\#17281](element-hq/synapse#17281))
- Include user membership in events served to clients, per [MSC4115](matrix-org/matrix-spec-proposals#4115). ([\#17282](element-hq/synapse#17282))
- Do not require user-interactive authentication for uploading cross-signing keys for the first time, per [MSC3967](matrix-org/matrix-spec-proposals#3967). ([\#17284](element-hq/synapse#17284))
- Add `stream_ordering` sort to experimental [MSC3575](matrix-org/matrix-spec-proposals#3575) Sliding Sync `/sync` endpoint. ([\#17293](element-hq/synapse#17293))
- `register_new_matrix_user` now supports a --password-file flag, which
  is useful for scripting. ([\#17294](element-hq/synapse#17294))
- `register_new_matrix_user` now supports a --exists-ok flag to allow registration of users that already exist in the database.
  This is useful for scripts that bootstrap user accounts with initial passwords. ([\#17304](element-hq/synapse#17304))
- Add support for via query parameter from [MSC4156](matrix-org/matrix-spec-proposals#4156). ([\#17322](element-hq/synapse#17322))
- Add `is_invite` filtering to experimental [MSC3575](matrix-org/matrix-spec-proposals#3575) Sliding Sync `/sync` endpoint. ([\#17335](element-hq/synapse#17335))
- Support [MSC3916](https://github.com/matrix-org/matrix-spec-proposals/blob/rav/authentication-for-media/proposals/3916-authentication-for-media.md) by adding a federation /download endpoint. ([\#17350](element-hq/synapse#17350))

- Fix searching for users with their exact localpart whose ID includes a hyphen. ([\#17254](element-hq/synapse#17254))
- Fix wrong retention policy being used when filtering events. ([\#17272](element-hq/synapse#17272))
- Fix bug where OTKs were not always included in `/sync` response when using workers. ([\#17275](element-hq/synapse#17275))
- Fix a long-standing bug where an invalid 'from' parameter to [`/notifications`](https://spec.matrix.org/v1.10/client-server-api/#get_matrixclientv3notifications) would result in an Internal Server Error. ([\#17283](element-hq/synapse#17283))
- Fix edge case in `/sync` returning the wrong the state when using sharded event persisters. ([\#17295](element-hq/synapse#17295))
- Add initial implementation of an experimental [MSC3575](matrix-org/matrix-spec-proposals#3575) Sliding Sync `/sync` endpoint. ([\#17301](element-hq/synapse#17301))
- Fix email notification subject when invited to a space. ([\#17336](element-hq/synapse#17336))

- Add missing quotes for example for `exclude_rooms_from_sync`. ([\#17308](element-hq/synapse#17308))
- Update header in the README to visually fix the the auto-generated table of contents. ([\#17329](element-hq/synapse#17329))
- Fix stale references to the Foundation's Security Disclosure Policy. ([\#17341](element-hq/synapse#17341))
- Add default values for `rc_invites.per_issuer` to docs. ([\#17347](element-hq/synapse#17347))
- Fix an error in the docs for `search_all_users` parameter under `user_directory`. ([\#17348](element-hq/synapse#17348))

- Remove unused `expire_access_token` option in the Synapse Docker config file. Contributed by @AaronDewes. ([\#17198](element-hq/synapse#17198))
- Use fully-qualified `PersistedEventPosition` when returning `RoomsForUser` to facilitate proper comparisons and `RoomStreamToken` generation. ([\#17265](element-hq/synapse#17265))
- Add debug logging for when room keys are uploaded, including whether they are replacing other room keys. ([\#17266](element-hq/synapse#17266))
- Handle OTK uploads off master. ([\#17271](element-hq/synapse#17271))
- Don't try and resync devices for remote users whose servers are marked as down. ([\#17273](element-hq/synapse#17273))
- Re-organize Pydantic models and types used in handlers. ([\#17279](element-hq/synapse#17279))
- Expose the worker instance that persisted the event on `event.internal_metadata.instance_name`. ([\#17300](element-hq/synapse#17300))
- Update the README with Element branding, improve headers and fix the #synapse:matrix.org support room link rendering. ([\#17324](element-hq/synapse#17324))
- Change path of the experimental [MSC3575](matrix-org/matrix-spec-proposals#3575) Sliding Sync implementation to `/org.matrix.simplified_msc3575/sync` since our simplified API is slightly incompatible with what's in the current MSC. ([\#17331](element-hq/synapse#17331))
- Handle device lists notifications for large accounts more efficiently in worker mode. ([\#17333](element-hq/synapse#17333), [\#17358](element-hq/synapse#17358))
- Do not block event sending/receiving while calculating large event auth chains. ([\#17338](element-hq/synapse#17338))
- Tidy up `parse_integer` docs and call sites to reflect the fact that they require non-negative integers by default, and bring `parse_integer_from_args` default in alignment. Contributed by Denis Kasak (@dkasak). ([\#17339](element-hq/synapse#17339))

* Bump authlib from 1.3.0 to 1.3.1. ([\#17343](element-hq/synapse#17343))
* Bump dawidd6/action-download-artifact from 3.1.4 to 5. ([\#17289](element-hq/synapse#17289))
* Bump dawidd6/action-download-artifact from 5 to 6. ([\#17313](element-hq/synapse#17313))
* Bump docker/build-push-action from 5 to 6. ([\#17312](element-hq/synapse#17312))
* Bump jinja2 from 3.1.3 to 3.1.4. ([\#17287](element-hq/synapse#17287))
* Bump lazy_static from 1.4.0 to 1.5.0. ([\#17355](element-hq/synapse#17355))
* Bump msgpack from 1.0.7 to 1.0.8. ([\#17317](element-hq/synapse#17317))
* Bump netaddr from 1.2.1 to 1.3.0. ([\#17353](element-hq/synapse#17353))
* Bump packaging from 24.0 to 24.1. ([\#17352](element-hq/synapse#17352))
* Bump phonenumbers from 8.13.37 to 8.13.39. ([\#17315](element-hq/synapse#17315))
* Bump regex from 1.10.4 to 1.10.5. ([\#17290](element-hq/synapse#17290))
* Bump requests from 2.31.0 to 2.32.2. ([\#17345](element-hq/synapse#17345))
* Bump sentry-sdk from 2.1.1 to 2.3.1. ([\#17263](element-hq/synapse#17263))
* Bump sentry-sdk from 2.3.1 to 2.6.0. ([\#17351](element-hq/synapse#17351))
* Bump tornado from 6.4 to 6.4.1. ([\#17344](element-hq/synapse#17344))
* Bump mypy from 1.8.0 to 1.9.0. ([\#17297](element-hq/synapse#17297))
* Bump types-jsonschema from 4.21.0.20240311 to 4.22.0.20240610. ([\#17288](element-hq/synapse#17288))
* Bump types-netaddr from 1.2.0.20240219 to 1.3.0.20240530. ([\#17314](element-hq/synapse#17314))
* Bump types-pillow from 10.2.0.20240423 to 10.2.0.20240520. ([\#17285](element-hq/synapse#17285))
* Bump types-pyyaml from 6.0.12.12 to 6.0.12.20240311. ([\#17316](element-hq/synapse#17316))
* Bump typing-extensions from 4.11.0 to 4.12.2. ([\#17354](element-hq/synapse#17354))
* Bump urllib3 from 2.0.7 to 2.2.2. ([\#17346](element-hq/synapse#17346))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants