Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Track presence state per-device and combine to a user state. #16066

Merged
merged 8 commits into from Sep 5, 2023

Conversation

clokep
Copy link
Contributor

@clokep clokep commented Aug 4, 2023

Tracks presence on an individual per-device basis and combines the per-device state into a per-user state. This should help in situations where a user has two devices with conflicting status (e.g. one is syncing with unavailable and one is syncing with online).

The tie-breaking is done by priority:

BUSY > ONLINE > UNAVAILABLE > OFFLINE

Builds on #16170, #16171, #16172.

@clokep clokep force-pushed the clokep/presence branch 3 times, most recently from cc66e18 to 1f2a2a0 Compare August 4, 2023 16:04
synapse/handlers/presence.py Outdated Show resolved Hide resolved
@clokep
Copy link
Contributor Author

clokep commented Aug 4, 2023

@erikjohnston Can you take a look at this and see if I'm wildly missing something / breaking the world?

changelog.d/16066.bugfix Outdated Show resolved Hide resolved
synapse/handlers/presence.py Outdated Show resolved Hide resolved
synapse/handlers/presence.py Show resolved Hide resolved
synapse/handlers/presence.py Outdated Show resolved Hide resolved
synapse/handlers/presence.py Outdated Show resolved Hide resolved
synapse/handlers/presence.py Outdated Show resolved Hide resolved
synapse/handlers/presence.py Outdated Show resolved Hide resolved
tests/server.py Outdated Show resolved Hide resolved
synapse/handlers/presence.py Outdated Show resolved Hide resolved
Copy link
Member

@erikjohnston erikjohnston left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to mostly make sense, though it is a huge PR. Left some comments, but I think we might need to sit down with it? I'm assuming there is now chance of splitting this PR up?

if prev_state.state == PresenceState.UNAVAILABLE:
new_fields["state"] = PresenceState.ONLINE
# Update the device information & mark the device as online if it was
# unavailable.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this is a "what" comment rather than a "why" comment? Why can't we go from offline -> online? Presumably because we need to have a sync stream to transition out of offline, which we clearly don't have?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair, I think the code is just a bit complicated and was attempting to put a succinct description of what's going on.

# Mark the device as online since it is not syncing, this
# might mark the user as online, so send an update.
device_state.state = PresenceState.ONLINE
presence = _combine_device_states(devices.values())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like a lot of this logic is a bit scattered around? I'd expect the state machine conditions to be in one place where we update the device info?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I did not love how this is scattered around. Will give it a think! 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is pretty simplified now, but we still do call _combine_presence_state in a few spots.

synapse/handlers/presence.py Show resolved Hide resolved
@clokep clokep changed the base branch from develop to clokep/presence-prep-2 August 23, 2023 15:45
@clokep clokep force-pushed the clokep/presence-prep-2 branch 2 times, most recently from caf6119 to 7a8ef9f Compare August 28, 2023 17:10
@clokep clokep changed the title Track presence state per-device and amalgamate to a user state. Track presence state per-device and combine to a user state. Aug 28, 2023
Comment on lines +2212 to +2217
if (PRESENCE_BY_PRIORITY[device_state.state], device_state.last_active_ts) > (
PRESENCE_BY_PRIORITY[presence],
last_active_ts,
):
presence = device_state.state
last_active_ts = device_state.last_active_ts
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could probably all be replaced with max(device_states) if we define the correct comparison method.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or even just max with an explicit key function.

Base automatically changed from clokep/presence-prep-2 to develop August 29, 2023 15:44
@clokep clokep marked this pull request as ready for review August 29, 2023 16:39
@clokep clokep requested a review from a team as a code owner August 29, 2023 16:39
@clokep clokep merged commit ea75346 into develop Sep 5, 2023
37 checks passed
@clokep clokep deleted the clokep/presence branch September 5, 2023 13:58
clokep added a commit that referenced this pull request Sep 5, 2023
Tracks presence on an individual per-device basis and combine
the per-device state into a per-user state. This should help in
situations where a user has multiple devices with conflicting status
(e.g. one is syncing with unavailable and one is syncing with online).

The tie-breaking is done by priority:

    BUSY > ONLINE > UNAVAILABLE > OFFLINE
yingziwu added a commit to yingziwu/synapse that referenced this pull request Oct 3, 2023
No significant changes since 1.93.0rc1.

The following issues are fixed in 1.93.0 (and RCs).

- [GHSA-4f74-84v3-j9q5](GHSA-4f74-84v3-j9q5) / [CVE-2023-41335](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2023-41335) — Low Severity

  Temporary storage of plaintext passwords during password changes.

- [GHSA-7565-cq32-vx2x](GHSA-7565-cq32-vx2x) / [CVE-2023-42453](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2023-42453) — Low Severity

  Improper validation of receipts allows forged read receipts.

See the advisories for more details. If you have any questions, email security@matrix.org.

- Add automatic purge after all users have forgotten a room. ([\matrix-org#15488](matrix-org#15488))
- Restore room purge/shutdown after a Synapse restart. ([\matrix-org#15488](matrix-org#15488))
- Support resolving homeservers using `matrix-fed` DNS SRV records from [MSC4040](matrix-org/matrix-spec-proposals#4040). ([\matrix-org#16137](matrix-org#16137))
- Add the ability to use `G` (GiB) and `T` (TiB) suffixes in configuration options that refer to numbers of bytes. ([\matrix-org#16219](matrix-org#16219))
- Add span information to requests sent to appservices. Contributed by MTRNord. ([\matrix-org#16227](matrix-org#16227))
- Add the ability to enable/disable registrations when using CAS. Contributed by Aurélien Grimpard. ([\matrix-org#16262](matrix-org#16262))
- Allow the `/notifications` endpoint to be routed to workers. ([\matrix-org#16265](matrix-org#16265))
- Enable users to easily unsubscribe to notifications emails via the `List-Unsubscribe` header. ([\matrix-org#16274](matrix-org#16274))
- Report whether a user is `locked` in the [List Accounts admin API](https://matrix-org.github.io/synapse/latest/admin_api/user_admin_api.html#list-accounts), and exclude locked users by default. ([\matrix-org#16328](matrix-org#16328))

- Fix a long-standing bug where multi-device accounts could cause high load due to presence. ([\matrix-org#16066](matrix-org#16066), [\matrix-org#16170](matrix-org#16170), [\matrix-org#16171](matrix-org#16171), [\matrix-org#16172](matrix-org#16172), [\matrix-org#16174](matrix-org#16174))
- Fix a long-standing bug where appservices using [MSC2409](matrix-org/matrix-spec-proposals#2409) to receive `to_device` messages would only get messages for one user. ([\matrix-org#16251](matrix-org#16251))
- Fix bug when using workers where Synapse could end up re-requesting the same remote device repeatedly. ([\matrix-org#16252](matrix-org#16252))
- Fix long-standing bug where we kept re-requesting a remote server's key repeatedly, potentially causing delays in receiving events over federation. ([\matrix-org#16257](matrix-org#16257))
- Avoid temporary storage of sensitive information. ([\matrix-org#16272](matrix-org#16272))
- Fix bug introduced in Synapse 1.49.0 when using dehydrated devices ([MSC2697](matrix-org/matrix-spec-proposals#2697)) and refresh tokens. Contributed by Hanadi. ([\matrix-org#16288](matrix-org#16288))
- Fix a long-standing bug where invalid receipts would be accepted. ([\matrix-org#16327](matrix-org#16327))
- Use standard name for UTF-8 charset in emails. ([\matrix-org#16329](matrix-org#16329))
- Don't try refetching device lists for users on remote hosts that are marked as "down". ([\matrix-org#16298](matrix-org#16298))

- Fix typos in the documentation. ([\matrix-org#16282](matrix-org#16282))
- Link to the Alpine Linux community package for Synapse. ([\matrix-org#16304](matrix-org#16304))
- Use string for `federation_client_minimum_tls_version` documentation examples. Contributed by @jcgruenhage. ([\matrix-org#16353](matrix-org#16353))

- Allow modules to delete rooms. ([\matrix-org#15997](matrix-org#15997))
- Add GCC and GNU Make to the Nix flake development environment so that `ruff` can be compiled. ([\matrix-org#16090](matrix-org#16090), [\matrix-org#16263](matrix-org#16263))
- Fix type checking when using the new version of Twisted. ([\matrix-org#16235](matrix-org#16235))
- Delete device messages asynchronously and in staged batches using the task scheduler. ([\matrix-org#16240](matrix-org#16240), [\matrix-org#16311](matrix-org#16311), [\matrix-org#16312](matrix-org#16312), [\matrix-org#16313](matrix-org#16313))
- Bump minimum supported Rust version to 1.61.0. ([\matrix-org#16248](matrix-org#16248))
- Update rust to version 1.71.1 in the nix development environment. ([\matrix-org#16260](matrix-org#16260))
- Simplify server key storage. ([\matrix-org#16261](matrix-org#16261))
- Reduce CPU overhead of change password endpoint. ([\matrix-org#16264](matrix-org#16264))
- Stop purging from tables slated for removal. ([\matrix-org#16273](matrix-org#16273))
- Improve type hints. ([\matrix-org#16276](matrix-org#16276), [\matrix-org#16301](matrix-org#16301), [\matrix-org#16325](matrix-org#16325), [\matrix-org#16326](matrix-org#16326))
- Raise `setuptools_rust` version cap to 1.7.0. ([\matrix-org#16277](matrix-org#16277))
- Fix using the new task scheduler causing lots of CPU to be used. ([\matrix-org#16278](matrix-org#16278))
- Upgrade CI run of Python 3.12 from rc1 to rc2. ([\matrix-org#16280](matrix-org#16280))
- Include values in SQL debug when using `execute_values` with Postgres. ([\matrix-org#16281](matrix-org#16281))
- Enable additional linting checks. ([\matrix-org#16283](matrix-org#16283))
- Refactor `receipts_graph` Postgres transactions to stop error messages. ([\matrix-org#16299](matrix-org#16299))
- Small improvements to logging in replication code. ([\matrix-org#16309](matrix-org#16309))
- Remove a reference cycle in background processes. ([\matrix-org#16314](matrix-org#16314))
- Only use literal strings for background process names. ([\matrix-org#16315](matrix-org#16315))
- Refactor `get_user_by_id`. ([\matrix-org#16316](matrix-org#16316))
- Speed up task to delete to-device messages. ([\matrix-org#16318](matrix-org#16318))
- Avoid patching code in tests. ([\matrix-org#16349](matrix-org#16349))
- Test against PostgreSQL 16. ([\matrix-org#16351](matrix-org#16351))

* Bump mypy from 1.4.1 to 1.5.1. ([\matrix-org#16300](matrix-org#16300))
* Bump black from 23.7.0 to 23.9.1. ([\matrix-org#16295](matrix-org#16295))
* Bump docker/build-push-action from 4 to 5. ([\matrix-org#16336](matrix-org#16336))
* Bump docker/login-action from 2 to 3. ([\matrix-org#16339](matrix-org#16339))
* Bump docker/metadata-action from 4 to 5. ([\matrix-org#16337](matrix-org#16337))
* Bump docker/setup-qemu-action from 2 to 3. ([\matrix-org#16338](matrix-org#16338))
* Bump furo from 2023.8.19 to 2023.9.10. ([\matrix-org#16340](matrix-org#16340))
* Bump gitpython from 3.1.32 to 3.1.35. ([\matrix-org#16267](matrix-org#16267), [\matrix-org#16279](matrix-org#16279))
* Bump mypy-zope from 1.0.0 to 1.0.1. ([\matrix-org#16291](matrix-org#16291))
* Bump pillow from 10.0.0 to 10.0.1. ([\matrix-org#16344](matrix-org#16344))
* Bump regex from 1.9.4 to 1.9.5. ([\matrix-org#16233](matrix-org#16233))
* Bump ruff from 0.0.286 to 0.0.290. ([\matrix-org#16342](matrix-org#16342))
* Bump serde_json from 1.0.105 to 1.0.107. ([\matrix-org#16296](matrix-org#16296), [\matrix-org#16345](matrix-org#16345))
* Bump twisted from 22.10.0 to 23.8.0. ([\matrix-org#16235](matrix-org#16235))
* Bump types-pillow from 10.0.0.2 to 10.0.0.3. ([\matrix-org#16293](matrix-org#16293))
* Bump types-setuptools from 68.0.0.3 to 68.2.0.0. ([\matrix-org#16292](matrix-org#16292))
* Bump typing-extensions from 4.7.1 to 4.8.0. ([\matrix-org#16341](matrix-org#16341))
Fizzadar added a commit to beeper/synapse-legacy-fork that referenced this pull request Oct 27, 2023
No significant changes since 1.93.0rc1.

The following issues are fixed in 1.93.0 (and RCs).

- [GHSA-4f74-84v3-j9q5](GHSA-4f74-84v3-j9q5) / [CVE-2023-41335](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2023-41335) — Low Severity

  Temporary storage of plaintext passwords during password changes.

- [GHSA-7565-cq32-vx2x](GHSA-7565-cq32-vx2x) / [CVE-2023-42453](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2023-42453) — Low Severity

  Improper validation of receipts allows forged read receipts.

See the advisories for more details. If you have any questions, email security@matrix.org.

- Add automatic purge after all users have forgotten a room. ([\matrix-org#15488](matrix-org#15488))
- Restore room purge/shutdown after a Synapse restart. ([\matrix-org#15488](matrix-org#15488))
- Support resolving homeservers using `matrix-fed` DNS SRV records from [MSC4040](matrix-org/matrix-spec-proposals#4040). ([\matrix-org#16137](matrix-org#16137))
- Add the ability to use `G` (GiB) and `T` (TiB) suffixes in configuration options that refer to numbers of bytes. ([\matrix-org#16219](matrix-org#16219))
- Add span information to requests sent to appservices. Contributed by MTRNord. ([\matrix-org#16227](matrix-org#16227))
- Add the ability to enable/disable registrations when using CAS. Contributed by Aurélien Grimpard. ([\matrix-org#16262](matrix-org#16262))
- Allow the `/notifications` endpoint to be routed to workers. ([\matrix-org#16265](matrix-org#16265))
- Enable users to easily unsubscribe to notifications emails via the `List-Unsubscribe` header. ([\matrix-org#16274](matrix-org#16274))
- Report whether a user is `locked` in the [List Accounts admin API](https://matrix-org.github.io/synapse/latest/admin_api/user_admin_api.html#list-accounts), and exclude locked users by default. ([\matrix-org#16328](matrix-org#16328))

- Fix a long-standing bug where multi-device accounts could cause high load due to presence. ([\matrix-org#16066](matrix-org#16066), [\matrix-org#16170](matrix-org#16170), [\matrix-org#16171](matrix-org#16171), [\matrix-org#16172](matrix-org#16172), [\matrix-org#16174](matrix-org#16174))
- Fix a long-standing bug where appservices using [MSC2409](matrix-org/matrix-spec-proposals#2409) to receive `to_device` messages would only get messages for one user. ([\matrix-org#16251](matrix-org#16251))
- Fix bug when using workers where Synapse could end up re-requesting the same remote device repeatedly. ([\matrix-org#16252](matrix-org#16252))
- Fix long-standing bug where we kept re-requesting a remote server's key repeatedly, potentially causing delays in receiving events over federation. ([\matrix-org#16257](matrix-org#16257))
- Avoid temporary storage of sensitive information. ([\matrix-org#16272](matrix-org#16272))
- Fix bug introduced in Synapse 1.49.0 when using dehydrated devices ([MSC2697](matrix-org/matrix-spec-proposals#2697)) and refresh tokens. Contributed by Hanadi. ([\matrix-org#16288](matrix-org#16288))
- Fix a long-standing bug where invalid receipts would be accepted. ([\matrix-org#16327](matrix-org#16327))
- Use standard name for UTF-8 charset in emails. ([\matrix-org#16329](matrix-org#16329))
- Don't try refetching device lists for users on remote hosts that are marked as "down". ([\matrix-org#16298](matrix-org#16298))

- Fix typos in the documentation. ([\matrix-org#16282](matrix-org#16282))
- Link to the Alpine Linux community package for Synapse. ([\matrix-org#16304](matrix-org#16304))
- Use string for `federation_client_minimum_tls_version` documentation examples. Contributed by @jcgruenhage. ([\matrix-org#16353](matrix-org#16353))

- Allow modules to delete rooms. ([\matrix-org#15997](matrix-org#15997))
- Add GCC and GNU Make to the Nix flake development environment so that `ruff` can be compiled. ([\matrix-org#16090](matrix-org#16090), [\matrix-org#16263](matrix-org#16263))
- Fix type checking when using the new version of Twisted. ([\matrix-org#16235](matrix-org#16235))
- Delete device messages asynchronously and in staged batches using the task scheduler. ([\matrix-org#16240](matrix-org#16240), [\matrix-org#16311](matrix-org#16311), [\matrix-org#16312](matrix-org#16312), [\matrix-org#16313](matrix-org#16313))
- Bump minimum supported Rust version to 1.61.0. ([\matrix-org#16248](matrix-org#16248))
- Update rust to version 1.71.1 in the nix development environment. ([\matrix-org#16260](matrix-org#16260))
- Simplify server key storage. ([\matrix-org#16261](matrix-org#16261))
- Reduce CPU overhead of change password endpoint. ([\matrix-org#16264](matrix-org#16264))
- Stop purging from tables slated for removal. ([\matrix-org#16273](matrix-org#16273))
- Improve type hints. ([\matrix-org#16276](matrix-org#16276), [\matrix-org#16301](matrix-org#16301), [\matrix-org#16325](matrix-org#16325), [\matrix-org#16326](matrix-org#16326))
- Raise `setuptools_rust` version cap to 1.7.0. ([\matrix-org#16277](matrix-org#16277))
- Fix using the new task scheduler causing lots of CPU to be used. ([\matrix-org#16278](matrix-org#16278))
- Upgrade CI run of Python 3.12 from rc1 to rc2. ([\matrix-org#16280](matrix-org#16280))
- Include values in SQL debug when using `execute_values` with Postgres. ([\matrix-org#16281](matrix-org#16281))
- Enable additional linting checks. ([\matrix-org#16283](matrix-org#16283))
- Refactor `receipts_graph` Postgres transactions to stop error messages. ([\matrix-org#16299](matrix-org#16299))
- Small improvements to logging in replication code. ([\matrix-org#16309](matrix-org#16309))
- Remove a reference cycle in background processes. ([\matrix-org#16314](matrix-org#16314))
- Only use literal strings for background process names. ([\matrix-org#16315](matrix-org#16315))
- Refactor `get_user_by_id`. ([\matrix-org#16316](matrix-org#16316))
- Speed up task to delete to-device messages. ([\matrix-org#16318](matrix-org#16318))
- Avoid patching code in tests. ([\matrix-org#16349](matrix-org#16349))
- Test against PostgreSQL 16. ([\matrix-org#16351](matrix-org#16351))

* Bump mypy from 1.4.1 to 1.5.1. ([\matrix-org#16300](matrix-org#16300))
* Bump black from 23.7.0 to 23.9.1. ([\matrix-org#16295](matrix-org#16295))
* Bump docker/build-push-action from 4 to 5. ([\matrix-org#16336](matrix-org#16336))
* Bump docker/login-action from 2 to 3. ([\matrix-org#16339](matrix-org#16339))
* Bump docker/metadata-action from 4 to 5. ([\matrix-org#16337](matrix-org#16337))
* Bump docker/setup-qemu-action from 2 to 3. ([\matrix-org#16338](matrix-org#16338))
* Bump furo from 2023.8.19 to 2023.9.10. ([\matrix-org#16340](matrix-org#16340))
* Bump gitpython from 3.1.32 to 3.1.35. ([\matrix-org#16267](matrix-org#16267), [\matrix-org#16279](matrix-org#16279))
* Bump mypy-zope from 1.0.0 to 1.0.1. ([\matrix-org#16291](matrix-org#16291))
* Bump pillow from 10.0.0 to 10.0.1. ([\matrix-org#16344](matrix-org#16344))
* Bump regex from 1.9.4 to 1.9.5. ([\matrix-org#16233](matrix-org#16233))
* Bump ruff from 0.0.286 to 0.0.290. ([\matrix-org#16342](matrix-org#16342))
* Bump serde_json from 1.0.105 to 1.0.107. ([\matrix-org#16296](matrix-org#16296), [\matrix-org#16345](matrix-org#16345))
* Bump twisted from 22.10.0 to 23.8.0. ([\matrix-org#16235](matrix-org#16235))
* Bump types-pillow from 10.0.0.2 to 10.0.0.3. ([\matrix-org#16293](matrix-org#16293))
* Bump types-setuptools from 68.0.0.3 to 68.2.0.0. ([\matrix-org#16292](matrix-org#16292))
* Bump typing-extensions from 4.7.1 to 4.8.0. ([\matrix-org#16341](matrix-org#16341))

# -----BEGIN PGP SIGNATURE-----
#
# iQFEBAABCgAuFiEEBTGR3/RnAzBGUif3pULk7RsPrAkFAmUS8iEQHGVyaWtAbWF0
# cml4Lm9yZwAKCRClQuTtGw+sCXFgB/912+T+BydS290UECCXp9kpRB5xo3aWe8mX
# NCx9Oor1TRLBpLhlQWk786gP1Q9JAQpmA4z6kovjKaLG1b4oLbZNjbPG4hEYc8ow
# /rVzGor52pfyS7uS5GW+rRmapcw4AYND6hA9XGELupf2joC8LXioSCEVG4cxwD8E
# IgIbLc87C7KpaUkNbDEz3jzZ3/BVRGcIYyhF3zTK2ZApvH2qsegq8wKYx4EYJnfh
# 87DXtTCNwA+bW6XZYPtUwPKjZ+TGB11IizxmQySGLbAxvH+GUan8X8TizGyxaqaA
# FDk3yMBbUo0R7ljDgL5YsZXT6qsZz+IBz/bsMzSbZ39f/yEUqHak
# =1/pL
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue Sep 26 16:00:49 2023 BST
# gpg:                using RSA key 053191DFF4670330465227F7A542E4ED1B0FAC09
# gpg:                issuer "erik@matrix.org"
# gpg: Can't check signature: No public key

# Conflicts:
#	.github/workflows/docker.yml
#	.github/workflows/push_complement_image.yml
#	.github/workflows/release-artifacts.yml
#	.github/workflows/tests.yml
#	poetry.lock
#	synapse/appservice/scheduler.py
#	synapse/handlers/pagination.py
#	synapse/handlers/room.py
#	synapse/rest/client/account_data.py
#	tests/rest/client/test_receipts.py
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Handling of differing set_presence values between devices causes sync tightlooping
5 participants