Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Implement MSC3391: Removing account data #14244

Open
10 of 17 tasks
anoadragon453 opened this issue Oct 20, 2022 · 1 comment
Open
10 of 17 tasks

Implement MSC3391: Removing account data #14244

anoadragon453 opened this issue Oct 20, 2022 · 1 comment
Assignees
Labels
O-Occasional Affects or can be seen by some users regularly or most users rarely T-Enhancement New features, changes in functionality, improvements in performance, or user-facing enhancements. Z-PS-Backend The Professional Services team at Element plan to see this through. Z-Time-Tracked Element employees should track their time spent on this issue/PR.

Comments

@anoadragon453
Copy link
Member

anoadragon453 commented Oct 20, 2022

Add an experimental implementation for MSC3391, which allows users to remove keys from their user and room account data.

TODO

Base behaviour

  • Add an off-by-default experimental config option msc3391_enabled to guard the feature behind.
  • Add two unstable endpoints for deleting account data:
    • DELETE /_matrix/client/unstable/org.matrix.msc3391/user/{userId}/account_data/{type}
    • DELETE /_matrix/client/unstable/org.matrix.msc3391/user/{userId}/rooms/{roomId}/account_data/{type}
      • Upon calling the above endpoints, the provided type is removed from the user's user/room account data. This should be recorded in the account_data and room_account_data tables respectively, by setting the content of each to {}.
      • Ensure these entries are not included in initial syncs.
  • Setting an account data type's content to {} when calling PUT /_matrix/client/v3/user/{userId}/account_data/{type} and PUT /_matrix/client/v3/user/{userId}/rooms/{roomId}/account_data/{type} will be equivalent to calling the DELETE endpoint for that type. This is for backwards-compatibility with older clients which have used this to effectively "delete" keys in the past.
  • Ensure that event entries with a content of {} appear down /sync after an account data entry is deleted.

Permanently deleting items from the account data stream

The account_data and room_account_data tables are simply streams that track all changes to account data items. It would be nice to remove deleted rows from these tables when they're no longer needed (i.e. when every user's device has seen that the delete occurred). We can do so by tracking which devices have seen the change.

Note that account data changes through GET /_matrix/client/{r0,v3}/events are ignored for now. The endpoint has long been deprecated and supporting this requires a non-trivial amount of changes (namely passing a device ID into AccountDataEventSource.get_new_events).

  • Add a new table account_data_undelivered_deletes, which tracks whether a device has seen the deletion. Upon a delete one row for each known user device is added to this table. Upon a device sync'ing the change or hitting the GET endpoint for the data (axed, as it's weird that calling this may prevent the delete from coming down /sync - expanded upon in this comment), the relevant row will be removed. Note that "sync'ing a change" here is defined as a device using a since token with an account data stream ID that is after the account data entry in question, otherwise we can't be sure that the client has really seen the change (credit to Olivier for this point). Once all devices of a user have seen a deletion, we can delete the key from the account_data or room_account_data table. The device that caused the deletion should not have a row inserted.
    The account_data_undelivered_deletes table has the following schema:

    CREATE TABLE IF NOT EXISTS account_data_undelivered_deletes (
        -- The stream_id of the delete in `account_data` or `room_account_data`. Note that this value is
        -- unique across both `account_data` and `room_account_data` tables.
        stream_id BIGINT NOT NULL,
        -- foreign key: users(name)
        user_id TEXT NOT NULL,
        type TEXT NOT NULL,
        -- The room ID if this is referring to `room_account_data`.
        -- foreign key: rooms(room_id)
        room_id TEXT,
        -- A device ID that has not yet seen this delete.
        -- foreign key: devices(device_id)
        device_id TEXT NOT NULL
    );
    
    -- This is used to quickly look up a given (room) account_data entry for a given (user_id, device_id) pair
    CREATE UNIQUE INDEX IF NOT EXISTS account_data_undelivered_deletes_stream_id_user_id_device_id ON account_data_undelivered_deletes(stream_id, user_id, device_id);
    -- This is used to delete any rows for a given
    -- (account_data_type, room_id, user_id, device_id) tuple when an account data entry
    -- is added again.
    CREATE INDEX IF NOT EXISTS account_data_undelivered_deletes_type_room_id_user_id_device_id ON account_data_undelivered_deletes(type, room_id, user_id, device_id);
    -- This is used to delete all rows for a given (user_id, device_id) pair when a device is deleted.
    CREATE INDEX IF NOT EXISTS account_data_undelivered_deletes_user_id_device_id ON account_data_undelivered_deletes(user_id, device_id);
  • Add entries to account_data_undelivered_deletes for each device of a user when an account data entry is deleted.

  • Adding content to an account data type should clear all relevant rows in account_data_undelivered_deletes as that type is no longer in a deleted state.

  • Delete entries in account_data_undelivered_deletes when a given device knows that delete has occurred (through /sync).

  • Permanently delete all data from the account_data or room_account_data tables related to a given (account_data_type, user_id) pair when all user devices have seen the delete (the last relevant row has been removed from account_data_undelivered_deletes).
    This will be done by a background job - it's relatively easy to compute all {user,room} account data items with an empty content that do not have a corresponding entry in account_data_undelivered_deletes and there are no theoretical race conditions with this cleanup occurring on a periodic timer.

  • Deleting a device should remove all relevant rows from the account_data_undelivered_deletes table. (hint: use purge_account_data_for_user)

  • Possibly update the relevant account data Synapse module callback documentation.

Tests

  • Write complement tests for endpoint functionality.
  • Write Synapse unit tests for garbage collection functionality.

cc @ShadowJonathan

@anoadragon453 anoadragon453 added Z-Time-Tracked Element employees should track their time spent on this issue/PR. T-Enhancement New features, changes in functionality, improvements in performance, or user-facing enhancements. O-Occasional Affects or can be seen by some users regularly or most users rarely Z-PS-Backend The Professional Services team at Element plan to see this through. labels Oct 20, 2022
@anoadragon453 anoadragon453 self-assigned this Oct 20, 2022
@anoadragon453
Copy link
Member Author

The "Base Behaviour" portion of this feature has been implemented in #14714.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
O-Occasional Affects or can be seen by some users regularly or most users rarely T-Enhancement New features, changes in functionality, improvements in performance, or user-facing enhancements. Z-PS-Backend The Professional Services team at Element plan to see this through. Z-Time-Tracked Element employees should track their time spent on this issue/PR.
Projects
None yet
Development

No branches or pull requests

1 participant