Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to migrate events between different formats. #932

Open
ara4n opened this issue Nov 21, 2021 · 2 comments
Open

Ability to migrate events between different formats. #932

ara4n opened this issue Nov 21, 2021 · 2 comments
Labels
enhancement A suggestion for a relatively simple improvement to the protocol

Comments

@ara4n
Copy link
Member

ara4n commented Nov 21, 2021

One of the biggest problems in rapid Matrix development turns out to be confusion over the extent to which it's acceptable to ship experimental features in clients which generate events with prefixed event types/fields. Currently there is a concern that if one ships features with prefixed event types too widely, the immutable nature of events means that the experimental feature will become a de facto part of the Matrix standard, and clients will have to implement it for the rest of time in order to render older room history - much as org.matrix.custom.html has done.

However: perhaps we are being too constrained by the idea that matrix events are immutable. What if we provided a mechanism to migrate events from an old format to a new one during a room version upgrade? So that room version N+1 re-imports all the events from prior versions of the room, having reexpressed them (e.g. JSON->CBOR, or unprefixing a prefix, etc)? For instance, the server triggering the upgrade could go through re-importing all the old events (obviously at the expense of manipulating history, but that comes with the territory - plus folks would always be able to compare against the old room history to verify that the migration was not malicious). For E2EE, you could rely on the user triggering the upgrade to have all the keys, and have them similarly re-submit all the messages (again at the expense of transport consistency, and the ugliness of the client having to be online during the migration). The migration could also be done incrementally in the background via MSC2716.

This feels like a pretty useful thing to have in the protocol, especially as we gear up to more invasive changes such as changing event shapes in order to support account portability, P2P, or more efficient event encodings.

Thoughts welcome :)

(This would also be useful for migrating historical data between encryption formats - see also https://github.com/matrix-org/matrix-doc/issues/3520)

EDIT: Thinking further, this is almost a hard requirement whenever a crypto vulnerability (e.g. post-quantum) which requires everyone to shift to a more secure record of their conversation history...

@ara4n ara4n added the enhancement A suggestion for a relatively simple improvement to the protocol label Nov 21, 2021
@ShadowJonathan
Copy link
Contributor

ShadowJonathan commented Nov 21, 2021

This is, assuming that we have solved and compromised on the idea of who-imports-what in a historical room scenario, a good idea.

This would also require a robust and exhaustive history preservation mechanism, one which (imo) should try to query every server it knows about, to scrape every event in a room's history, to then have a comprehensive history of said room. (I'll comment here that I think adding a "waiver" to retrieve hidden room history on servers which have a user with upgrade permission could be worth considering)

However, I think it'll help everyone involved if this process were deterministic, so that (so to speak) every server can come to the same conversion of events, given the same room history.

This could get tricky, however, once custom events come into the mix, and a third party (in respect to Matrix and User) may have the same needs. In this case, their server implementation may emit different events for their custom format, which is undesirable, as it makes the server upgrade process more involved with "which server do we upgrade from?". This makes Matrix's position on dictating what events should be upgraded to what a privilege that third parties don't have, which is not ideal, and imo not in line with matrix's ideology.

So, my concerns are the following;

  • This process should be deterministic, and encoded in the spec, to not make the room history upgrade process different from server implementation to implementation, as otherwise it may yield widely differing and inconsistent results.
  • Third parties may not have the same opportunities to rewrite history to a more friendly format, this should be dealt with on a separate basis
    • comment: maybe with an appservice scanning the room, and adding events with relations to to-be-replaced events, with "requests" with what event content that event should be replaced with, could be a good compromise, at the risk of allowing easier rewriting of arbitrary history.

The whole process should be deterministic and "auditable", the previous room state could stay up for a while, and any server or concerned individual should be able to check the events of the previous room, compare them with the current one (re-applying that deterministic upgrade process, and aforementioned possible "event upgrade requests"), and determine if any piece of history has been wrongly altered. This tool should be available for anyone to easily check and audit that room's history, the outcome and consequences of a deliberately altered room history should be handled in a social sense afterwards.

@Stvad
Copy link

Stvad commented Nov 21, 2022

You may find this write-up on schema evolutions interesting when thinking about how to actually do migrations: https://www.inkandswitch.com/cambria/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement A suggestion for a relatively simple improvement to the protocol
Projects
None yet
Development

No branches or pull requests

3 participants