From 38ca4159e273c2c71e10bf7d56569d7150ac5c3b Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Fri, 9 Sep 2022 14:50:03 -0500 Subject: [PATCH 1/4] Document another benefit of using `(room_id, event_id)` Discussed in the backend chapter sync, https://docs.google.com/document/d/1kmGRzPFfg_gRY6l0sxjYkSLW6UpMFn9ELQX5CtTLWlA/edit#bookmark=id.ciuq6xs2t47 --- docs/development/database_schema.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/docs/development/database_schema.md b/docs/development/database_schema.md index e9b925ddd835..7a79bef5a38e 100644 --- a/docs/development/database_schema.md +++ b/docs/development/database_schema.md @@ -208,10 +208,11 @@ But hash collisions are still possible, and by treating event IDs as room scoped, we can reduce the possibility of a hash collision. When scoping `event_id` in the database schema, it should be also accompanied by `room_id` (`PRIMARY KEY (room_id, event_id)`) and lookups should be done through the pair -`(room_id, event_id)`. +`(room_id, event_id)`. Another benefit of scoping `event_ids` to the room is +that it makes it very easy to find and clean up everything in a room when it +needs to be purged. -There has been a lot of debate on this in places like +`event_id` global uniqueness has had a lot debate in places like https://github.com/matrix-org/matrix-spec-proposals/issues/2779 and [MSC2848](https://github.com/matrix-org/matrix-spec-proposals/pull/2848) which has no resolution yet (as of 2022-09-01). - From 913eab1b9057fc64dd9c8be8b7d3395cea44e71e Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Fri, 9 Sep 2022 14:53:36 -0500 Subject: [PATCH 2/4] Add changelog --- changelog.d/13771.doc | 1 + 1 file changed, 1 insertion(+) create mode 100644 changelog.d/13771.doc diff --git a/changelog.d/13771.doc b/changelog.d/13771.doc new file mode 100644 index 000000000000..f45b5dc1bbd8 --- /dev/null +++ b/changelog.d/13771.doc @@ -0,0 +1 @@ +Document easy room purge benefit of using `(room_id, event_id)` in our database schemas. From 8871e3d6359c995ddd3422d4e45299076e8e3a6e Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Mon, 12 Sep 2022 18:22:14 -0500 Subject: [PATCH 3/4] Clarify why easier See https://github.com/matrix-org/synapse/pull/13771#discussion_r968250300 --- docs/development/database_schema.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/docs/development/database_schema.md b/docs/development/database_schema.md index 7a79bef5a38e..1d29ee5a8e03 100644 --- a/docs/development/database_schema.md +++ b/docs/development/database_schema.md @@ -210,7 +210,8 @@ scoped, we can reduce the possibility of a hash collision. When scoping (`PRIMARY KEY (room_id, event_id)`) and lookups should be done through the pair `(room_id, event_id)`. Another benefit of scoping `event_ids` to the room is that it makes it very easy to find and clean up everything in a room when it -needs to be purged. +needs to be purged (no need to sub-`select` query or join from the `events` +table). `event_id` global uniqueness has had a lot debate in places like https://github.com/matrix-org/matrix-spec-proposals/issues/2779 and From 2cd28131790cd65374484901f33d66ce0f6d4dcf Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Mon, 12 Sep 2022 18:23:13 -0500 Subject: [PATCH 4/4] Grammar --- docs/development/database_schema.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/development/database_schema.md b/docs/development/database_schema.md index 1d29ee5a8e03..9648a939b76f 100644 --- a/docs/development/database_schema.md +++ b/docs/development/database_schema.md @@ -210,7 +210,7 @@ scoped, we can reduce the possibility of a hash collision. When scoping (`PRIMARY KEY (room_id, event_id)`) and lookups should be done through the pair `(room_id, event_id)`. Another benefit of scoping `event_ids` to the room is that it makes it very easy to find and clean up everything in a room when it -needs to be purged (no need to sub-`select` query or join from the `events` +needs to be purged (no need to use sub-`select` query or join from the `events` table). `event_id` global uniqueness has had a lot debate in places like