Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Reintroduce DB changes from #14979, using SCHEMA_COMPAT_VERSION mechanism #15014

Closed
DMRobertson opened this issue Feb 7, 2023 · 9 comments · Fixed by #15128
Closed

Reintroduce DB changes from #14979, using SCHEMA_COMPAT_VERSION mechanism #15014

DMRobertson opened this issue Feb 7, 2023 · 9 comments · Fixed by #15128
Labels
A-Database DB stuff like queries, migrations, new/remove columns, indexes, unexpected entries in the db A-Performance Performance, both client-facing and admin-facing O-Occasional Affects or can be seen by some users regularly or most users rarely S-Tolerable Minor significance, cosmetic issues, low or no impact to users. T-Enhancement New features, changes in functionality, improvements in performance, or user-facing enhancements.

Comments

@DMRobertson
Copy link
Contributor

DMRobertson commented Feb 7, 2023

The changes in #14979

ALTER TABLE current_state_events ADD COLUMN event_stream_ordering BIGINT;
ALTER TABLE local_current_membership ADD COLUMN event_stream_ordering BIGINT;
ALTER TABLE room_memberships ADD COLUMN event_stream_ordering BIGINT;

INSERT INTO background_updates (update_name, progress_json) VALUES
  ('populate_membership_event_stream_ordering', '{}');

Problems:

Questions:

  • does the new query assume the populate_membership_event_stream_ordering background update has completed? How long do we expect it will take?

For 1.77 I

For the future we should try to land the schema changes gradually (see guidance) over the next few releases.

@DMRobertson DMRobertson added S-Major Major functionality / product severely impaired, no satisfactory workaround. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues. X-Release-Blocker Must be resolved before making a release O-Occasional Affects or can be seen by some users regularly or most users rarely labels Feb 7, 2023
@DMRobertson
Copy link
Contributor Author

(cc @Fizzadar)

@DMRobertson DMRobertson added the A-Database DB stuff like queries, migrations, new/remove columns, indexes, unexpected entries in the db label Feb 7, 2023
@DMRobertson DMRobertson self-assigned this Feb 7, 2023
@DMRobertson DMRobertson changed the title DB changes in #14979 are not backwards compatible (and do not use the SCHEMA_COMPAT_VERSION mechanism) Reintroduce DB changes in #14979, using SCHEMA_COMPAT_VERSION mechanism Feb 7, 2023
@DMRobertson DMRobertson removed their assignment Feb 7, 2023
@DMRobertson DMRobertson added T-Enhancement New features, changes in functionality, improvements in performance, or user-facing enhancements. A-Performance Performance, both client-facing and admin-facing S-Tolerable Minor significance, cosmetic issues, low or no impact to users. and removed S-Major Major functionality / product severely impaired, no satisfactory workaround. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues. X-Release-Blocker Must be resolved before making a release labels Feb 7, 2023
@DMRobertson
Copy link
Contributor Author

I've updated the description and labels to explain what I did in 1.77.

@DMRobertson DMRobertson changed the title Reintroduce DB changes in #14979, using SCHEMA_COMPAT_VERSION mechanism Reintroduce DB changes from #14979, using SCHEMA_COMPAT_VERSION mechanism Feb 7, 2023
@Fizzadar
Copy link
Contributor

Fizzadar commented Feb 8, 2023

Thank you for catching the compact & handling this @DMRobertson, I'll resubmit using the SCHEMA_COMPAT_VERSION next week(ish).

@DMRobertson
Copy link
Contributor Author

No worries @Fizzadar. I think we'd normally hope to land it in stages e.g.:

  • disambiguate the query (1.77)
  • add the columns, start writing to them, and kick off the background job. (e.g. 1.78)
  • require the background job to have completed, and read from the new columns. (e.g. 1.79)

I'm not fully sure when SCHEMA_VERSION and SCHEMA_COMPAT_VERSION should get bumped. The example in the docs is for removing a column, not adding one. There might be some clues in big comment describing SCHEMA_VERSION bumps, see

Changes in SCHEMA_VERSION = 73;

In fact, judging from

Changes in SCHEMA_VERSION = 66:
- Queries on state_key columns are now disambiguated (ie, the codebase can handle
the `events` table having a `state_key` column).
maybe I should have bumped the version when disambiguating the query?

(cc @richvdh, does the above sound sensible?)

@richvdh
Copy link
Member

richvdh commented Feb 9, 2023

maybe I should have bumped the version when disambiguating the query?

Oh bah, yes you should. Otherwise we have no way to distinguish between a Synapse that is compatible with the new columns, and one which is not.

Can you do it now, before 1.77final?

@DMRobertson
Copy link
Contributor Author

Can you do it now, before 1.77final?

Will do.

@DMRobertson DMRobertson added the X-Release-Blocker Must be resolved before making a release label Feb 9, 2023
@richvdh
Copy link
Member

richvdh commented Feb 9, 2023

Re the other comments on SCHEMA_VERSION, in particular

- thread_id column is added to event_push_actions, event_push_actions_staging
event_push_summary, receipts_linearized, and receipts_graph.
- Add table `event_failed_pull_attempts` to keep track when we fail to pull
events over federation.
- Add indexes to various tables (`event_failed_pull_attempts`, `insertion_events`,
`batch_events`) to make it easy to delete all associated rows when purging a room.
- `inserted_ts` column is added to `event_push_actions_staging` table.

What we really care about for SCHEMA_VERSION is the expectations made by the python code. Those three points don't really give us any clues about those expectations. Compare with most of the earlier comments which are more explicit about how the Python is changing, rather than the structure of the database.

Not necessarily saying they need fixing retrospectively; my point is more that they aren't great examples to follow.

@DMRobertson
Copy link
Contributor Author

Can you do it now, before 1.77final?

Will do.

#15036

@DMRobertson DMRobertson removed the X-Release-Blocker Must be resolved before making a release label Feb 9, 2023
Fizzadar added a commit to Fizzadar/synapse that referenced this issue Feb 20, 2023
Synapse 1.77.0 (2023-02-14)
===========================

No significant changes since 1.77.0rc2.

Synapse 1.77.0rc2 (2023-02-10)
==============================

Bugfixes
--------

- Fix bug where retried replication requests would return a failure. Introduced in v1.76.0. ([\matrix-org#15024](matrix-org#15024))

Internal Changes
----------------

- Prepare for future database schema changes. ([\matrix-org#15036](matrix-org#15036))

Synapse 1.77.0rc1 (2023-02-07)
==============================

Features
--------

- Experimental support for [MSC3952](matrix-org/matrix-spec-proposals#3952): intentional mentions. ([\matrix-org#14823](matrix-org#14823), [\matrix-org#14943](matrix-org#14943), [\matrix-org#14957](matrix-org#14957), [\matrix-org#14958](matrix-org#14958))
- Experimental support to suppress notifications from message edits ([MSC3958](matrix-org/matrix-spec-proposals#3958)). ([\matrix-org#14960](matrix-org#14960), [\matrix-org#15016](matrix-org#15016))
- Add profile information, devices and connections to the command line [user data export tool](https://matrix-org.github.io/synapse/v1.77/usage/administration/admin_faq.html#how-can-i-export-user-data). ([\matrix-org#14894](matrix-org#14894))
- Improve performance when joining or sending an event in large rooms. ([\matrix-org#14962](matrix-org#14962))
- Improve performance of joining and leaving large rooms with many local users. ([\matrix-org#14971](matrix-org#14971))

Bugfixes
--------

- Fix a bug introduced in Synapse 1.53.0 where `next_batch` tokens from `/sync` could not be used with the `/relations` endpoint. ([\matrix-org#14866](matrix-org#14866))
- Fix a bug introduced in Synapse 1.35.0 where the module API's `send_local_online_presence_to` would fail to send presence updates over federation. ([\matrix-org#14880](matrix-org#14880))
- Fix a bug introduced in Synapse 1.70.0 where the background updates to add non-thread unique indexes on receipts could fail when upgrading from 1.67.0 or earlier. ([\matrix-org#14915](matrix-org#14915))
- Fix a regression introduced in Synapse 1.69.0 which can result in database corruption when database migrations are interrupted on sqlite. ([\matrix-org#14926](matrix-org#14926))
- Fix a bug introduced in Synapse 1.68.0 where we were unable to service remote joins in rooms with `@room` notification levels set to `null` in their (malformed) power levels. ([\matrix-org#14942](matrix-org#14942))
- Fix a bug introduced in Synapse 1.64.0 where boolean power levels were erroneously permitted in [v10 rooms](https://spec.matrix.org/v1.5/rooms/v10/). ([\matrix-org#14944](matrix-org#14944))
- Fix a long-standing bug where sending messages on servers with presence enabled would spam "Re-starting finished log context" log lines. ([\matrix-org#14947](matrix-org#14947))
- Fix a bug introduced in Synapse 1.68.0 where logging from the Rust module was not properly logged. ([\matrix-org#14976](matrix-org#14976))
- Fix various long-standing bugs in Synapse's config, event and request handling where booleans were unintentionally accepted where an integer was expected. ([\matrix-org#14945](matrix-org#14945))

Internal Changes
----------------

- Add missing type hints. ([\matrix-org#14879](matrix-org#14879), [\matrix-org#14886](matrix-org#14886), [\matrix-org#14887](matrix-org#14887), [\matrix-org#14904](matrix-org#14904), [\matrix-org#14927](matrix-org#14927), [\matrix-org#14956](matrix-org#14956), [\matrix-org#14983](matrix-org#14983), [\matrix-org#14984](matrix-org#14984), [\matrix-org#14985](matrix-org#14985), [\matrix-org#14987](matrix-org#14987), [\matrix-org#14988](matrix-org#14988), [\matrix-org#14990](matrix-org#14990), [\matrix-org#14991](matrix-org#14991), [\matrix-org#14992](matrix-org#14992), [\matrix-org#15007](matrix-org#15007))
- Use `StrCollection` to avoid potential bugs with `Collection[str]`. ([\matrix-org#14922](matrix-org#14922))
- Allow running the complement tests suites with the asyncio reactor enabled. ([\matrix-org#14858](matrix-org#14858))
- Improve performance of `/sync` in a few situations. ([\matrix-org#14908](matrix-org#14908), [\matrix-org#14970](matrix-org#14970))
- Document how to handle Dependabot pull requests. ([\matrix-org#14916](matrix-org#14916))
- Fix typo in release script. ([\matrix-org#14920](matrix-org#14920))
- Update build system requirements to allow building with poetry-core 1.5.0. ([\matrix-org#14949](matrix-org#14949), [\matrix-org#15019](matrix-org#15019))
- Add an [lnav](https://lnav.org) config file for Synapse logs to `/contrib/lnav`. ([\matrix-org#14953](matrix-org#14953))
- Faster joins: Refactor internal handling of servers in room to never store an empty list. ([\matrix-org#14954](matrix-org#14954))
- Faster joins: tag `v2/send_join/` requests to indicate if they served a partial join response. ([\matrix-org#14950](matrix-org#14950))
- Allow running `cargo` without the `extension-module` option. ([\matrix-org#14965](matrix-org#14965))
- Preparatory work for adding a denormalised event stream ordering column in the future. Contributed by Nick @ Beeper (@Fizzadar). ([\matrix-org#14979](matrix-org#14979), [9cd7610](matrix-org@9cd7610), [f10caa7](matrix-org@f10caa7); see [\matrix-org#15014](matrix-org#15014))
- Add tests for `_flatten_dict`. ([\matrix-org#14981](matrix-org#14981), [\matrix-org#15002](matrix-org#15002))

<details><summary>Dependabot updates</summary>

- Bump dtolnay/rust-toolchain from e645b0cf01249a964ec099494d38d2da0f0b349f to 9cd00a88a73addc8617065438eff914dd08d0955. ([\matrix-org#14968](matrix-org#14968))
- Bump docker/build-push-action from 3 to 4. ([\matrix-org#14952](matrix-org#14952))
- Bump ijson from 3.1.4 to 3.2.0.post0. ([\matrix-org#14935](matrix-org#14935))
- Bump types-pyyaml from 6.0.12.2 to 6.0.12.3. ([\matrix-org#14936](matrix-org#14936))
- Bump types-jsonschema from 4.17.0.2 to 4.17.0.3. ([\matrix-org#14937](matrix-org#14937))
- Bump types-pillow from 9.4.0.3 to 9.4.0.5. ([\matrix-org#14938](matrix-org#14938))
- Bump hiredis from 2.0.0 to 2.1.1. ([\matrix-org#14939](matrix-org#14939))
- Bump hiredis from 2.1.1 to 2.2.1. ([\matrix-org#14993](matrix-org#14993))
- Bump types-setuptools from 65.6.0.3 to 67.1.0.0. ([\matrix-org#14994](matrix-org#14994))
- Bump prometheus-client from 0.15.0 to 0.16.0. ([\matrix-org#14995](matrix-org#14995))
- Bump anyhow from 1.0.68 to 1.0.69. ([\matrix-org#14996](matrix-org#14996))
- Bump serde_json from 1.0.91 to 1.0.92. ([\matrix-org#14997](matrix-org#14997))
- Bump isort from 5.11.4 to 5.11.5. ([\matrix-org#14998](matrix-org#14998))
- Bump phonenumbers from 8.13.4 to 8.13.5. ([\matrix-org#14999](matrix-org#14999))
</details>

# -----BEGIN PGP SIGNATURE-----
#
# iHUEABYKAB0WIQSTI7xPaHQ1yo0PA8uSL1esuTqr+QUCY+ubcgAKCRCSL1esuTqr
# +foKAP9K8HQeGlOns6GRRiyY1EPILRvptAXeMit2eQ19J+ROKAD+JZM5WqlpWAdW
# ikmC4GV8hps01IAWFwKtK3+pLqg79gc=
# =yBT7
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue Feb 14 14:32:18 2023 GMT
# gpg:                using EDDSA key 9323BC4F687435CA8D0F03CB922F57ACB93AABF9
# gpg: Can't check signature: No public key

# Conflicts:
#	docker/Dockerfile
#	poetry.lock
#	rust/src/push/base_rules.rs
#	rust/src/push/evaluator.rs
#	rust/src/push/mod.rs
#	synapse/config/experimental.py
#	synapse/event_auth.py
#	synapse/handlers/message.py
#	synapse/handlers/pagination.py
#	synapse/push/bulk_push_rule_evaluator.py
#	synapse/rest/admin/rooms.py
#	synapse/storage/databases/main/devices.py
#	synapse/storage/databases/main/roommember.py
#	tests/push/test_push_rule_evaluator.py
@Fizzadar
Copy link
Contributor

I have now reintroduced this in #15128 with proper schema bump & ordering on the background job. To answer Q's above:

does the new query assume the populate_membership_event_stream_ordering background update has completed?

No only to use existing columns explicitly.

How long do we expect it will take?

Probably a while :) Requires iterating over all membership events so a pretty hefty job overall.

netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this issue Mar 1, 2023
Synapse 1.77.0 (2023-02-14)
===========================

No significant changes since 1.77.0rc2.


Synapse 1.77.0rc2 (2023-02-10)
==============================

Bugfixes
--------

- Fix bug where retried replication requests would return a failure. Introduced in v1.76.0. ([\#15024](matrix-org/synapse#15024))


Internal Changes
----------------

- Prepare for future database schema changes. ([\#15036](matrix-org/synapse#15036))


Synapse 1.77.0rc1 (2023-02-07)
==============================

Features
--------

- Experimental support for [MSC3952](matrix-org/matrix-spec-proposals#3952): intentional mentions. ([\#14823](matrix-org/synapse#14823), [\#14943](matrix-org/synapse#14943), [\#14957](matrix-org/synapse#14957), [\#14958](matrix-org/synapse#14958))
- Experimental support to suppress notifications from message edits ([MSC3958](matrix-org/matrix-spec-proposals#3958)). ([\#14960](matrix-org/synapse#14960), [\#15016](matrix-org/synapse#15016))
- Add profile information, devices and connections to the command line [user data export tool](https://matrix-org.github.io/synapse/v1.77/usage/administration/admin_faq.html#how-can-i-export-user-data). ([\#14894](matrix-org/synapse#14894))
- Improve performance when joining or sending an event in large rooms. ([\#14962](matrix-org/synapse#14962))
- Improve performance of joining and leaving large rooms with many local users. ([\#14971](matrix-org/synapse#14971))


Bugfixes
--------

- Fix a bug introduced in Synapse 1.53.0 where `next_batch` tokens from `/sync` could not be used with the `/relations` endpoint. ([\#14866](matrix-org/synapse#14866))
- Fix a bug introduced in Synapse 1.35.0 where the module API's `send_local_online_presence_to` would fail to send presence updates over federation. ([\#14880](matrix-org/synapse#14880))
- Fix a bug introduced in Synapse 1.70.0 where the background updates to add non-thread unique indexes on receipts could fail when upgrading from 1.67.0 or earlier. ([\#14915](matrix-org/synapse#14915))
- Fix a regression introduced in Synapse 1.69.0 which can result in database corruption when database migrations are interrupted on sqlite. ([\#14926](matrix-org/synapse#14926))
- Fix a bug introduced in Synapse 1.68.0 where we were unable to service remote joins in rooms with `@room` notification levels set to `null` in their (malformed) power levels. ([\#14942](matrix-org/synapse#14942))
- Fix a bug introduced in Synapse 1.64.0 where boolean power levels were erroneously permitted in [v10 rooms](https://spec.matrix.org/v1.5/rooms/v10/). ([\#14944](matrix-org/synapse#14944))
- Fix a long-standing bug where sending messages on servers with presence enabled would spam "Re-starting finished log context" log lines. ([\#14947](matrix-org/synapse#14947))
- Fix a bug introduced in Synapse 1.68.0 where logging from the Rust module was not properly logged. ([\#14976](matrix-org/synapse#14976))
- Fix various long-standing bugs in Synapse's config, event and request handling where booleans were unintentionally accepted where an integer was expected. ([\#14945](matrix-org/synapse#14945))


Internal Changes
----------------

- Add missing type hints. ([\#14879](matrix-org/synapse#14879), [\#14886](matrix-org/synapse#14886), [\#14887](matrix-org/synapse#14887), [\#14904](matrix-org/synapse#14904), [\#14927](matrix-org/synapse#14927), [\#14956](matrix-org/synapse#14956), [\#14983](matrix-org/synapse#14983), [\#14984](matrix-org/synapse#14984), [\#14985](matrix-org/synapse#14985), [\#14987](matrix-org/synapse#14987), [\#14988](matrix-org/synapse#14988), [\#14990](matrix-org/synapse#14990), [\#14991](matrix-org/synapse#14991), [\#14992](matrix-org/synapse#14992), [\#15007](matrix-org/synapse#15007))
- Use `StrCollection` to avoid potential bugs with `Collection[str]`. ([\#14922](matrix-org/synapse#14922))
- Allow running the complement tests suites with the asyncio reactor enabled. ([\#14858](matrix-org/synapse#14858))
- Improve performance of `/sync` in a few situations. ([\#14908](matrix-org/synapse#14908), [\#14970](matrix-org/synapse#14970))
- Document how to handle Dependabot pull requests. ([\#14916](matrix-org/synapse#14916))
- Fix typo in release script. ([\#14920](matrix-org/synapse#14920))
- Update build system requirements to allow building with poetry-core 1.5.0. ([\#14949](matrix-org/synapse#14949), [\#15019](matrix-org/synapse#15019))
- Add an [lnav](https://lnav.org) config file for Synapse logs to `/contrib/lnav`. ([\#14953](matrix-org/synapse#14953))
- Faster joins: Refactor internal handling of servers in room to never store an empty list. ([\#14954](matrix-org/synapse#14954))
- Faster joins: tag `v2/send_join/` requests to indicate if they served a partial join response. ([\#14950](matrix-org/synapse#14950))
- Allow running `cargo` without the `extension-module` option. ([\#14965](matrix-org/synapse#14965))
- Preparatory work for adding a denormalised event stream ordering column in the future. Contributed by Nick @ Beeper (@Fizzadar). ([\#14979](matrix-org/synapse#14979), [9cd7610](matrix-org/synapse@9cd7610), [f10caa7](matrix-org/synapse@f10caa7); see [\#15014](matrix-org/synapse#15014))
- Add tests for `_flatten_dict`. ([\#14981](matrix-org/synapse#14981), [\#15002](matrix-org/synapse#15002))

<details><summary>Locked dependency updates</summary>

- Bump dtolnay/rust-toolchain from e645b0cf01249a964ec099494d38d2da0f0b349f to 9cd00a88a73addc8617065438eff914dd08d0955. ([\#14968](matrix-org/synapse#14968))
- Bump docker/build-push-action from 3 to 4. ([\#14952](matrix-org/synapse#14952))
- Bump ijson from 3.1.4 to 3.2.0.post0. ([\#14935](matrix-org/synapse#14935))
- Bump types-pyyaml from 6.0.12.2 to 6.0.12.3. ([\#14936](matrix-org/synapse#14936))
- Bump types-jsonschema from 4.17.0.2 to 4.17.0.3. ([\#14937](matrix-org/synapse#14937))
- Bump types-pillow from 9.4.0.3 to 9.4.0.5. ([\#14938](matrix-org/synapse#14938))
- Bump hiredis from 2.0.0 to 2.1.1. ([\#14939](matrix-org/synapse#14939))
- Bump hiredis from 2.1.1 to 2.2.1. ([\#14993](matrix-org/synapse#14993))
- Bump types-setuptools from 65.6.0.3 to 67.1.0.0. ([\#14994](matrix-org/synapse#14994))
- Bump prometheus-client from 0.15.0 to 0.16.0. ([\#14995](matrix-org/synapse#14995))
- Bump anyhow from 1.0.68 to 1.0.69. ([\#14996](matrix-org/synapse#14996))
- Bump serde_json from 1.0.91 to 1.0.92. ([\#14997](matrix-org/synapse#14997))
- Bump isort from 5.11.4 to 5.11.5. ([\#14998](matrix-org/synapse#14998))
- Bump phonenumbers from 8.13.4 to 8.13.5. ([\#14999](matrix-org/synapse#14999))
</details>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Database DB stuff like queries, migrations, new/remove columns, indexes, unexpected entries in the db A-Performance Performance, both client-facing and admin-facing O-Occasional Affects or can be seen by some users regularly or most users rarely S-Tolerable Minor significance, cosmetic issues, low or no impact to users. T-Enhancement New features, changes in functionality, improvements in performance, or user-facing enhancements.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants