This repository has been archived by the owner on Apr 26, 2024. It is now read-only.
-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Fix PostgreSQL sometimes using table scans for event_search
#14409
Merged
squahtx
merged 2 commits into
develop
from
squah/fix_event_search_distinct_room_id_estimate
Nov 10, 2022
Merged
Fix PostgreSQL sometimes using table scans for event_search
#14409
squahtx
merged 2 commits into
develop
from
squah/fix_event_search_distinct_room_id_estimate
Nov 10, 2022
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
PostgreSQL may underestimate the number of distinct `room_id`s in `event_search`, which can cause it to use table scans for queries for multiple rooms. Fix this by setting `n_distinct` on the column. Resolves #14402. Signed-off-by: Sean Quah <seanq@matrix.org>
squahtx
commented
Nov 10, 2022
Comment on lines
+17
to
+33
-- By default the postgres statistics collector massively underestimates the | ||
-- number of distinct rooms in `event_search`, which can cause postgres to use | ||
-- table scans for queries for multiple rooms. | ||
-- | ||
-- To work around this we can manually tell postgres the number of distinct rooms | ||
-- by setting `n_distinct` (a negative value here is the number of distinct values | ||
-- divided by the number of rows, so -0.01 means on average there are 100 rows per | ||
-- distinct value). We don't need a particularly accurate number here, as a) we just | ||
-- want it to always use index scans and b) our estimate is going to be better than the | ||
-- one made by the statistics collector. | ||
|
||
ALTER TABLE event_search ALTER COLUMN room_id SET (n_distinct = -0.01); | ||
|
||
-- Ideally we'd do an `ANALYZE event_search (room_id)` here so that | ||
-- the above gets picked up immediately, but that can take a bit of time so we | ||
-- rely on the autovacuum eventually getting run and doing that in the | ||
-- background for us. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments shamelessly borrowed from #10359.
DMRobertson
approved these changes
Nov 10, 2022
changelog.d/14409.bugfix
Outdated
@@ -0,0 +1 @@ | |||
Fix PostgreSQL sometimes using table scans for queries against `event_search` table, taking a long time and a large amount of IO. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
against
-> against the
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops, fixed.
Thanks for the fast review! |
bradtgmurray
added a commit
to beeper/synapse-legacy-fork
that referenced
this pull request
Nov 22, 2022
Synapse 1.72.0 (2022-11-22) =========================== Please note that Synapse now only supports PostgreSQL 11+, because PostgreSQL 10 has reached end-of-life, c.f. our [Deprecation Policy](https://github.com/matrix-org/synapse/blob/develop/docs/deprecation_policy.md). Bugfixes -------- - Update forgotten references to legacy metrics in the included Grafana dashboard. ([\matrix-org#14477](matrix-org#14477)) Synapse 1.72.0rc1 (2022-11-16) ============================== Features -------- - Add experimental support for [MSC3912](matrix-org/matrix-spec-proposals#3912): Relation-based redactions. ([\matrix-org#14260](matrix-org#14260)) - Build Debian packages for Ubuntu 22.10 (Kinetic Kudu). ([\matrix-org#14396](matrix-org#14396)) - Add an [Admin API](https://matrix-org.github.io/synapse/latest/usage/administration/admin_api/index.html) endpoint for user lookup based on third-party ID (3PID). Contributed by @ashfame. ([\matrix-org#14405](matrix-org#14405)) - Faster joins: include heroes' membership events in the partial join response, for rooms without a name or canonical alias. ([\matrix-org#14442](matrix-org#14442)) Bugfixes -------- - Faster joins: do not block creation of or queries for room aliases during the resync. ([\matrix-org#14292](matrix-org#14292)) - Fix a bug introduced in Synapse 1.64.0rc1 which could cause log spam when fetching events from other homeservers. ([\matrix-org#14347](matrix-org#14347)) - Fix a bug introduced in 1.66 which would not send certain pushrules to clients. Contributed by Nico. ([\matrix-org#14356](matrix-org#14356)) - Fix a bug introduced in v1.71.0rc1 where the power level event was incorrectly created during initial room creation. ([\matrix-org#14361](matrix-org#14361)) - Fix the refresh token endpoint to be under /r0 and /v3 instead of /v1. Contributed by Tulir @ Beeper. ([\matrix-org#14364](matrix-org#14364)) - Fix a long-standing bug where Synapse would raise an error when encountering an unrecognised field in a `/sync` filter, instead of ignoring it for forward compatibility. ([\matrix-org#14369](matrix-org#14369)) - Fix a background database update, introduced in Synapse 1.64.0, which could cause poor database performance. ([\matrix-org#14374](matrix-org#14374)) - Fix PostgreSQL sometimes using table scans for queries against the `event_search` table, taking a long time and a large amount of IO. ([\matrix-org#14409](matrix-org#14409)) - Fix rendering of some HTML templates (including emails). Introduced in v1.71.0. ([\matrix-org#14448](matrix-org#14448)) - Fix a bug introduced in Synapse 1.70.0 where the background updates to add non-thread unique indexes on receipts could fail when upgrading from 1.67.0 or earlier. ([\matrix-org#14453](matrix-org#14453)) Updates to the Docker image --------------------------- - Add all Stream Writer worker types to `configure_workers_and_start.py`. ([\matrix-org#14197](matrix-org#14197)) - Remove references to legacy worker types in the multi-worker Dockerfile. ([\matrix-org#14294](matrix-org#14294)) Improved Documentation ---------------------- - Upload documentation PRs to Netlify. ([\matrix-org#12947](matrix-org#12947), [\matrix-org#14370](matrix-org#14370)) - Add addtional TURN server configuration example based on [eturnal](https://github.com/processone/eturnal) and adjust general TURN server doc structure. ([\matrix-org#14293](matrix-org#14293)) - Add example on how to load balance /sync requests. Contributed by [aceArt](https://aceart.de). ([\matrix-org#14297](matrix-org#14297)) - Edit sample Nginx reverse proxy configuration to use HTTP/1.1. Contributed by Brad Jones. ([\matrix-org#14414](matrix-org#14414)) Deprecations and Removals ------------------------- - Remove support for PostgreSQL 10. ([\matrix-org#14392](matrix-org#14392), [\matrix-org#14397](matrix-org#14397)) Internal Changes ---------------- - Run unit tests against Python 3.11. ([\matrix-org#13812](matrix-org#13812)) - Add TLS support for generic worker endpoints. ([\matrix-org#14128](matrix-org#14128), [\matrix-org#14455](matrix-org#14455)) - Switch to a maintained action for installing Rust in CI. ([\matrix-org#14313](matrix-org#14313)) - Add override ability to `complement.sh` command line script to request certain types of workers. ([\matrix-org#14324](matrix-org#14324)) - Enabling testing of [MSC3874](matrix-org/matrix-spec-proposals#3874) (filtering of `/messages` by relation type) in complement. ([\matrix-org#14339](matrix-org#14339)) - Concisely log a failure to resolve state due to missing `prev_events`. ([\matrix-org#14346](matrix-org#14346)) - Use a maintained Github action to install Rust. ([\matrix-org#14351](matrix-org#14351)) - Cleanup old worker datastore classes. Contributed by Nick @ Beeper (@Fizzadar). ([\matrix-org#14375](matrix-org#14375)) - Test against PostgreSQL 15 in CI. ([\matrix-org#14394](matrix-org#14394)) - Remove unreachable code. ([\matrix-org#14410](matrix-org#14410)) - Clean-up event persistence code. ([\matrix-org#14411](matrix-org#14411)) - Update docstring to clarify that `get_partial_state_events_batch` does not just give you completely arbitrary partial-state events. ([\matrix-org#14417](matrix-org#14417)) - Fix mypy errors introduced by bumping the locked version of `attrs` and `gitpython`. ([\matrix-org#14433](matrix-org#14433)) - Make Dependabot only bump Rust deps in the lock file. ([\matrix-org#14434](matrix-org#14434)) - Fix an incorrect stub return type for `PushRuleEvaluator.run`. ([\matrix-org#14451](matrix-org#14451)) - Improve performance of `/context` in large rooms. ([\matrix-org#14461](matrix-org#14461))
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
PostgreSQL may underestimate the number of distinct
room_id
s inevent_search
, which can cause it to use table scans for queries formultiple rooms.
Fix this by setting
n_distinct
on the column.Resolves #14402.
Signed-off-by: Sean Quah seanq@matrix.org