Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Further address performance regression in search #1749

Merged
merged 3 commits into from May 21, 2019

Conversation

Projects
None yet
3 participants
@sgrif
Copy link
Contributor

commented May 20, 2019

Even after #1746, we're still seeing a performance issue with search in
production. Now it's limited to searches that are a single letter, or
short 2 letter words like 'do'. It's caused by any search that would
cause PG to warn that the query contains only stopwords. It appears the
code path taken when plainto_tsquery returns an empty query is
substantially slower than it would be otherwise, even if the query
contains stopwords.

The reason this has started causing problems now is that #1560 caused
the query to change from performing a nested loop join to a hash join.
Due to what appears to be a bug in PG, plainto_tsquery is getting
called once per row when a hash join is performed. When the query is
passed as the only argument, the function is declared as STABLE,
meaning that within a single statement it will always return the same
result for the same arguments, so PG should only be calling it once (or
at least only a handful of times).

There's a second form available where you explicitly pass the language
as an argument. This form is marked as IMMUTABLE, so the query planner
will just replace the call to the function with its results.

Unfortunately, PG is picky about how we pass the language. It doesn't
consider a cast from text to regconfig to be IMMUTABLE, only
STABLE (which is valid, since it's based on a pg_catalog lookup --
The fact that it accepts a string literal as IMMUTABLE actually seems
wrong). The actual value is the OID of the row in pg_ts_config, which
is not stable. Since regconfig('english'::text) is not considered
IMMUTABLE, we just need to embed it as a string literal instead.

sgrif added some commits May 20, 2019

Further address performance regression in search
Even after #1746, we're still seeing a performance issue with search in
production. Now it's limited to searches that are a single letter, or
short 2 letter words like 'do'. It's caused by any search that would
cause PG to warn that the query contains only stopwords. It appears the
code path taken when `plainto_tsquery` returns an empty query is
substantially slower than it would be otherwise, even if the query
contains stopwords.

The reason this has started causing problems now is that #1560 caused
the query to change from performing a nested loop join to a hash join.
Due to what appears to be a bug in PG, `plainto_tsquery` is getting
called once per row when a hash join is performed. When the query is
passed as the only argument, the function is declared as `STABLE`,
meaning that within a single statement it will always return the same
result for the same arguments, so PG should only be calling it once (or
at least only a handful of times).

There's a second form available where you explicitly pass the language
as an argument. This form is marked as `IMMUTABLE`, so the query planner
will just replace the call to the function with its results.

Unfortunately, PG is picky about how we pass the language. It doesn't
consider a cast from `text` to `regconfig` to be `IMMUTABLE`, only
`STABLE` (which is valid, since it's based on a `pg_catalog` lookup --
The fact that it accepts a string literal as `IMMUTABLE` actually seems
wrong). The actual value is the OID of the row in `pg_ts_config`, which
is *not* stable. Since `regconfig('english'::text)` is not considered
`IMMUTABLE`, we just need to embed it as a string literal instead.
@pietroalbini

This comment has been minimized.

Copy link
Member

commented May 21, 2019

Seems fine to me. Not really familiar with that part of PostgreSQL but the code makes sense.

@sgrif

This comment has been minimized.

Copy link
Contributor Author

commented May 21, 2019

@bors r=pietroalbini

@bors

This comment has been minimized.

Copy link
Contributor

commented May 21, 2019

📌 Commit 8f14232 has been approved by pietroalbini

bors added a commit that referenced this pull request May 21, 2019

Auto merge of #1749 - sgrif:sg-fix-slow-query-more, r=pietroalbini
Further address performance regression in search

Even after #1746, we're still seeing a performance issue with search in
production. Now it's limited to searches that are a single letter, or
short 2 letter words like 'do'. It's caused by any search that would
cause PG to warn that the query contains only stopwords. It appears the
code path taken when `plainto_tsquery` returns an empty query is
substantially slower than it would be otherwise, even if the query
contains stopwords.

The reason this has started causing problems now is that #1560 caused
the query to change from performing a nested loop join to a hash join.
Due to what appears to be a bug in PG, `plainto_tsquery` is getting
called once per row when a hash join is performed. When the query is
passed as the only argument, the function is declared as `STABLE`,
meaning that within a single statement it will always return the same
result for the same arguments, so PG should only be calling it once (or
at least only a handful of times).

There's a second form available where you explicitly pass the language
as an argument. This form is marked as `IMMUTABLE`, so the query planner
will just replace the call to the function with its results.

Unfortunately, PG is picky about how we pass the language. It doesn't
consider a cast from `text` to `regconfig` to be `IMMUTABLE`, only
`STABLE` (which is valid, since it's based on a `pg_catalog` lookup --
The fact that it accepts a string literal as `IMMUTABLE` actually seems
wrong). The actual value is the OID of the row in `pg_ts_config`, which
is *not* stable. Since `regconfig('english'::text)` is not considered
`IMMUTABLE`, we just need to embed it as a string literal instead.
@bors

This comment has been minimized.

Copy link
Contributor

commented May 21, 2019

⌛️ Testing commit 8f14232 with merge 16e8d71...

@bors

This comment has been minimized.

Copy link
Contributor

commented May 21, 2019

☀️ Test successful - checks-travis
Approved by: pietroalbini
Pushing 16e8d71 to master...

@bors bors merged commit 8f14232 into rust-lang:master May 21, 2019

3 checks passed

Travis CI - Pull Request Build Passed
Details
homu Test successful
Details
percy/crates.io Visual review automatically approved, no visual changes found.
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.