Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

release-24.1: opt: add GenerateTrigramSimilarityInvertedIndexScans rule #122838

Merged
merged 4 commits into from
Apr 22, 2024

Conversation

mgartner
Copy link
Collaborator

Backport 4/5 commits from #121973.

/cc @cockroachdb/release


sql: add optimizer_use_trigram_similarity_optimization setting

This commit adds a new session setting called
optimizer_use_trigram_similarity_optimization. It currently has no
effect, but in future commits it will allow control over a new
optimization for queries with trigram similarity filters.

Release note: None

sql: disable inverted scans when similarity optimization is enabled

The GenerateInvertedIndexScans rule is now disabled for similarity
filters on trigram indexes when the
optimizer_use_text_similarity_optimization session setting is enabled.
In a future commit this setting will enable a new rule that makes
GenerateInvertedIndexScans obsolete. There is no reason to trigger
both rules.

Release note: None

opt: add GenerateTrigramSimilarityInvertedIndexScans rule

The GenerateTrigramSimilarityInvertedIndexScans exploration rule has
been added, which index-accelerates trigram similarity filters. See the
comment above the newly added custom function for more details. In
future commits, this rule will be optimized further.

The tests for this rule break from convention: they are added to a new
file rather than to the select test file corresponding to the location
of the rule in select.opt. This proposed new method of organization is
motivated by the huge growth of the select file over the years. If
there is agreement on this, we can incrementally reorganize all existing
tests to match.

Release note (performance improvement): More efficient query plans are
now generated for queries with text similarity filters, e.g.,
text_col % 'foobar'. These plans are generated if the
optimizer_use_trigram_similarity_optimization session setting is
enabled. It is disabled by default.

opt: reduce scanned trigrams for similarity filters

The GenerateSimilarityInvertedIndexScans exploration rule now
generates plans that scan fewer trigrams. See the added code comments
for more details.

Release note: None

Fixes #112675

Release note: None


Release justification: Performance improvements gated behind a
session setting.

This commit adds a new session setting called
`optimizer_use_trigram_similarity_optimization`. It currently has no
effect, but in future commits it will allow control over a new
optimization for queries with trigram similarity filters.

Release note: None
The `GenerateInvertedIndexScans` rule is now disabled for similarity
filters on trigram indexes when the
`optimizer_use_text_similarity_optimization` session setting is enabled.
In a future commit this setting will enable a new rule that makes
`GenerateInvertedIndexScans` obsolete. There is no reason to trigger
both rules.

Release note: None
The `GenerateTrigramSimilarityInvertedIndexScans` exploration rule has
been added, which index-accelerates trigram similarity filters. See the
comment above the newly added custom function for more details. In
future commits, this rule will be optimized further.

The tests for this rule break from convention: they are added to a new
file rather than to the `select` test file corresponding to the location
of the rule in `select.opt`. This proposed new method of organization is
motivated by the huge growth of the `select` file over the years. If
there is agreement on this, we can incrementally reorganize all existing
tests to match.

Release note (performance improvement): More efficient query plans are
now generated for queries with text similarity filters, e.g.,
`text_col % 'foobar'`. These plans are generated if the
`optimizer_use_trigram_similarity_optimization` session setting is
enabled. It is disabled by default.
The `GenerateSimilarityInvertedIndexScans` exploration rule now
generates plans that scan fewer trigrams. See the added code comments
for more details.

Release note: None
@mgartner mgartner requested a review from a team April 22, 2024 19:34
@mgartner mgartner requested review from a team as code owners April 22, 2024 19:34
@mgartner mgartner requested review from michae2 and removed request for a team April 22, 2024 19:34
Copy link

blathers-crl bot commented Apr 22, 2024

Thanks for opening a backport.

Please check the backport criteria before merging:

  • Backports should only be created for serious
    issues
    or test-only changes.
  • Backports should not break backwards-compatibility.
  • Backports should change as little code as possible.
  • Backports should not change on-disk formats or node communication protocols.
  • Backports should not add new functionality (except as defined
    here).
  • Backports must not add, edit, or otherwise modify cluster versions; or add version gates.
  • All backports must be reviewed by the owning areas TL and one additional
    TL. For more information as to how that review should be conducted, please consult the backport
    policy
    .
If your backport adds new functionality, please ensure that the following additional criteria are satisfied:
  • There is a high priority need for the functionality that cannot wait until the next release and is difficult to address in another way.
  • The new functionality is additive-only and only runs for clusters which have specifically “opted in” to it (e.g. by a cluster setting).
  • New code is protected by a conditional check that is trivial to verify and ensures that it only runs for opt-in clusters. State changes must be further protected such that nodes running old binaries will not be negatively impacted by the new state (with a mixed version test added).
  • The PM and TL on the team that owns the changed code have signed off that the change obeys the above rules.
  • Your backport must be accompanied by a post to the appropriate Slack
    channel (#db-backports-point-releases or #db-backports-XX-X-release) for awareness and discussion.

Also, please add a brief release justification to the body of your PR to justify this
backport.

@blathers-crl blathers-crl bot added the backport Label PR's that are backports to older release branches label Apr 22, 2024
Copy link

blathers-crl bot commented Apr 22, 2024

Your pull request contains more than 1000 changes. It is strongly encouraged to split big PRs into smaller chunks.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 8 of 8 files at r1, 2 of 2 files at r2, 14 of 14 files at r3, 9 of 9 files at r4, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @michae2)

@mgartner mgartner merged commit 2acaa36 into cockroachdb:release-24.1 Apr 22, 2024
19 of 20 checks passed
@mgartner mgartner deleted the backport24.1-121973 branch April 22, 2024 20:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport Label PR's that are backports to older release branches
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants