Skip to content

Conversation

@Mpdreamz
Copy link
Member

@Mpdreamz Mpdreamz commented Dec 3, 2025

Renames config/synonyms.yml to config/search.yml and expands it to be the central place for all search-related configuration.

This consolidates all search configuration into a single search.yml file - synonyms, negative boosting terms, and a new query rules system. Having everything in one place makes it easier to reason about and tune search behavior.

The big addition here is query rules support. Sometimes our relevance scoring is almost right but not quite - like when searching for "logstash" returns the Logstash extensions page instead of the main Logstash reference (both are equally valid from a scoring perspective, but users expect the reference page first). Rather than constantly tweaking boost values and hoping for the best, we can now explicitly pin specific results for specific queries.

We're also moving the diminish_terms config (those terms that get negative-boosted like "plugin", "client", "glossary") out of the code and into the config file where it belongs.

  • Synonyms: Same as before, just in a new home
  • Rules: Published to Elasticsearch via the Query Rules API (PUT /_query_rules/docs-ruleset-{env}) during indexing, then applied at search time via RuleQuery
  • Diminish terms: Applied as a negative boost in the lexical query

The rules support exact, fuzzy, and prefix matching on the query string, with pinned or exclude actions. We're starting conservatively with just two pinned rules for data-streams and logstash - the goal is to use these sparingly when relevance tuning alone can't solve the problem.

Renames `config/synonyms.yml` to `config/search.yml` and expands it to be the central place for all search-related configuration.

This consolidates all search configuration into a single `search.yml` file - synonyms, negative boosting terms, and a new query rules system. Having everything in one place makes it easier to reason about and tune search behavior.

The big addition here is **query rules** support. Sometimes our relevance scoring is *almost* right but not quite - like when searching for "logstash" returns the Logstash extensions page instead of the main Logstash reference (both are equally valid from a scoring perspective, but users expect the reference page first). Rather than constantly tweaking boost values and hoping for the best, we can now explicitly pin specific results for specific queries.

We're also moving the `diminish_terms` config (those terms that get negative-boosted like "plugin", "client", "glossary") out of the code and into the config file where it belongs.

- **Synonyms**: Same as before, just in a new home
- **Rules**: Published to Elasticsearch via the Query Rules API (`PUT /_query_rules/docs-ruleset-{env}`) during indexing, then applied at search time via `RuleQuery`
- **Diminish terms**: Applied as a negative boost in the lexical query

The rules support `exact`, `fuzzy`, and `prefix` matching on the query string, with `pinned` or `exclude` actions. We're starting conservatively with just two pinned rules for data-streams and logstash - the goal is to use these sparingly when relevance tuning alone can't solve the problem.
@Mpdreamz Mpdreamz requested review from a team as code owners December 3, 2025 11:02
@Mpdreamz Mpdreamz requested a review from cotti December 3, 2025 11:02
@Mpdreamz Mpdreamz added the fix label Dec 3, 2025
@Mpdreamz Mpdreamz self-assigned this Dec 3, 2025
@Mpdreamz Mpdreamz changed the title feature/search config Introduce config/search.yml and ensure all relevance tests pass. Dec 3, 2025
@Mpdreamz Mpdreamz merged commit f0d95c1 into main Dec 3, 2025
28 checks passed
@Mpdreamz Mpdreamz deleted the feature/search-config branch December 3, 2025 11:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants