feat(es-compat): support regexp shorthand, expose concatenate fields, map text to keyword in _mapping#6208
Merged
congx4 merged 4 commits intoquickwit-oss:mainfrom Mar 19, 2026
Conversation
…fields, and map text to keyword in _mapping
Elasticsearch's `regexp` query accepts two formats:
- Shorthand: `{"regexp": {"field": "pattern"}}`
- Full: `{"regexp": {"field": {"value": "pattern", "case_insensitive": true}}}`
Quickwit only supported the full form, causing queries from ES-compatible
connectors (e.g. Trino ES connector) to fail with a deserialization error.
This adds support for the shorthand format via `#[serde(untagged)]` enum
deserialization.
Additionally, in the `_mapping` endpoint:
- `Text` fields are now reported as `keyword` type. This enables filter
pushdown (e.g. `LIKE` predicates) from connectors that only push down
filters for `keyword`-typed fields.
- `Concatenate` fields are now exposed as `keyword` type instead of being
hidden. This allows connectors to discover and query these fields.
Made-with: Cursor
… lint Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
guilload
reviewed
Mar 19, 2026
| /// - Full: `{"regexp": {"field": {"value": "pattern", "case_insensitive": true}}}` | ||
| #[derive(Deserialize, Debug, Eq, PartialEq, Clone)] | ||
| #[serde(untagged)] | ||
| enum RegexQueryParamsInner { |
Member
There was a problem hiding this comment.
Can't we implement this directly on RegexQueryParams?
guilload
approved these changes
Mar 19, 2026
…d keyword comment - Replace inner enum + serde(from) with a custom Deserialize impl directly on RegexQueryParams, as suggested by reviewer - Add comment explaining why text fields are mapped to keyword in the ES-compat _mapping response Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace the custom Deserialize visitor with a simple #[serde(untagged)] enum that handles both shorthand and full regexp query formats directly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR improves ES compatibility in three areas:
Support shorthand
regexpquery format — Elasticsearch accepts both{"regexp": {"field": "pattern"}}(shorthand) and{"regexp": {"field": {"value": "pattern"}}}(full). Quickwit only supported the full form, causing connectors like the Trino ES connector to fail with a deserialization error when pushing downLIKEpredicates (which get translated toregexpqueries).Map
Textfields tokeywordin_mappingresponse — The Trino ES connector (and potentially other connectors) only pushes downLIKEand filter predicates forkeyword-typed fields. By reporting Quickwittextfields as ESkeywordin the mapping response, we enable filter pushdown for string fields.Expose
Concatenatefields in_mappingendpoint — Concatenate fields (e.g. anallfield that combines multiple source fields) were previously hidden from the_mappingresponse. This change exposes them askeywordtype so downstream connectors can discover and query them.Test plan
SELECT ... WHERE service LIKE '%logs%'now worksMade with Cursor