Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The default_search analyzer overrides the field level analyzer for search #73333

Closed
rSinha-Sfmc opened this issue May 24, 2021 · 4 comments · Fixed by #73359
Closed

The default_search analyzer overrides the field level analyzer for search #73333

rSinha-Sfmc opened this issue May 24, 2021 · 4 comments · Fixed by #73359
Labels
>bug :Search/Analysis How text is split into tokens Team:Search Meta label for search team

Comments

@rSinha-Sfmc
Copy link

rSinha-Sfmc commented May 24, 2021

Elasticsearch version (bin/elasticsearch --version): ES 7.12.1

Plugins installed: []

JVM version (java -version):

OS version (uname -a if on a Unix-like system):

Updated description of the problem including expected versus actual behavior:

analyzer specified on the field was historically used as both search and index analyzer. Something recently changed in such a way that if default index search analyzer is specified it takes over as the field search analyzer. Steps to reproduce:

DELETE test

PUT test
{
  "mappings": {
    "properties": {
      "test": {
        "type": "text",
        "analyzer": "simple"
      }
    }
  },
  "settings": {
    "index": {
      "analysis": {
        "analyzer": {
          "default_search": {
            "filter": [
              "lowercase"
            ],
            "type": "custom",
            "tokenizer": "standard"
          }
        }
      }
    }
  }
}

GET test/_analyze
{
  "text": ["123abc"],
  "field": "test"
}

POST test/_validate/query?explain
{
  "query": {
    "match": {
      "test": "123abc"
    }
  }
}

Original description of the problem including expected versus actual behavior:
Ideally the analyzer used at index time should be used at search unless explicitly altered, this behavior is not followed in ES 7.12.1. At search time standard analyzer is getting selected by default.

Here is the discussion link for this - https://discuss.elastic.co/t/es-7-12-different-analyzers-are-getting-used-for-indexing-and-searching/272186

Steps to reproduce: This discussion has details on how to reproduce this bug https://discuss.elastic.co/t/es-7-12-different-analyzers-are-getting-used-for-indexing-and-searching/272186

The bug only presents itself on version 7.9 and above

Please include a minimal but complete recreation of the problem,
including (e.g.) index creation, mappings, settings, query etc. The easier
you make for us to reproduce it, the more likely that somebody will take the
time to look at it.

This discussion link has complete steps to reproduce the bug - https://discuss.elastic.co/t/es-7-12-different-analyzers-are-getting-used-for-indexing-and-searching/272186
1.
2.
3.

Provide logs (if relevant):

@rSinha-Sfmc rSinha-Sfmc added >bug needs:triage Requires assignment of a team area label labels May 24, 2021
@imotov imotov changed the title Different analyzers are getting used for indexing and searching The default_search analyzer overrides the field level analyzer for search May 24, 2021
@imotov imotov added the :Search/Analysis How text is split into tokens label May 24, 2021
@elasticmachine elasticmachine added the Team:Search Meta label for search team label May 24, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@romseygeek
Copy link
Contributor

Thanks for opening this @rSinha-Sfmc - it is indeed a bug. I've opened #73359 to fix it.

@romseygeek
Copy link
Contributor

I should add: you can workaround the bug by specifying the search_analyzer directly in the mapping configuration.

@rSinha-Sfmc
Copy link
Author

Our schema is not well defined and some of these fields are dynamically indexed where we don't have control.

@markharwood markharwood removed the needs:triage Requires assignment of a team area label label May 26, 2021
romseygeek added a commit that referenced this issue May 26, 2021
…ult (#73359)

When a search or search_quote analyzer on a text mapper is not defined,
we fallback to a configured default search/search_quote analyzer if it
exists. However, if an index analyzer has been configured on the mapper
then we should first fall back to that.

Fixes #73333
romseygeek added a commit that referenced this issue May 26, 2021
…ult (#73359)

When a search or search_quote analyzer on a text mapper is not defined,
we fallback to a configured default search/search_quote analyzer if it
exists. However, if an index analyzer has been configured on the mapper
then we should first fall back to that.

Fixes #73333
romseygeek added a commit that referenced this issue May 26, 2021
…ult (#73359)

When a search or search_quote analyzer on a text mapper is not defined,
we fallback to a configured default search/search_quote analyzer if it
exists. However, if an index analyzer has been configured on the mapper
then we should first fall back to that.

Fixes #73333
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Search/Analysis How text is split into tokens Team:Search Meta label for search team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants