Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

_all field's analyzer is not used #11851

Closed
nharraud opened this issue Jun 24, 2015 · 4 comments
Closed

_all field's analyzer is not used #11851

nharraud opened this issue Jun 24, 2015 · 4 comments

Comments

@nharraud
Copy link

I just tried setting an analyzer for the _all field but it does not work. The field is still analyzed with the standard analyzer.

Here is the scenario:

DELETE myindex

POST myindex
{
  "settings": {
    "analysis": {
        "filter": {
          "title-synonym": {
            "type" : "synonym",
            "synonyms" : [
                "universe, cosmos"
            ]
          }
        },
        "analyzer": {
          "all-field-analyzer": {
            "tokenizer" : "whitespace",
            "filter" : ["title-synonym"]
          }
        }
    }
  },
  "mappings": {
    "record": {
        "properties": {
          "_all": {
            "type": "string",
            "analyzer": "all-field-analyzer"
          },
          "title": {
            "type": "string",
            "analyzer": "all-field-analyzer"
          }
        }
    }
  }
}

PUT myindex/record/1
{
  "title": "my big universe"
}

GET myindex/_search
{
  "query": {
    "match": {
      "title": "cosmos"
    }
  }
}

=> OK returns doc 1 thanks to synonyms.

GET myindex/_search
{
  "query": {
    "match": {
      "title": "Universe"
    }
  }
}

=> OK returns no doc because of the whitespace tokenizer

GET myindex/_search
{
  "query": {
    "match": {
      "_all": "cosmos"
    }
  }
}

=> KO returns no doc. The synonyms are not used.

GET myindex/_search
{
  "query": {
    "match": {
      "_all": "Universe"
    }
  }
}

=> KO returns doc 1 even though it should not match because of the character case.

Maybe I am doing something wrong. The documentation says that an analyzer can be specified for the _all field: https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-all-field.html

The _all fields allows for store, term_vector and analyzer (with specific index_analyzer and search_analyzer) to be set.

I am using the elasticsearch 1.6 container from docker.io

@s1monw
Copy link
Contributor

s1monw commented Jun 24, 2015

whitespace tokenizer doesn't lowercase so I think you also want a lowercase filter before the synonym filter to fix that... Please use the mailing list instead for questions like this.

@s1monw s1monw closed this as completed Jun 24, 2015
@johtani
Copy link
Contributor

johtani commented Jun 24, 2015

Hi @nharraud ,

please ask questions like these on the mailing list instead.

https://discuss.elastic.co

And you should put _all section before properties and after record in mappings section. Your mapping put wrong place...

@nharraud
Copy link
Author

Thank you @johtani. I know about the mailing but I just didn't see my mistake and after trying for some time assumed it was a bug. My bad.
@s1monw The idea was not to lowercase with whitespace tokenizer but to create a scenario reproducing the invalid "bug". But thank you.

However it seems to me that this shows another issue. Maybe creating a custom _all field without explicitly disabling the default _all field should return an error? Especially if the new field cannot be accessed.
Once I disable the default _all field with "_all": { "enabled": false }. I have a normal behavior, i.e. no result for the two last queries.

@clintongormley
Copy link

Relates to #10456

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants