Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Postings highlighter doesn't return snippets when using "path":"just_name" #4116

Closed
javanna opened this issue Nov 7, 2013 · 4 comments
Closed

Comments

@javanna
Copy link
Member

javanna commented Nov 7, 2013

The postings highlighter doesn't return snippets when the field name in the index is different from the one specified in the search request. That happens either using "path":"just_name" or using a custom index_name in the mapping.

@ghost ghost assigned javanna Nov 7, 2013
@javanna javanna closed this as completed in cdbd791 Nov 7, 2013
javanna added a commit that referenced this issue Nov 7, 2013
…s highlighter

Previously the field name specified in the search request was used, which isn't correct in case a custom index_name has been used for a field or the "path":"just_name" has been used in the mapping.

 Closes #4116
@roytmana
Copy link

roytmana commented Nov 7, 2013

Hi @javanna just checked it out and made a build. now more records are highlighted but not all
also when I search on some words bunch of fields get highlighted from start which has nothing to do with the word I am searching on (but the word in the legitimate field does occur at the start of the field) and the same with another word in the middle it highlight bunch of the fields in the same position.

the problem I think is because there are very many fields named "name" in various mappings (for example award.type.name) and they all are lumped together somehow!

"award": {
....
"type": {
              "dynamic": "true",
              "properties": {
                "id": {
                  "type": "integer",
                  "include_in_all": false
                },
                "name": {
                  "type": "multi_field",
                  "path": "just_name",
                  "fields": {
                    "name": {
                      "type": "string",
                      "index_options": "offsets",
                      "boost": 0.3
                    },
                    "all": {
                      "type": "string",
                      "index_options": "offsets",
                      "boost": 0.01
                    },
                    "all_stem": {
                      "type": "string",
                      "index_options": "offsets",
                      "boost": 0.01
                    }
                  }
                }
              }
            }

@javanna
Copy link
Member Author

javanna commented Nov 7, 2013

I see what you mean, I think it happens using "path":"just_name" for different fields with same name but different path. That leads to having different json fields merged into the same lucene field, but the content that is loaded from the _source is only part of it, thus the offsets don't match the field content.

I think the same would happen with the fast vector highlighter too, while with the plain highlighter it doesn't happen because the text is being re-analyzed on-the-fly.

Can you open another issue with a small recreation for it?

@roytmana
Copy link

roytmana commented Nov 7, 2013

Yes I can but i thought that primary field in multifield mapping gets resolved to full name regardless of the "just_name" option. If so why would there be any name clash?

BTW I posted another issue #4108 which seems to be looking into. With all the traffic it is easy to miss but is rather critical for us, could you possible tell me if it is a fixable bug or a feasible enhancement or it can't be achieved?

@javanna
Copy link
Member Author

javanna commented Nov 7, 2013

@roytmana the fact that a field gets resolved doesn't say much about how it is actually indexed. When you use "just_name" for different fields with different path but same name, they get merged into the same lucene field. E.g. obj1.name and obj2.name both get indexed into name. If you can open the issue and send a full recreation we'll take it from there.

mute pushed a commit to mute/elasticsearch that referenced this issue Jul 29, 2015
…s highlighter

Previously the field name specified in the search request was used, which isn't correct in case a custom index_name has been used for a field or the "path":"just_name" has been used in the mapping.

 Closes elastic#4116
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants