New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nested multi_field type wrapped by a custom type passing 'external values' doesn't get values passed #5402
Comments
dadoonet
added a commit
to dadoonet/elasticsearch
that referenced
this issue
Jun 14, 2014
In context of mapper attachment and other mapper plugins, when dealing with multi fields, sub fields never get the `externalValue` although it was set. This patch consider that an external value is set when `externalValue` is different than `null`. Here is a full script which reproduce the issue when used with mapper attachment plugin: ``` DELETE /test PUT /test { "mappings": { "test": { "properties": { "f": { "type": "attachment", "fields": { "f": { "analyzer": "english", "fields": { "no_stemming": { "type": "string", "store": "yes", "analyzer": "standard" } } } } } } } } } PUT /test/test/1 { "f": "VGhlIHF1aWNrIGJyb3duIGZveGVz" } GET /test/_search { "query": { "match": { "f": "quick" } } } GET /test/_search { "query": { "match": { "f.no_stemming": "quick" } } } GET /test/test/1?fields=f.no_stemming ``` Related to elastic/elasticsearch-mapper-attachments#57 Closes elastic#5402.
@dadoonet What's the status of this? |
@clintongormley At this exact moment, we are talking about it with @jpountz :) We would like to have it in 1.3! |
dadoonet
added a commit
to dadoonet/elasticsearch
that referenced
this issue
Jul 25, 2014
In context of mapper attachment and other mapper plugins, when dealing with multi fields, sub fields never get the `externalValue` although it was set. Here is a full script which reproduce the issue when used with mapper attachment plugin: ``` DELETE /test PUT /test { "mappings": { "test": { "properties": { "f": { "type": "attachment", "fields": { "f": { "analyzer": "english", "fields": { "no_stemming": { "type": "string", "store": "yes", "analyzer": "standard" } } } } } } } } } PUT /test/test/1 { "f": "VGhlIHF1aWNrIGJyb3duIGZveGVz" } GET /test/_search { "query": { "match": { "f": "quick" } } } GET /test/_search { "query": { "match": { "f.no_stemming": "quick" } } } GET /test/test/1?fields=f.no_stemming ``` Related to elastic/elasticsearch-mapper-attachments#57 Closes elastic#5402.
dadoonet
added a commit
that referenced
this issue
Jul 25, 2014
In context of mapper attachment and other mapper plugins, when dealing with multi fields, sub fields never get the `externalValue` although it was set. Here is a full script which reproduce the issue when used with mapper attachment plugin: ``` DELETE /test PUT /test { "mappings": { "test": { "properties": { "f": { "type": "attachment", "fields": { "f": { "analyzer": "english", "fields": { "no_stemming": { "type": "string", "store": "yes", "analyzer": "standard" } } } } } } } } } PUT /test/test/1 { "f": "VGhlIHF1aWNrIGJyb3duIGZveGVz" } GET /test/_search { "query": { "match": { "f": "quick" } } } GET /test/_search { "query": { "match": { "f.no_stemming": "quick" } } } GET /test/test/1?fields=f.no_stemming ``` Related to elastic/elasticsearch-mapper-attachments#57 Closes #5402. (cherry picked from commit 11eced0)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
This issue came up when I tried to combine the Attachment type (elasticsearch-mapper-attachments plugin) with a nested multi_field type. The intention behind that was to use the fulltext content extracted by the Attachment type for creating shingles and nGrams as well.
Prerequisites
Install the Attachment type plugin:
Index configuration
I've prepared a sample configuration for an elasticsearch index 'attachment' as listed below:
Adding document to the index
After adding a document, we will see, that the defined multi_field fields for shingles and nGrams do NOT receive any data.
Problem in elasticsearch code
I've started investigations on this issue, as I could find discussions online about similar problems, but no solution. Digging into the depths of the elasticsearch code I've detected the issue and built a workaround inside the Attachment plugin's code for my local needs which produced the desired result. From my point of view this should be fixed inside the elasticsearch code and I will try to explain how.
From what I hopefully got right, there are two ways the org.elasticsearch.index.mapper.ParseContext may provide values to any subclass of org.elasticsearch.index.mapper.core.AbstractFieldMapper for parsing:
Inside the org.elasticsearch.index.mapper.core.StringFieldMapper.parseCreateFieldForString(ParseContext, String, float) method this can be seen, where the 'external value' is tried to be consumed (emphasis intended) by calling org.elasticsearch.index.mapper.ParseContext.externalValue(). I am using the term 'consumed' as this is literally what happens as the call to the externalValue() method sets a boolean flag to 'false' which is actually used for checking for existence of such an 'external value'.
So, this is where the problem resides and I overcame it by wrapping the ParseContext passed to the Attachment mapper plugin with my own implementation, that actually just delegates any call to the original context except for the org.elasticsearch.index.mapper.ParseContext.externalValueSet() method which I've overwritten to not only check the boolean flag but also check for the 'external value' to be not null. This way it is assured that as long as the multi_field fields are processed all of them get the content.
Desired output after fixing the code
Just to make sure what the expected result should look like:
Hopefully my explanations are clear to you guys. As I am not totally sure which way one should solve this issue in order to avoid any side-effects I am relying on you to get this thing fixed.
Thanks in advance
Tom
The text was updated successfully, but these errors were encountered: