Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Percolation does not seem to work fully on dynamically templated fields #5750

Closed
egueidan opened this issue Apr 9, 2014 · 4 comments
Closed
Assignees

Comments

@egueidan
Copy link

egueidan commented Apr 9, 2014

Hi all,

it looks like when using dynamic_templates with percolation, new fields introduced in the percolated document do not get properly picked up. Here is the test scenario:

Create an index with a type with all custom.* fields not analysed:

PUT /myindex
PUT /myindex/mytype/_mapping
{
    "mytype": {
        "dynamic" : false,
        "properties" : {
            "custom": {
                "dynamic": true, "type": "object", "include_in_all": false
            }
        },
        "dynamic_templates": [ {
            "custom_fields": {
                "path_match": "custom.*",
                "mapping": {
                    "index": "not_analyzed"
                }
            }
        }]
    }
}

Then register two queries one matching docs with color:red and the other color:blue

PUT /myindex/.percolator/redperco
{
    "query": {
        "query_string": {
           "query": "color:red"
        }
    }
}

PUT /myindex/.percolator/blueperco
{
    "query": {
        "query_string": {
           "query": "color:blue"
        }
    }
}

Then percolate. Note that we did not insert any document in the index. It will be the first time it sees the custom.color field.

POST /myindex/mytype/_percolate
{
    "doc" : {
        "custom": {
            "color": "blue"
        }
    }
}

We get no match when it should find one.

Interesting notes:

  • if you use the fully qualified field name custom.color in the queries it works
  • if you re-put the mapping and re-register the queries it works
  • if you index the same document before registering the queries it works

Tests run on OSX 10.9.2.
Elasticsearch Info:

{
   "status": 200,
   "name": "Silver Fox",
   "version": {
      "number": "1.1.0",
      "build_hash": "2181e113dea80b4a9e31e58e9686658a2d46e363",
      "build_timestamp": "2014-03-25T15:59:51Z",
      "build_snapshot": false,
      "lucene_version": "4.7"
   },
   "tagline": "You Know, for Search"
}

Thanks,
Emmanuel

@martijnvg martijnvg self-assigned this Apr 11, 2014
@martijnvg
Copy link
Member

I think this doesn't has to do with dynamic templates, but how queries resolve field name that are not an exact match.

At query parse time the field is resolved in the mapping of an index. However if the field can't be found in the mapping under its exact name, the smart name resolving kicks in and tries to find the best field with the same suffix and uses that concrete field. However at the time when the percolator query is added the query parsing kicks in as well, but there field configured yet, so it fallbacks to using the name color. The document being percolated doesn't have this field and therefor doesn't match. However it does update the mapping, by adding the custom.color field and that is why when you update the mapping and reindexing the queries the queries do match with that document.

I think this is expected behaviour, since that just how the percolator works, the fields of the queries are resolved at query registration time. So one of the workarounds you mentioned should be used in order for fields to get resolved correctly. (use of full names, reindex queries, or have the mapping configured before indexing queries (either via indexing a percolator doc or adding a mapping))

Actually just registering the percolator queries should have been enough in order for the queries to match with the document being percolated. The mapping does get modified, but not properly propagated into the cluster state. This is an issue that needs to be addressed.

@egueidan
Copy link
Author

Hi Martijn, thanks for taking the time!
Your analysis makes a lot of sense. I wasn't aware the percolator queries would not react to mapping changes. Two points:

  1. Do you want me to create a separate (more precise) issue concerning the mapping propagation issue?
  2. Don't you think that making the percolator queries react to mapping changes would be useful? It seems to me that it would rather be aligned with the dynamic nature of ES. It would basically mean that you can register your interest in something you haven't seen yet and pick it up as soon as it appears while using dynamic mappings. In some cases, as a developper, you don't know exactly what'll be in the data (hence you use dynamic mappings/dynamic templates) and you don't know exactly what'll be in the query (user input). To support those cases, we would need to re-register the queries but we are left with the question: when? How can I pick up that the mapping has changed and queries should be re-registered. If that makes sense, then shall I create an enhancement request?
    Cheers,
    Emmanuel

@martijnvg
Copy link
Member

Hi Emmanuel,

  1. The mapping propagation issue should be fixed soon: Percolation does not seem to work fully on dynamically templated fields #5750
  2. That idea makes absolutely sense. However I see other way than reindexing the percolator query (or al least reloading it). What happens when a query gets added is that the query part of the document gets parsed in a Lucene query and there isn't an easy way to just a change a field in the Lucene query because of a mapping update. Also if there is such a mechanism it can be confusing too? Maybe there're mapping updates that don't make sense for your percolator queries?

I think it is better to approach this problem from a different perspective. You can configure copy_to fields (http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-core-types.html#copy-to) in your mapping, which copies the values of fields with a specific expression into a field you know about. This field can then use in your percolator queries, since you can rely on the fact that it exists. Also the _all field can help here, but copy_to gives you more control.

@egueidan
Copy link
Author

Martijn, thanks a lot for the quick fix on #5776. As for the automatic reloading of the percolator query on mapping updates, I understand that it's not trivial to do. As far as we are concerned we can live with using the fully qualified names.Using the copy_to field to dynamically alias unknown fields to known ones is also an interesting idea.
Thanks a lot for your help!

mikebaumann pushed a commit to HumanCellAtlas/data-store that referenced this issue Nov 29, 2017
Enable subscription queries for percolation to be added to (schema specific) document
indices even when the index does not contain mappings for some of the fields
referenced by the query. This is useful in the case where the query includes
fields that have changed between metadata data schema versions, such as when the
'species' field changed from type text to type object representing an ontology.

This also helps resolve internal constraints of needing to have the mappings defined
in an index before a subscription query for percolation can be added to the index.

This is accomplished by setting index.percolator.map_unmapped_fields_as_string to true,
which also works in the case of other field types (e.g. numeric, date).
Note that when dynamic templates are used and queries for percolation have been added
to an index before the index contains mappings of fields referenced by those queries,
the queries must be reloaded when the mappings are present for the queries to match.
For more information, see: elastic/elasticsearch#5750
mikebaumann pushed a commit to HumanCellAtlas/data-store that referenced this issue Nov 30, 2017
Enable subscription queries for percolation to be added to (schema specific) document
indices even when the index does not contain mappings for some of the fields
referenced by the query. This is useful in the case where the query includes
fields that have changed between metadata data schema versions, such as when the
'species' field changed from type text to type object representing an ontology.

This also helps resolve internal constraints of needing to have the mappings defined
in an index before a subscription query for percolation can be added to the index.

This is accomplished by setting index.percolator.map_unmapped_fields_as_string to true,
which also works in the case of other field types (e.g. numeric, date).
Note that when dynamic templates are used and queries for percolation have been added
to an index before the index contains mappings of fields referenced by those queries,
the queries must be reloaded when the mappings are present for the queries to match.
For more information, see: elastic/elasticsearch#5750
hannes-ucsc pushed a commit to HumanCellAtlas/data-store that referenced this issue Dec 1, 2017
Enable subscription queries for percolation to be added to (schema specific) document
indices even when the index does not contain mappings for some of the fields
referenced by the query. This is useful in the case where the query includes
fields that have changed between metadata data schema versions, such as when the
'species' field changed from type text to type object representing an ontology.

This also helps resolve internal constraints of needing to have the mappings defined
in an index before a subscription query for percolation can be added to the index.

This is accomplished by setting index.percolator.map_unmapped_fields_as_string to true,
which also works in the case of other field types (e.g. numeric, date).
Note that when dynamic templates are used and queries for percolation have been added
to an index before the index contains mappings of fields referenced by those queries,
the queries must be reloaded when the mappings are present for the queries to match.
For more information, see: elastic/elasticsearch#5750
mikebaumann pushed a commit to HumanCellAtlas/data-store that referenced this issue Dec 1, 2017
Enable subscription queries for percolation to be added to (schema specific) document
indices even when the index does not contain mappings for some of the fields
referenced by the query. This is useful in the case where the query includes
fields that have changed between metadata data schema versions, such as when the
'species' field changed from type text to type object representing an ontology.

This also resolves internal constraints of needing to have the mappings defined
in an index before a subscription query for percolation can be added to the index.

This is accomplished by setting index.percolator.map_unmapped_fields_as_string to true,
which also works in the case of other field types (e.g. numeric, date).
Note that when dynamic templates are used and queries for percolation have been added
to an index before the index contains mappings of fields referenced by those queries,
the queries must be reloaded when the mappings are present for the queries to match.
For more information, see: elastic/elasticsearch#5750
mikebaumann pushed a commit to HumanCellAtlas/data-store that referenced this issue Dec 1, 2017
Ensure subscription queries are added to indices, and do so in a way
that allows queries including fields not defined in the index mappings.
This enables subscription of queries that work across metadata schema versions,
including fields that are not present in a schema version or have
changed between versions, such as a field changing from type text to type object.
This also resolves internal constraints of needing to have the mappings defined
in an index before a subscription query for percolation can be added to the index.

This is accomplished by setting index.percolator.map_unmapped_fields_as_string to true,
which also works in the case of other field types (e.g. numeric, date).
Note when dynamic templates are used and queries for percolation have been added
to an index before the index contains mappings of fields referenced by those queries,
the queries must be reloaded when the mappings change for the queries to match.
For more information, see: elastic/elasticsearch#5750
Bento007 pushed a commit to HumanCellAtlas/data-store that referenced this issue Dec 2, 2017
Ensure subscription queries are added to indices, and do so in a way
that allows queries including fields not defined in the index mappings.
This enables subscription of queries that work across metadata schema versions,
including fields that are not present in a schema version or have
changed between versions, such as a field changing from type text to type object.
This also resolves internal constraints of needing to have the mappings defined
in an index before a subscription query for percolation can be added to the index.

This is accomplished by setting index.percolator.map_unmapped_fields_as_string to true,
which also works in the case of other field types (e.g. numeric, date).
Note when dynamic templates are used and queries for percolation have been added
to an index before the index contains mappings of fields referenced by those queries,
the queries must be reloaded when the mappings change for the queries to match.
For more information, see: elastic/elasticsearch#5750
Bento007 pushed a commit to HumanCellAtlas/data-store that referenced this issue Dec 2, 2017
…687)

* Finish changes for schema version specific subscription/notification

Ensure subscription queries are added to indices, and do so in a way
that allows queries including fields not defined in the index mappings.
This enables subscription of queries that work across metadata schema versions,
including fields that are not present in a schema version or have
changed between versions, such as a field changing from type text to type object.
This also resolves internal constraints of needing to have the mappings defined
in an index before a subscription query for percolation can be added to the index.

This is accomplished by setting index.percolator.map_unmapped_fields_as_string to true,
which also works in the case of other field types (e.g. numeric, date).
Note when dynamic templates are used and queries for percolation have been added
to an index before the index contains mappings of fields referenced by those queries,
the queries must be reloaded when the mappings change for the queries to match.
For more information, see: elastic/elasticsearch#5750
mikebaumann pushed a commit to HumanCellAtlas/data-store that referenced this issue Dec 5, 2017
…650)

Ensure subscription queries are added to indices, and do so in a way
that allows queries including fields not defined in the index mappings.
This enables subscription of queries that work across metadata schema versions,
including fields that are not present in a schema version or have
changed between versions, such as a field changing from type text to type object.
This also resolves internal constraints of needing to have the mappings defined
in an index before a subscription query for percolation can be added to the index.

This is accomplished by setting index.percolator.map_unmapped_fields_as_string to true,
which also works in the case of other field types (e.g. numeric, date).
Note when dynamic templates are used and queries for percolation have been added
to an index before the index contains mappings of fields referenced by those queries,
the queries must be reloaded when the mappings change for the queries to match.
For more information, see: elastic/elasticsearch#5750

Unit tests for multiple schema version indexing, search and subscription/notification (#688)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants