Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not all meta fields in _source print a nice error #24422

Closed
narph opened this issue May 1, 2017 · 27 comments · Fixed by #57378
Closed

Not all meta fields in _source print a nice error #24422

narph opened this issue May 1, 2017 · 27 comments · Fixed by #57378
Assignees
Labels
>non-issue :Search/Mapping Index mappings, including merging and defining field types Team:Search Meta label for search team

Comments

@narph
Copy link
Contributor

narph commented May 1, 2017

Given this example:

POST /_bulk
{"index":{"_index":"releaseindex","_type":"rootobject","_id":"1"}}
{"product": [{"_version": 14}],"seo": {"section": [{"_version": 1}]},"_version": 1}
{"index":{"_index":"releaseindex","_type":"rootobject","_id":"2"}}
{"seo": {"section": [{"_version": 1}]},"_version": 1}
{"index":{"_index":"releaseindex","_type":"rootobject","_id":"3"}}
{"_version": 1}
{"index":{"_index":"releaseindex","_type":"rootobject","_id":"4"}}
{"product": [{"_version": 14}],"seo": {"section": [{"_version": 1}]}}
{"index":{"_index":"releaseindex","_type":"rootobject","_id":"5"}}
{"_field_names": 1}
{"index":{"_index":"releaseindex","_type":"rootobject","_id":"5"}}
{"_index": 1}

We get different messages for adding meta fields in the root of _source e.g look at _index vs _field_names

{
      "index": {
        "_index": "releaseindex",
        "_type": "rootobject",
        "_id": "5",
        "status": 400,
        "error": {
          "type": "illegal_argument_exception",
          "reason": "Field [_field_names] is defined twice in [rootobject]"
        }
      }
    },
    {
      "index": {
        "_index": "releaseindex",
        "_type": "rootobject",
        "_id": "5",
        "status": 400,
        "error": {
          "type": "mapper_parsing_exception",
          "reason": "Field [_index] is a metadata field and cannot be added inside a document. Use the index API request parameters."
        }
      }
@jimczi jimczi added :Search/Mapping Index mappings, including merging and defining field types help wanted adoptme good first issue low hanging fruit >non-issue labels May 2, 2017
@jimczi
Copy link
Contributor

jimczi commented May 2, 2017

I agree that the message is a bit confusing since we usually fail any document that contain a metadata field. Though _field_names is not listed as a metadata field in MapperService.META_FIELDS and this is why the message is different.
Thanks @narph, are you interested in submitting a PR for this ?

@asettouf
Copy link

asettouf commented May 2, 2017

Hello,

if @narph is not available to submit a PR, I am interested in handling this issue.

@jimczi
Copy link
Contributor

jimczi commented May 2, 2017

@asettouf sure go for it, I'd be happy to review !

@rjernst
Copy link
Member

rjernst commented May 2, 2017

Note that MapperService.META_FIELDS is just wrong now and needs to be removed. Since adding MapperPlugin.getMetadataFields(), that is what should be used as the definition of which meta fields exist. So MapperService should use the MapperRegistry that is passed in the ctor, instead of a static set.

@asettouf
Copy link

asettouf commented May 2, 2017

So for beginning I have just added "_field_names" in the MapperService.META_FIELDS. Especially given that some refactoring is needed, as right now the parsing and checking is made by static methods.

Am I right to understand that to use the MapperRegistry, I would need to refactor the static methods inside the DocumentParser to get a reference to it and use it?

I'd love to do the refactoring, now as I am new, I would just like some guidance on what would be the best way (perhaps I've missed a much smarter way of refactoring) to proceed before I start breaking everything!

@rjernst
Copy link
Member

rjernst commented May 2, 2017

@asettouf The META_FIELDS set is only used within MapperService. I think you only need to use the keyset of the mapperRegistry to reimplement isMetadataField(String) and getAllMetaFields().

@javanna
Copy link
Member

javanna commented Mar 16, 2018

@elastic/es-search-aggs

@Rijul1204
Copy link

I guess the existing PR for this issue is closed now. Can I work on this issue ?

@rjernst
Copy link
Member

rjernst commented Sep 19, 2018

@Rijul1204 Please feel free to work on this, or any other issue. Those labeled with help wanted in particular indicate nobody else plans to work on them.

@sandmannn
Copy link
Contributor

Hi, as there has been no activity for some time, can I pick this issue?

@Rijul1204
Copy link

@sandmannn, Sure please proceed. I am not currently working on it.

@sandmannn
Copy link
Contributor

sandmannn commented Dec 11, 2018

Hi @rjernst , you mentioned in your reply before, So MapperService should use the MapperRegistry that is passed in the ctor,. Since we cannot use exactly the same instance that has been passed in ctor due to isMetadataField being a static method, we need a way to grab such an instance of MapperRegistry inside this static method.

I looked at several places (constructor of MapperRegistry, IndicesModule) but did not find a nice way of getting a MapperRegistry instance (or metadataMapperParsers instance which is what we are really looking for). I am still looking for relevant usages in the codebase, but maybe there is a very straightforward way of getting a MapperRegistry instance that I am missing?

@rjernst
Copy link
Member

rjernst commented Dec 11, 2018

The isMetadataField and getAllMetaFields methods would need to become instance methods, and any callers would need to have an instance of MapperService. This can likely be done in increments by converting each use of the method independently. It will also likely require reworking several chains of construction. For example, DocumentField calls the method statically. Instead, it would need to take in whether the given field is a meta field as a ctor arg, and that would need to be passed through a few layers from there is access to the MapperService.

@sandmannn
Copy link
Contributor

sandmannn commented Dec 12, 2018

Thanks for your suggestion, @rjernst ! I think it is the only possible way to proceed here.

As number of classes affected seems to be slightly bigger than originally expected, lets maybe address the necessary class changes one by one. I looked for usages of DocumentField constructor. In half of the cases it is either obvious that we are looking at metadata field, or we have access to mapper service instance that can provide such information. Hovever, there are 2 groups of calls, where it is quite ambiguous: first, several static methods within DocumentField itself, which are creating DocumentField either from InputStream or from XContent. Second, the usage in PercolatorMatchedSlotSubFetchPhase. These are real blockers for adding the metadata information to the DocumentField constructor. As a next step i will check where are those methods used, but currently I don't see a straightforward way to proceed.

@sandmannn
Copy link
Contributor

I was looking up various chains of funciton calling of DocumentField constructor, among others found this chain

static DocumentField.readDocumentField(StreamInput in)
Called from: GetResult.readFrom(StreamInput) 
Called from static static DocumentField.readGetResult (StreamInput in) 
Called from UpdateResponse.readFrom(StreamInput) 
Called from BulkItemResponse.readFrom(StreamInput) 
Called from static BulkItemResponse.readBulkItem(StreamInput)
Called from BulkItemRequest.readFrom(StreamInput)
Called from BulkItemRequest.readBulkItem(StreamInput)
Called from BulkShardRequest.readFrom(StreamInput)

BulkShardRequest readFrom() implements ReplicatedWriteRequest, thus if we want to extend the function signature of the methods above by handing over additional context, then we also need to adjust all calls to ReplicatedWriteRequest.readFrom(), affected classes DeleteRequest, IndexRequest,ResyncReplicationRequest. (further call chain must be researched separately for all affected classes)

is going up the call chain up to creation of a Node instance and passing the contexts from there really the only possible way of dealing with this issue? Maybe there is some easy way to get access to the instances of Node or IndicesService directly and reading the MapperRegistry from there?

@rjernst
Copy link
Member

rjernst commented Dec 14, 2018

For DocumentField, I would make this a member variable that is set on construction. The readFrom case is then simpl, because the new flag (whether a field is a meta field) would be serialized/deserialized on the StreamOutput/Input methods. So nothing else would need to be passed into those StreamInput methods.

@rjernst
Copy link
Member

rjernst commented Dec 14, 2018

FTIW, passing the boolean to the DocumentField constructor should be relatively simple: see FetchPhase.getSearchFields. This method has SearchContext, which has a ref to the MapperService.

@sandmannn
Copy link
Contributor

Assuming it was already written during serialization indeed makes it much simpler. 2 questions though:

  1. Do we have to care about deserializing DocumentField instances serialized in previous version (without the flag)?
  2. What is the policy on splitting PRs? Is it possible to first create a PR concentrating on change in DocumentField and whoever calls its constructor and afterwards a separate PR for changing the static methods of the MapperService?

@rjernst
Copy link
Member

rjernst commented Dec 15, 2018

  1. Yes, backcompat needs to be addressed. In this case, the deserialization can fall back to calling the static method (it would need to stick around in 6.x then).
  2. Splitting PRs is highly encourage, assuming the split makes logical sense (each PR should leave the system in a stable state, still fully tested, etc). If I understand your suggested split, you would change the construction of DocumentField (and the deserialization) to call the static method in a first PR, and then change to instance methods in a followup? That would be fine, but I would personally prefer adding the instance methods first (which can internally call the static method) in order to begin getting rid of uses of the static method. After the uses are all changed, the impl can change. But in this case I think the order doesn't matter much (at least at a glance).

@sandmannn
Copy link
Contributor

sandmannn commented Dec 26, 2018

sure, lets add instance methods first. Does it mean that in the scope of the first PR we just adding the instance equivalents of isMetadataField and getAllMetaFields, which in turn internally call the static methods? We have 2 options then:

  1. accept that we have both instance and static methods available in the codeline for some time, meaning we have to give the instance methods different names for the time of refactoring, as they have otherwise same signature. Is there anything less ugly than isMetadataFieldInstance and getAllMetaFieldsInstance?
  2. refactor all uses of the static methods by instantiating empty MapperService, calling instance method which in turn internally calls static method. In this case we can rename static method (which is then called only inside MapperService body), plus get lots of calls to empty MapperService constructor, which can look puzzling.

Both options look ugly, but in different ways. What do you prefer?

@rjernst
Copy link
Member

rjernst commented Jan 8, 2019

I don't think we need to have both static and instance methods at the same time. My suggestion is to convert the uses of the meta fields checking to first call the existing static methods directly, instead of indirectly through wrapped layers.

For example, a first PR can change the constructor of DocumentField to take a boolean indicating whether or not the field is a metadata field. All the calls to the constructor would be changed to pass this statically (eg with percolator), or by calling the existing static method on MapperService. A second PR would then change those static method callers to check on the MapperService instance (and at the same time change to an instance method).

rjernst pushed a commit that referenced this issue May 23, 2019
This commit makes creators of GetField split the fields into document fields and metadata fields. It is part of larger refactoring that aims to remove the calls to static methods of MapperService related to metadata fields, as discussed in #24422.
rjernst pushed a commit to rjernst/elasticsearch that referenced this issue May 23, 2019
This commit makes creators of GetField split the fields into document fields and metadata fields. It is part of larger refactoring that aims to remove the calls to static methods of MapperService related to metadata fields, as discussed in elastic#24422.
rjernst added a commit that referenced this issue May 23, 2019
This commit makes creators of GetField split the fields into document fields and metadata fields. It is part of larger refactoring that aims to remove the calls to static methods of MapperService related to metadata fields, as discussed in #24422.
gurkankaymak pushed a commit to gurkankaymak/elasticsearch that referenced this issue May 27, 2019
This commit makes creators of GetField split the fields into document fields and metadata fields. It is part of larger refactoring that aims to remove the calls to static methods of MapperService related to metadata fields, as discussed in elastic#24422.
@shashvat-kedia
Copy link

@rjernst Can I work on this issue?

@sandmannn
Copy link
Contributor

@SD1998 you can review #41656, which was there for some time

mayya-sharipova pushed a commit that referenced this issue Mar 30, 2020
Refactor SearchHit to have separate document and meta fields.
This is a part of bigger refactoring of issue #24422 to remove
dependency on MapperService to check if a field is metafield.

Relates to PR: #38373
Relates to issue #24422
@sandmannn
Copy link
Contributor

Hi, as mentioned in the discussion in the last commit #41656 I will not have capacity to contribute at least in the next 3-6 months starting from the April 2020. so as of now everybody is welcome to take over this broader refactoring.

mayya-sharipova pushed a commit to mayya-sharipova/elasticsearch that referenced this issue Apr 1, 2020
Refactor SearchHit to have separate document and meta fields.
This is a part of bigger refactoring of issue elastic#24422 to remove
dependency on MapperService to check if a field is metafield.

Relates to PR: elastic#38373
Relates to issue elastic#24422
mayya-sharipova added a commit that referenced this issue Apr 1, 2020
Refactor SearchHit to have separate document and meta fields.
This is a part of bigger refactoring of issue #24422 to remove
dependency on MapperService to check if a field is metafield.

Relates to PR: #38373
Relates to issue #24422

Co-authored-by: sandmannn <bohdanpukalskyi@gmail.com>
yyff pushed a commit to yyff/elasticsearch that referenced this issue Apr 17, 2020
Refactor SearchHit to have separate document and meta fields.
This is a part of bigger refactoring of issue elastic#24422 to remove
dependency on MapperService to check if a field is metafield.

Relates to PR: elastic#38373
Relates to issue elastic#24422
@rjernst rjernst added the Team:Search Meta label for search team label May 4, 2020
@mayya-sharipova mayya-sharipova removed good first issue low hanging fruit help wanted adoptme labels May 7, 2020
@mayya-sharipova mayya-sharipova self-assigned this May 7, 2020
mayya-sharipova added a commit to mayya-sharipova/elasticsearch that referenced this issue May 14, 2020
Before to determine if a field is meta-field, a static method of MapperService
isMetadataField was used. This method was using an outdated static list
of meta-fields.

This PR instead changes this method to the instance method that
is also aware of meta-fields in all registered plugins.

This PR also refactors DocumentField to add a new parameter
to the constructor of DocumentField isMetadataField
which describes whether this is metadata field or not.
This refactoring is necessary as there is no more statiic
method to check if a field is meta-field, but DocumentField
objects are always created from the contexts  where this
information is available, and will be passed to DocumentField
constructor for further usage.

Related elastic#38373, elastic#41656
Closes elastic#24422
mayya-sharipova added a commit to mayya-sharipova/elasticsearch that referenced this issue May 14, 2020
Before to determine if a field is meta-field, a static method of MapperService
isMetadataField was used. This method was using an outdated static list
of meta-fields.

This PR instead changes this method to the instance method that
is also aware of meta-fields in all registered plugins.

This PR also refactors DocumentField to add a new parameter
to the constructor of DocumentField isMetadataField
which describes whether this is metadata field or not.
This refactoring is necessary as there is no more statiic
method to check if a field is meta-field, but DocumentField
objects are always created from the contexts  where this
information is available, and will be passed to DocumentField
constructor for further usage.

Related elastic#38373, elastic#41656
Closes elastic#24422
mayya-sharipova added a commit to mayya-sharipova/elasticsearch that referenced this issue May 15, 2020
Before to determine if a field is meta-field, a static method of MapperService
isMetadataField was used. This method was using an outdated static list
of meta-fields.

This PR instead changes this method to the instance method that
is also aware of meta-fields in all registered plugins.

This PR also refactors DocumentField to add a new parameter
to the constructor of DocumentField isMetadataField
which describes whether this is metadata field or not.
This refactoring is necessary as there is no more statiic
method to check if a field is meta-field, but DocumentField
objects are always created from the contexts  where this
information is available, and will be passed to DocumentField
constructor for further usage.

Related elastic#38373, elastic#41656
Closes elastic#24422
mayya-sharipova added a commit to mayya-sharipova/elasticsearch that referenced this issue May 29, 2020
Before to determine if a field is meta-field, a static method of MapperService
isMetadataField was used. This method was using an outdated static list
of meta-fields.

This PR instead changes this method to the instance method that
is also aware of meta-fields in all registered plugins.

Related elastic#38373, elastic#41656
Closes elastic#24422
mayya-sharipova added a commit to mayya-sharipova/elasticsearch that referenced this issue May 29, 2020
Before to determine if a field is meta-field, a static method of MapperService
isMetadataField was used. This method was using an outdated static list
of meta-fields.

This PR instead changes this method to the instance method that
is also aware of meta-fields in all registered plugins.

Related elastic#38373, elastic#41656
Closes elastic#24422
mayya-sharipova added a commit that referenced this issue Jun 5, 2020
Before to determine if a field is meta-field, a static method of MapperService
isMetadataField was used. This method was using an outdated static list
of meta-fields.

This PR instead changes this method to the instance method that
is also aware of meta-fields in all registered plugins.

Related #38373, #41656
Closes #24422
mayya-sharipova added a commit to mayya-sharipova/elasticsearch that referenced this issue Jun 5, 2020
Before to determine if a field is meta-field, a static method of MapperService
isMetadataField was used. This method was using an outdated static list
of meta-fields.

This PR instead changes this method to the instance method that
is also aware of meta-fields in all registered plugins.

Related elastic#38373, elastic#41656
Closes elastic#24422
mayya-sharipova added a commit that referenced this issue Jun 8, 2020
Before to determine if a field is meta-field, a static method of MapperService
isMetadataField was used. This method was using an outdated static list
of meta-fields.

This PR instead changes this method to the instance method that
is also aware of meta-fields in all registered plugins.

Related #38373, #41656
Closes #24422
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>non-issue :Search/Mapping Index mappings, including merging and defining field types Team:Search Meta label for search team
Projects
None yet
9 participants