Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed naming inconsistency for fields/stored_fields in the APIs #20166

Merged
merged 7 commits into from
Sep 13, 2016
Merged

Fixed naming inconsistency for fields/stored_fields in the APIs #20166

merged 7 commits into from
Sep 13, 2016

Conversation

jimczi
Copy link
Contributor

@jimczi jimczi commented Aug 25, 2016

This change replaces the fields parameter with stored_fields when it makes sense.
This is dictated by the renaming we made in #18943 for the search API.

The following list of endpoint has been changed to use stored_fields instead of fields:

  • get
  • mget
  • explain

The documentation and the rest API spec has been updated to cope with the changes for the following APIs:

  • delete_by_query
  • get
  • mget
  • explain

The fields parameter has been deprecated for the following APIs:

  • update
  • bulk

These APIs now support _source as a parameter to filter the _source of the updated document to be returned.

Some APIs still have the fields parameter for various reasons:

  • cat.fielddata: the fields paramaters relates to the fielddata fields that should be printed.
  • indices.clear_cache: used to indicate which fielddata fields should be cleared.
  • indices.get_field_mapping: used to filter fields in the mapping.
  • indices.stats: get stats on fields (stored or not stored).
  • termvectors: fields are retrieved from the stored fields if possible and extracted from the _source otherwise.
  • mtermvectors:
  • nodes.stats: the fields parameter is used to concatenate completion_fields and fielddata_fields so it's not related to stored_fields at all.

Fixes #20155

@jimczi jimczi added >bug >breaking :Distributed/CRUD A catch all label for issues around indexing, updating and getting a doc by id. Not search. v5.0.0-beta1 labels Aug 25, 2016
@jpountz
Copy link
Contributor

jpountz commented Aug 26, 2016

The change looks good and your explanation about what APIs you modified makes sense to me. Should we have nicer backward compatibility and use ParseField to make fields a deprecated alias of stored_fields like we did for docvalue_fields and fielddata_fields? (See also this comment: #18943 (comment))

@jimczi
Copy link
Contributor Author

jimczi commented Aug 26, 2016

Should we have nicer backward compatibility and use ParseField to make fields a deprecated alias of stored_fields like we did for docvalue_fields and fielddata_fields? (See also this comment: #18943 (comment))

Sure or maybe we can just throw a nice exception if fields is used ? It would be consistent with what we have for the Search API:

throw new ParsingException(parser.getTokenLocation(), "The field [" +

@jpountz
Copy link
Contributor

jpountz commented Aug 26, 2016

An exception would work for me even though when it comes to the DSL, I like having things deprecated better: unlike Java there is no compile time error, it only fails at runtime. Moreover, major version upgrades are already challenging on their own so if we can avoid breaking changes in the DSL, I think that's better.

@clintongormley
Copy link

I like having things deprecated better

Given that this is not just a change in parameter name but also a change in behaviour, I'd prefer an exception over a deprecation. It is easier to rename the parameter in your code than to figure out why you're getting bad results.

Perhaps we could add deprecation logging for fields in 2.4.1?

@clintongormley
Copy link

In bulk and update, I'd deprecate the use of fields and add support for _source instead. This makes it more consistent.

The termvectors and mtermvectors APIs are trickier... Perhaps leave them as they are for now.

@jimczi
Copy link
Contributor Author

jimczi commented Aug 29, 2016

I think it's ok to add _source in the update request but it conflicts with the java API where we use setSource and source to create the request from a bytes array . Since it's not used in the rest API and that source in this case has nothing to do with the source of the document, is it ok to rename setSource to something like fromBytes ?

@jpountz
Copy link
Contributor

jpountz commented Aug 29, 2016

Since it's not used in the rest API and that source in this case has nothing to do with the source of the document, is it ok to rename setSource to something like fromBytes ?

Not sure about the name, but definitely +1 on having the Java API as close as possible to the REST API and then renaming internal Java APIs if there is conflict. Back to the name... maybe fromXContent since we expect these bytes to be some form of xcontent and the name would be consistent with what we use in query builders to parse incoming bytes (even though the bytes are provided differently).

@jimczi
Copy link
Contributor Author

jimczi commented Aug 30, 2016

I pushed a change that deprecate the fields parameter in the bulk/update API and adds the support for _source filtering.
@jpountz could you please take another look ?

@jimczi jimczi added the review label Sep 8, 2016
This change replaces the fields parameter with stored_fields when it makes sense.
This is dictated by the renaming we made in #18943 for the search API.

The following list of endpoint has been changed to use `stored_fields` instead of `fields`:
* get
* mget
* explain

The documentation and the rest API spec has been updated to cope with the changes for the following APIs:
* delete_by_query
* get
* mget
* explain

Some APIs still have the `fields` parameter for various reasons:

* update: the fields are extracted from the _source directly.
* bulk: the fields parameter is used but fields are extracted from the source directly so it is allowed to have non-stored fields.
* cat.fielddata: the fields paramaters relates to the fielddata fields that should be printed.
* indices.clear_cache: used to indicate which fielddata fields should be cleared.
* indices.get_field_mapping: used to filter fields in the mapping.
* indices.stats: get stats on fields (stored or not stored).
* termvectors: fields are retrieved from the stored fields if possible and extracted from the _source otherwise.
* mtermvectors:
* nodes.stats: the fields parameter is used to concatenate completion_fields and fielddata_fields so it's not related to stored_fields at all.

Fixes #20155
"lang": "painless",
"params" : {
"tag" : "blue"
"tag" : "green"
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ tag:green inconsistent with the example description which talks about tag:blue

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the description to mention the green tag, I had to change the logic since we use the same doc for all CONSOLE snippet and it will break the next snippet if the document is removed.

@jimczi
Copy link
Contributor Author

jimczi commented Sep 13, 2016

Thanks @markharwood I pushed a commit to address your comment.

@@ -72,7 +72,7 @@ def search = node.client.search {
source {
query {
query_string(
fields: ["test"],
stored_fields: ["test"],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this change is incorrect. query_string still takes fields.

index.addField(field, text.toString(), analyzer);
if (entry.getValue() instanceof List) {
for (Object text : entry.getValue()) {
index.addField(field, text.toString(), analyzer);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some mean person is going to write "fields": ["bleh", null, "blah"] and complain about the weird exception at this line. Maybe just throw an IllegalArgumentException here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is null.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This list is the response of a get request so we should never have null values.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! All good then.

@nik9000
Copy link
Member

nik9000 commented Sep 13, 2016

I left a bunch of minor stuff. Asked for more docs, more formatting, and modernizing a few tests. I didn't see anything major. I think it is the right thing to do and we should get it in sooner rather than later so we can get it in 5.0's next release. My instinct is that we're going to find some subtle bug that I should have caught on review and I'm going to facepalm but it really does look good to me other than all the minor stuff I left.

@jimczi
Copy link
Contributor Author

jimczi commented Sep 13, 2016

Thank you so much for the review @nik9000 ! I pushed a commit to hopefully address all your feedbacks. Can you take another look ?

--------------------------------------------------
GET twitter/tweet/1?routing=user1,stored_fields=tags,counter
--------------------------------------------------

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is missing // CONSOLE I think.

@nik9000
Copy link
Member

nik9000 commented Sep 13, 2016

Left one minor thing on the last commit. I'm going to take a quick break and reread the whole thing one last time after that but I expect it'll be fine.

@@ -298,8 +304,24 @@ public GetResult extractGetResult(final UpdateRequest request, String concreteIn
}
}

BytesReference sourceFilteredAsBytes = sourceAsBytes;
if (request.fetchSource() != null) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth optimizing the case where fetchSource is true and includes and excludes are empty and the xcontents line up? Then you can just return the bytes you have like we do for search lookups.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about the case when request.fetchSource().fetchSource() is false? Why do we even have a boolean there? Maybe we should just use null instead? It doesn't make sense to have a FetchSourceContext with fetchSource = false and includes = ["something"].

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe just handle the boolean for this PR, but we should think about removing it....

@nik9000
Copy link
Member

nik9000 commented Sep 13, 2016

LGTM. I left some very minor stuff but you can merge as is if you feel the need and/or fix what I left and merge without another review.

@jimczi jimczi merged commit 1764ec5 into elastic:master Sep 13, 2016
@jimczi jimczi deleted the fields_renaming branch September 13, 2016 18:54
jimczi added a commit that referenced this pull request Sep 13, 2016
This change replaces the fields parameter with stored_fields when it makes sense.
This is dictated by the renaming we made in #18943 for the search API.

The following list of endpoint has been changed to use `stored_fields` instead of `fields`:
* get
* mget
* explain

The documentation and the rest API spec has been updated to cope with the changes for the following APIs:
* delete_by_query
* get
* mget
* explain

The `fields` parameter has been deprecated for the following APIs (it is replaced by _source filtering):
* update: the fields are extracted from the _source directly.
* bulk: the fields parameter is used but fields are extracted from the source directly so it is allowed to have non-stored fields.

Some APIs still have the `fields` parameter for various reasons:
* cat.fielddata: the fields paramaters relates to the fielddata fields that should be printed.
* indices.clear_cache: used to indicate which fielddata fields should be cleared.
* indices.get_field_mapping: used to filter fields in the mapping.
* indices.stats: get stats on fields (stored or not stored).
* termvectors: fields are retrieved from the stored fields if possible and extracted from the _source otherwise.
* mtermvectors:
* nodes.stats: the fields parameter is used to concatenate completion_fields and fielddata_fields so it's not related to stored_fields at all.

Fixes #20155
jimczi added a commit that referenced this pull request Sep 13, 2016
This change replaces the fields parameter with stored_fields when it makes sense.
This is dictated by the renaming we made in #18943 for the search API.

The following list of endpoint has been changed to use `stored_fields` instead of `fields`:
* get
* mget
* explain

The documentation and the rest API spec has been updated to cope with the changes for the following APIs:
* delete_by_query
* get
* mget
* explain

The `fields` parameter has been deprecated for the following APIs (it is replaced by _source filtering):
* update: the fields are extracted from the _source directly.
* bulk: the fields parameter is used but fields are extracted from the source directly so it is allowed to have non-stored fields.

Some APIs still have the `fields` parameter for various reasons:
* cat.fielddata: the fields paramaters relates to the fielddata fields that should be printed.
* indices.clear_cache: used to indicate which fielddata fields should be cleared.
* indices.get_field_mapping: used to filter fields in the mapping.
* indices.stats: get stats on fields (stored or not stored).
* termvectors: fields are retrieved from the stored fields if possible and extracted from the _source otherwise.
* mtermvectors:
* nodes.stats: the fields parameter is used to concatenate completion_fields and fielddata_fields so it's not related to stored_fields at all.

Fixes #20155
@jimczi
Copy link
Contributor Author

jimczi commented Sep 13, 2016

Thanks again @nik9000 ! It is merged in 5.0 and 5.x.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>breaking >bug :Distributed/CRUD A catch all label for issues around indexing, updating and getting a doc by id. Not search. v5.0.0-beta1 v6.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants