Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed naming inconsistency for fields/stored_fields in the APIs #20166

Merged
merged 7 commits into from Sep 13, 2016

Conversation

Projects
None yet
5 participants
@jimczi
Copy link
Member

commented Aug 25, 2016

This change replaces the fields parameter with stored_fields when it makes sense.
This is dictated by the renaming we made in #18943 for the search API.

The following list of endpoint has been changed to use stored_fields instead of fields:

  • get
  • mget
  • explain

The documentation and the rest API spec has been updated to cope with the changes for the following APIs:

  • delete_by_query
  • get
  • mget
  • explain

The fields parameter has been deprecated for the following APIs:

  • update
  • bulk

These APIs now support _source as a parameter to filter the _source of the updated document to be returned.

Some APIs still have the fields parameter for various reasons:

  • cat.fielddata: the fields paramaters relates to the fielddata fields that should be printed.
  • indices.clear_cache: used to indicate which fielddata fields should be cleared.
  • indices.get_field_mapping: used to filter fields in the mapping.
  • indices.stats: get stats on fields (stored or not stored).
  • termvectors: fields are retrieved from the stored fields if possible and extracted from the _source otherwise.
  • mtermvectors:
  • nodes.stats: the fields parameter is used to concatenate completion_fields and fielddata_fields so it's not related to stored_fields at all.

Fixes #20155

@jpountz

This comment has been minimized.

Copy link
Contributor

commented Aug 26, 2016

The change looks good and your explanation about what APIs you modified makes sense to me. Should we have nicer backward compatibility and use ParseField to make fields a deprecated alias of stored_fields like we did for docvalue_fields and fielddata_fields? (See also this comment: #18943 (comment))

@jimczi

This comment has been minimized.

Copy link
Member Author

commented Aug 26, 2016

Should we have nicer backward compatibility and use ParseField to make fields a deprecated alias of stored_fields like we did for docvalue_fields and fielddata_fields? (See also this comment: #18943 (comment))

Sure or maybe we can just throw a nice exception if fields is used ? It would be consistent with what we have for the Search API:

throw new ParsingException(parser.getTokenLocation(), "The field [" +

@jpountz

This comment has been minimized.

Copy link
Contributor

commented Aug 26, 2016

An exception would work for me even though when it comes to the DSL, I like having things deprecated better: unlike Java there is no compile time error, it only fails at runtime. Moreover, major version upgrades are already challenging on their own so if we can avoid breaking changes in the DSL, I think that's better.

@clintongormley

This comment has been minimized.

Copy link
Member

commented Aug 26, 2016

I like having things deprecated better

Given that this is not just a change in parameter name but also a change in behaviour, I'd prefer an exception over a deprecation. It is easier to rename the parameter in your code than to figure out why you're getting bad results.

Perhaps we could add deprecation logging for fields in 2.4.1?

@clintongormley

This comment has been minimized.

Copy link
Member

commented Aug 26, 2016

In bulk and update, I'd deprecate the use of fields and add support for _source instead. This makes it more consistent.

The termvectors and mtermvectors APIs are trickier... Perhaps leave them as they are for now.

@jimczi

This comment has been minimized.

Copy link
Member Author

commented Aug 29, 2016

I think it's ok to add _source in the update request but it conflicts with the java API where we use setSource and source to create the request from a bytes array . Since it's not used in the rest API and that source in this case has nothing to do with the source of the document, is it ok to rename setSource to something like fromBytes ?

@jpountz

This comment has been minimized.

Copy link
Contributor

commented Aug 29, 2016

Since it's not used in the rest API and that source in this case has nothing to do with the source of the document, is it ok to rename setSource to something like fromBytes ?

Not sure about the name, but definitely +1 on having the Java API as close as possible to the REST API and then renaming internal Java APIs if there is conflict. Back to the name... maybe fromXContent since we expect these bytes to be some form of xcontent and the name would be consistent with what we use in query builders to parse incoming bytes (even though the bytes are provided differently).

@jimczi

This comment has been minimized.

Copy link
Member Author

commented Aug 30, 2016

I pushed a change that deprecate the fields parameter in the bulk/update API and adds the support for _source filtering.
@jpountz could you please take another look ?

@jimczi jimczi added the review label Sep 8, 2016

jimczi added some commits Aug 25, 2016

Fixed naming inconsistency for fields/stored_fields in the APIs
This change replaces the fields parameter with stored_fields when it makes sense.
This is dictated by the renaming we made in #18943 for the search API.

The following list of endpoint has been changed to use `stored_fields` instead of `fields`:
* get
* mget
* explain

The documentation and the rest API spec has been updated to cope with the changes for the following APIs:
* delete_by_query
* get
* mget
* explain

Some APIs still have the `fields` parameter for various reasons:

* update: the fields are extracted from the _source directly.
* bulk: the fields parameter is used but fields are extracted from the source directly so it is allowed to have non-stored fields.
* cat.fielddata: the fields paramaters relates to the fielddata fields that should be printed.
* indices.clear_cache: used to indicate which fielddata fields should be cleared.
* indices.get_field_mapping: used to filter fields in the mapping.
* indices.stats: get stats on fields (stored or not stored).
* termvectors: fields are retrieved from the stored fields if possible and extracted from the _source otherwise.
* mtermvectors:
* nodes.stats: the fields parameter is used to concatenate completion_fields and fielddata_fields so it's not related to stored_fields at all.

Fixes #20155
"lang": "painless",
"params" : {
"tag" : "blue"
"tag" : "green"
}

This comment has been minimized.

Copy link
@markharwood

markharwood Sep 13, 2016

Contributor

^ tag:green inconsistent with the example description which talks about tag:blue

This comment has been minimized.

Copy link
@jimczi

jimczi Sep 13, 2016

Author Member

I changed the description to mention the green tag, I had to change the logic since we use the same doc for all CONSOLE snippet and it will break the next snippet if the document is removed.

@@ -16,7 +16,7 @@
}
},
"params": {
"fields": {
"stored_fields": {
"type": "list",
"description" : "A comma-separated list of fields to return in the response"
},

This comment has been minimized.

Copy link
@markharwood

markharwood Sep 13, 2016

Contributor

Inconsistent with description in get.json - should be:

A comma-separated list of *stored* fields to return in the response
@@ -40,13 +40,17 @@
"type" : "boolean",
"description" : "Specify whether to return detailed information about score computation as part of a hit"
},
"fields": {
"stored_fields": {
"type" : "list",
"description" : "A comma-separated list of fields to return as part of a hit"
},

This comment has been minimized.

Copy link
@markharwood

markharwood Sep 13, 2016

Contributor

description should say "list of stored fields"

@jimczi

This comment has been minimized.

Copy link
Member Author

commented Sep 13, 2016

Thanks @markharwood I pushed a commit to address your comment.

@@ -72,7 +72,7 @@ def search = node.client.search {
source {
query {
query_string(
fields: ["test"],
stored_fields: ["test"],

This comment has been minimized.

Copy link
@nik9000

nik9000 Sep 13, 2016

Contributor

I think this change is incorrect. query_string still takes fields.

{ "doc" : {"field" : "value"} }
{ "update" : {"_id" : "4", "_type" : "type1", "_index" : "index1"} }
{ "doc" : {"field" : "value"}, "fields": ["_source"]}
{ "doc" : {"field" : "value"}, "_source": true}
--------------------------------------------------

This comment has been minimized.

Copy link
@nik9000

nik9000 Sep 13, 2016

Contributor

I'd convert this to // CONSOLE while I was here, maybe even add an example of what is returned. It'd be nice to have the example while reading the docs and it'd really help to make sure that they are up to date.

This comment has been minimized.

Copy link
@jimczi

jimczi Sep 13, 2016

Author Member

I'd prefer doing this in another PR. Since it's a doc thing we can make it whenever we want, right ?

This comment has been minimized.

Copy link
@nik9000

nik9000 Sep 13, 2016

Contributor

Fine with me so long as we don't forget.

will, when possible, be fetched as stored fields (fields mapped as
<<mapping-store,stored>> in the mapping).
When getting a document, one can specify `stored_fields` to fetch from it.
They will be simply ignored if the field is not stored (<<mapping-store,stored>> in the mapping)*[]:

This comment has been minimized.

Copy link
@nik9000

nik9000 Sep 13, 2016

Contributor

Maybe just remove the whole paragraph because we already have the === Source filtering section.

This comment has been minimized.

Copy link
@nik9000

nik9000 Sep 13, 2016

Contributor

And the === Stored Fields section.


Field values fetched from the document it self are always returned as an array. Metadata fields like `_routing` and
`_parent` fields are never returned as an array.

Also only leaf fields can be returned via the `field` option. So object fields can't be returned and such requests
Also only leaf fields can be returned via the `stored_field` option. So object fields can't be returned and such requests

This comment has been minimized.

Copy link
@nik9000

nik9000 Sep 13, 2016

Contributor

I have no idea what this sentence means.

This comment has been minimized.

Copy link
@jimczi

jimczi Sep 13, 2016

Author Member

Really ? You cannot retrieve object fields that's all it says.

This comment has been minimized.

Copy link
@nik9000

nik9000 Sep 13, 2016

Contributor

Ah, yeah. I guess it is ok this way then.

For backward compatibility, if the requested fields are not stored, they will be fetched
from the `_source` (parsed and extracted). This functionality has been replaced by the
<<get-source-filtering,source filtering>> parameter.
If the requested fields are not stored, they will be ignored.

Field values fetched from the document it self are always returned as an array. Metadata fields like `_routing` and

This comment has been minimized.

Copy link
@nik9000

nik9000 Sep 13, 2016

Contributor

I'd likely rewrite this whole section to make it more clear what stored fields are. Maybe it makes more sense to just make a page about them and reference it here like I did with refresh.asciidoc. Maybe this should come in a separate PR? I dunno.

"doc" : {
"name" : "new_name"
},
"detect_noop": false
}'
"detect_noop": true

This comment has been minimized.

Copy link
@nik9000

nik9000 Sep 13, 2016

Contributor

s/true/false/ I think. The paragraph talks about setting it to false and how the default is true.

String sField = request.param("fields");
if (request.param("fields") != null) {
throw new IllegalArgumentException("The parameter [fields] is no longer supported, " +
"please use [stored_fields] to retrieve stored fields or _source filtering if the field is not stored");

This comment has been minimized.

Copy link
@nik9000

nik9000 Sep 13, 2016

Contributor

or [_source] to load the field from _source.?

@@ -40,7 +40,7 @@
/**
* Context used to fetch the {@code _source}.
*/
public class FetchSourceContext implements Streamable, ToXContent {
public class FetchSourceContext implements Writeable, ToXContent {

This comment has been minimized.

Copy link
@nik9000

nik9000 Sep 13, 2016

Contributor

❤️

out.writeBoolean(fetchSource);
out.writeStringArray(includes);
out.writeStringArray(excludes);
out.writeBoolean(false); // Used to be transformSource but that was dropped in 2.1

This comment has been minimized.

Copy link
@nik9000

nik9000 Sep 13, 2016

Contributor

Nice to see that gone.

@@ -405,7 +406,7 @@ public void testBulkUpdateLargerVolume() throws Exception {
assertThat(response.getItems()[i].getType(), equalTo("type1"));
assertThat(response.getItems()[i].getOpType(), equalTo("update"));
for (int j = 0; j < 5; j++) {
GetResponse getResponse = client().prepareGet("test", "type1", Integer.toString(i)).setFields("counter").execute()
GetResponse getResponse = client().prepareGet("test", "type1", Integer.toString(i)).execute()

This comment has been minimized.

Copy link
@nik9000

nik9000 Sep 13, 2016

Contributor

Just do .get()?

request.source(XContentFactory.jsonBuilder().startObject().startObject("script").field("inline", "script1").startObject("params")
.field("param1", "value1").endObject().endObject().endObject());
request.fromXContent(XContentFactory.jsonBuilder().startObject()
.startObject("script").field("inline", "script1")

This comment has been minimized.

Copy link
@nik9000

nik9000 Sep 13, 2016

Contributor

Could you write this like:

request.fromXContent(XContentFactory.jsonBuilder().startObject()
  .startObject("script")
    .field("inline", "script1")
    .startObject("params")
      .field("param1", "value1")
    .endObject()
  .endObject().endObject());

so the structure is more obvious when you scan it?

This comment has been minimized.

Copy link
@nik9000

nik9000 Sep 13, 2016

Contributor

Or with "script" on the starting line is fine too.

.endObject().field("inline", "script1").endObject().startObject("upsert").field("field1", "value1").startObject("compound")
.field("field2", "value2").endObject().endObject().endObject());
request.fromXContent(XContentFactory.jsonBuilder().startObject().startObject("script")
.startObject("params")

This comment has been minimized.

Copy link
@nik9000

nik9000 Sep 13, 2016

Contributor

Yeah, like this.

.field("field2", "value2").endObject().endObject().endObject());
request.fromXContent(XContentFactory.jsonBuilder().startObject().startObject("script")
.startObject("params")
.field("param1", "value1").endObject()

This comment has been minimized.

Copy link
@nik9000

nik9000 Sep 13, 2016

Contributor

Oh, weird, actually, the endObject on the end of this is weird to read.

.field("field2", "value2").endObject().endObject().startObject("script").startObject("params").field("param1", "value1")
.endObject().field("inline", "script1").endObject().endObject());
request.fromXContent(XContentFactory.jsonBuilder().startObject()
.startObject("upsert")

This comment has been minimized.

Copy link
@nik9000

nik9000 Sep 13, 2016

Contributor

This one is perfect I think.

@@ -585,7 +587,7 @@ public void testGetFieldsNonLeafField() throws Exception {
.get();

try {
client().prepareGet(indexOrAlias(), "my-type1", "1").setFields("field1").get();
client().prepareGet(indexOrAlias(), "my-type1", "1").setStoredFields("field1").get();
fail();
} catch (IllegalArgumentException e) {
//all well

This comment has been minimized.

Copy link
@nik9000

nik9000 Sep 13, 2016

Contributor

Can you add an assertion here? Maybe rewrite to expectThrows if you are feeling brave. But I really would like an assertion about the message here regardless.

@@ -283,6 +283,7 @@ public void testReplicaToPrimaryPromotion() throws Exception {
client().prepareIndex(IDX, "doc", "1").setSource("foo", "bar").get();
client().prepareIndex(IDX, "doc", "2").setSource("foo", "bar").get();


This comment has been minimized.

Copy link
@nik9000

nik9000 Sep 13, 2016

Contributor

Leftover I think.

updateResponse = client().prepareUpdate(indexOrAlias(), "type1", "1")
.setScript(new Script("field1", ScriptService.ScriptType.INLINE, "field_inc", null))
.setFetchSource("field1", "field2")
.execute().actionGet();

This comment has been minimized.

Copy link
@nik9000

nik9000 Sep 13, 2016

Contributor

.get()

@nik9000

This comment has been minimized.

Copy link
Contributor

commented Sep 13, 2016

I left a bunch of minor stuff. Asked for more docs, more formatting, and modernizing a few tests. I didn't see anything major. I think it is the right thing to do and we should get it in sooner rather than later so we can get it in 5.0's next release. My instinct is that we're going to find some subtle bug that I should have caught on review and I'm going to facepalm but it really does look good to me other than all the minor stuff I left.

@jimczi

This comment has been minimized.

Copy link
Member Author

commented Sep 13, 2016

Thank you so much for the review @nik9000 ! I pushed a commit to hopefully address all your feedbacks. Can you take another look ?

--------------------------------------------------
GET twitter/tweet/1?routing=user1,stored_fields=tags,counter
--------------------------------------------------

This comment has been minimized.

Copy link
@nik9000

nik9000 Sep 13, 2016

Contributor

This one is missing // CONSOLE I think.

@nik9000

This comment has been minimized.

Copy link
Contributor

commented Sep 13, 2016

Left one minor thing on the last commit. I'm going to take a quick break and reread the whole thing one last time after that but I expect it'll be fine.

@@ -298,8 +304,24 @@ public GetResult extractGetResult(final UpdateRequest request, String concreteIn
}
}

BytesReference sourceFilteredAsBytes = sourceAsBytes;
if (request.fetchSource() != null) {

This comment has been minimized.

Copy link
@nik9000

nik9000 Sep 13, 2016

Contributor

Is it worth optimizing the case where fetchSource is true and includes and excludes are empty and the xcontents line up? Then you can just return the bytes you have like we do for search lookups.

This comment has been minimized.

Copy link
@nik9000

nik9000 Sep 13, 2016

Contributor

What about the case when request.fetchSource().fetchSource() is false? Why do we even have a boolean there? Maybe we should just use null instead? It doesn't make sense to have a FetchSourceContext with fetchSource = false and includes = ["something"].

This comment has been minimized.

Copy link
@nik9000

nik9000 Sep 13, 2016

Contributor

Maybe just handle the boolean for this PR, but we should think about removing it....

},
"tags": {
"type": "keyword",
"store": "yes"
"store": "true"

This comment has been minimized.

Copy link
@nik9000

nik9000 Sep 13, 2016

Contributor

I'm fairly sure you don't need the quotes here and it'd be more idiomatic not to have them.

@@ -182,8 +182,10 @@ PUT twitter/tweet/2?routing=user1

[source,js]
--------------------------------------------------
GET twitter/tweet/1?routing=user1,stored_fields=tags,counter
GET twitter/tweet/2?routing=user1&stored_fields=tags,counter

This comment has been minimized.

Copy link
@nik9000

nik9000 Sep 13, 2016

Contributor

Nice. Glad you caught the ,. I certainly didn't see it the first time around.

@nik9000

This comment has been minimized.

Copy link
Contributor

commented Sep 13, 2016

LGTM. I left some very minor stuff but you can merge as is if you feel the need and/or fix what I left and merge without another review.

@jimczi jimczi merged commit 1764ec5 into elastic:master Sep 13, 2016

1 of 2 checks passed

elasticsearch-ci Build started sha1 is merged.
Details
CLA Commit author is a member of Elasticsearch
Details

@jimczi jimczi deleted the jimczi:fields_renaming branch Sep 13, 2016

jimczi added a commit that referenced this pull request Sep 13, 2016

Fixed naming inconsistency for fields/stored_fields in the APIs (#20166)
This change replaces the fields parameter with stored_fields when it makes sense.
This is dictated by the renaming we made in #18943 for the search API.

The following list of endpoint has been changed to use `stored_fields` instead of `fields`:
* get
* mget
* explain

The documentation and the rest API spec has been updated to cope with the changes for the following APIs:
* delete_by_query
* get
* mget
* explain

The `fields` parameter has been deprecated for the following APIs (it is replaced by _source filtering):
* update: the fields are extracted from the _source directly.
* bulk: the fields parameter is used but fields are extracted from the source directly so it is allowed to have non-stored fields.

Some APIs still have the `fields` parameter for various reasons:
* cat.fielddata: the fields paramaters relates to the fielddata fields that should be printed.
* indices.clear_cache: used to indicate which fielddata fields should be cleared.
* indices.get_field_mapping: used to filter fields in the mapping.
* indices.stats: get stats on fields (stored or not stored).
* termvectors: fields are retrieved from the stored fields if possible and extracted from the _source otherwise.
* mtermvectors:
* nodes.stats: the fields parameter is used to concatenate completion_fields and fielddata_fields so it's not related to stored_fields at all.

Fixes #20155

jimczi added a commit that referenced this pull request Sep 13, 2016

Fixed naming inconsistency for fields/stored_fields in the APIs (#20166)
This change replaces the fields parameter with stored_fields when it makes sense.
This is dictated by the renaming we made in #18943 for the search API.

The following list of endpoint has been changed to use `stored_fields` instead of `fields`:
* get
* mget
* explain

The documentation and the rest API spec has been updated to cope with the changes for the following APIs:
* delete_by_query
* get
* mget
* explain

The `fields` parameter has been deprecated for the following APIs (it is replaced by _source filtering):
* update: the fields are extracted from the _source directly.
* bulk: the fields parameter is used but fields are extracted from the source directly so it is allowed to have non-stored fields.

Some APIs still have the `fields` parameter for various reasons:
* cat.fielddata: the fields paramaters relates to the fielddata fields that should be printed.
* indices.clear_cache: used to indicate which fielddata fields should be cleared.
* indices.get_field_mapping: used to filter fields in the mapping.
* indices.stats: get stats on fields (stored or not stored).
* termvectors: fields are retrieved from the stored fields if possible and extracted from the _source otherwise.
* mtermvectors:
* nodes.stats: the fields parameter is used to concatenate completion_fields and fielddata_fields so it's not related to stored_fields at all.

Fixes #20155
@jimczi

This comment has been minimized.

Copy link
Member Author

commented Sep 13, 2016

Thanks again @nik9000 ! It is merged in 5.0 and 5.x.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.