Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add parameter to GET API for checking if generated fields can be retrieved #6973

Closed
wants to merge 7 commits into from

Conversation

Projects
None yet
4 participants
@brwe
Copy link
Contributor

brwe commented Jul 23, 2014

Fields of type token_count, murmur3, _all and _field_names are generated only when indexing.
If a GET requests accesses the transaction log (because no refresh
between indexing and GET request) then these fields cannot be retrieved at all.
Before the behavior was so:

_all, _field_names: The field was siletly ignored
murmur3, token_count: NumberFormatException because GET tried to parse the values from the source.

In addition, if these fields were not stored, the same behavior occured if the fields were
retrieved with GET after a refresh() because here also the source was used to get the fields.

Now, GET accepts a parameter ignore_errors_on_generated_fields which has
the following effect:

  • Throw exception with meaningful error message explaining the problem if set to false (default)
  • Ignore the field if set to true
  • Always ignore the field if it was not set to stored

This changes the behavior for _all and _field_names as now an Exception is thrown if a user
tries to GET them before a refresh().

closes #6676

@s1monw

View changes

src/main/java/org/elasticsearch/action/get/GetRequest.java Outdated
@@ -265,6 +275,7 @@ public void readFrom(StreamInput in) throws IOException {
} else if (realtime == 1) {
this.realtime = true;
}
this.ignoreErrorsOnGeneratedFields = in.readBoolean();

This comment has been minimized.

Copy link
@s1monw

s1monw Jul 23, 2014

Contributor

this should have a version check no?

This comment has been minimized.

Copy link
@brwe

brwe Jul 24, 2014

Author Contributor

indeed, forgot to run bwc tests which failed. added in commit "add version check for new get parameter"

@s1monw

View changes

src/main/java/org/elasticsearch/action/get/GetRequest.java Outdated
@@ -297,6 +308,7 @@ public void writeTo(StreamOutput out) throws IOException {
out.writeByte((byte) 1);
}

out.writeBoolean(ignoreErrorsOnGeneratedFields);

This comment has been minimized.

Copy link
@s1monw

s1monw Jul 23, 2014

Contributor

this should also have a version check

@s1monw

View changes

src/main/java/org/elasticsearch/action/get/MultiGetRequest.java Outdated
@@ -481,6 +488,7 @@ public void readFrom(StreamInput in) throws IOException {
} else if (realtime == 1) {
this.realtime = true;
}
ignoreErrorsOnGeneratedFields = in.readBoolean();

This comment has been minimized.

Copy link
@s1monw

s1monw Jul 23, 2014

Contributor

version checks?

@s1monw

View changes

src/main/java/org/elasticsearch/action/get/MultiGetRequest.java Outdated
@@ -501,6 +509,7 @@ public void writeTo(StreamOutput out) throws IOException {
} else {
out.writeByte((byte) 1);
}
out.writeBoolean(ignoreErrorsOnGeneratedFields);

This comment has been minimized.

Copy link
@s1monw

s1monw Jul 23, 2014

Contributor

version checks?

@s1monw

View changes

src/main/java/org/elasticsearch/action/get/MultiGetShardRequest.java Outdated
@@ -153,6 +159,7 @@ public void readFrom(StreamInput in) throws IOException {
} else if (realtime == 1) {
this.realtime = true;
}
ignoreErrorsOnGeneratedFields = in.readBoolean();

This comment has been minimized.

Copy link
@s1monw

s1monw Jul 23, 2014

Contributor

version checks

@s1monw

View changes

src/main/java/org/elasticsearch/action/get/MultiGetShardRequest.java Outdated
@@ -191,7 +198,11 @@ public void writeTo(StreamOutput out) throws IOException {
} else {
out.writeByte((byte) 1);
}
out.writeBoolean(ignoreErrorsOnGeneratedFields);

This comment has been minimized.

Copy link
@s1monw

s1monw Jul 23, 2014

Contributor

version checks

@s1monw

View changes

src/main/java/org/elasticsearch/index/get/ShardGetService.java Outdated
@@ -96,12 +96,12 @@ public ShardGetService setIndexShard(IndexShard indexShard) {
return this;
}

public GetResult get(String type, String id, String[] gFields, boolean realtime, long version, VersionType versionType, FetchSourceContext fetchSourceContext)
public GetResult get(String type, String id, String[] gFields, boolean realtime, boolean ignoreErrorsOnGeneratedFields, long version, VersionType versionType, FetchSourceContext fetchSourceContext)

This comment has been minimized.

Copy link
@s1monw

s1monw Jul 23, 2014

Contributor

can it be the last argument on the ctor to be consistent with the other ctor?

This comment has been minimized.

Copy link
@brwe

brwe Jul 24, 2014

Author Contributor

done in commit "renaming and reordering of parameters"

@s1monw

View changes

src/main/java/org/elasticsearch/index/get/ShardGetService.java Outdated
@@ -165,7 +165,7 @@ protected FetchSourceContext normalizeFetchSourceContent(@Nullable FetchSourceCo
return FetchSourceContext.DO_NOT_FETCH_SOURCE;
}

public GetResult innerGet(String type, String id, String[] gFields, boolean realtime, long version, VersionType versionType, FetchSourceContext fetchSourceContext) throws ElasticsearchException {
public GetResult innerGet(String type, String id, String[] gFields, boolean realtime, boolean ignoreErrorsOnGeneratedFields, long version, VersionType versionType, FetchSourceContext fetchSourceContext) throws ElasticsearchException {

This comment has been minimized.

Copy link
@s1monw

s1monw Jul 23, 2014

Contributor

also here can we just append to the argument list and move it to the back

@s1monw

View changes

src/main/java/org/elasticsearch/index/get/ShardGetService.java Outdated
@@ -312,7 +313,27 @@ public GetResult innerGet(String type, String id, String[] gFields, boolean real
}
}

private GetResult innerGetLoadFromStoredFields(String type, String id, String[] gFields, FetchSourceContext fetchSourceContext, Engine.GetResult get, DocumentMapper docMapper) {
protected boolean shouldGetFromSource(boolean ignoreErrorsOnGeneratedFields, DocumentMapper docMapper, FieldMapper<?> x) {

This comment has been minimized.

Copy link
@s1monw

s1monw Jul 23, 2014

Contributor

s/x/fieldMapper

@s1monw

View changes

src/main/java/org/elasticsearch/index/get/ShardGetService.java Outdated
@@ -345,7 +367,7 @@ private GetResult innerGetLoadFromStoredFields(String type, String id, String[]
// Only fail if we know it is a object field, missing paths / fields shouldn't fail.
throw new ElasticsearchIllegalArgumentException("field [" + field + "] isn't a leaf field");
}
} else if (!x.mapper().fieldType().stored()) {
} else if (!x.mapper().fieldType().stored() && !x.mapper().isGenerated()) {

This comment has been minimized.

Copy link
@s1monw

s1monw Jul 23, 2014

Contributor

also please change s/x/fieldMappers/

@s1monw

View changes

src/main/java/org/elasticsearch/index/mapper/FieldMapper.java Outdated
@@ -290,4 +290,6 @@ public static Loading parse(String loading, Loading defaultValue) {

Loading normsLoading(Loading defaultLoading);

public boolean isGenerated();

This comment has been minimized.

Copy link
@s1monw

s1monw Jul 23, 2014

Contributor

I know we are not very strict about it but maybe we can just put some javadocs on this?

@@ -1115,4 +1117,11 @@ public void parse(String field, ParseContext context) throws IOException {

}

/**

This comment has been minimized.

Copy link
@s1monw

s1monw Jul 23, 2014

Contributor

maybe instead of passing a boolean to the abstract field mapper we can just return false from here and the exceptional mappers that implement that can just override it and return true? that way we don't break compat with others that implement mappers and the change is less intrusive?

This comment has been minimized.

Copy link
@brwe

brwe Jul 24, 2014

Author Contributor

added commit "remove isGenerated flag from constructor..."

@s1monw

This comment has been minimized.

Copy link
Contributor

s1monw commented Jul 23, 2014

I left some comments - looks good

brwe added some commits Jul 17, 2014

Add parameter to GET for checking if generated fields can be retrieved
Fields of type `token_count`, `murmur3`, `_all` and `_field_names` are generated only when indexing.
If a GET requests accesses the transaction log (because no refresh
between indexing and GET request) then these fields cannot be retrieved at all.
Before the behavior was so:

`_all, _field_names`: The field was siletly ignored
`murmur3, token_count`: `NumberFormatException` because GET tried to parse the values from the source.

In addition, if these fields were not stored, the same behavior occured if the fields were
retrieved with GET after a `refresh()` because here also the source was used to get the fields.

Now, GET accepts a parameter `ignore_errors_on_generated_fields` which has
the following effect:
- Throw exception with meaningful error message explaining the problem if set to false (default)
- Ignore the field if set to true
- Always ignore the field if it was not set to stored

This changes the behavior for `_all` and `_field_names` as now an Exception is thrown if a user
tries to GET them before a `refresh()`.

closes #6676
@brwe

This comment has been minimized.

Copy link
Contributor Author

brwe commented Jul 24, 2014

implemented all changes

@brwe brwe added the review label Jul 24, 2014

@brwe brwe changed the title Add parameter to GET for checking if generated fields can be retrieved GET API: Add parameter to GET for checking if generated fields can be retrieved Jul 24, 2014

@jpountz

This comment has been minimized.

Copy link
Contributor

jpountz commented Aug 1, 2014

LGTM

@brwe brwe removed the review label Aug 4, 2014

brwe added a commit that referenced this pull request Aug 4, 2014

Add parameter to GET for checking if generated fields can be retrieved
Fields of type `token_count`, `murmur3`, `_all` and `_field_names` are generated only when indexing.
If a GET requests accesses the transaction log (because no refresh
between indexing and GET request) then these fields cannot be retrieved at all.
Before the behavior was so:

`_all, _field_names`: The field was siletly ignored
`murmur3, token_count`: `NumberFormatException` because GET tried to parse the values from the source.

In addition, if these fields were not stored, the same behavior occured if the fields were
retrieved with GET after a `refresh()` because here also the source was used to get the fields.

Now, GET accepts a parameter `ignore_errors_on_generated_fields` which has
the following effect:
- Throw exception with meaningful error message explaining the problem if set to false (default)
- Ignore the field if set to true
- Always ignore the field if it was not set to stored

This changes the behavior for `_all` and `_field_names` as now an Exception is thrown if a user
tries to GET them before a `refresh()`.

closes #6676
closes #6973

@brwe brwe closed this in 5706858 Aug 4, 2014

brwe added a commit that referenced this pull request Sep 8, 2014

Add parameter to GET for checking if generated fields can be retrieved
Fields of type `token_count`, `murmur3`, `_all` and `_field_names` are generated only when indexing.
If a GET requests accesses the transaction log (because no refresh
between indexing and GET request) then these fields cannot be retrieved at all.
Before the behavior was so:

`_all, _field_names`: The field was siletly ignored
`murmur3, token_count`: `NumberFormatException` because GET tried to parse the values from the source.

In addition, if these fields were not stored, the same behavior occured if the fields were
retrieved with GET after a `refresh()` because here also the source was used to get the fields.

Now, GET accepts a parameter `ignore_errors_on_generated_fields` which has
the following effect:
- Throw exception with meaningful error message explaining the problem if set to false (default)
- Ignore the field if set to true
- Always ignore the field if it was not set to stored

This changes the behavior for `_all` and `_field_names` as now an Exception is thrown if a user
tries to GET them before a `refresh()`.

closes #6676
closes #6973

@clintongormley clintongormley changed the title GET API: Add parameter to GET for checking if generated fields can be retrieved Add parameter to GET API for checking if generated fields can be retrieved Jun 7, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.