Factor out sort values from InternalSearchHit #22080

cbuescher · 2016-12-09T16:10:27Z

This adds fromXContent method and unit test for sort values that are part of
InternalSearchHit. In order to centralize serialisation and xContent parsing and
rendering code, move all relevant parts to a new class which can be unit tested
much better in isolation.This is part of the preparation for parsing search
responses on the client side.

nik9000

Left a few minor things, LGTM though.

nik9000 · 2016-12-09T16:28:10Z

core/src/main/java/org/elasticsearch/search/internal/SearchSortValues.java

+            sortValues = new Object[0];
+        }
+    }
+


Can you move writeTo up here? I like to get reading and writing on the same page if possible.

nik9000 · 2016-12-09T16:28:38Z

core/src/main/java/org/elasticsearch/search/internal/SearchSortValues.java

+    @Override
+    public void writeTo(StreamOutput out) throws IOException {
+        out.writeVInt(sortValues.length);
+        if (sortValues.length > 0) {


I don't think this if is needed.

nik9000 · 2016-12-09T16:31:16Z

core/src/main/java/org/elasticsearch/search/internal/SearchSortValues.java

+        }
+    }
+
+    public boolean xContentEquals(ToXContentToBytes other) {


I think I'd prefer to leave this out and do assertEquals(this.toString(), other.toString()) in the test.

javanna

left a few comments but I like it

javanna · 2016-12-12T16:08:29Z

core/src/main/java/org/elasticsearch/search/internal/SearchSortValues.java

+            sortValues = new Object[0];
+        }
+    }
+


javanna · 2016-12-12T16:09:11Z

core/src/main/java/org/elasticsearch/search/internal/SearchSortValues.java

+        assert parser.currentToken() == XContentParser.Token.FIELD_NAME;
+        assert parser.currentName().equals(Fields.SORT);
+        XContentParser.Token token = parser.nextToken();
+        assert token == XContentParser.Token.START_ARRAY;


shall these be exceptions? I don't have a strong opinion but I got this same comment from Tanguy in another PR.

Either way is fine. I will change it to what you did in #22082.

javanna · 2016-12-12T16:10:41Z

core/src/main/java/org/elasticsearch/search/internal/SearchSortValues.java

+    }
+
+    public SearchSortValues(Object[] sortValues, DocValueFormat[] sortValueFormats) {
+        this.sortValues = Arrays.copyOf(sortValues, sortValues.length);


shall we check that sortValueFormats is not null? also sortValues?

javanna · 2016-12-12T16:12:21Z

core/src/main/java/org/elasticsearch/search/internal/InternalSearchHit.java

@@ -86,7 +82,7 @@

    private Map<String, HighlightField> highlightFields = null;

-    private Object[] sortValues = EMPTY_SORT_VALUES;


seems like this empty array could be useful, shall we rather have a SearchSortValues.EMPTY default ?

javanna · 2016-12-12T16:13:21Z

core/src/main/java/org/elasticsearch/search/internal/SearchSortValues.java

+    @Override
+    public void writeTo(StreamOutput out) throws IOException {
+        out.writeVInt(sortValues.length);
+        if (sortValues.length > 0) {


javanna · 2016-12-12T16:13:53Z

core/src/main/java/org/elasticsearch/search/internal/SearchSortValues.java

+        }
+    }
+
+    public boolean xContentEquals(ToXContentToBytes other) {


javanna · 2016-12-12T16:14:42Z

core/src/test/java/org/elasticsearch/search/internal/SearchSortValuesTests.java

+        valueSuppliers.add(() -> randomByte());
+        valueSuppliers.add(() -> randomShort());
+        valueSuppliers.add(() -> randomBoolean());
+        valueSuppliers.add(() -> frequently() ? randomAsciiOfLengthBetween(1, 30) : randomRealisticUnicodeOfCodepointLength(30));


interesting, more elegant than a switch. I may steal this :)

cbuescher · 2016-12-12T18:27:37Z

@nik9000 @javanna I added a commit that addresses your last comments. For now I exclude SMILE from the xcontent types being tested because it makes roundtrip testing very hard, even if we only compare the json string representation of the object before and after parsing xContent. The problem is that when SMILE sends float values and we don't explicitely cast them back to floats, they are parsed as double and strangely get a slightly different value, if not cast back to float. I wrote a small test showing this in isolation here: https://github.com/cbuescher/elasticsearch/blob/smileFloatIssue/core/src/test/java/org/elasticsearch/common/xcontent/SmileFloatTests.java

javanna · 2016-12-14T21:31:32Z

core/src/test/java/org/elasticsearch/search/internal/SearchSortValuesTests.java

+        SearchSortValues sortValues = createTestItem();
+        // exclude SMILE: its difficult to compare exact float values after parsing
+        // they get returned as doubles with a slightly different value if not cast back to float (which we cannot do due to
+        // losing type information on the rest layer)


I did quite some digging around these, I did find that the different parsers for SMILE, CBOR and JSON and YAML have a different way of parsing float, either as float, as double, or as double but with double precision. This makes comparisons a bit cumbersome. The issue you were finding with SMILE is that once you parse a float, it is parsed into a double variable, which has double precision. When you print that out it becomes a double, which reparsed it is not a float anymore. The solution isn't to disable SMILE testing though, I resorted to comparing the map representations, essentially by parsing them back into maps. That also solves eventual ordering issue, as json keys ordering doesn't matter. see https://github.com/elastic/elasticsearch/pull/22082/files#diff-3e73f4760da40a9d55345c94edf833c2R108 . I think that we have to move that assertion to some common place, not sure yet where.

I changed this using the common helpers you pointed me to, left TODOs where we need to remove those once they are moved to a common class.

javanna · 2016-12-14T21:33:34Z

core/src/test/java/org/elasticsearch/search/internal/SearchSortValuesTests.java

+        builder.startObject(); // we need to wrap xContent output in proper object to create a parser for it
+        builder = sortValues.toXContent(builder, ToXContent.EMPTY_PARAMS);
+        builder.endObject();
+        return builder.string();


can we use Strings.toString instead? is pretty printing important here?

javanna · 2016-12-14T21:34:03Z

core/src/test/java/org/elasticsearch/search/internal/SearchSortValuesTests.java

+        builder = sortValues.toXContent(builder, ToXContent.EMPTY_PARAMS);
+        builder.endObject();
+
+        XContentParser parser = xcontentType.xContent().createParser(builder.bytes());


in general, we should rather use createParser from ESTestCase (it was added yesterday I think)

javanna · 2016-12-14T21:34:41Z

core/src/main/java/org/elasticsearch/search/internal/SearchSortValues.java

+        }
+        parser.nextToken();
+        if (parser.currentToken() != XContentParser.Token.START_ARRAY) {
+            throw new ParsingException(parser.getTokenLocation(), "expected [" + XContentParser.Token.START_ARRAY + "] token");


use the static methods in XContentParserUtils instead

cbuescher · 2016-12-19T11:52:15Z

@javanna I hope I adressed your last comments, using two (slightly modified) helper methods from your own PR now but leaving a note to remove these once they are available elsewhere. Not sure if we should hold this PR until those are merged.

javanna · 2016-12-19T14:34:58Z

core/src/main/java/org/elasticsearch/search/internal/SearchSortValues.java

+    public static SearchSortValues fromXContent(XContentParser parser) throws IOException {
+        XContentParserUtils.ensureFieldName(parser, parser.currentToken(), Fields.SORT);
+        parser.nextToken();
+        XContentParserUtils.ensureType(XContentParser.Token.START_ARRAY, parser.currentToken(), () -> parser.getTokenLocation());


you can just do parser.nextToken() as part of this line.

I thought about it but decided against it for better readability, I found advancing of the token gets hidden too much otherwise. Matter of taste I guess.

that's ok with me

maybe you could then save the returned token rather than retrieve it back from the parser? super minor nit though, not even sure what it helps

javanna · 2016-12-19T14:37:29Z

core/src/main/java/org/elasticsearch/search/internal/SearchSortValues.java

+        XContentParserUtils.ensureFieldName(parser, parser.currentToken(), Fields.SORT);
+        parser.nextToken();
+        XContentParserUtils.ensureType(XContentParser.Token.START_ARRAY, parser.currentToken(), () -> parser.getTokenLocation());
+        return new SearchSortValues(parser.list().toArray());


parser.list also supports nested arrays and objects. can they appear as part of sort values?

I think we may have the same "problem" that we have in get with stored fields, where we print out potentially anything, and when parsing it back we have to figure out what formats we actually support. Note also that parser list supports binary objects, but it returns byte[] for them, which makes it cumbersome to test them, hence I am assuming that part does not get tested at the moment.

This is somewhat limited here due to what is possible to go through writeTo/readFrom. Here this is essential single values, they should all be covered by the tests.

I see, but with serialization we can always read whatever we wrote in the same format. With parsing on the other hand, we can write many different types but we can parse only a few e.g. strings, numbers and not much more. That is why the generic methods hide complexity some times, especially because we end up supporting stuff that may not be needed and is not tested. Maybe it's ok in this case though. Can you double check if byte[] can be a potential value here?

I had missed that we know exactly which formats are supported thanks to the serialization methods that don't use a generic write/read method. So no byte[], no objects etc. I do wonder if it's cleaner to read the list manually and call parser.objectText() to read each value.

javanna

I left some comments, the main point is about formats supported when parsing stuff back, I think we have to figure out exactly what we support and test it rather than support anything which seems to be what we do at the moment.

This adds fromXContent method and unit test for sort values that are part of InternalSearchHit. In order to centralize serialisation and xContent parsing and rendering code, move all relevant parts to a new class which can be unit tested much better in isolation.This is part of the preparation for parsing search responses on the client side.

cbuescher · 2016-12-19T21:17:52Z

@javanna I updated this using the helper methods introduced in ElasticsearchAssertions and XContentHelper . What happened to XContentParserUtils.ensureFieldName()? I had to go back to my previous version of manual checks.

javanna · 2016-12-20T08:04:42Z

What happened to XContentParserUtils.ensureFieldName()? I had to go back to my previous version of manual checks.

it was removed as there were no usages for it. Feel free to add it back if it's going to be useful in other places too.

javanna

left a small comment, LGTM otherwise

javanna · 2016-12-20T12:36:43Z

core/src/main/java/org/elasticsearch/common/xcontent/XContentParserUtils.java

+     */
+    public static void ensureFieldName(XContentParser parser, Token token, String fieldName) throws IOException {
+        ensureExpectedToken(Token.FIELD_NAME, token, parser::getTokenLocation);
+        String current = parser.currentName() != null ? parser.currentName() : "<null>";


does it make sense to compare to something?

What do you mean? current is either the current tokens field name or <null> which I thought serves for printing the error message. I might be missing something.

I think if the currentName is null we have a bigger problem. why do we do the comparison even? have you tried passing in fieldName set to ? if the current token is a field name, currentName should not be null ever. I would treat that corner case differently.

Okay, got it. Makes sense.

This adds fromXContent method and unit test for sort values that are part of InternalSearchHit. In order to centralize serialisation and xContent parsing and rendering code, move all relevant parts to a new class which can be unit tested much better in isolation.This is part of the preparation for parsing search responses on the client side.

cbuescher · 2016-12-21T11:32:37Z

On 5.x with 9404411

* master: Simplify Unicast Zen Ping (elastic#22277) Replace IndicesQueriesRegistry (elastic#22289) Fixed document mistake and fit for 5.1.1 API [TEST] improve error message in ESTestCase#assertWarnings [TEST] remove deleted test classes from checkstyle suppressions [TEST] make ESSingleNodeTestCase tests repeatable (elastic#22283) Link for setting page in elasticsearch.yml is outdated Factor out sort values from InternalSearchHit (elastic#22080) Add ID for percolate query to Java API docs x_refresh.yaml tests should use unique index names and doc ids to ease debugging IndicesStoreIntegrationIT should not use start recovery sending as an indication that the recovery started Added base class for testing aggregators and some initial tests for `terms`, `top_hits` and `min` aggregations. Add link to foreach processor to ingest-attachment.asciidoc

cbuescher added review v6.0.0-alpha1 >non-issue labels Dec 9, 2016

nik9000 approved these changes Dec 9, 2016

View reviewed changes

javanna added :Java High Level REST Client >enhancement >non-issue and removed :Internal >non-issue >enhancement labels Dec 9, 2016

javanna approved these changes Dec 12, 2016

View reviewed changes

javanna reviewed Dec 14, 2016

View reviewed changes

cbuescher force-pushed the addParsing-sortValues branch from adda43b to a0fede6 Compare December 19, 2016 11:47

javanna reviewed Dec 19, 2016

View reviewed changes

javanna requested changes Dec 19, 2016

View reviewed changes

cbuescher added 3 commits December 19, 2016 21:25

review comments

a7ae785

Addressing review comment

2c8f26c

cbuescher force-pushed the addParsing-sortValues branch from d5b9cbf to 668bfcd Compare December 19, 2016 21:04

Using common helper methods

6e9a583

cbuescher force-pushed the addParsing-sortValues branch from 668bfcd to 6e9a583 Compare December 19, 2016 21:11

javanna approved these changes Dec 20, 2016

View reviewed changes

Re-adding XContentParserUtils.ensureFieldName helper

5ac0054

javanna reviewed Dec 20, 2016

View reviewed changes

No null checks in ensureFieldName

5b0d3f1

cbuescher merged commit bdecbb5 into elastic:master Dec 21, 2016

cbuescher added the v5.2.0 label Dec 21, 2016

		@@ -86,7 +82,7 @@

		private Map<String, HighlightField> highlightFields = null;

		private Object[] sortValues = EMPTY_SORT_VALUES;

Factor out sort values from InternalSearchHit #22080

Factor out sort values from InternalSearchHit #22080

Conversation

cbuescher commented Dec 9, 2016

nik9000 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

javanna left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

javanna Dec 12, 2016 • edited

Choose a reason for hiding this comment

cbuescher commented Dec 12, 2016 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cbuescher commented Dec 19, 2016 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

javanna left a comment

Choose a reason for hiding this comment

cbuescher commented Dec 19, 2016

javanna commented Dec 20, 2016

javanna left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cbuescher Dec 20, 2016 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cbuescher commented Dec 21, 2016

javanna Dec 12, 2016 •

edited

cbuescher commented Dec 12, 2016 •

edited

cbuescher commented Dec 19, 2016 •

edited

cbuescher Dec 20, 2016 •

edited