Deprecate reference to _type in lookup queries #37016

mayya-sharipova · 2018-12-28T13:42:26Z

Relates to #35190

elasticmachine · 2018-12-28T13:43:47Z

Pinging @elastic/es-search

* Add integration test coverage for typeless lookup queries. * Fix a bug around typeless terms lookup queries. * Make sure to provide non-deprecated methods in QueryBuilders. * Make the deprecation messages more accurate. * Update the query DSL documentation. * Avoid creating duplicate query builder tests. Don't use 'ids' when generating random queries to avoid types warnings.

cbuescher

@jtibshirani I left a few comments, let me know what you think. Since I haven't done so many type removal related reviews I think it would be good to also get a second reviewer on this.

server/src/main/java/org/elasticsearch/indices/TermsLookup.java

cbuescher · 2019-01-07T10:55:16Z

server/src/main/java/org/elasticsearch/indices/TermsLookup.java

+        if (out.getVersion().onOrAfter(Version.V_7_0_0)) {
+            out.writeOptionalString(type);
+        } else {
+            out.writeString(type);


I think this might break if type is "null". We should write some dummy type to the stream then?

My understanding is that before this change, type could never be null in this class (it is marked as final, and the constructor validates that it is not null). I will add a comment here to clarify.

This also reminds me of something I was hoping to get your thoughts on. With this implementation, it is not possible to use a typeless terms lookup query in a mixed 6.x and 7.0 cluster. The error will occur when a 6.x nodes tries to decode the optional string that the 7.0 node is sending here. Does this seem reasonable in terms of behavior + error messaging (not sure if there is any precedent for such a change)?

The error will occur when a 6.x nodes tries to decode the optional string that the 7.0 node is sending here.

Exactly what I thought, with the exception that the problem happens then a 7.0 node sends an object with a null-type I think we should make sure this cannot happen though. If we want to keep the current decision in this PR to allow null-references for types in the POJOs we need to send some dummy type when serializing to a 6.x node. We should also add unit tests for this case.

Got it, I understand your concern now. I had assumed that passing a dummy type would actually cause more confusion than simply erroring out. For example, say a user had an index with document type my_type, and a mixed-node cluster. Then a typeless terms lookup query will always succeed if the lookup document is on a 7.0 node, but show no error and not find the document if it lives on a 6.x node. This is because the type name _doc will be used for the 6.x lookup, and we will fail to find the lookup document without any indication as to why. Using _doc in get requests to mean 'any type' is functionality we only support on 7.0.

I'm now thinking it would be clearest if I explicitly checked for a null type here, and errored with the message "Typeless terms lookup queries are not supported if any node is running a version before 7.0."

This is because the type name _doc will be used for the 6.x lookup, and we will fail to find the lookup document without any indication as to why.

Is there a chance to use the dummy "_doc" type in order to not break serialization but then infer the "right" type name when on the shard for this case? As far as i understand the types removal process all 6.x indices should only have one type left, even if it is named differently?

Yes, that would be possible if we backported the part of #35790 that applied to 'get' requests. I would prefer not to do that if possible because it is not so straightfoward to backport, especially since 6.x can contain some indices with multiple types, as it supports indices created in 5.x. I also don't think that using typeless terms lookups in a mixed-node cluster will be a common enough occurrence to warrant the extra work + complexity.

Thanks for explaining, that is unfortunate.

In the 36549 PR for the IndexRequest class I opted to serialize using the result of a call to type() which will return _doc if null. This made 7.x+6.x nodes happy to talk to each other.
It was important that IndexRequest kept null types in the calling client because type==null was the condition I used to know if a blank request should inherit a choice of type from a BulkRequest's choice of default.

server/src/test/java/org/elasticsearch/index/query/RandomQueryBuilder.java

server/src/test/java/org/elasticsearch/index/query/TermsQueryBuilderTests.java

jtibshirani · 2019-01-07T19:25:24Z

Thanks @cbuescher for the helpful comments, I've pushed some commits to address them. I understand your thoughts around a second reviewer, just wanted to note for context that (1) these changes are quite different from the types deprecation work we've done so far, and (2) both Mayya and I have taken a close look at them.

cbuescher

Left a minor nit, but I think the mixed cluster serialization issue thats left open needs adressing and also probably a unit test that checks we can send/receive query builders from a 6.x node.

server/src/test/java/org/elasticsearch/index/query/RandomQueryBuilder.java

…d with pre-7.0 nodes.

jtibshirani · 2019-01-07T23:18:47Z

I think this is ready for another look.

…n-lookup-queries

cbuescher · 2019-01-08T12:52:14Z

server/src/main/java/org/elasticsearch/indices/TermsLookup.java

+        } else {
+            if (type == null) {
+                throw new IllegalArgumentException("Typeless [terms] lookup queries are not supported if any " +
+                    "node is running a version before 7.0.");


I don't really like throwing an IAE on the serialization layer, but I see that we don't have the information about the output streams version elsewhere. I will try to look into how this error propagates to e.g. the user making a rest call and I'd like to discuss the alternative of using a "dummy" type (with the downside that this then won't match lookup items with a non-default 6.x type name) again. In any case we should add something to the docs or some "known issues" document to capture this edge case.

Sounds good, I can also take a look at this by trying it out in REST tests. I will ping you offline to further discuss the idea of using a 'dummy' type name.

markharwood · 2019-01-08T17:10:26Z

server/src/main/java/org/elasticsearch/index/query/GeoShapeQueryBuilder.java

-        }
-        if (indexedShapeId != null && indexedShapeType == null) {
-            throw new IllegalArgumentException("indexedShapeType is required if indexedShapeId is specified");
+            throw new IllegalArgumentException("either shapeBytes or indexedShapeId is required");


"shapeBytes" should be "shape"

markharwood · 2019-01-08T17:15:24Z

server/src/main/java/org/elasticsearch/index/query/MoreLikeThisQueryBuilder.java

@@ -912,9 +964,18 @@ public static MoreLikeThisQueryBuilder fromXContent(XContentParser parser) throw
        if (stopWords != null) {
            moreLikeThisQueryBuilder.stopWords(stopWords);
        }
+
+        if (!moreLikeThisQueryBuilder.isTypeless()) {


prefer == false to !

markharwood · 2019-01-08T17:30:13Z

server/src/main/java/org/elasticsearch/indices/TermsLookup.java

+        if (out.getVersion().onOrAfter(Version.V_7_0_0)) {
+            out.writeOptionalString(type);
+        } else {
+            out.writeString(type);


In the 36549 PR for the IndexRequest class I opted to serialize using the result of a call to type() which will return _doc if null. This made 7.x+6.x nodes happy to talk to each other.
It was important that IndexRequest kept null types in the calling client because type==null was the condition I used to know if a blank request should inherit a choice of type from a BulkRequest's choice of default.

markharwood · 2019-01-08T17:35:22Z

server/src/test/java/org/elasticsearch/index/query/MoreLikeThisQueryBuilderTests.java

+        assertThat(query, instanceOf(MoreLikeThisQueryBuilder.class));
+
+        MoreLikeThisQueryBuilder mltQuery = (MoreLikeThisQueryBuilder) query;
+        if (!mltQuery.isTypeless()) {


markharwood · 2019-01-08T17:38:00Z

server/src/test/java/org/elasticsearch/index/query/TermsQueryBuilderTests.java

+        assertThat(query, CoreMatchers.instanceOf(TermsQueryBuilder.class));
+
+        TermsQueryBuilder termsQuery = (TermsQueryBuilder) query;
+        if (termsQuery.termsLookup() != null && termsQuery.termsLookup().type() != null) {


maybe add an isTypeless() abstraction like you did for MLT?

jtibshirani · 2019-01-08T19:03:32Z

Thanks to both of you for the review. I pushed changes to address @markharwood's suggestions. I looked into changing the strategy or handling mixed-cluster serialization, and my intuition is to stick with the current approach:

There is precedent for throwing an IllegalArgumentException within a writeTo method. Even within query builder classes, there are a couple examples (PercolateQueryBuilder and GeoShapeQueryBuilder).
I tried a refactor where TermsLookup used _doc instead of a null type to indicate it was typeless. However, this made it hard to tell later whether it was constructed in a truly typeless manner, or just by setting type: _doc. I'm not able to avoid this issue by moving the deprecation check into TermsLookup parsing, because of subtleties of how TermsQueryBuilderTests works. In particular, if it were moved into TermsLookup, then we could emit a deprecation warning but fail parsing in testUnknownFields, resulting in an unexpected warning that is never covered by assertWarnings.
We could introduce a more complex structure where _doc is used for serialization, but we separately track that the lookup is 'typeless', but I think this results in more complex code.
I'm not actually able to replicate a situation where a terms lookup query is serialized to a 6.x node, because I think the coordinating node always fetches the document and rewrites it to a normal terms query before sending the search request to other nodes. So I'm not sure this case can even be hit outside of tests, and it seems good to take the option that results in the simplest code.

cbuescher · 2019-01-08T20:35:33Z

My intuition is to stick with the current approach

Then I'm fine with that, as far as I understand user will still have the option to use types if anything on their side doesn't work without types in a mixed-cluster environment, they'd only have to live with the deprecation warnings. Is that correct?
In any case, I'd at least like to document this somewhere as a "known issue" (as in: terms lookups don't work in mixed clusters without types). I'm not sure if the documentation is the right place, maybe we have some extra documentation around types removal where we can put this? I don't know if we had a "known issues" document somewhere at some point, I cannot seem to find it but maybe you can check?

cbuescher

LGTM

jtibshirani · 2019-01-08T21:27:15Z

as far as I understand user will still have the option to use types if anything on their side doesn't work without types in a mixed-cluster environment, they'd only have to live with the deprecation warnings. Is that correct?

Yes, that's correct.

I can't find a 'known issues' document, but I will create another section under removal_of_types.asciidoc about 6.7 and the mixed cluster set-up. I've made a note on the meta issue to remember to do this.

…n-lookup-queries

jtibshirani · 2019-01-09T00:55:01Z

@elasticmachine run gradle build tests 1

jtibshirani mentioned this pull request Dec 28, 2018

Implementation tracking for 7.0 types deprecation. #35190

Closed

48 tasks

mayya-sharipova added >deprecation :Search Foundations/Mapping Index mappings, including merging and defining field types labels Dec 28, 2018

mayya-sharipova added the v7.0.0 label Dec 28, 2018

jtibshirani changed the title ~~Deprecate reference to _type in lookup queries and aggs~~ Deprecate reference to _type in lookup queries Jan 3, 2019

jtibshirani force-pushed the deprecate-reference-to_type-aggregations-retrieving-fields branch 3 times, most recently from 0d159a3 to ab26db4 Compare January 4, 2019 01:26

jtibshirani requested review from cbuescher and markharwood January 4, 2019 01:37

mayya-sharipova and others added 2 commits January 4, 2019 11:27

Deprecate reference to _type in lookup queries

8e892f5

Relates to elastic#35190

jtibshirani force-pushed the deprecate-reference-to_type-aggregations-retrieving-fields branch from 3b25c64 to d7b0cd8 Compare January 4, 2019 19:27

cbuescher requested changes Jan 7, 2019

View reviewed changes

jtibshirani added 4 commits January 7, 2019 10:52

Add Javadoc to deprecated methods to suggest alternatives.

986930e

Switch back to using an ids query in RandomQueryBuilder.

96370d7

Avoid introducing a special case in TermsLookup#hashCode.

932e250

Add some clarifying comments.

6678b83

cbuescher reviewed Jan 7, 2019

View reviewed changes

server/src/test/java/org/elasticsearch/index/query/RandomQueryBuilder.java Outdated Show resolved Hide resolved

jtibshirani added 6 commits January 7, 2019 12:50

Prefer to use Strings.EMPTY_ARRAY.

e176101

Give a clear error message when typeless terms lookup queries are use…

7a6100a

…d with pre-7.0 nodes.

Minor spacing issue.

12f5eaf

Test terms lookup serialization to 6.x nodes.

f3031ba

Make sure GeoShapeQueryBuilderTests covers typeless queries.

c9f48a3

Simplify the handling around null types for more_like_this queries.

806e957

Merge remote-tracking branch 'upstream/master' into deprecate-types-i…

8324f5f

…n-lookup-queries

cbuescher reviewed Jan 8, 2019

View reviewed changes

markharwood reviewed Jan 8, 2019

View reviewed changes

Small refactors in response to review comments.

13c4854

cbuescher approved these changes Jan 8, 2019

View reviewed changes

jtibshirani added 2 commits January 8, 2019 13:43

Fix an outdated message in GeoShapeQueryBuilderTests.

0bdbd71

Merge remote-tracking branch 'upstream/master' into deprecate-types-i…

61ad78d

…n-lookup-queries

jtibshirani merged commit ec32e66 into elastic:master Jan 9, 2019

colings86 added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deprecate reference to _type in lookup queries #37016

Deprecate reference to _type in lookup queries #37016

mayya-sharipova commented Dec 28, 2018

elasticmachine commented Dec 28, 2018

cbuescher left a comment

cbuescher Jan 7, 2019

jtibshirani Jan 7, 2019 •

edited

Loading

cbuescher Jan 7, 2019

jtibshirani Jan 7, 2019 •

edited

Loading

cbuescher Jan 7, 2019

jtibshirani Jan 7, 2019

cbuescher Jan 8, 2019

markharwood Jan 8, 2019

jtibshirani commented Jan 7, 2019

cbuescher left a comment

jtibshirani commented Jan 7, 2019

cbuescher Jan 8, 2019 •

edited

Loading

jtibshirani Jan 8, 2019 •

edited

Loading

markharwood Jan 8, 2019

jtibshirani Jan 8, 2019

markharwood Jan 8, 2019

jtibshirani Jan 8, 2019

markharwood Jan 8, 2019

markharwood Jan 8, 2019

jtibshirani Jan 8, 2019

markharwood Jan 8, 2019

jtibshirani Jan 8, 2019

jtibshirani commented Jan 8, 2019

cbuescher commented Jan 8, 2019

cbuescher left a comment

jtibshirani commented Jan 8, 2019

jtibshirani commented Jan 9, 2019

Deprecate reference to _type in lookup queries #37016

Deprecate reference to _type in lookup queries #37016

Conversation

mayya-sharipova commented Dec 28, 2018

elasticmachine commented Dec 28, 2018

cbuescher left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jtibshirani Jan 7, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jtibshirani Jan 7, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jtibshirani commented Jan 7, 2019

cbuescher left a comment

Choose a reason for hiding this comment

jtibshirani commented Jan 7, 2019

cbuescher Jan 8, 2019 • edited Loading

Choose a reason for hiding this comment

jtibshirani Jan 8, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jtibshirani commented Jan 8, 2019

cbuescher commented Jan 8, 2019

cbuescher left a comment

Choose a reason for hiding this comment

jtibshirani commented Jan 8, 2019

jtibshirani commented Jan 9, 2019

jtibshirani Jan 7, 2019 •

edited

Loading

jtibshirani Jan 7, 2019 •

edited

Loading

cbuescher Jan 8, 2019 •

edited

Loading

jtibshirani Jan 8, 2019 •

edited

Loading