Transport: allow to de-serialize arbitrary objects given their name #12571

javanna · 2015-07-31T08:40:16Z

This commit makes it possible to serialize arbitrary objects by having them extend Writeable. When reading them though, we need to be able to identify which object we have to create, based on its name. This is useful for queries once we move to parsing on the coordinating node, as well as with aggregations and so on.

Introduced a new abstraction called NamedWriteable, which is supported by StreamOutput and StreamInput through writeNamedWriteable and readNamedWriteable methods. A new NamedWriteableRegistry is introduced also where named writeable prototypes need to be registered so that we are able to retrieve the proper instance of the writeable given its name and then de-serialize it calling readFrom against it.

We decided to streamline the support for NamedWriteables and make related methods available across the board in StreamInput and StreamOutput. That said the new write* and read* methods are package private so they can be tested but won't be made public. The idea is to add specific methods once we have named writeable to be streamed, e.g.:

public QueryBuilder readQuery() {
    return readNamedWriteable("query");
}

and

public void writeQuery(QueryBuilder queryBuilder) {
    writeNamedWriteable("query", queryBuilder);
}

The above methods cannot be added yet as neither queries nor aggs are streamable yet.

colings86 · 2015-07-31T12:32:52Z

Not sure how it could be done without producing a massive PR but its a shame that anything using the NamedWriteableAwareStreamInput needs to cast the StreamInput it receives.

Also I wonder if we should have a NamedWriteableAwareStreamOutput so we are sure that classes using NamedWritable are sure to write the object properly. So it would have writeNamedWritable(NamedWritable), writeOptionalNamedWritable(NamedWritable) and write NamedWritableArray(NamedWritable[])?

It seems like this change is going to get complicated due to the mechanism for determining when to wrap the StreamInput so I agree that maybe we should explore how big the context argument option is?

javanna · 2015-07-31T12:50:57Z

Also I wonder if we should have a NamedWriteableAwareStreamOutput so we are sure that classes using NamedWritable are sure to write the object properly. So it would have writeNamedWritable(NamedWritable), writeOptionalNamedWritable(NamedWritable) and write NamedWritableArray(NamedWritable[])?

That is why we have the new serializer object that exposes the methods to read and write named writeables, you have to effectively go through it so you can read and write named writeables. That said I agree with you the casting is not great, and the current wrapping of the stream is even worse :)

I am all for adding a context argument to all readFrom methods at this point. The only condition is that the context needs to expose final objects only and must not change its state while reading.

Let's see what @jpountz and @s1monw think about this.

javanna · 2015-08-04T15:54:07Z

After talking to @s1monw and @jpountz we decided to go back to something closer to the original implementation (what we have in the query-refactoring branch). We wrap the stream (only in case of request) and named writeables are supported across the board. The default registry is empty if the stream is not wrapped with one that has a non empty registry. Also we went for exposing specific readQuery and readAggregation method in the future rather than the generic readNamedWriteable and writeNamedWriteable methods. I updated the description of the PR accordingly and removed the WIP label, this is ready for review now.

jpountz · 2015-08-05T10:11:48Z

core/src/main/java/org/elasticsearch/common/io/stream/FilterStreamInput.java

+
+    @Override
+    public StreamInput setVersion(Version version) {
+        return delegate.setVersion(version);


the builder pattern is trappy here: I think this should call delegate.setVersion() and then return this? I quickly looked and I don't see many call sites relying on the fact that StreamInput.setVersion returns itself, so maybe we can make it void to remove the trap?

indeed good catch...

javanna · 2015-08-06T07:19:31Z

core/src/main/java/org/elasticsearch/common/io/stream/FilterStreamInput.java

+/**
+ * Wraps a {@link StreamInput} and associates it with a {@link NamedWriteableRegistry}
+ */
+public class FilterStreamInput extends StreamInput {


I am now wondering if naming is still fine, maybe NamedWriteableAwareStreamInput or something along those lines (and shorter!) would be better?

We could keep this FilterStreamInput that just delegates everything, and add a new NamedWriteableAwareStreamInput (or just WriteableStreamInput) that extends it and adds the logic to look into the registry to deserialize?

good point, will do

javanna · 2015-08-06T07:22:14Z

@jpountz I updated the PR according to your comments, it's ready for another round of review

jpountz · 2015-08-06T08:40:11Z

core/src/main/java/org/elasticsearch/common/io/stream/FilterStreamInput.java

+    }
+
+    @Override
+    <C> C readNamedWriteable(@SuppressWarnings("unused") Class<C> categoryClass) throws IOException {


I think it should be:
<C> C readNamedWriteable(@SuppressWarnings("unused") Class<? extends C> categoryClass) throws IOException {

also what is categoryClass tagged as unused? I see it used to look up the registry?

the annotation is a bug, it is needed only in the base class, will fix. I am not sure about the Class<? extends C>. If we are deserializing an arbitrary query, the base class QueryBuilder implements NamedWriteable, but when we read an arbitrary query we cannot expect a subclass of it, we just do QueryBuilder query = readQuery(); which will call readNamedWriteable(QueryBuilder.class). That is why I kept Class as a type. Does it make sense?

jpountz · 2015-08-06T08:53:23Z

core/src/main/java/org/elasticsearch/common/io/stream/StreamInput.java

+     * Default implementation throws {@link UnsupportedOperationException} as StreamInput doesn't hold a registry.
+     * Use {@link FilterInputStream} instead which wraps a stream and supports a {@link NamedWriteableRegistry} too.
+     */
+    <C> C readNamedWriteable(@SuppressWarnings("unused") Class<C> categoryClass) throws IOException {


s/Class<C>/Class<? extends C>/

I think it makes sense as-is for the reason stated above

javanna · 2015-08-06T09:34:57Z

I pushed another commit that should address the last review

jpountz · 2015-08-06T09:52:26Z

LGTM

This commit makes it possible to serialize arbitrary objects by having them extend Writeable. When reading them though, we need to be able to identify which object we have to create, based on its name. This is useful for queries once we move to parsing on the coordinating node, as well as with aggregations and so on. Introduced a new abstraction called NamedWriteable, which is supported by StreamOutput and StreamInput through writeNamedWriteable and readNamedWriteable methods. A new NamedWriteableRegistry is introduced also where named writeable prototypes need to be registered so that we are able to retrieve the proper instance of the writeable given its name and then de-serialize it calling readFrom against it. Closes elastic#12393

javanna added >enhancement v2.0.0-beta1 review and removed review v2.0.0-beta1 labels Jul 31, 2015

javanna mentioned this pull request Jul 31, 2015

Transport: allow to de-serialize arbitrary objects given their name #12393

Closed

javanna added the WIP label Jul 31, 2015

javanna added v2.0.0 and removed v2.0.0-beta1 labels Aug 4, 2015

javanna force-pushed the enhancement/named_writeable_serializer branch from b8597a8 to f97cba6 Compare August 4, 2015 15:41

javanna removed the WIP label Aug 5, 2015

jpountz reviewed Aug 5, 2015
View reviewed changes

javanna force-pushed the enhancement/named_writeable_serializer branch from fd6e6b0 to 566d64c Compare August 5, 2015 12:09

javanna reviewed Aug 6, 2015
View reviewed changes

jpountz reviewed Aug 6, 2015
View reviewed changes

javanna force-pushed the enhancement/named_writeable_serializer branch from a005b50 to e1e9e1a Compare August 6, 2015 10:26

javanna removed the review label Aug 6, 2015

javanna merged commit e1e9e1a into elastic:master Aug 6, 2015

javanna mentioned this pull request Aug 6, 2015

Merge NamedWriteable changes from master #12694

Merged

clintongormley added v2.0.0-beta1 and removed v2.0.0-beta1 v2.0.0 labels Aug 9, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Transport: allow to de-serialize arbitrary objects given their name #12571

Transport: allow to de-serialize arbitrary objects given their name #12571

javanna commented Jul 31, 2015

colings86 commented Jul 31, 2015

javanna commented Jul 31, 2015

javanna commented Aug 4, 2015

jpountz Aug 5, 2015

javanna Aug 5, 2015

javanna Aug 6, 2015

jpountz Aug 6, 2015

javanna Aug 6, 2015

javanna commented Aug 6, 2015

jpountz Aug 6, 2015

jpountz Aug 6, 2015

javanna Aug 6, 2015

jpountz Aug 6, 2015

javanna Aug 6, 2015

javanna commented Aug 6, 2015

jpountz commented Aug 6, 2015

Transport: allow to de-serialize arbitrary objects given their name #12571

Transport: allow to de-serialize arbitrary objects given their name #12571

Conversation

javanna commented Jul 31, 2015

colings86 commented Jul 31, 2015

javanna commented Jul 31, 2015

javanna commented Aug 4, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

javanna commented Aug 6, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

javanna commented Aug 6, 2015

jpountz commented Aug 6, 2015