New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Transport: allow to de-serialize arbitrary objects given their name #12571
Transport: allow to de-serialize arbitrary objects given their name #12571
Conversation
Not sure how it could be done without producing a massive PR but its a shame that anything using the NamedWriteableAwareStreamInput needs to cast the StreamInput it receives. Also I wonder if we should have a NamedWriteableAwareStreamOutput so we are sure that classes using NamedWritable are sure to write the object properly. So it would have writeNamedWritable(NamedWritable), writeOptionalNamedWritable(NamedWritable) and write NamedWritableArray(NamedWritable[])? It seems like this change is going to get complicated due to the mechanism for determining when to wrap the StreamInput so I agree that maybe we should explore how big the context argument option is? |
That is why we have the new serializer object that exposes the methods to read and write named writeables, you have to effectively go through it so you can read and write named writeables. That said I agree with you the casting is not great, and the current wrapping of the stream is even worse :) I am all for adding a context argument to all readFrom methods at this point. The only condition is that the context needs to expose final objects only and must not change its state while reading. |
b8597a8
to
f97cba6
Compare
After talking to @s1monw and @jpountz we decided to go back to something closer to the original implementation (what we have in the query-refactoring branch). We wrap the stream (only in case of request) and named writeables are supported across the board. The default registry is empty if the stream is not wrapped with one that has a non empty registry. Also we went for exposing specific readQuery and readAggregation method in the future rather than the generic readNamedWriteable and writeNamedWriteable methods. I updated the description of the PR accordingly and removed the WIP label, this is ready for review now. |
|
||
@Override | ||
public StreamInput setVersion(Version version) { | ||
return delegate.setVersion(version); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the builder pattern is trappy here: I think this should call delegate.setVersion() and then return this? I quickly looked and I don't see many call sites relying on the fact that StreamInput.setVersion returns itself, so maybe we can make it void to remove the trap?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indeed good catch...
fd6e6b0
to
566d64c
Compare
/** | ||
* Wraps a {@link StreamInput} and associates it with a {@link NamedWriteableRegistry} | ||
*/ | ||
public class FilterStreamInput extends StreamInput { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am now wondering if naming is still fine, maybe NamedWriteableAwareStreamInput or something along those lines (and shorter!) would be better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could keep this FilterStreamInput that just delegates everything, and add a new NamedWriteableAwareStreamInput (or just WriteableStreamInput) that extends it and adds the logic to look into the registry to deserialize?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point, will do
@jpountz I updated the PR according to your comments, it's ready for another round of review |
} | ||
|
||
@Override | ||
<C> C readNamedWriteable(@SuppressWarnings("unused") Class<C> categoryClass) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be:
<C> C readNamedWriteable(@SuppressWarnings("unused") Class<? extends C> categoryClass) throws IOException {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also what is categoryClass tagged as unused? I see it used to look up the registry?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the annotation is a bug, it is needed only in the base class, will fix. I am not sure about the Class<? extends C>. If we are deserializing an arbitrary query, the base class QueryBuilder implements NamedWriteable, but when we read an arbitrary query we cannot expect a subclass of it, we just do QueryBuilder query = readQuery();
which will call readNamedWriteable(QueryBuilder.class)
. That is why I kept Class as a type. Does it make sense?
* Default implementation throws {@link UnsupportedOperationException} as StreamInput doesn't hold a registry. | ||
* Use {@link FilterInputStream} instead which wraps a stream and supports a {@link NamedWriteableRegistry} too. | ||
*/ | ||
<C> C readNamedWriteable(@SuppressWarnings("unused") Class<C> categoryClass) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/Class<C>/Class<? extends C>/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes sense as-is for the reason stated above
I pushed another commit that should address the last review |
LGTM |
This commit makes it possible to serialize arbitrary objects by having them extend Writeable. When reading them though, we need to be able to identify which object we have to create, based on its name. This is useful for queries once we move to parsing on the coordinating node, as well as with aggregations and so on. Introduced a new abstraction called NamedWriteable, which is supported by StreamOutput and StreamInput through writeNamedWriteable and readNamedWriteable methods. A new NamedWriteableRegistry is introduced also where named writeable prototypes need to be registered so that we are able to retrieve the proper instance of the writeable given its name and then de-serialize it calling readFrom against it. Closes elastic#12393
a005b50
to
e1e9e1a
Compare
This commit makes it possible to serialize arbitrary objects by having them extend Writeable. When reading them though, we need to be able to identify which object we have to create, based on its name. This is useful for queries once we move to parsing on the coordinating node, as well as with aggregations and so on.
Introduced a new abstraction called NamedWriteable, which is supported by StreamOutput and StreamInput through writeNamedWriteable and readNamedWriteable methods. A new NamedWriteableRegistry is introduced also where named writeable prototypes need to be registered so that we are able to retrieve the proper instance of the writeable given its name and then de-serialize it calling readFrom against it.
We decided to streamline the support for NamedWriteables and make related methods available across the board in StreamInput and StreamOutput. That said the new write* and read* methods are package private so they can be tested but won't be made public. The idea is to add specific methods once we have named writeable to be streamed, e.g.:
and
The above methods cannot be added yet as neither queries nor aggs are streamable yet.