Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use NamedWritable to enable GeoBoundingBox serialisation #99163

Conversation

craigtaverner
Copy link
Contributor

@craigtaverner craigtaverner commented Sep 4, 2023

BoundingBox allows inter-node serialisation only within aggs results as components of aggregations that themselves use NamedWritable. However, if a GeoBoundingBox is added to search hits, serialisation fails as this type is not explicitly supported by StreamOuput/StreamInput. This was never noticed before because the only way to get a BoundingBox into the hits is through a painless script, something that appears to have not been used by customers, and is only tested internally using a single-node, single-shard yaml test, which does not touch this code path. We only found this issue due to recent effort to run all yaml tests in a serverless environment, which appears to always require serialisation from backend data nodes to a coordinator node.

Having said that, it is a surface API that could be used by users, and so should be fixed. While the missing type could be added explicitly, a much more flexible approach is to allow named classes to encapsulate their own serialization/deserialization logic. Here we use NamedWritable for that purpose.

Tasks:

  • Implement GenericNamedWriteable with version checking capabilities (based on VersionedNamedWriteable)
  • Manual testing with a multi-node cluster to verify this fixes Failed Serialization of Scripted Geo fields  #99089
  • Unit tests for version checking
  • Unit tests for serialization of GeoBoundingBox
  • Support CartesianBoundingBox by registering the named writable in SpatialPlugin

Fixes #99089

@craigtaverner craigtaverner added >bug :Analytics/Geo Indexing, search aggregations of geo points and shapes Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) labels Sep 4, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (Team:Analytics)

@elasticsearchmachine
Copy link
Collaborator

Hi @craigtaverner, I've created a changelog YAML for you.

@craigtaverner craigtaverner force-pushed the supportNamedWritableSerializationBoundingBox branch from a142cc8 to 9b9c3fc Compare September 5, 2023 09:57
craigtaverner and others added 8 commits September 5, 2023 12:29
BoundingBox allows inter-node serialization only within aggs results as components of aggregations that themselves use NamedWritable.
However, if a GeoBoundingBox is added to search hits serialization fails as this type is not explictly supported by StreamOuput/StreamInput.
This was never noticed before because the only way to get a BoundingBox into the hits is through a painless script, something that appears to have not been used by customers, and is only tested internally using a single-node, single-shard yaml test, which does not touch this code path.
We only found this issue due to recent effort to run all yaml tests in a serverless environment, which appears to always require serialization from backend data nodes to a coordinator node.

Having said that, it is a surface API that could be used by users, and so should be fixed.
While the missing type could be added explicitly, a much more flexible approach is to allow named classes to encapsulate their own serialization/deserialization logic.
Here we use NamedWritable for that purpose.

Fixes elastic#99089
So only classes that implement this specific interface will benefit from this additional capability.
Also we implement VersionedNamedWriteable to ensure that TransportVersion checking is done.
Co-authored-by: David Turner <david.turner@elastic.co>
@craigtaverner craigtaverner force-pushed the supportNamedWritableSerializationBoundingBox branch from d0e026c to e7a534e Compare September 5, 2023 10:37
The parent class AbstractNamedWriteableTestCase was not using the writeGenericValue and readGenericValue that this work changed, so we override that behaviour to ensure we're testing the new code.

We also add a new test method for negative testing of errors thrown when trying to serialize using an older TransportVersion.
We do not throw similar errors on reading, with the assumption that the reader can always read things it knows about, regardless of the age of the sender.
Taking account the TransportVersion in the reader only make sense if reading a known type where the format has changed.
This is one reason why we went for a generic approach, allowing plugins to register additional GenericNamedWriteables that can be serialized/deserialized by the StreamInput/StreamOutput code without the server depending on xpack.
Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, just a few small comments.


// Test writing fails with older version
try {
copyInstance(testInstance, older, valid);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should just verify that writing to an older stream fails, no need to make a copyInstance which tries to read it back again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. After simplifying the test I replaced this with just the writing code.

}

// Test reading succeeds with older version (we do not cripple the reader)
deserializedInstance = copyInstance(testInstance, valid, older);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this doesn't really make sense, we are always sure that reading and writing always use the same transport version, and the message we're creating isn't actually something an older node would be able to create.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I've simplified this failure test to only test the failure case. The first part with (valid,valid) was already tested in another test testSerialization, and the last part is unnecessary:

Comment on lines 52 to 82
@Override
protected GenericNamedWriteable copyInstance(GenericNamedWriteable original, TransportVersion version) throws IOException {
return copyInstance(original, version, version);
}

/**
* Read and write with different serialization versions
*/
protected GenericNamedWriteable copyInstance(
GenericNamedWriteable original,
TransportVersion writeVersion,
TransportVersion readVersion
) throws IOException {
try (BytesStreamOutput output = new BytesStreamOutput()) {
output.setTransportVersion(writeVersion);
output.writeGenericValue(original);
try (StreamInput in = new NamedWriteableAwareStreamInput(output.bytes().streamInput(), getNamedWriteableRegistry())) {
in.setTransportVersion(readVersion);
return readGenericValue(in);
}
}
}

private GenericNamedWriteable readGenericValue(StreamInput in) throws IOException {
Object obj = in.readGenericValue();
if (obj instanceof GenericNamedWriteable result) {
return result;
}
throw new IllegalStateException("Read result of wrong type: " + obj.getClass().getSimpleName());
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is no longer needed, we can just use the copyInstance implementation in the subclass.

Suggested change
@Override
protected GenericNamedWriteable copyInstance(GenericNamedWriteable original, TransportVersion version) throws IOException {
return copyInstance(original, version, version);
}
/**
* Read and write with different serialization versions
*/
protected GenericNamedWriteable copyInstance(
GenericNamedWriteable original,
TransportVersion writeVersion,
TransportVersion readVersion
) throws IOException {
try (BytesStreamOutput output = new BytesStreamOutput()) {
output.setTransportVersion(writeVersion);
output.writeGenericValue(original);
try (StreamInput in = new NamedWriteableAwareStreamInput(output.bytes().streamInput(), getNamedWriteableRegistry())) {
in.setTransportVersion(readVersion);
return readGenericValue(in);
}
}
}
private GenericNamedWriteable readGenericValue(StreamInput in) throws IOException {
Object obj = in.readGenericValue();
if (obj instanceof GenericNamedWriteable result) {
return result;
}
throw new IllegalStateException("Read result of wrong type: " + obj.getClass().getSimpleName());
}

(also needs a spotless run to clean up imports)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately the parent version does not call output.writeGenericValue or in.readGenericValue which are the methods we added support for this to. Instead it calls directly the writeNamedWriteable and readNamedWriteable methods, skipping all the new code and version checks. But I see your point that at least I do not need the two-version version of the method, so I removed that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

acked, I think I can see a way to make this a little less special, but let's merge this and I'll follow up with a suggestion.

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (left a comment about some now-unnecessary test code tho)

@craigtaverner craigtaverner merged commit 0c3b8ed into elastic:main Sep 6, 2023
12 checks passed
alex-spies pushed a commit to dreamquster/elasticsearch that referenced this pull request Sep 8, 2023
* Use NamedWritable to enable GeoBoundingBox serialization

BoundingBox allows inter-node serialization only within aggs results as components of aggregations that themselves use NamedWritable.
However, if a GeoBoundingBox is added to search hits serialization fails as this type is not explictly supported by StreamOuput/StreamInput.
This was never noticed before because the only way to get a BoundingBox into the hits is through a painless script, something that appears to have not been used by customers, and is only tested internally using a single-node, single-shard yaml test, which does not touch this code path.
We only found this issue due to recent effort to run all yaml tests in a serverless environment, which appears to always require serialization from backend data nodes to a coordinator node.

Having said that, it is a surface API that could be used by users, and so should be fixed.
While the missing type could be added explicitly, a much more flexible approach is to allow named classes to encapsulate their own serialization/deserialization logic.
Here we use NamedWritable for that purpose.

Fixes elastic#99089

* Update docs/changelog/99163.yaml

* Use GenericNamedWriteable to limit this to specific classes

So only classes that implement this specific interface will benefit from this additional capability.
Also we implement VersionedNamedWriteable to ensure that TransportVersion checking is done.

* Code style fix

* Explicit checks of version support for GenericNamedWritable

Co-authored-by: David Turner <david.turner@elastic.co>

* Simplify GenericNamedWriteable checks

* Move NamedWriteable methods into concrete classes

* Added assert on GenericNamedWriteable min supported version

* Unit tests for serialization of GenericNamedWriteable

* Ensure serialization tests use the generic code path

The parent class AbstractNamedWriteableTestCase was not using the writeGenericValue and readGenericValue that this work changed, so we override that behaviour to ensure we're testing the new code.

We also add a new test method for negative testing of errors thrown when trying to serialize using an older TransportVersion.
We do not throw similar errors on reading, with the assumption that the reader can always read things it knows about, regardless of the age of the sender.
Taking account the TransportVersion in the reader only make sense if reading a known type where the format has changed.

* Support CartesianBoundingBox in StreamOutput/Input

This is one reason why we went for a generic approach, allowing plugins to register additional GenericNamedWriteables that can be serialized/deserialized by the StreamInput/StreamOutput code without the server depending on xpack.

* Update server/src/main/java/org/elasticsearch/TransportVersion.java

Co-authored-by: David Turner <david.turner@elastic.co>

* Code review updates - simplify negative test to only test failure case

* Code review - remove unused method after simplifying test

* Simplify GeoBoundsGenericWriteableTests

---------

Co-authored-by: David Turner <david.turner@elastic.co>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/Geo Indexing, search aggregations of geo points and shapes >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.11.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Failed Serialization of Scripted Geo fields
3 participants