Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EQL: fix async missing events #97718

Merged

Conversation

luigidellaquila
Copy link
Contributor

Fixes #97644

Missing events were represented as null values (before converting them to {missing: true} JSON), but in some circumstances events have to be serialized, eg. when executing an async search, and that caused a NPE.
This PR replaces the null placeholder with a constant Event instance to represent missing events, allowing proper serialization.

@elasticsearchmachine elasticsearchmachine added Team:QL (Deprecated) Meta label for query languages team v8.10.0 labels Jul 17, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-ql (Team:QL)

@elasticsearchmachine
Copy link
Collaborator

Hi @luigidellaquila, I've created a changelog YAML for you.

@@ -202,6 +203,8 @@ public String toString() {
// Event
public static class Event implements Writeable, ToXContentObject {

public static Event MISSING = new Event("", "", BytesArray.EMPTY, null);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better to distinguish missing events from actual events with a new flag in the serialization, but it would require a version increment, and it's probably not worth it.
An "empty" event (empty index, empty id, empty source) should be enough.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the null occurrence plus serialisation issue is rare, I guess having MISSING is ok. Otherwise, having a version gated behaviour might be acceptable.

@@ -417,7 +425,7 @@ public XContentBuilder toXContent(XContentBuilder builder, Params params) throws
if (events.isEmpty() == false) {
builder.startArray(Fields.EVENTS);
for (Event event : events) {
if (event == null) {
if (event == null || event == Event.MISSING) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is null still a valid option here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, it can be removed 👍

Copy link
Contributor

@bpintea bpintea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but wondering about the object-as-null choice.

@@ -263,6 +266,11 @@ public Event(StreamInput in) throws IOException {
}
}

public static Event readFrom(StreamInput in) throws IOException {
Event result = new Event(in);
return result.equals(MISSING) ? MISSING : result;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would a boolean as null indicator work as well, similar to the serialisation of other "optionals"? Though having a marker object works too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is actually in serialization: the events are in a collection nested in the EQL response, and StreamOutput.writeCollection() cannot handle nulls, so a local fix (eg. a boolean in the serialization) is not enough

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering if the "local" serialisation couldn't deal with a boolean flag, rather than de-/serialise an Event-as-null. But yes, it might require version-dependent behaviour, if this had worked at all before.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we'll need version-dependent behavior anyway, I enhanced a bit the unit test coverage and I saw some failures on bwc and xContent serialization.
I'll push one more fix in short, that adds version checks

@luigidellaquila
Copy link
Contributor Author

I did a few significant changes to the serialization.
Long story short: {missing: true} does not respect the contract of x-content parser for Event instances, so I had to replace
it with a mock event like {_index: "_missing", _id: "_missing", _source: {}, missing: true}.
I also had to introduce a new TransportVersion and do the proper bwc checks.

@luigidellaquila
Copy link
Contributor Author

luigidellaquila commented Jul 18, 2023

But it has a huge impact, now all the events have "missing": false...
Trying to find a better solution

Never mind, fixed

@luigidellaquila
Copy link
Contributor Author

@elasticmachine update branch

Copy link
Contributor

@bpintea bpintea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some questions, but functionally and from testing perspective it LGTM.

@@ -323,9 +362,13 @@ public Map<String, DocumentField> fetchFields() {
return fetchFields;
}

public boolean missing() {
return Boolean.TRUE.equals(missing);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If missing == null, missing() will return false. Is this desired?
Does it have to do with the parser/c'tor the reason why missing is an object (not a primitive)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's just about the c'tor param (it has to be an Object, because it's optional - x-content bwc, you know), but it's probably better to have the attribute to be a boolean, I'll refactor it

@@ -202,23 +204,39 @@ public String toString() {
// Event
public static class Event implements Writeable, ToXContentObject {

public static Event MISSING_EVENT = new Event(
"_missing",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once missing is true the rest of the fields should be irrelevant, but is there any downside to using the empty string, considering they're going to be serialised?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless x-content parsers complain (and I don't think it's the case), it should be fine. Let me do it


public static Event readFrom(StreamInput in) throws IOException {
Event result = new Event(in);
return result.missing() ? MISSING_EVENT : result;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to return a singleton here? It probably has at least some memory benefits and conceptually this replaces a "unique" null, but wondering if there are any other reasons, given that we now have a field id'ing these events.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here specifically it's just a small memory optimization; in other places I'm using the singleton in a more meaningful way.

@@ -27,7 +27,7 @@ public EventPayload(SearchResponse response) {
List<SearchHit> hits = RuntimeUtils.searchHits(response);
values = new ArrayList<>(hits.size());
for (SearchHit hit : hits) {
values.add(new Event(qualifiedIndex(hit), hit.getId(), hit.getSourceRef(), hit.getDocumentFields()));
values.add(new Event(qualifiedIndex(hit), hit.getId(), hit.getSourceRef(), hit.getDocumentFields(), false));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to have a c'tor that doesn't take missing, given that missing never is actually true (since MISSING_EVENT is used when needed)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 it will save us some changes in the tests code

@luigidellaquila luigidellaquila merged commit 9ab8f71 into elastic:main Jul 19, 2023
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/EQL EQL querying >bug Team:QL (Deprecated) Meta label for query languages team v8.10.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

EQL: async search with missing events fails with NPE
5 participants