New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EQL: fix async missing events #97718
EQL: fix async missing events #97718
Conversation
Pinging @elastic/es-ql (Team:QL) |
Hi @luigidellaquila, I've created a changelog YAML for you. |
@@ -202,6 +203,8 @@ public String toString() { | |||
// Event | |||
public static class Event implements Writeable, ToXContentObject { | |||
|
|||
public static Event MISSING = new Event("", "", BytesArray.EMPTY, null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be better to distinguish missing events from actual events with a new flag in the serialization, but it would require a version increment, and it's probably not worth it.
An "empty" event (empty index, empty id, empty source) should be enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the null
occurrence plus serialisation issue is rare, I guess having MISSING
is ok. Otherwise, having a version gated behaviour might be acceptable.
@@ -417,7 +425,7 @@ public XContentBuilder toXContent(XContentBuilder builder, Params params) throws | |||
if (events.isEmpty() == false) { | |||
builder.startArray(Fields.EVENTS); | |||
for (Event event : events) { | |||
if (event == null) { | |||
if (event == null || event == Event.MISSING) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is null still a valid option here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, it can be removed 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but wondering about the object-as-null choice.
@@ -263,6 +266,11 @@ public Event(StreamInput in) throws IOException { | |||
} | |||
} | |||
|
|||
public static Event readFrom(StreamInput in) throws IOException { | |||
Event result = new Event(in); | |||
return result.equals(MISSING) ? MISSING : result; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would a boolean as null indicator work as well, similar to the serialisation of other "optionals"? Though having a marker object works too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is actually in serialization: the events are in a collection nested in the EQL response, and StreamOutput.writeCollection()
cannot handle nulls, so a local fix (eg. a boolean in the serialization) is not enough
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering if the "local" serialisation couldn't deal with a boolean flag, rather than de-/serialise an Event-as-null. But yes, it might require version-dependent behaviour, if this had worked at all before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we'll need version-dependent behavior anyway, I enhanced a bit the unit test coverage and I saw some failures on bwc and xContent serialization.
I'll push one more fix in short, that adds version checks
…vents' into eql/fix_async_missing_events
I did a few significant changes to the serialization. |
|
@elasticmachine update branch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some questions, but functionally and from testing perspective it LGTM.
@@ -323,9 +362,13 @@ public Map<String, DocumentField> fetchFields() { | |||
return fetchFields; | |||
} | |||
|
|||
public boolean missing() { | |||
return Boolean.TRUE.equals(missing); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If missing == null
, missing()
will return false
. Is this desired?
Does it have to do with the parser/c'tor the reason why missing
is an object (not a primitive)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's just about the c'tor param (it has to be an Object, because it's optional - x-content bwc, you know), but it's probably better to have the attribute to be a boolean
, I'll refactor it
@@ -202,23 +204,39 @@ public String toString() { | |||
// Event | |||
public static class Event implements Writeable, ToXContentObject { | |||
|
|||
public static Event MISSING_EVENT = new Event( | |||
"_missing", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once missing
is true
the rest of the fields should be irrelevant, but is there any downside to using the empty string, considering they're going to be serialised?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless x-content parsers complain (and I don't think it's the case), it should be fine. Let me do it
|
||
public static Event readFrom(StreamInput in) throws IOException { | ||
Event result = new Event(in); | ||
return result.missing() ? MISSING_EVENT : result; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to return a singleton here? It probably has at least some memory benefits and conceptually this replaces a "unique" null
, but wondering if there are any other reasons, given that we now have a field id'ing these events.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here specifically it's just a small memory optimization; in other places I'm using the singleton in a more meaningful way.
@@ -27,7 +27,7 @@ public EventPayload(SearchResponse response) { | |||
List<SearchHit> hits = RuntimeUtils.searchHits(response); | |||
values = new ArrayList<>(hits.size()); | |||
for (SearchHit hit : hits) { | |||
values.add(new Event(qualifiedIndex(hit), hit.getId(), hit.getSourceRef(), hit.getDocumentFields())); | |||
values.add(new Event(qualifiedIndex(hit), hit.getId(), hit.getSourceRef(), hit.getDocumentFields(), false)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to have a c'tor that doesn't take missing
, given that missing never is actually true
(since MISSING_EVENT is used when needed)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 it will save us some changes in the tests code
…vents' into eql/fix_async_missing_events
Fixes #97644
Missing events were represented as
null
values (before converting them to{missing: true}
JSON), but in some circumstances events have to be serialized, eg. when executing an async search, and that caused a NPE.This PR replaces the
null
placeholder with a constantEvent
instance to represent missing events, allowing proper serialization.