-
Notifications
You must be signed in to change notification settings - Fork 24.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow any map key type when serialising hash maps #88686
Allow any map key type when serialising hash maps #88686
Conversation
Pinging @elastic/es-analytics-geo (Team:Analytics) |
Hi @salvatore-campagna, I've created a changelog YAML for you. |
Pinging @elastic/es-core-infra (Team:Core/Infra) |
Is |
I think it'd be kinder to catch this when the script tries to write a non-string key into the map. Painless doesn't support generics and the API we give for these is |
@jdconrad mentioned it could be possible to make a |
Er, well, maybe we shouldn't do this. If you had a stored script that required the old type signature then it won't compile and it'll fail to start the node. That's terrible.... Maybe better to just do it at runtime. It'll be more expensive, but scripted metric isn't fast. |
What is the behavior we see today? We are already doing a hard cast to Technically we could support any key type, it just needs to be (1) hashable and (2) supported by StreamInput/StreamOutput. Today we assume String, but we could not do the cast to |
I suspected this but I was not sure where those values are generated. I also tried a solution where I just Anyway I am not sure I am changing the behaviour of other parts because:
Should we go for a solution using |
As it is now, without changes, this results in a
|
I understand that the problem is here:
The script creates a map where |
I have the feeling my last commit might break a few tests... |
I had a look at |
I think the BWC issue here is because we have a lot of places where maps are declared like
while, according to this change we would need a
I tried to replace some of them but the change propagates to a very large number of classes. |
Sorting maps requires keys to be comparamble. As a result, we can't use a generic key.
I have no clue why the BWC tests are failing...it looks correct to me. |
This reverts commit 6ee1489.
This was a tricky issue to find! The mismatch is because you are using |
When version is set on FilterStreamInput, it is passed to the delegate stream. However, any uses of this.version directly on the stream are not correct. While it would be better to force using getVersion() consistently, this commit makes the situation less trappy by also setting the version on the wrapper stream. relates elastic#88686
When version is set on FilterStreamInput, it is passed to the delegate stream. However, any uses of this.version directly on the stream are not correct. While it would be better to force using getVersion() consistently, this commit makes the situation less trappy by also setting the version on the wrapper stream. relates #88686
Ok now I understand. May I ask how did you find that? Just step by step debugging? I spent quite some time debugging this without realizing that could be the issue. |
If getVersion is overridded a mismatch is possible between version and getVersion. As a result of this mismatch it is possible that version checking on reading and writing ends is done incorrectly.
🥳 thanks @rjernst |
it was time consuming. I started by repeating one of the tests that failed. That test was failing on deserializing runtime mappings in a search source builder, so I added printouts on several Stream Input/Output methods regarding maps. Eventually I saw the write side was writing a generic string, but the read side was reading a boolean. I realized the boolean type id 5 was the size of the string, and then added a couple more printouts that showed it was using the new code on the write side, but then old code on the read side. After staring at the read code for a while, I wondered what would happen if getVersion were overridden, and that’s when I saw FilterStreamInput did just that. Debugging the transport protocol can be quite tedious. I’m sure there are more efficient ways than what I did above, but in any case I think long term we need a better way to debug (I have an idea for this for a future spacetime project). |
@elasticsearchmachine run elasticsearch-ci/full-bwc please |
I would not worry about backporting unless there is a compelling reason to do so (eg an anticipated PR that would want to use a map with an object key). However, without backporting it might make all future PRs (to 7.17) that have a map more difficult (though I expect since they are just calling the methods, there will not be a conflict). It really comes down to how easy the backport applies, if it's at all difficult, I wouldn't worry about it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I think fixing this in InternalScriptedMetric
would be tedious (given that it just serialises a generic list). The fix in StreamOutput
/StreamInput
is smaller and shouldn't harm existing usages of maps in the internal protocol.
o.writeGenericValue(entry.getValue()); | ||
if (o.getVersion().onOrAfter(Version.V_8_7_0)) { | ||
@SuppressWarnings("unchecked") | ||
final Map<Object, Object> map = (Map<Object, Object>) v; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe final Map<?, ?> map = (Map<?, ?>) v;
? Then @SuppressWarnings("unchecked")
can be removed.
o.writeMap(map, StreamOutput::writeGenericValue, StreamOutput::writeGenericValue); | ||
} else { | ||
@SuppressWarnings("unchecked") | ||
final Map<String, Object> map = (Map<String, Object>) v; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here the key needs to be a String
since we need to use StreamOutput::writeString
below. Using a String requires an unchecked cast here (but not above).
We need to explicitly cast the map so to have String keys because of the call writeMap taking StreamOutput::writeString. As a result, here we need the unchecked cast and we need the SuppressWarnings.
Before this change only maps with String keys were supported. There is no reason why we should not support non-String keys too, provided that they are serialisable. With this PR we include support for other map key types.
Before this change only maps with String keys
were supported. There is no specific reason
why we should not support non-String keys
too, provided that they are serialisable. With
this PR we include support for other map key
types.
Resolves #66057