-
Notifications
You must be signed in to change notification settings - Fork 24.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Serialize index boost and phrase suggest collation keys in a consistent order #20081
Conversation
@@ -250,11 +254,7 @@ public void writeTo(StreamOutput out) throws IOException { | |||
boolean hasIndexBoost = indexBoost != null; | |||
out.writeBoolean(hasIndexBoost); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While we are able to break the wire protocol, can you remove the boolean and just use size=0 on read to avoid building the map? May as well size a boolean over the wire and in the cache key
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nik9000 , good call 👍
Fix looks right to me. I left a suggestion for a thing that'd be nice to touch while you are there but isn't required. |
Oh! Can you update the PR description to make to make it clear what the bug you were fixing is? When we cut the release notes we do it with links to PRs rather than issues so it is convenient to have a useful description in the PR. |
|
||
private void assertMapsOrder(Iterator<Map.Entry<String, Object>> streamInMap, Iterator<Map.Entry<String, Object>> streamOutMap) { | ||
while(streamInMap.hasNext()) { | ||
assertEquals(streamInMap.next().getKey(), streamOutMap.next().getKey()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the method's name I expected it to check the values too.
* write map to stream with consistent order | ||
* to make sure every map generated bytes order are same | ||
*/ | ||
public void writeMapWithConsistentOrder(@Nullable Map<String, Object> map) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method is for writing map to stream have the consistent order.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like how this makes reading the read and write side easy. Can you add a javadoc to the method saying it is compatible with readMap
and readGenericValue
?
Can you add a dedicated test case for this method in BytesStreamsTests
?
I think we shouldn't try to handle sorting recursively in this, at least for now. Can you say something about only sorting the keys of the map, not maps contained within the map?
Maybe the signature for the method should be public <K extends Comparable<K>> void writeMapWithConsistentOrder(@Nullable Map<K, ? extends Object> map) throws IOException
? That way it can function on Map<String, String>
and Map<Integer, String>
or any other weirdness we need to throw at it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, If don't care about the value's type, can just use writeGenericValue
.
About the generic type for key, I don't think we can do it, because if we use generic type for key, we need to use writeGenericValue
for writing key. but for writeGenericValue
, it will write extra byte for Type
, like String
use byte 0
for String
type. so if we do this, we need to create a new method maybe like readGenericHashMap
.
but this will cause writeMapWithConsistentOrder
method is incompatible with readMap
and readGenericValue
. so How about your idea?
and for the other writeMap
, they also use String
type as key. maybe we can transform Integer
to String when we need to writeMap with consistent order.
|
||
public void testWriteMapWithConsistentOrder() throws IOException { | ||
Map<String, Object> map = new HashMap<>(); | ||
map.put("gWrgS", randomAsciiOfLength(5)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why hardcode these test keys?
For the HashMap
default capacity is 16
, and threshold is 12(DEFAULT_LOAD_FACTOR(0.75) * DEFAULT_INITIAL_CAPACITY(16)
).
for these keys, their slots should be:
Key | Slot |
---|---|
gWrgS | 10 |
HLRYi | 4 |
HyKnF | 12 |
and for Map<String, Object> reOrderMap = new HashMap<>(map);
it will recalculate capacity size and threshold, capacity: 8, threshold: 6,
and the generated slots should be:
Key | Slot |
---|---|
gWrgS | 2 |
HLRYi | 4 |
HyKnF | 4 (collision!!! ) |
when met collision, it will append this key into the existed node.
for these test keys, when met collision,this two maps order will not equal.
but these two map generated bytes still should be equal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not super comfortable relying on the internals of HashMap for this. I wonder if we could get away instead with two TreeMap
s, one with natural ordering and one with Comparator.reverse()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, TreeMap
is a good choice to generate the different order Map
.
BytesStreamOutput output1 = new BytesStreamOutput(); | ||
Map<String, Object> map1 = new LinkedHashMap<>(); | ||
try { | ||
output1.writeMapWithConsistentOrder(map1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use expectThrows
for this sort of testing. something like
Throwable e = expectThrows(AssertionError.class, () -> output.writeMapWithConsistentOrder(map));
assertEquals("Use writeMap with LinkedHashMaps", e.getMessage());
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, nearly not know this usage. :)
Looks good to me. I'll yank it locally now and test. Assuming everything passes I'll merge. Thanks for fixing this! |
import static java.util.Collections.emptyList; | ||
import static org.elasticsearch.search.builder.SearchSourceBuilderTests.createSearchSourceBuilder; | ||
|
||
public class SearchSourceBuilderTest extends ESTestCase { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The build failed because this test should be named SearchSourceBuilderTests
with a s
on the end. There is actually already a file with that name so maybe these should be just be combined in there?
Build failed locally - I left a comment explaining the failure. You can catch similar failures with |
oh, very sorry about this, I only run |
Testing it locally and everything passing so far. This doesn't merge cleanly with master but the manual merge is fairly obvious. Once it passes locally I'll manually squash and merge. |
Thanks @nik9000 |
Thanks for fixing this! |
Squashed and merged as 22242ec. |
writeByte((byte) -1); | ||
return; | ||
} | ||
assert false == (map instanceof LinkedHashMap); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this should be an IllegalStateException rather than assertion. @nik9000 what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is fairly safe as an assertion. Mostly we are in control of the types we pass in here. Any LinkedHashMap
isn't wrong to pass here, it is just super silly because we'll reorder it. If we want something stronger than an assertion then maybe we should do something like
if (map instanceof LinkedHashMap) {
// Already has consistent order
writeMap(map);
return;
}
?
Closes #19986
This cache Key not equal issue is caused by when stream out
Map
and stream inMap
has the different keys order. so the generatedbytes
are not equalThe possible affected maps include:
ObjectFloatHashMap
,HashMap
. forLinkedHashMap
hasaccessOrder
to keep the order.ObjectFloatHashMap
insert values, it will calculateslot
byhashKey
and a random generatedkeyMixer
, so even though same data has the differentslot
(means different order).stream out
andstream in
maybe will generate different orders key-values.HashMap
, althoughHashMap
doesn't have random seed to calculate slot, forHashMap
public HashMap(Map<? extends K, ? extends V> m)
, this constructor willpre-size
theHashMap
, Example(PhraseSuggestionBuilder
):this will cause if we use this to construct
HashMap
, use otherHashMap
constructorpublic HashMap(int initialCapacity)
tostream in
, this will notpre-size
:it will cause the different
keys
order becauseslot
calculate factor is not same.Solution:
stream out
theObjectFloatHashMap
, try to order theirString
key to ensure every requeststream out
have samebytes order
, like:PhraseSuggestionBuilder
collateParams
custommap
writer
with consistent order