You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While running some stateful property-based tests on the Erlang Distribution Protocol, I encountered an issue with specific Map encodings which can result in a "broken map".
Specifically: if a map with more than 32 key-pairs (or heap map) contains a key which is a flat map with at least 2 key pairs of different types (for example: a float() and an integer() like with #{0 => [],0.0 => []}), the exact ordering of the encoded key pairs in this nested flat map can result in a decoded map that has a key that cannot be fetched with maps:get/2,3, cannot be pattern matched against, cannot be displayed properly with erlang:display/1, etc.
Example nested encoding which can result in a broken map:
%% Encoding which can result in a broken map if used as the key of a larger heap map:
<<
116, % MAP_EXT0,0,0,2, % Arity = 270,0,0,0,0,0,0,0,0, % [0] Key = 0.0106, % [0] Val = []97,0, % [1] Key = 0106% [1] Val = []
>>.
%% Encoding from `erlang:term_to_binary/2` which does not result in a broken map:
<<
116, % MAP_EXT0,0,0,2, % Arity = 297,0, % [0] Key = 0106, % [0] Val = []70,0,0,0,0,0,0,0,0, % [1] Key = 0.0106% [1] Val = []
>>.
The External Term Format needs to have more specific documentation about MAP_EXT if the ordering of the key pairs is important and binary_to_term should error if out-of-order key pairs are detected when decoding.
Decoding a MAP_EXT should allow the key pairs to be in any arbitrary order without resulting in a "broken map".
My vote would be for (2).
Here are a couple property definitions in pseudo-quickcheck code:
% Calling `maps:get/2,3` on a `Map` for each `Key` returned by `maps:keys(Map)` should succeed.?FORALL(
Map,
map(),
beginTag=erlang:make_ref(),
Pred=fun(Key) -> maps:get(Key, Map, Tag) =/=Tagend,
lists:all(Pred, maps:keys(Map))
end
).
% A "roundtrip" `Map` term should be equal to itself after encoding and decoding.?FORALL(
Map,
map(),
beginRoundtripMap=erlang:binary_to_term(erlang:term_to_binary(Map)),
RoundtripMap==Mapend
).
Affected versions
OTP 25.x (maint)
OTP 26.x (master)
Additional context
I think external.c may be erroneously assuming that the key pairs being decoded are in a specific order.
The text was updated successfully, but these errors were encountered:
/* Iterate through all the (flat)maps and check for validity and sort keys
* - done here for when we know it is complete.
*/
while(!WSTACK_ISEMPTY(flat_maps)) {
next= (Eterm*)WSTACK_POP(flat_maps);
if (!erts_validate_and_sort_flatmap((flatmap_t*)next))
goto error;
}
where all (big) hashmaps are generated followed by sorting of all (small) flatmaps. If an unsorted flatmap is key in a hashmap its hash value will be calculated wrong. Seems like we have to fixup all maps in one go bottom-up.
Working on it...
Describe the bug
While running some stateful property-based tests on the Erlang Distribution Protocol, I encountered an issue with specific Map encodings which can result in a "broken map".
Specifically: if a map with more than 32 key-pairs (or heap map) contains a key which is a flat map with at least 2 key pairs of different types (for example: a
float()
and aninteger()
like with#{0 => [],0.0 => []}
), the exact ordering of the encoded key pairs in this nested flat map can result in a decoded map that has a key that cannot be fetched withmaps:get/2,3
, cannot be pattern matched against, cannot be displayed properly witherlang:display/1
, etc.Example nested encoding which can result in a broken map:
To Reproduce
See
prop_find_broken_maps
and the counterexample.log for more details on how these encodings were produced.Expected behavior
Depending on the desired behavior here, either:
MAP_EXT
if the ordering of the key pairs is important andbinary_to_term
should error if out-of-order key pairs are detected when decoding.MAP_EXT
should allow the key pairs to be in any arbitrary order without resulting in a "broken map".My vote would be for (2).
Here are a couple property definitions in pseudo-quickcheck code:
Affected versions
Additional context
I think external.c may be erroneously assuming that the key pairs being decoded are in a specific order.
The text was updated successfully, but these errors were encountered: