New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial implementation of EEP 50 #2864
Conversation
I went ahead and tackled the pending issues. In particular, I chose for |
@jhogberg I have updated the PR to support a version flag. Docs have been updated too. I have also benchmarked maps:intersect/2 and it is slower then the current cerl_sets implementation. You can see the benchmark here and the benchmark results here. For this reason, I decided to keep the cerl_sets implementation for intersection for now. If you want to give it a try yourself, you can clone the repo, run |
Great :)
The current implementation of -spec intersect(Map1,Map2) -> Map3 when
Map1 :: #{ Key => term() },
Map2 :: #{ term() => Value2 },
Map3 :: #{ Key => Value2 }.
intersect(Map1, Map2) when is_map(Map1), is_map(Map2) ->
case map_size(Map1) < map_size(Map2) of
true ->
Iterator = maps:iterator(Map1),
intersect_1(maps:next(Iterator), Map1, Map2);
false ->
Iterator = maps:iterator(Map2),
intersect_2(maps:next(Iterator), Map2, Map1)
end;
intersect(Map1, Map2) ->
erlang:error(error_type_two_maps(Map1, Map2),
[Map1, Map2]).
intersect_1({Key, LHS_Value, Iterator}, LHS0, RHS) ->
LHS = case RHS of
#{ Key := LHS_Value } ->
LHS0;
#{ Key := RHS_Value } ->
LHS0#{ Key := RHS_Value };
_ ->
maps:remove(Key, LHS0)
end,
intersect_1(maps:next(Iterator), LHS, RHS);
intersect_1(none, Res, _) ->
Res.
intersect_2({Key, _RHS_Value, Iterator}, RHS0, LHS) ->
RHS = case LHS of
#{ Key := _LHS_Value } ->
%% The value in RHS has precedence, so it doesn't need to
%% be updated even if they differ.
RHS0;
_ ->
maps:remove(Key, RHS0)
end,
intersect_2(maps:next(Iterator), RHS, LHS);
intersect_2(none, Res, _) ->
Res.
Adding a fast-path for Either way I'm happy with this PR as-is and can merge it now if you want me to, I can always switch to |
This pull request adds support for maps in the sets module. According to my [benchmarks][bench], using maps is faster in the huge majority of cases, sometimes by multiple orders of magnitude, and in the few cases it is slower, it is by less than 10%. [bench]: https://github.com/josevalim/sets_bench
@jhogberg awesome, i ran your branch too and got 3-4x solid improvement. I changed this PR to use maps:intersect, updated the commit message and pushed. Next I will update the EEP and then send a separate PR to remove cerl_sets. |
Merged, thanks for the PR! |
This pull request adds support for maps in the sets module. According to my benchmarks, using maps is faster in the majority of cases, sometimes by multiple orders of magnitude, and in the few cases it is slower, it is only by 1.1x-1.4x, which can be improved with the steps outlined below. Note that, although we are using maps, the type is still opaque.
Fully adopting EEP 50 will require multiple PRs. In particular, following PRs should:
Remove cerl_sets
Incorporate the intersection implementation in Add maps:merge_with/3, maps:intersect/2, and maps:intersect_with/2 #2797Write intersection, subtract, is_subset, is_disjoint, and from_list as NIFs so
sets
becomes the fastest set implementation in OTP on all possible scenarios