Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store projections in a tree of patterns #188

Merged
merged 1 commit into from
Mar 23, 2023
Merged

Store projections in a tree of patterns #188

merged 1 commit into from
Mar 23, 2023

Conversation

the-mikedavis
Copy link
Member

@the-mikedavis the-mikedavis commented Mar 16, 2023

Depends on #187

The main tree structure used to store data in the Khepri store is a tree where each edge is a path component and that tree can look up all matching paths for a given pattern. This change introduces a new tree structure that instead uses pattern components as edges and can be used to look up all matching patterns for a given path.

This replaces the data structure used to hold projections in the machine's state. The prior structure was a mapping from projections to path patterns. Every change to the store was checked against all patterns for registered projections using khepri_tree:does_path_match/3. The existing approach is fast in practice but it duplicates some work: pattern components shared between multiple patterns would all be re-compiled and re-checked for each projection and for every change to the store. The new pattern tree eliminates the duplicate work and saves a small but noticeable amount of time when a store uses many projections and sees a large number of changes.

In particular I tested this against a rabbitmqctl import_definitions of 1 million topic bindings against the khepri branch in the server's repository. The time with this change is now an improvement over mnesia:

Store Time (seconds, lower is better)
Mnesia 200.64
Khepri (main) 238.17
Khepri w/ pattern tree 158.32
We can compare the flamegraphs recorded during these tests...

main:

main

This branch:

pattern-tree

One of the largest spans in the flamegraph on main is around the khepri_machine:create_projection_side_effects/3 function which spends much of its time in khepri_tree:does_path_match/3,4 and khepri_condition:compile/1 under that. This branch eliminates the duplicate khepri_condition:compile/1 calls and spends less time checking whether patterns match.

@the-mikedavis the-mikedavis added the enhancement New feature or request label Mar 16, 2023
@the-mikedavis the-mikedavis added this to the v0.7.0 milestone Mar 16, 2023
@the-mikedavis the-mikedavis self-assigned this Mar 16, 2023
@codecov
Copy link

codecov bot commented Mar 16, 2023

Codecov Report

Patch coverage: 89.77% and project coverage change: +0.13 🎉

Comparison is base (d23df59) 91.01% compared to head (5f5f619) 91.14%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #188      +/-   ##
==========================================
+ Coverage   91.01%   91.14%   +0.13%     
==========================================
  Files          20       21       +1     
  Lines        3494     3535      +41     
==========================================
+ Hits         3180     3222      +42     
+ Misses        314      313       -1     
Flag Coverage Δ
erlang-24 89.56% <89.77%> (+0.09%) ⬆️
erlang-25 89.75% <89.77%> (+0.14%) ⬆️
erlang-26.0-rc1 88.65% <89.77%> (+0.21%) ⬆️
os-ubuntu-latest 91.14% <89.77%> (+0.13%) ⬆️
os-windows-latest 89.70% <89.77%> (+0.09%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/khepri.erl 91.45% <25.00%> (-0.60%) ⬇️
src/khepri_machine.erl 95.50% <91.17%> (+0.63%) ⬆️
src/khepri_pattern_tree.erl 100.00% <100.00%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

src/khepri_machine.erl Show resolved Hide resolved
src/khepri_pattern_tree.erl Show resolved Hide resolved
%% @see fold_fun().
%% @see fold_acc().

fold(PatternTree, Tree, Path, FoldFun, Acc) ->
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm split on the naming of this function. I like fold because we have the FoldFun and Acc arguments. It could also be called find_matching_patterns though to mimic khepri_tree:find_matching_nodes/5 which also has a FoldFun and Acc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably khepri_tree:find_matching_nodes/5 should be renamed fold/5. What it does changed a lot since its creation and the name remained the same.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it's worth renaming khepri_tree:find_matching_nodes/5 now? I think fold/5 would fit better for that too

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, now that's it's a separate module, the name you suggest is better.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok! I will create a separate PR with the rename 👍

@the-mikedavis the-mikedavis marked this pull request as ready for review March 20, 2023 15:56
%% @see fold_fun().
%% @see fold_acc().

fold(PatternTree, Tree, Path, FoldFun, Acc) ->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably khepri_tree:find_matching_nodes/5 should be renamed fold/5. What it does changed a lot since its creation and the name remained the same.

test/helpers.hrl Outdated Show resolved Hide resolved
test/helpers.hrl Outdated Show resolved Hide resolved
src/khepri_pattern_tree.erl Show resolved Hide resolved
src/khepri_machine.erl Show resolved Hide resolved
The main tree structure used to store data in the Khepri store is a
tree where each edge is a path component and that tree can look up all
matching paths for a given pattern. This change introduces a new tree
structure that instead uses pattern components as edges and can be used
to look up all matching patterns for a given path.

This replaces the data structure used to hold projections in
the machine's state. The prior structure was a mapping from
projections to path patterns. Every change to the store was
checked against all patterns for registered projections using
'khepri_tree:does_path_match/3'. The existing approach is fast in
practice but it duplicates some work: pattern components shared between
multiple patterns would all be re-compiled and re-checked for each
projection and for every change to the store. The new pattern tree
eliminates the duplicate work and saves a small but noticeable amount
of time when a store uses many projections and sees a large number of
changes.
@dumbbell dumbbell merged commit e827286 into main Mar 23, 2023
@dumbbell dumbbell deleted the pattern-tree branch March 23, 2023 17:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants