New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Opimtize key tree stemming by using maps instead of sets #3963
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
sets to due to a compiler bug in 22 consumes too much memory and is slower on Erlang 22+ Reproducer: https://gist.github.com/nickva/ddc499e6e05678faf20d344c6e11daaa With sets: ``` couch_key_tree:gen_and_stem(). {8,6848.812683105469} ``` With maps: ``` couch_key_tree:gen_and_stem(). {0,544.000732421875} ```
I'd also note that this improves performance across the board. Using maps on 20 and 21 also improves against current baseline. +1 |
Trying the optimization with a production revision tree with 6638 live conflicts.
Measured time to run Before this commit
After this commit
In summary, on Erlang 23, we got a nice 460x speedup and 34x memory usage reduction |
Now you're just showing off 😉 Nice work! |
nickva
added a commit
that referenced
this pull request
Mar 24, 2022
`couch_key_tree:stem/2`, as seen in #3963, has a potential to consume quite a bit of memory. Replacing sets with maps helped in that case however, since stemming has a non-tail recursive section, there is a chance in future versions of Erlang to experience the same behavior again. As a safeguard, add a memory limit test by stemming a larger conflicted rev tree while limiting the maximum process heap size. For that, use the nifty `max_heap_size` process flag, which ensures a process is killed if it starts using too much memory. To reduce the flakiness, use a determistic tree shape by using a hard-coded seed value, and leave a decent margin for error for the limit. Ref: #3963
nickva
added a commit
that referenced
this pull request
Mar 24, 2022
`couch_key_tree:stem/2`, as seen in #3963, has a potential to consume quite a bit of memory. Replacing sets with maps helped in that case however, since stemming has a non-tail recursive section, there is a chance in future versions of Erlang to experience the same behavior again. As a safeguard, add a memory limit test by stemming a larger conflicted rev tree while limiting the maximum process heap size. For that, use the nifty `max_heap_size` process flag, which ensures a process is killed if it starts using too much memory. To reduce the flakiness, use a deterministic tree shape by using a hard-coded seed value and leave a decent margin of error for the memory limit. Ref: #3963
nickva
added a commit
that referenced
this pull request
Mar 25, 2022
`couch_key_tree:stem/2`, as seen in #3963, has a potential to consume quite a bit of memory. Replacing sets with maps helped in that case however, since stemming has a non-tail recursive section, there is a chance in future versions of Erlang to experience the same behavior again. As a safeguard, add a memory limit test by stemming a larger conflicted rev tree while limiting the maximum process heap size. For that, use the nifty `max_heap_size` process flag, which ensures a process is killed if it starts using too much memory. To reduce the flakiness, use a deterministic tree shape by using a hard-coded seed value and leave a decent margin of error for the memory limit. Ref: #3963
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
sets to due to a compiler bug in 22 consumes too much memory and is slower on Erlang 22+
Reproducer: https://gist.github.com/nickva/ddc499e6e05678faf20d344c6e11daaa
With sets:
With maps: