New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize list subtraction (A -- B) and make it yield on large inputs #1998

Merged
merged 4 commits into from Nov 2, 2018

Conversation

Projects
None yet
3 participants
@jhogberg
Contributor

jhogberg commented Oct 23, 2018

@kvakvs had to pause his work on #1993 so this PR takes over where he left off. I've refactored it to make state handling easier to follow, and started using a tree instead of an array for the removal set, decreasing run-time complexity from n*n to n*log(n).

Other than that it's pretty much the same, and it should still be relatively easy to backport.

@jhogberg jhogberg self-assigned this Oct 23, 2018

product of the length of its operands, which was extremely slow on long
lists.</p>
<p>These days the run-time complexity is "n log n" and the operation will

This comment has been minimized.

@bjorng

bjorng Oct 24, 2018

Contributor

"These days". I suggest making it more specific, e.g. from "OTP 22".

lists.</p>
<p>These days the run-time complexity is "n log n" and the operation will
complete quickly even on very long lists. In fact, it's faster and uses

This comment has been minimized.

@bjorng

bjorng Oct 24, 2018

Contributor

"it's": OTP's documentation guidelines says that contraction should be avoided.

This comment has been minimized.

@jhogberg

jhogberg Oct 24, 2018

Contributor

Thanks, fixed!

kvakvs and others added some commits Oct 23, 2018

Fix trapping in lists:reverse/2
The first stage wasn't bounded by reductions, and it bumped far
more reductions than it should have due to a logic bug.
Inline erts_cmp
This greatly increases the performance of '--'/2 which does a lot
of term comparisons.
Optimize operator '--' and yield on large inputs
The removal set now uses a red-black tree instead of an array on
large inputs, decreasing runtime complexity from `n*n` to
`n*log(n)`. It will also exit early when there are no more items
left in the removal set, drastically improving performance and
memory use when the items to be removed are present near the head
of the list.

This got a lot more complicated than before as the overhead of
always using a red-black tree was unacceptable when either of the
inputs were small, but this compromise has okay-to-decent
performance regardless of input size.

Co-authored-by: Dmytro Lytovchenko <dmytro.lytovchenko@erlang-solutions.com>

@jhogberg jhogberg merged commit eb9ee88 into erlang:maint Nov 2, 2018

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
license/cla Contributor License Agreement is signed.
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment