Skip to content

Introduce tournament tree to achieve better k-way sort-merging #4300

@richox

Description

@richox

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

SortPreservingMergeStream currently uses a binary heap for k-way merging, which gets O(NlogS) time complexity (S=num_streams). however, when a top record is taken from the heap, we need to perform a pop/push operation and are likely to take 2*logS comparisons.

an improved way is to use a Tournament Tree (aka Loser Tree) to do the selection. when the top record is taken, the tree structure is not modified, and only the path from bottom to top is visited. the number of comparisons is always logS.

reference: https://en.wikipedia.org/wiki/K-way_merge_algorithm#Tournament_Tree

Describe the solution you'd like
implement the loser tree algorithm in SortPreservingMergeStream.

Describe alternatives you've considered

Additional context
the benchmark shows the merging time is ~50% shorter after applying the tournament tree.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions