Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lazy sort in Phobos? #9913

Open
dlangBugzillaToGithub opened this issue Oct 7, 2011 · 3 comments
Open

Lazy sort in Phobos? #9913

dlangBugzillaToGithub opened this issue Oct 7, 2011 · 3 comments

Comments

@dlangBugzillaToGithub
Copy link

bearophile_hugs reported this on 2011-10-07T15:53:29Z

Transfered from https://issues.dlang.org/show_bug.cgi?id=6787

CC List

Description

Sometimes in my code I have to find the first few smallest (or biggest) items of an array, I don't know how many I will need of them, but I know in general I will need only few of them, much less than the whole array.

Turning the array into an heap is a slow operation if I will only need few items, and I can't use std.algorithm.partialSort because I don't know the number of items I will need.

So I have created this very simple LazySort range, based on partialSort (works with DMD 2.056head):


import std.stdio, std.algorithm, std.random, std.array, std.range, std.traits;

// Missing still: less and SwapStrategy template arguments.
struct LazySort(Range) if (isRandomAccessRange!Range) {
    Range data;
    private size_t idx, idxSorted;

    bool empty() { return idx >= data.length; }

    ForeachType!Range front() {
        if (idx >= idxSorted) {
            immutable oldIdxSorted = idxSorted;
            idxSorted = min(data.length, idxSorted ? (idxSorted * 2) : 1);
            partialSort(data[oldIdxSorted .. $], idxSorted - oldIdxSorted);
        }
        return data[idx];
    }

    void popFront() { idx++; }
}

void main() {
    auto A = array(iota(25));
    randomShuffle(A);
    writeln(A);

    foreach (x; LazySort!(int[])(A))
        write(x, " ");
}



I have not done benchmarks on it yet. This code seems to work, but it is not efficient, it's just to show the semantics of the idea. A better implementation is welcome.

I think a lazy sort will be useful in Phobos. Timon Gehr seems to appreciate the idea.
@dlangBugzillaToGithub
Copy link
Author

andrei (@andralex) commented on 2011-10-07T15:58:21Z

The canonical solution uses a heap. Creating a heap is cheap and quickly amortized over only a few pops. An input range that creates a heap and then yields one element at a time would be a better idea.

@dlangBugzillaToGithub
Copy link
Author

bearophile_hugs commented on 2011-10-07T16:21:35Z

(In reply to comment #1)
> The canonical solution uses a heap. Creating a heap is cheap and quickly
> amortized over only a few pops. An input range that creates a heap and then
> yields one element at a time would be a better idea.

If benchmarks show that a range that heapifies the input array is about as efficient as a tailored lazy sorting solution for about 4 to 10 requested max/min items, then I am OK with this idea :-)

@dlangBugzillaToGithub
Copy link
Author

andrei (@andralex) commented on 2011-10-07T17:17:20Z

(In reply to comment #2)
> (In reply to comment #1)
> > The canonical solution uses a heap. Creating a heap is cheap and quickly
> > amortized over only a few pops. An input range that creates a heap and then
> > yields one element at a time would be a better idea.
> 
> If benchmarks show that a range that heapifies the input array is about as
> efficient as a tailored lazy sorting solution for about 4 to 10 requested
> max/min items, then I am OK with this idea :-)

This is odd at a few levels. First, you opened this report. So it's not you who's supposed to be OK, it's the rest of us. You have the burden of proof. Second, there's no need for benchmarking anything - simple complexity analysis shows that partialSort does O(n) log(n-k) at each step, which pretty much makes it bankrupt compared to the heap approach.

@LightBender LightBender removed the P4 label Dec 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants