libc++ `std::nth_element` is quadratic, should be linear #52747

orlp · 2021-12-16T15:23:44Z

The reproduction program is almost the same as my 7 year old one for std::sort that got fixed only recently:

#include <algorithm>
#include <iostream>
#include <vector>

void make_killer(int size, std::vector<int>& v) {
    int candidate = 0;
    int num_solid = 0;
    int gas = size - 1;

    std::vector<int> tmp(size);
    v.resize(size);

    for (int i = 0; i < size; ++i) {
        tmp[i] = i;
        v[i] = gas;
    }

    std::nth_element(tmp.begin(), tmp.end() - 2, tmp.end(), [&](int x, int y) {
        if (v[x] == gas && v[y] == gas) {
            if (x == candidate) v[x] = num_solid++;
            else v[y] = num_solid++;
        }

        if (v[x] == gas) candidate = x;
        else if (v[y] == gas) candidate = y;

        return v[x] < v[y];
    });
}

int main(int argc, char** argv) {
    std::vector<int> v;
    int comparison_count;

    auto counter = [&](int x, int y) { ++comparison_count; return x < y; };

    std::cout << "N: comparisons\n";
    for (int i = 100; i <= 6400; i *= 2) {
        // to nullify small constants we multiply by 100
        make_killer(i, v);

        comparison_count = 0;
        std::nth_element(v.begin(), v.end() - 2, v.end(), counter);
        std::cout << i << ": " << comparison_count << "\n";
    }

    return 0;
}

The output:

N: comparisons
100: 2741
200: 10491
400: 40991
800: 161991
1600: 643991
3200: 2567991
6400: 10255991

Evidently quadratic, but the standard requires:

Complexity: For the overloads with no ExecutionPolicy, linear on average. For the overloads
with an ExecutionPolicy, O(N) applications of the predicate, and O(N log N) swaps, where N = last - first.

The problem is that pure quickselect is implemented without a fallback for worst cases, like median of medians.

The text was updated successfully, but these errors were encountered:

ldionne · 2021-12-16T18:07:25Z

@nilayvaish @Morwenn

Any interest in working on this? I'm just trying to nerdsnipe people, don't feel any obligation. :-)

nilayvaish · 2021-12-16T19:23:16Z

I'll take a look at this.

@ldionne, I have been meaning to increase my participation in libc++ development. Feel free to send some part of the work my way.

Morwenn · 2021-12-16T21:10:13Z

I've been stealing the nth_element implementation I use from miniselect, which itself took it from Andrei Alexandrescu's adaptive quicskelect own implementation. Unfortunately I haven't really tried to understand how it works yet, and it's rather massive.

I guess an introselect that falls back to median-of-median or heap selection would be enough for most uses. Median-of-median itself isn't terribly hard to implement in O(1) space, but it's probably better to have it as a doubly-recursive algorithm with quickselect. HeapSelect is probably the easiest solution though since heap operations are already in the standard library.

danlark1 · 2021-12-17T02:41:46Z

Author of miniselect is here. I analyzed quite thoroughly all existing nth_element implementations and the one from Alexandrescu seems the best to be standard compliant, fast and predictable at the same time

I strongly recommend not to have median of medians implementation as it has huge constant underneath, see the benchmarks in the repository

The easiest way will be to implement HeapSelect as a fallback, current nth_element is already quite good in terms of speed. See the benchmark section

danlark1 · 2021-12-17T02:47:16Z

By the way, technically current implementation is standard compliant

For the overloads with no ExecutionPolicy, linear on average.

Quickselect works linear on average

ldionne · 2021-12-17T15:55:43Z

@nilayvaish

I'll take a look at this.

@ldionne, I have been meaning to increase my participation in libc++ development. Feel free to send some part of the work my way.

Thank you so much!

@danlark1

The easiest way will be to implement HeapSelect as a fallback, current nth_element is already quite good in terms of speed.

The link you shared says that libstdc++'s implementation is good, but it says about the libc++ implementation:

This algorithm is used in standard library and is not recommended to use if you are looking for performance.

So I think we definitely do want to improve the libc++ implementation, right?

danlark1 · 2021-12-17T16:07:41Z

So I think we definitely do want to improve the libc++ implementation, right?

Benchmarks showed that Floyd-Rivest is the best algorithm to use but it requires floating point arithmetic and libc++ implementation is much better on average than libstdc++ right now.

The final solution depends on the goals of that. At google we don't see much perf opportunity for nth_element, for example. Not that much to spend even a month working on that.

If we want to "fix" worst case scenarios, then we can add HeapSelect and forget about it by spending several hours.

If we want to be at top of the industry, we can consider Alexandrescu's algorithm or Floyd-Rivest one.

And as I said, we already are standard compliant. Unlike std::sort, std::nth_element is required to be linear on average.

orlp · 2021-12-17T18:04:13Z

By the way, technically current implementation is standard compliant

For the overloads with no ExecutionPolicy, linear on average.

Quickselect works linear on average

I don't agree, even if we go strictly by the standard, if we allow custom comparison functions. Take this example, where I shuffle the input before using nth_element on it. No matter how it's shuffled, we are quadratic.

#include <algorithm>
#include <iostream>
#include <vector>
#include <random>

int quadratic(int size) {
    int num_solid = 0;
    int gas = size + 1;
    int comparison_count = 0;

    std::vector<int> indices(size);
    std::vector<int> values(size);
    for (int i = 0; i < size; ++i) {
        indices[i] = i;
        values[i] = gas;
    }

    std::random_device rd;
    std::mt19937 g(rd());
    std::shuffle(indices.begin(), indices.end(), g); // Enforce uniform input distribution!
    std::nth_element(indices.begin(), indices.end() - 2, indices.end(), [&](int x, int y) {
        // Invariant: gas always compares greater than solid.
        comparison_count += 1;
        if (values[x] == gas && values[y] == gas) {
            // We must solidify one of the two elements.
            // Note: still greater than any previous solids - doesn't violate order.
            values[x] = num_solid++;
            return true;
        } else if (values[x] == gas) {
            return false;
        } else if (values[y] == gas) {
            return true;
        } else {
            return values[x] < values[y];
        }
    });
    return comparison_count;
}

int main(int argc, char** argv) {
    std::cout << "N: comparisons\n";
    for (int i = 100; i <= 6400; i *= 2) {
        std::cout << i << ": " << quadratic(i) << "\n";
    }
    return 0;
}

Output:

N: comparisons
100: 2788
200: 10588
400: 41188
800: 162388
1600: 644788
3200: 2569588
6400: 10259188

Mind you, this comparison function satisfies the requirements of a strict weak ordering and is perfectly valid. I don't think the standard claims to only be linear on average when used with std::less or std::greater.

danlark1 · 2021-12-17T19:19:43Z

Good point and self-referenced comparators definitely make the wording slightly ambiguous

alg.nth.element-3

Effects: After nth_element the element in the position pointed to by nth is the element that would be in that position if the whole range were sorted with respect to comp and proj, unless nth == last

alg.sorting#general-5

A sequence is sorted with respect to a comp and proj for a comparator and projection comp and proj if for every iterator i pointing to the sequence and every non-negative integer n such that i + n is a valid iterator pointing to an element of the sequence,
bool(invoke(comp, invoke(proj, *(i + n)), invoke(proj, *i)))
is false.

I can come up with the sequence of such invocations (from std::sort) where from some point all of them will be false with your comparator. And if removing these invocations, nth_element is going to return the wrong value.

What "if the whole range were sorted" is ambiguous here.

Does that mean if it was sorted with std::sort or just any order? Or does that mean std::is_sorted returns true?

"On average" here does not make sense as well as the definition of the sorted range where the property starts to depend on the order of checking itself.

We can bring this to discussion at c++ mailing list but generally I try not to be a lawyer and "on average" in the standard meant likely the ability to implement quickselect.

Morwenn · 2021-12-17T21:09:26Z

I'm reading the introselect Wikipedia article again and surprised to see that it includes strategies that, as far as I can tell, haven't been explored in miniselect:

Introselect works by optimistically starting out with quickselect and only switching to a worst-case linear-time selection algorithm (the Blum-Floyd-Pratt-Rivest-Tarjan median of medians algorithm) if it recurses too many times without making sufficient progress. The switching strategy is the main technical content of the algorithm. Simply limiting the recursion to constant depth is not good enough, since this would make the algorithm switch on all sufficiently large lists. Musser discusses a couple of simple approaches:

Keep track of the list of sizes of the subpartitions processed so far. If at any point k recursive calls have been made without halving the list size, for some small positive k, switch to the worst-case linear algorithm.
Sum the size of all partitions generated so far. If this exceeds the list size times some small positive constant k, switch to the worst-case linear algorithm. This sum is easy to track in a single scalar variable.

Both approaches limit the recursion depth to k ⌈log n⌉ = O(log n) and the total running time to O(n).

Also if I understand your implementation in miniselect, you really implemented a raw median-of-medians which doesn't attempt to pick the pivot any other way than with median-of-medians itself. The techniques proposed in the quote above might be worth exploring as a simple alternative to Alexandrecu's adaptive quickselect.

github-actions bot added the new issue label Dec 16, 2021

Morwenn mentioned this issue Dec 16, 2021

The random-access version of quick_merge_sort runs in O(n²) Morwenn/cpp-sort#179

Closed

ldionne added the libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. label Dec 16, 2021

mkurdej removed the new issue label Dec 28, 2021

Quuxplusone added the confirmed Verified by a second party label Mar 8, 2022

orlp mentioned this issue Dec 1, 2022

select_nth_unstable has quadratic worst-case time complexity; docs claim it should be linear rust-lang/rust#102451

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

libc++ `std::nth_element` is quadratic, should be linear #52747

libc++ `std::nth_element` is quadratic, should be linear #52747

orlp commented Dec 16, 2021

ldionne commented Dec 16, 2021

nilayvaish commented Dec 16, 2021

Morwenn commented Dec 16, 2021

danlark1 commented Dec 17, 2021 •

edited

Loading

danlark1 commented Dec 17, 2021

ldionne commented Dec 17, 2021

danlark1 commented Dec 17, 2021 •

edited

Loading

orlp commented Dec 17, 2021

danlark1 commented Dec 17, 2021

Morwenn commented Dec 17, 2021 •

edited

Loading

libc++ std::nth_element is quadratic, should be linear #52747

libc++ std::nth_element is quadratic, should be linear #52747

Comments

orlp commented Dec 16, 2021

ldionne commented Dec 16, 2021

nilayvaish commented Dec 16, 2021

Morwenn commented Dec 16, 2021

danlark1 commented Dec 17, 2021 • edited Loading

danlark1 commented Dec 17, 2021

ldionne commented Dec 17, 2021

danlark1 commented Dec 17, 2021 • edited Loading

orlp commented Dec 17, 2021

danlark1 commented Dec 17, 2021

Morwenn commented Dec 17, 2021 • edited Loading

libc++ `std::nth_element` is quadratic, should be linear #52747

libc++ `std::nth_element` is quadratic, should be linear #52747

danlark1 commented Dec 17, 2021 •

edited

Loading

danlark1 commented Dec 17, 2021 •

edited

Loading

Morwenn commented Dec 17, 2021 •

edited

Loading