Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

select_nth_unstable has quadratic worst-case time complexity; docs claim it should be linear #102451

Closed
qtow opened this issue Sep 29, 2022 · 2 comments · Fixed by #107522
Closed
Labels
C-bug Category: This is a bug.

Comments

@qtow
Copy link

qtow commented Sep 29, 2022

The docs claim that select_nth_unstable is "O(n) worst-case", but the implementation exhibits O(n2) time complexity for some inputs. I'm not sure if this is a documentation error or an implementation error, but it's definitely one or the other.

The current implementation is a fairly normal Quickselect. Notably it only looks at a constant number of elements to determine a pivot, which cannot guarantee linear time complexity for quickselect. Median of medians must be used instead, which is different from the "median of medians" referenced in the code.

For actual proof that pathological cases exist, here's a playground link. It generates close to the worst possible input for a fixed length, and compares the time for that against the time for a random input. For a length of ~128k (the highest I can get the playground to go before it times out), the pathological input is around 1000 times slower than the random input.

Here's some graphs of runtime vs input length:

time vs input length

time vs input length, log-log

Both show runtime in seconds vs slice length, with blue being random inputs and red being pathological inputs. The second is a log-log graph with some lines fit to the data. The slopes on the log-log graph are roughly 1.3 and 2.3 for the blue and red lines respectively.

To uphold the linear worst-case guarantee, select_nth_unstable would need to use Median of medians. Introselect is the hybrid version, and Fast Deterministic Selection is the most recent work I'm aware of that looks at optimizing median of medians.

If the worst-case guarantee is relaxed to an average-case guarantee (that's all that C++ gives for nth_element), I think the docs should make it clear that the worst case is quadratic.

@qtow qtow added the C-bug Category: This is a bug. label Sep 29, 2022
@orlp
Copy link
Contributor

orlp commented Dec 1, 2022

I have a much simpler proof of concept that shows the quadratic behavior (playground):

use std::cmp::Ordering;
use std::cell::Cell;
use std::sync::atomic::{self, AtomicUsize};

static NUM_SOLIDS: AtomicUsize = AtomicUsize::new(0);

// Gas is always greater than Solid.
#[derive(Copy, Clone, PartialOrd, Ord, PartialEq, Eq, Debug)]
enum Value {
    Solid(usize),
    Gas,
}

// When comparing two EvilValue's it will always give a consistent answer, but
// for some 'mysterious' reason new elements always appear to be greater than
// previously seen ones. This is done by starting all values as Gas and only
// solidifying when necessary: Gas <-> Gas comparisons. Solid <-> Gas
// comparisons can always return that the Gas is greater without any commitment
// to any particular value being made.
#[derive(Clone, Debug)]
struct EvilValue {
    value: Cell<Value>,
}

impl EvilValue {
    pub fn new() -> Self {
        Self { value: Cell::new(Value::Gas) }
    }

    // Note that this doesn't violate any previous returned ordering,
    // as we create a new solid that is larger than any existing solids
    // which is consistent with any earlier Solid <-> Gas comparison we've made.
    pub fn solidify(&self) -> usize {
        if let Value::Solid(id) = self.value.get() {
            id
        } else {
            let id = NUM_SOLIDS.fetch_add(1, atomic::Ordering::Relaxed);
            assert!(id < usize::MAX);
            self.value.set(Value::Solid(id));
            id
        }
    }
}

impl PartialOrd for EvilValue {
    fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
        if self.value.get() == Value::Gas { other.solidify(); }
        self.value.partial_cmp(&other.value)
    }
}

impl PartialEq for EvilValue {
    fn eq(&self, other: &Self) -> bool {
        if self.value.get() == Value::Gas { other.solidify(); }
        self.value == other.value
    }
}

impl Eq for EvilValue { }
impl Ord for EvilValue {
    fn cmp(&self, other: &Self) -> Ordering {
        self.partial_cmp(other).unwrap()
    }
}

fn main() {
    for n in [10, 100, 1000, 10000, 100_000] {
        // Construct worst case.
        let mut values: Vec<_> = (0..n).map(|i| (EvilValue::new(), i)).collect();
        values.select_nth_unstable(n/2);
        let mut worst_case: Vec<usize> = vec![0; n];
        for (ev, i) in values {
            worst_case[i] = ev.solidify();
        }
        
        // Benchmark (ordinary usize comparisons - NOT EvilValue).
        let mut comparisons = 0;
        worst_case.select_nth_unstable_by(n/2, |a, b| {
            comparisons += 1;
            a.cmp(&b)
        });
        println!("{n}: {comparisons}");
    }
}

This is similar to my proof of concept showing the same issue for libc++'s nth_element: llvm/llvm-project#52747.

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Jan 18, 2023
Add heapsort fallback in `select_nth_unstable`

Addresses rust-lang#102451 and rust-lang#106933.

`slice::select_nth_unstable` uses a quick select implementation based on the same pattern defeating quicksort algorithm that `slice::sort_unstable` uses. `slice::sort_unstable` uses a recursion limit and falls back to heapsort if there were too many bad pivot choices, to ensure O(n log n) worst case running time (known as introsort). However, `slice::select_nth_unstable` does not have such a fallback strategy, which leads to it having a worst case running time of O(n²) instead. rust-lang#102451 links to a playground which generates pathological inputs that show this quadratic behavior. On my machine, a randomly generated slice of length `1 << 19` takes ~200µs to calculate its median, whereas a pathological input of the same length takes over 2.5s. This PR adds an iteration limit to `select_nth_unstable`, falling back to heapsort, which ensures an O(n log n) worst case running time (introselect). With this change, there was no noticable slowdown for the random input, but the same pathological input now takes only ~1.2ms. In the future it might be worth implementing something like Median of Medians or Fast Deterministic Selection instead, which guarantee O(n) running time for all possible inputs. I've left this as a `FIXME` for now and only implemented the heapsort fallback to minimize the needed code changes.

I still think we should clarify in the `select_nth_unstable` docs that the worst case running time isn't currently O(n) (the original reason that rust-lang#102451 was opened), but I think it's a lot better to be able to guarantee O(n log n) instead of O(n²) for the worst case.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Jan 18, 2023
Add heapsort fallback in `select_nth_unstable`

Addresses rust-lang#102451 and rust-lang#106933.

`slice::select_nth_unstable` uses a quick select implementation based on the same pattern defeating quicksort algorithm that `slice::sort_unstable` uses. `slice::sort_unstable` uses a recursion limit and falls back to heapsort if there were too many bad pivot choices, to ensure O(n log n) worst case running time (known as introsort). However, `slice::select_nth_unstable` does not have such a fallback strategy, which leads to it having a worst case running time of O(n²) instead. rust-lang#102451 links to a playground which generates pathological inputs that show this quadratic behavior. On my machine, a randomly generated slice of length `1 << 19` takes ~200µs to calculate its median, whereas a pathological input of the same length takes over 2.5s. This PR adds an iteration limit to `select_nth_unstable`, falling back to heapsort, which ensures an O(n log n) worst case running time (introselect). With this change, there was no noticable slowdown for the random input, but the same pathological input now takes only ~1.2ms. In the future it might be worth implementing something like Median of Medians or Fast Deterministic Selection instead, which guarantee O(n) running time for all possible inputs. I've left this as a `FIXME` for now and only implemented the heapsort fallback to minimize the needed code changes.

I still think we should clarify in the `select_nth_unstable` docs that the worst case running time isn't currently O(n) (the original reason that rust-lang#102451 was opened), but I think it's a lot better to be able to guarantee O(n log n) instead of O(n²) for the worst case.
@Sp00ph
Copy link
Member

Sp00ph commented Jan 27, 2023

(Copy pasting this from #106933. Adding this to stdlib would guarantee O(n) worst case for selection. After #106997 it's already guaranteed O(n log n).)

I also now implemented the fast deterministic selection algorithm here. The implementation is entirely based on this paper (and this repo) except that I couldn't be bothered to implement expand_partition, and instead just used the existing stdlib partition. It completely blows heapsort out of the water. Even on very short slices it isn't much slower than heapsort, and on longer slices it becomes way faster. Random inputs of length 1 << 16 take ~4ms to sort using heapsort on my machine, whereas my fast deterministic selection implementation only takes ~70µs to select the median. Also, there is no noticable runtime degradation when choosing an index further from the middle of the slice as there was with median of medians. Note also that I paid absolutely no attention to optimization yet, so it may very well become even faster if we were to eliminate unnecessary bounds checks and such. The implementation also doesn't introduce that much new code (~200 lines, though with comments and optimizations that number will obviously grow).

In its current form it is a bit slower than select_nth_unstable on random input (though that might of course change with added optimizations), so I wouldn't (yet) consider always using it for selection, but for a fallback for introselect it looks very promising imo.

Dylan-DPC added a commit to Dylan-DPC/rust that referenced this issue Feb 19, 2023
…nieu

Update documentation of select_nth_unstable and select_nth_unstable_by to state O(n^2) complexity

See rust-lang#102451
@bors bors closed this as completed in fb45513 May 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants