New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add initial implementation of 'sort_at_index' for slices. #55448

Open
wants to merge 10 commits into
base: master
from

Conversation

Projects
None yet
9 participants
@Mokosha

Mokosha commented Oct 28, 2018

This is an analog to C++'s std::nth_element (a.k.a. quickselect).

Corresponds to tracking bug #55300.

Add initial implementation of 'sort_at_index' for slices -- analog to…
… C++'s std::nth_element (a.k.a. quickselect)
@rust-highfive

This comment has been minimized.

Collaborator

rust-highfive commented Oct 28, 2018

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @bluss (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@TimNN

This comment has been minimized.

Contributor

TimNN commented Nov 6, 2018

Ping from triage @bluss / @rust-lang/libs: This PR requires your review.

@SimonSapin

This comment has been minimized.

Contributor

SimonSapin commented Nov 6, 2018

@Mokosha, can you say more about why this is useful? What’s an example of a situation where you’d use this?

@rust-lang/libs, thoughts on having this feature in the standard library? It’s apparently already available on crates.io rust-lang/rfcs#1470 (comment)

@abonander

This comment has been minimized.

Contributor

abonander commented Nov 6, 2018

@SimonSapin I have two separate use-cases in my own projects, both involving selecting the median of a dataset.

In the former I implemented quickselect myself because I couldn't find any crate implementing quickselect or any other kth-element algorithm when I wrote the code back in 2016. I didn't discover order-stat until much later because it didn't have the keywords I was searching for.

That's a major issue with saying "just use something from crates.io", sometimes people don't know what they're actually looking for or what keywords to search to find it. This is a fundamental enough operation that it should reside in the stdlib, where people are most likely to look first for solutions.

In the latter because the dataset is a small, constant size and I needed to copy it to preserve the original ordering (potentially not necessary, I already see a possible optimization), I decided to just use the stdlib's sort and index the set directly.

In both cases I would prefer a stdlib-provided implementation if it was available. In both cases it would probably be more optimal than the solutions I ended up going with, and importing an external crate just for one function at one use site in each case seemed like overkill (though in img_hash's case I wanted to try implementing it myself anyway).

Show resolved Hide resolved src/libcore/slice/mod.rs Outdated
@alexcrichton

This comment has been minimized.

Member

alexcrichton commented Nov 7, 2018

This seems reasonable to me to add, pending bikeshedding the name/returns/etc

Add some more notes to the documentation:
- Mention that the median can be found if we used `len() / 2`.
- Mention that this function is usually called "kth element" in other libraries.
Show resolved Hide resolved src/libcore/slice/mod.rs Outdated
@Mark-Simulacrum

This comment has been minimized.

Member

Mark-Simulacrum commented Nov 15, 2018

@Mokosha It looks like there's been a few review comments, can you take a look at those?

@Mokosha

This comment has been minimized.

Mokosha commented Nov 16, 2018

@Mokosha It looks like there's been a few review comments, can you take a look at those?

Yes. My apologies. I will take a look at these over the weekend.

Show resolved Hide resolved src/libcore/slice/mod.rs Outdated
Show resolved Hide resolved src/libcore/slice/mod.rs Outdated
Show resolved Hide resolved src/libcore/slice/sort.rs Outdated

Mokosha added some commits Nov 27, 2018

Address some comments in PR:
- Change wording on some of the documentation
- Change recursive function into a loop
@Mokosha

This comment has been minimized.

Mokosha commented Nov 27, 2018

I haven't tested the changes, as I'm running into an issue building rust_llvm. Based on the fact that there isn't a bug filed for it yet, I'm willing to bet it has to do with my setup. Making sure first before feeling confident about this PR, but I felt like I should address the comments here in the meantime.

@Mokosha

This comment has been minimized.

Mokosha commented Dec 2, 2018

All tests pass. :) Waiting for additional feedback.

@abonander

This comment has been minimized.

Contributor

abonander commented Dec 4, 2018

@Mokosha they'll want the commits to be squashed before they merge. I don't have review permissions.

@Mokosha

This comment has been minimized.

Mokosha commented Dec 6, 2018

Is this something I'm in charge of? (Do I create a new PR that squashes the commits?) I thought that was an option when doing the merge...

@abonander

This comment has been minimized.

Contributor

abonander commented Dec 6, 2018

Merges aren't done directly through Github, I don't think bors will squash.

You can do an interactive rebase against the parent commit of your first one, mark all subsequent commits as "fixup" and then force-push to your branch when the rebase is complete. The change in commits will be immediately reflected on the PR.

@bluss

This comment has been minimized.

Contributor

bluss commented Dec 9, 2018

This seems good! The name doesn't feel intuitive to me (but not the operation either, so maybe I haven't used it anywhere). Name was discussed in rust-lang/rfcs/issues/1470. I think I understand this operation better as partition_at_nth or similar than as a sort. Any drawback I missed with that name?

@Mokosha

This comment has been minimized.

Mokosha commented Dec 9, 2018

I don't know which name is the most intuitive, honestly. The reason I didn't choose anything with partition was because it doesn't convey the fact that the list is somewhat sorted around the point that's being partitioned. I would expect partition_at_nth to do the same thing that split_at does. If people prefer a different name, though, I don't have any strong feelings about it (within reason 😉).

@Mark-Simulacrum

This comment has been minimized.

Member

Mark-Simulacrum commented Dec 9, 2018

Perhaps sort_until? If I understand this correctly, it sorts the first N elements -- so either sort_until or perhaps sort_first would make sense?

@Mokosha

This comment has been minimized.

Mokosha commented Dec 9, 2018

@Mark-Simulacrum It doesn't quite sort the first N elements. Rather, it chooses an index i of the slice, and makes sure that:

  • The element at index i is the same as if the slice was sorted.
  • All elements at positions [0, i) are "less than" the element at index i, in no particular order.
  • All elements at positions [i + 1, n) are "greater than" the element at index i, in no particular order.
@Mark-Simulacrum

This comment has been minimized.

Member

Mark-Simulacrum commented Dec 9, 2018

Ah, okay. Then I think the partition naming is probably the best that I can think of -- and seems to jive okay with the other partition method on slices (https://doc.rust-lang.org/nightly/std/primitive.slice.html#method.partition_dedup).

If we do rename to the partition naming though I think changing the return type to (&mut [T], &mut T, &mut [T]) might be a good idea to provide immediate access to those three "parts" of the array. If the user doesn't want that, they can just not use the return type -- it should be essentially free to give it to them. It could also provide a good way of explaining the API.

@Mokosha

This comment has been minimized.

Mokosha commented Dec 11, 2018

OK -- good point about consistency. How about partition_at_index ?

Show resolved Hide resolved src/libcore/slice/sort.rs Outdated
Show resolved Hide resolved src/libcore/slice/sort.rs Outdated
@rust-highfive

This comment has been minimized.

Collaborator

rust-highfive commented Dec 15, 2018

The job x86_64-gnu-llvm-5.0 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
travis_time:end:014fe288:start=1544864775536698405,finish=1544864832701652752,duration=57164954347
$ git checkout -qf FETCH_HEAD
travis_fold:end:git.checkout

Encrypted environment variables have been removed for security reasons.
See https://docs.travis-ci.com/user/pull-requests/#pull-requests-and-security-restrictions
$ export SCCACHE_BUCKET=rust-lang-ci-sccache2
$ export SCCACHE_REGION=us-west-1
Setting environment variables from .travis.yml
$ export IMAGE=x86_64-gnu-llvm-5.0
---
travis_time:start:test_codegen
Check compiletest suite=codegen mode=codegen (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu)
[00:51:28] 
[00:51:28] running 121 tests
[00:51:30] i..ii...iii..iiii.....i...i..........i..iii.............i.....i......ii...i..i.ii..............i...i 100/121
[00:51:31] i..ii.i.....iiii.....
[00:51:31] 
[00:51:31]  finished in 3.278
[00:51:31] travis_fold:end:test_codegen

---
travis_time:start:test_debuginfo
Check compiletest suite=debuginfo mode=debuginfo-both (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu)
[00:51:45] 
[00:51:45] running 119 tests
[00:52:07] .iiiii...i.....i..i...i..i.i..i.ii..i.....i..i....i..........iiii.........i.i...ii...i.......ii.i.i. 100/119
[00:52:11] i......iii.i.....ii
[00:52:11] 
[00:52:11]  finished in 26.281
[00:52:11] travis_fold:end:test_debuginfo

---
[01:02:11] travis_fold:start:test_stage1-core
travis_time:start:test_stage1-core
Testing core stage1 (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu)
[01:02:11]    Compiling core v0.0.0 (/checkout/src/libcore)
[01:02:16] error: use of deprecated item 'rand::XorShiftRng': import from rand_xorshift crate instead
[01:02:16]      |
[01:02:16]      |
[01:02:16] 1097 |     let mut rng = XorShiftRng::from_entropy();
[01:02:16]      |
[01:02:16]      = note: `-D deprecated` implied by `-D warnings`
[01:02:16] 
[01:02:16] 
[01:02:16] error: use of deprecated item 'rand::XorShiftRng': import from rand_xorshift crate instead
[01:02:16]      |
[01:02:16]      |
[01:02:16] 1095 |     use rand::{FromEntropy, Rng, XorShiftRng};
[01:02:16] 

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

Address reviewer comments:
- Don't swap on each iteration when searching for min/max element.
- Add some docs about when we panic.
- Test that the sum of the lengths of the output matches the length of the input.
- Style fix for for-loop.
Show resolved Hide resolved src/libcore/slice/mod.rs Outdated
if is_less(&v[i], &v[index]) {
v.swap(index, i);
let mut min_index = 0;
for i in 1..(v.len()) {

This comment has been minimized.

@bluss

bluss Dec 15, 2018

Contributor

We can write this with an iterator over the vec, so we don't have to use indexing. iter + enumerate + min_by should work I think

This comment has been minimized.

@Mokosha

Mokosha Dec 15, 2018

I did this, but the code seems a bit uglier since I had to convert the is_less into something that returned Orderings. PTAL.

This comment has been minimized.

@bluss

bluss Dec 16, 2018

Contributor

Looking, sure, I agree with your taste, especially that we have the unwrap (because the sequence is known to be nonempty, if you have time, put in a comment above the unwrap to say this). Any iterator solution would be better than the old index based one, though.

.max_by_key would be a viable replacement to get rid of the Ordering, because you can just return the element reference as the key.

This comment has been minimized.

@Mokosha

Mokosha Dec 16, 2018

I don't follow about max_by_key. We need to use is_less to evaluate ordering, but it's a binary function, so we can't pass it to max_by_key. Could you elaborate?

@rust-highfive

This comment has been minimized.

Collaborator

rust-highfive commented Dec 15, 2018

The job x86_64-gnu-llvm-5.0 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
travis_time:end:00847854:start=1544869930868984920,finish=1544869988702588898,duration=57833603978
$ git checkout -qf FETCH_HEAD
travis_fold:end:git.checkout

Encrypted environment variables have been removed for security reasons.
See https://docs.travis-ci.com/user/pull-requests/#pull-requests-and-security-restrictions
$ export SCCACHE_BUCKET=rust-lang-ci-sccache2
$ export SCCACHE_REGION=us-west-1
Setting environment variables from .travis.yml
$ export IMAGE=x86_64-gnu-llvm-5.0
---
travis_time:start:test_codegen
Check compiletest suite=codegen mode=codegen (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu)
[00:51:46] 
[00:51:46] running 121 tests
[00:51:49] i..ii...iii..iiii.....i...i..........i..iii.............i.....i......ii...i..i.ii..............i...i 100/121
[00:51:49] i..ii.i.....iiii.....
[00:51:49] 
[00:51:49]  finished in 3.405
[00:51:49] travis_fold:end:test_codegen

---
travis_time:start:test_debuginfo
Check compiletest suite=debuginfo mode=debuginfo-both (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu)
[00:52:04] 
[00:52:04] running 119 tests
[00:52:26] .iiiii...i.....i..i...i..i.i..i.ii..i.....i..i....i..........iiii.........i.i...ii...i.......ii.i.i. 100/119
[00:52:30] i......iii.i.....ii
[00:52:30] 
[00:52:30]  finished in 26.472
[00:52:30] travis_fold:end:test_debuginfo

---
[01:02:36] travis_fold:start:test_stage1-core
travis_time:start:test_stage1-core
Testing core stage1 (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu)
[01:02:36]    Compiling core v0.0.0 (/checkout/src/libcore)
[01:02:40] error: use of deprecated item 'rand::XorShiftRng': import from rand_xorshift crate instead
[01:02:40]      |
[01:02:40]      |
[01:02:40] 1097 |     let mut rng = XorShiftRng::from_entropy();
[01:02:40]      |
[01:02:40]      = note: `-D deprecated` implied by `-D warnings`
[01:02:40] 
[01:02:40] 
[01:02:40] error: use of deprecated item 'rand::XorShiftRng': import from rand_xorshift crate instead
[01:02:40]      |
[01:02:40]      |
[01:02:40] 1095 |     use rand::{FromEntropy, Rng, XorShiftRng};
[01:02:40] 
4645780 .
3017408 ./obj
3002100 ./obj/build
---
182612 ./obj/build/x86_64-unknown-linux-gnu/test/ui
159932 ./obj/build/bootstrap/debug/incremental
153272 ./src/tools/clang
143832 ./obj/build/bootstrap/debug/incremental/bootstrap-2x7szixskz2uj
143828 ./obj/build/bootstrap/debug/incremental/bootstrap-2x7szixskz2uj/s-f7lzj03z7i-s93mg4-2fhvkcimqfl5d
136964 ./obj/build/x86_64-unknown-linux-gnu/stage1-rustc
128504 ./obj/build/x86_64-unknown-linux-gnu/stage1-rustc/x86_64-unknown-linux-gnu
128500 ./obj/build/x86_64-unknown-linux-gnu/stage1-rustc/x86_64-unknown-linux-gn/lib
68396 ./src/test

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@rust-highfive

This comment has been minimized.

Collaborator

rust-highfive commented Dec 15, 2018

The job x86_64-gnu-llvm-5.0 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
travis_time:end:0013ac16:start=1544898895283853520,finish=1544898948700520062,duration=53416666542
$ git checkout -qf FETCH_HEAD
travis_fold:end:git.checkout

Encrypted environment variables have been removed for security reasons.
See https://docs.travis-ci.com/user/pull-requests/#pull-requests-and-security-restrictions
$ export SCCACHE_BUCKET=rust-lang-ci-sccache2
$ export SCCACHE_REGION=us-west-1
Setting environment variables from .travis.yml
$ export IMAGE=x86_64-gnu-llvm-5.0
---
travis_time:start:test_codegen
Check compiletest suite=codegen mode=codegen (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu)
[00:51:37] 
[00:51:37] running 121 tests
[00:51:40] i..ii...iii..iiii.....i...i..........i..iii.............i.....i......ii...i..i.ii..............i...i 100/121
[00:51:40] i..ii.i.....iiii.....
[00:51:40] 
[00:51:40]  finished in 3.364
[00:51:40] travis_fold:end:test_codegen

---
travis_time:start:test_debuginfo
Check compiletest suite=debuginfo mode=debuginfo-both (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu)
[00:51:54] 
[00:51:54] running 119 tests
[00:52:17] .iiiii...i.....i..i...i..i.i..i.ii..i.....i..i....i..........iiii.........i.i...ii...i.......ii.i.i. 100/119
[00:52:21] i......iii.i.....ii
[00:52:21] 
[00:52:21]  finished in 26.476
[00:52:21] travis_fold:end:test_debuginfo

---
[01:02:30] travis_fold:start:test_stage1-core
travis_time:start:test_stage1-core
Testing core stage1 (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu)
[01:02:30]    Compiling core v0.0.0 (/checkout/src/libcore)
[01:02:34] error: use of deprecated item 'rand::XorShiftRng': import from rand_xorshift crate instead
[01:02:34]      |
[01:02:34]      |
[01:02:34] 1097 |     let mut rng = XorShiftRng::from_entropy();
[01:02:34]      |
[01:02:34]      = note: `-D deprecated` implied by `-D warnings`
[01:02:34] 
[01:02:34] 
[01:02:34] error: use of deprecated item 'rand::XorShiftRng': import from rand_xorshift crate instead
[01:02:34]      |
[01:02:34]      |
[01:02:34] 1095 |     use rand::{FromEntropy, Rng, XorShiftRng};
[01:02:34] 
4647684 .
3019384 ./obj
3004056 ./obj/build
---
182732 ./obj/build/x86_64-unknown-linux-gnu/test/ui
160528 ./obj/build/bootstrap/debug/incremental
153280 ./src/tools/clang
144428 ./obj/build/bootstrap/debug/incremental/bootstrap-2x7szixskz2uj
144424 ./obj/build/bootstrap/debug/incremental/bootstrap-2x7szixskz2uj/s-f7mctrj43q-11zvdmy-4zt1yysir9st
137068 ./obj/build/x86_64-unknown-linux-gnu/stage1-rustc
128608 ./obj/build/x86_64-unknown-linux-gnu/stage1-rustc/x86_64-unknown-linux-gnu
128604 ./obj/build/x86_64-unknown-linux-gnu/stage1-rustc/x86_64-unknown-linux-gn

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@bluss

This comment has been minimized.

Contributor

bluss commented Dec 16, 2018

The travis error indicates we have to update the test code to use a different Rng, for example SmallRng.

@rust-highfive

This comment has been minimized.

Collaborator

rust-highfive commented Dec 17, 2018

The job x86_64-gnu-llvm-5.0 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
travis_time:end:08a0fe58:start=1545000783692634956,finish=1545000838410597843,duration=54717962887
$ git checkout -qf FETCH_HEAD
travis_fold:end:git.checkout

Encrypted environment variables have been removed for security reasons.
See https://docs.travis-ci.com/user/pull-requests/#pull-requests-and-security-restrictions
$ export SCCACHE_BUCKET=rust-lang-ci-sccache2
$ export SCCACHE_REGION=us-west-1
Setting environment variables from .travis.yml
$ export IMAGE=x86_64-gnu-llvm-5.0
---
travis_time:start:test_codegen
Check compiletest suite=codegen mode=codegen (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu)
[00:51:47] 
[00:51:47] running 121 tests
[00:51:50] i..ii...iii..iiii.....i...i..........i..iii.............i.....i......ii...i..i.ii..............i...i 100/121
[00:51:50] i..ii.i.....iiii.....
[00:51:50] 
[00:51:50]  finished in 3.364
[00:51:50] travis_fold:end:test_codegen

---
travis_time:start:test_debuginfo
Check compiletest suite=debuginfo mode=debuginfo-both (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu)
[00:52:05] 
[00:52:05] running 119 tests
[00:52:27] .iiiii...i.....i..i...i..i.i..i.ii..i.....i..i....i..........iiii.........i.i...ii...i.......ii.i.i. 100/119
[00:52:31] i......iii.i.....ii
[00:52:31] 
[00:52:31]  finished in 26.212
[00:52:31] travis_fold:end:test_debuginfo

---
[01:06:53] .................................................................................................... 1700/2213
[01:07:04] .................................................................................................... 1800/2213
[01:07:16] .................................................................................................... 1900/2213
[01:07:28] .................................................................................................... 2000/2213
[01:07:43] ........................F........................................................................... 2100/2213
[01:07:58] .............
[01:07:58] failures:
[01:07:58] 
[01:07:58] ---- slice/mod.rs - slice::[T]::partition_at_index_by_key (line 1715) stdout ----
[01:07:58] ---- slice/mod.rs - slice::[T]::partition_at_index_by_key (line 1715) stdout ----
[01:07:58] error[E0658]: use of unstable library feature 'partition_at_index' (see issue #55300)
[01:07:58]  --> slice/mod.rs:1721:3
[01:07:58]   |
[01:07:58] 9 | v.partition_at_index_by_key(2, |a| a.abs());
[01:07:58]   |
[01:07:58]   |
[01:07:58]   = help: add #![feature(partition_at_index)] to the crate attributes to enable
[01:07:58] 
[01:07:58] thread 'slice/mod.rs - slice::[T]::partition_at_index_by_key (line 1715)' panicked at 'couldn't compile the test', src/librustdoc/test.rs:326:13
[01:07:58] 
[01:07:58] 
[01:07:58] failures:
[01:07:58]     slice/mod.rs - slice::[T]::partition_at_index_by_key (line 1715)
[01:07:58]     slice/mod.rs - slice::[T]::partition_at_index_by_key (line 1715)
[01:07:58] 
[01:07:58] test result: FAILED. 2209 passed; 1 failed; 3 ignored; 0 measured; 0 filtered out
[01:07:58] 
[01:07:58] error: test failed, to rerun pass '--doc'
[01:07:58] 
[01:07:58] 
[01:07:58] command did not execute successfully: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo" "test" "--target" "x86_64-unknown-linux-gnu" "-j" "4" "--release" "--locked" "--color" "always" "--features" "panic-unwind backtrace" "--manifest-path" "/checkout/src/libstd/Cargo.toml" "-p" "core" "--" "--quiet"
[01:07:58] 
[01:07:58] 
[01:07:58] failed to run: /checkout/obj/build/bootstrap/debug/bootstrap test
[01:07:58] Build completed unsuccessfully 

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@Mokosha

This comment has been minimized.

Mokosha commented Dec 17, 2018

Sorry about the build failure spam. I hadn't merged master in a while, and I got impatient waiting for doctests to run before adding changes. I'll make sure I have a clean build before uploading more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment