Implement sorting in place for OrderMap, OrderSet #57

bluss · 2018-01-03T21:22:04Z

Implement an in place sorting method for OrderMap (and OrderSet). We use a very neat trick and temporarily save away the hash for each entry somewhere else and store the old index in the hash field. The benefit is that we can use self.entries.sort_by directly after that, which is very fast.

I don't think the extra allocation is avoidable (but we could have unstable sort too, avoiding an allocation).

Benchmarks from the current version, sort is the sorting algorithm in this PR.

test ordermap_simple_sort_s                ... bench:   3,169,240 ns/iter (+/- 180,707)
test ordermap_simple_sort_u32              ... bench:     845,610 ns/iter (+/- 54,359)
test ordermap_sort_s                       ... bench:   2,550,010 ns/iter (+/- 279,423)
test ordermap_sort_u32                     ... bench:     678,828 ns/iter (+/- 10,313)

The maps have 10k key-value pairs. s is for OrderMap<String, String> and u32 is for OrderMap<u32, u32>. "~As you can see, the simple implementation wins when the key/value types are simple.~~ (Now fixed!)

bluss · 2018-01-03T21:24:25Z

Please read the last commits in the PR, the ones that implement sort_by and sort_keys

vitiral · 2018-01-03T22:42:41Z

I wish I could review this but (after looking at the code) I don't know enough about the inner workings of the data structure to make a comment. Thanks for doing this!

bluss · 2018-01-03T22:45:01Z

The improvements we could do with unsafe code are (I think) mostly about

Removing bounds checks for the indexing used across sort_by
Removing the second writes when applying the permutation. By that I mean the swaps; we move values with swap in safe Rust, instead of pushing one hole around. Pushing a hole halves the number of writes needed.

bluss · 2018-01-03T22:46:04Z

@vitiral Any review would be appreciated, of course! This implementation is actually rather simple -- it makes an identity permutation of the indices, then sorts that permutation with the user's comparison function (mapping each index to its key-value pair), and then we apply the permutation to the two parts of the hash table.

vitiral · 2018-01-03T22:47:04Z

the docstrings look good to me though. Some minor suggested tweaks:

+    /// Sort the map’s key-value pairs by the default ordering of the keys

Consider adding a period at the end.

Sort the map’s key-value pairs in place using the comparison
function `compare`; the comparison function receives two key and
value pairs to compare (so you can sort by keys or values).

Consider changing to:

    /// Sort the map's key-value pairs in place using the comparison·
    /// function.

    /// The comparison function receives two key and value pairs to compare, so you can sort by
    /// keys or values or both.

I prefer docs to have a single sentance "basic summary" and then "further description" below. I also added the "or both" description and removed the parenthesis.

(In addition I alligned by 100 columns which is the default in rust, feel free to do whatever alignment you want).

(Final edit: I realized that I aligned without the correct comment inlines. fixed)

vitiral · 2018-01-03T22:51:15Z

It's amazing to me that you can simply mutate the Pos objects to a different index value and "walah!" it is updated. I'm definitely not up to speed on how you can then quickly iterate in the correct order. That's pretty incredible!

bluss · 2018-01-03T22:56:43Z

Simplified for the common case where capacity fits in 32-bits:

self.indices: Box<[Pos]> is like the actual hash table. Each Pos there is one u32 with the index (into self.entries) and one u32 with the hash.
self.entries: Vec<Bucket<K, V>> is the Key-Value-(Hash) pairs (triplets) in their order.

To change the order in sort_by we first go through the "actual hash table" and update each Pos there and switch the old index to the new index.

Then we permutate self.entries so that it uses the new indexing.

When we iterate the ordermap in its order, we just completely overlook self.indices and we just iterate self.entries in order. That's why iteration is so fast.. it's just a mapped version of the contiguous slice iterator that Vec also uses.

vitiral

Gotta love quickcheck -- doesn't take a lot of tests and it is easy to feel confident that the library does a pretty decent job :)

vitiral · 2018-01-03T22:56:16Z

tests/quick.rs

+        let mut answer = keyvals.0;
+        answer.sort_by_key(|t| t.0);
+
+        // reverse dedup: Because OrderMap::from_iter keeps the last value for


good comment!

vitiral · 2018-01-03T23:00:23Z

I just realized, is removal pretty slow for this library since you have to remove an entry from self.entries which is a Vec?

The README states:

Removal is fast since it moves memory areas only in the first vector, and uses a single swap in the second vector.

But for a large number of elements, removing a lower index should be pretty slow right?

Honestly, fast removals isn't that useful IMO. The number of times I have to remove an item is rare.

vitiral · 2018-01-03T23:03:06Z

I get it now! You can apply the new indexes in only O(N) time since you just need to properly swap things in a single pass. Cool!

bluss · 2018-01-03T23:06:08Z

The slow order preserving removal is not implemented -- it's just a swap_replace removal.

OrderMap is not a perfect structure -- we can't have it all. Slow order preserving removal is also not what we want. So my general though is: we want another variant crate of ordermap, that is not indexable, uses tombstones and has O(1) order preserving removals. Current crate will stay as it is in general because usize-indexable hash maps are also very useful.

bluss · 2018-01-03T23:06:58Z

Yes the permutation algo is pretty wild. And it destroys the permutation vector in the process, for its own bookkeeping :-)

vitiral · 2018-01-03T23:08:40Z

ah, swap_replace -- that's right! I forgot that removal screws up the order.

I think you made a very reasonable choice in terms of the different technical options. This is an AMAZING library. In particular it makes is SO much easier to work with Maps and Sets while testing since I can control the order (and I don't have the real world performance penalty of a BTree). This feature will make that even easier!

clarfonthey · 2018-01-03T23:10:37Z

src/lib.rs

@@ -1110,6 +1110,65 @@ impl<K, V, S> OrderMap<K, V, S>
        });
    }

+    /// Sort the map’s key-value pairs by the default ordering of the keys.
+    pub fn sort_keys(&mut self)


Perhaps a sort_keys_by helper would be nice as well?

Additionally, would sort_values and sort_values_by be good for completion?

personal preference: I don't want API explosion and I doubt that would be used very often. Even when it is used, it would not have significant savings.

Benefit: if keys and values are comparable it might be possible to do an accidental footgun like:

h.sort_by(|(k1, k2, _, _)| k1.cmp(k2))

Needless to say, this would be annoying (IMO still not enough of a benefit to justify it though).

I don't want API explosion either. And we need to keep some headroom for the (inevitable?) sort_unstably family. Given the crate in question here, preferring stable sort seems obvious.

Be a good sport about it, but You said “Would be nice”! We should not fall into the would be nice trap. Gets in the way of the "this is my actual usecase" stuff 😄

bluss · 2018-01-03T23:23:55Z

src/lib.rs

+        new_index.sort_by(|&i, &j| {
+            let ei = &self.entries[i];
+            let ej = &self.entries[j];
+            compare(&ei.key, &ei.value, &ej.key, &ej.value)


I measured and decrease in sort_by benchmark time from using unchecked indexing is up to 10% here for getting ei, ej. Just FYI.

bluss · 2018-01-03T23:43:33Z

Well this is boring. In the benchmark vs the simple version that @vitiral wrote (using drain, sort, extend); the simple version wins.

100K key-value pairs key type: u32, value type: u32

test ordermap_simple_sort                  ... bench:  11,565,368 ns/iter (+/- 1,280,747)
test ordermap_sort                         ... bench:  15,780,304 ns/iter (+/- 1,037,502)

We can decide what to do about that tomorrow. Now you see, sorting Vecs in place is pretty nice. (I wonder what we need to do this faster in place??)

Can we change the rules to win? Key type String, Value type String; with 10K key-value pairs gives a slight preference to this PR instead

test ordermap_simple_sort                  ... bench: 167,297,667 ns/iter (+/- 17,713,737)
test ordermap_sort                         ... bench: 151,640,343 ns/iter (+/- 9,195,779)

bluss · 2018-01-04T01:27:01Z

Ping @stjepang! Maybe you can see & superficially understand the application of slice::sort_by in this PR and we can meditate a bit over this, how to make the in place sort of OrderMap efficient.

The indirection in the current state of this PR is not good for performance, I think that's my conclusion. I'd like to sort the self.entries in place, and still somehow also produce a record that shows which start index moved to which index in the result. (Which is then used to update the self.indices.)

My mind is still on something I think we have talked about before -- sorting two slices in lock step. Imagine taking Vec::from_iter(0..v.len()) and v itself and sorting them both in lock step with some comparison function over the elements of v. This wouldn't be hard to implement I guess, it would just be weird and some unfortunate code duplication over the existing slice::sort_by.

vitiral · 2018-01-04T01:43:59Z

That is boring! I certainly wasn't relying to implement the fastest method, haha

bluss · 2018-01-04T19:34:42Z

I appreciate the review! There are up to date benchmarks in the first post. They show the same thing as before just a bit more cleanly; the simple key/value type of u32 makes the simple sort implementation win.

For complex types it should be important that this PR's sort saves us any rehashing or hash comparisons.

The remaining performance challenge I can see here is not the lack of unchecked indexing or any simple fixes that we could do with unsafe code (but won't, now), but the actual sort_by call itself. To be fast, we need a regular sort by over self.entries, not via this indirection.

We still merge this PR because this functionality is not the fastest but very useful, and we can improve upon it later.

The comparison with the simple sort as of this post (now superseded with faster sort)

test ordermap_simple_sort_s                ... bench:   3,153,184 ns/iter (+/- 100,313)
test ordermap_simple_sort_u32              ... bench:     854,225 ns/iter (+/- 39,856)
test ordermap_sort_s                       ... bench:   2,955,574 ns/iter (+/- 187,388)
test ordermap_sort_u32                     ... bench:   1,024,853 ns/iter (+/- 45,365)

bluss · 2018-01-04T20:09:04Z

Whaaaat whaat wut wut there's a simple way to do the faster and in place-er sort_by.

The improvement of that new commit / new version of the sort by algo: (I may squash it later):

 name               63 ns/iter  62 ns/iter  diff ns/iter   diff % 
 ordermap_sort_s    2,962,953   2,500,124       -462,829  -15.62% 
 ordermap_sort_u32  1,022,244   671,821         -350,423  -34.28%

Now it's faster than the "simple" version too. 😄

…map.

- See implementation comments

bluss · 2018-01-04T20:55:48Z

Too awesome to sit around unreleased.

vitiral · 2018-01-04T21:20:07Z

ya, additional performance improvements can be done later. Glad you finally beat the simple version though 😄

ghost · 2018-01-04T21:47:37Z

@bluss Your solution is very nice - I think it doesn't get better than that. :)

By the way, I was thinking... do we need a crate similar to itertools but focused on slices, perhaps named slicetools? Here are some quick ideas what kind of methods it might provide:

// These are inspired by the STL in C++.
fn nth_element(&mut self, index: usize);
fn lower_bound(&self, &T) -> usize;
fn upper_bound(&self, &T) -> usize;
fn equal_range(&self, &T) -> (usize, usize);
fn next_permutation(&mut self) -> bool;
fn prev_permutation(&mut self) -> bool;
fn is_sorted(&self) -> bool;

// Similar to, but faster than `itertools::partition` (optimized for slices).
fn partition(&mut self, f: impl FnMut(&T) -> bool) -> usize;

// Merging.
fn merge(&mut self, mid: usize);
fn merge_in_place(&mut self, mid: usize);

// Lazy sorting.
fn sorted(self) -> impl Iterator<Item = T>;
fn sorted_unstable(self) -> impl Iterator<Item = T>;

// In-place version of stable sort (and compatible with `no_std`).
fn sort_in_place(&self);

// Sorting in lockstep.
fn sort_and_permute(&mut self, &mut [P]);
fn sort_unstable_and_permute(&mut self, &mut [P]);

clarfonthey · 2018-01-04T22:28:11Z

@stjepang I think that'd be lovely and I'd love to help out if you made a repo for it. Although I think that a better name might be sword or knife, as they both help with slices.

bluss · 2018-01-05T06:37:11Z

Slice tools would be lovely. Crate odds contains some dusty gizmos for it as well, including BlockedIter.

bluss · 2018-01-05T17:43:19Z

Can we use sort_unstably transparently in OrderSet.sort() and OrderMap.sort_keys()? The keys are after all guaranteed to be unique by their default ordering.

vitiral · 2018-01-05T17:44:24Z

I would certainly hope so. I hadn't considered that before.

bluss · 2018-01-05T17:54:55Z

Implementation is simple but it's not a slam dunk since it's not faster for both the implemented benchmarks. Not that we have very impressive benchmarks.

 name               63 ns/iter  62 ns/iter  diff ns/iter   diff % 
 ordermap_sort_s    2,523,155   2,638,570        115,415    4.57% 
 ordermap_sort_u32  659,370     356,677         -302,693  -45.91%

bluss · 2018-01-05T19:44:00Z

Benchmarks here: rust-lang/rust#40601

sorting strings is basically the case slice::sort_large_random_expensive because the benchmarks use uniform random order.

bluss · 2018-01-09T17:47:35Z

This will seem pretty random, but benchmarks compensated for clone speed. Current stable vs unstable sort and for an element count of 24. But so that I have the numbers on file somewhere.r

 ordermap_sort_s              857         988                  131   15.29%
 ordermap_sort_u32            343         248                  -95  -27.70%

bluss mentioned this pull request Jan 3, 2018

Implement in place sorting of OrderMap and OrderSet #55

Closed

bluss force-pushed the sort-in-place branch 2 times, most recently from 766c6aa to 434f3fc Compare January 3, 2018 22:25

bluss force-pushed the sort-in-place branch from 434f3fc to 2c722bc Compare January 3, 2018 22:52

vitiral approved these changes Jan 3, 2018

View reviewed changes

clarfonthey reviewed Jan 3, 2018

View reviewed changes

bluss force-pushed the sort-in-place branch from 2c722bc to 042cc85 Compare January 3, 2018 23:19

bluss commented Jan 3, 2018

View reviewed changes

bluss force-pushed the sort-in-place branch from 042cc85 to 5866900 Compare January 4, 2018 18:45

bluss added 4 commits January 4, 2018 20:29

FEAT: Implement in-place sorting with .sort_by()

3df49ba

FEAT: Add .sort_keys()

b1b71d6

TEST: Add benchmarks for .sort_by()

a274f61

FIX: Remove unused imports in benches/faststring.rs

c047493

bluss force-pushed the sort-in-place branch from 5866900 to c047493 Compare January 4, 2018 19:29

FEAT: Add OrderSet::sort,sort_by()

6dec9e4

bluss force-pushed the sort-in-place branch from f32a318 to c0c9800 Compare January 4, 2018 20:12

bluss added 2 commits January 4, 2018 21:15

TEST: In sort test, check that we can look up each key in the sorted …

beb7882

…map.

FEAT: Faster, in place-er, .sort_by for OrderMap (and OrderSet)

21ba2a8

- See implementation comments

bluss force-pushed the sort-in-place branch from c0c9800 to 21ba2a8 Compare January 4, 2018 20:15

DOC: Add time/space bound in doc for sort_by

7048e17

bluss changed the title ~~Implement sorting in place~~ Implement sorting in place for OrderMap, OrderSet Jan 4, 2018

bluss merged commit b43fa13 into master Jan 4, 2018

bluss deleted the sort-in-place branch January 4, 2018 20:55

ghost mentioned this pull request Mar 8, 2018

Add slice::sort_by_cached_key as a memoised sort_by_key rust-lang/rust#48639

Merged

Implement sorting in place for OrderMap, OrderSet #57

Implement sorting in place for OrderMap, OrderSet #57

Conversation

bluss commented Jan 3, 2018 • edited

bluss commented Jan 3, 2018 • edited

vitiral commented Jan 3, 2018

bluss commented Jan 3, 2018

bluss commented Jan 3, 2018 • edited

vitiral commented Jan 3, 2018 • edited

vitiral commented Jan 3, 2018

bluss commented Jan 3, 2018 • edited

vitiral left a comment

Choose a reason for hiding this comment

vitiral Jan 3, 2018

Choose a reason for hiding this comment

vitiral commented Jan 3, 2018

vitiral commented Jan 3, 2018

bluss commented Jan 3, 2018 • edited

bluss commented Jan 3, 2018

vitiral commented Jan 3, 2018

clarfonthey Jan 3, 2018

Choose a reason for hiding this comment

clarfonthey Jan 3, 2018

Choose a reason for hiding this comment

vitiral Jan 3, 2018

Choose a reason for hiding this comment

bluss Jan 3, 2018

Choose a reason for hiding this comment

bluss Jan 3, 2018 • edited

Choose a reason for hiding this comment

bluss Jan 3, 2018 • edited

Choose a reason for hiding this comment

bluss commented Jan 3, 2018 • edited

bluss commented Jan 4, 2018 • edited

vitiral commented Jan 4, 2018 via email

bluss commented Jan 4, 2018 • edited

bluss commented Jan 4, 2018 • edited

bluss commented Jan 4, 2018

vitiral commented Jan 4, 2018

ghost commented Jan 4, 2018 • edited by ghost

clarfonthey commented Jan 4, 2018

bluss commented Jan 5, 2018

bluss commented Jan 5, 2018

vitiral commented Jan 5, 2018

bluss commented Jan 5, 2018

bluss commented Jan 5, 2018

bluss commented Jan 9, 2018

bluss commented Jan 3, 2018 •

edited

bluss commented Jan 3, 2018 •

edited

bluss commented Jan 3, 2018 •

edited

vitiral commented Jan 3, 2018 •

edited

bluss commented Jan 3, 2018 •

edited

bluss commented Jan 3, 2018 •

edited

bluss Jan 3, 2018 •

edited

bluss Jan 3, 2018 •

edited

bluss commented Jan 3, 2018 •

edited

bluss commented Jan 4, 2018 •

edited

bluss commented Jan 4, 2018 •

edited

bluss commented Jan 4, 2018 •

edited

ghost commented Jan 4, 2018 •

edited by ghost