Release 1.12.0 #196

Morwenn · 2021-12-02T21:06:23Z

No description provided.

The old algorithm was an unsightly O(n² log n) algorithm, and a really cunning on at that since it returned in the blink of an eye (generally even faster than that) for most inputs, and only hit its worst case for a few inputs. The new algorithm is a O(n) algorithm described by T. Altman and Y. Igarashi in *Roughly Sorting: Sequential And Parallel Approach*. It runs in O(n) space and doesn't run as fast for most inputs, but it at least doesn't hide as many surprises. The old algorithm is still used when not enough memory is available, to ensure that the next 1.x.y release won't be a breaking change, but it will be removed in cpp-sort 2.0.0.

Only the O(n) algorithm works with bidirectional iterators, which means that the fallback to the O(n² log n) algorithm only works with random-access iterators.

Instead of computing all the values of DM and extracting the biggest one, we compute them on the fly and compare them to the current biggest element to avoid having to store O(n) values for DM.

This in turns makes the probe::par fallback algorithm run in O(n log n) time and O(1) space instead of O(n² log n) time. That change makes the fallback algorithm worth keeping no matter what since O(n log n) is reasonable enough a suprise - the deprecation warning has been removed accordingly. Thanks a lot to Control from The Studio for showing me that it was actually easy enough to make is_p_sorted O(n) and for giving me the necessary insights to make it so.

is_p_sorted and the O(n log n) algorithm relying on it were changed to work with forward iterators. probe::par now works with forward iterators, but can only use the O(n) algorithm for bidirectional iterators and their refinements.

In *Right invariant metrics and measures of presortedness*, Estivill-Castro, Mannila and Wood mention that Par(X)=Dis(X) for all X. After checking, it appears that probe::par and probe::dis do always return the same results despite having been implemented differently to match different definitions. Estivill-Castro and Wood then stop using Par(X) and use Dis(X) consistently in all their subsequent works, so it makes sense to drop probe::par and only keep probe::dis in cpp-sort. This commit deprecates probe::par - which will be removed in a future breaking release -, and changes probe::dis to use the current implementation of probe::par, which has a better time complexity. Thanks a lot to Control from The Studio, who first suggested that Dis and Par were the same measure of presortedness, and pushed me to look into it.

The array wasn't needed because the cumulative min can be computed oon the fly during the computation of DM. This new version computes LR then goes straight to a fused RL/DM algorithm without extra space. The new algorithm was more than twice as fast than the old one in benchmarks.

The value of Osc(X) is obtained by computing j - i, and can only grow through the course of the algorithm. Since i always decreases, if it becomes smaller or equal to the current value of Osc(X), then it means that we're done because we can't find a bigger Osc(X) than the current value.

The implementation called std::min and std::max on projected values without passing the comparators to them, leading compile-time errors or runtime errors depending on whether the results of the projected values were comparable, and whether it compared them correctly.

Thanks to Control from The Studio for the algorithm. The old one is kept for now to avoid breaking the case where not enough memory is available, but it is deprecated and will be removed in version 2.0.0.

The new implementation reduces the number of class template instantiations to 2 and reduces a bit the size of the binary in debug mode.

This would produce different template instantiations for every variation of fixed_buffer and dynamic_buffer, while they all provide a pointer and a size, which is all Wiki::sort cares about. Hoisting the buffer instantiation to the top-level wiki_sort function cuts the number of instantiations, which notably shrinks the size of the test suite by 2%.

The test suite used to instantiate a thread_local PRNG for every instantiation of dist::shuffled::operator() and shuffled_16_values. This commit introduces a single thread_local PRNG common to all distrubtions, which makes the whole TLS segment of the test suite executable disappear.

Replace it with a constexpr variable inside the function: in cpp-sort this parameter is always fully dependent on the Compare and Projection parameters.

When 'param' is a by-value function parameter, 'return param;' automatically moves, but not something like 'return param += 1;'. This commit avoids the inhibition of the automatic move by only using the name of the parameter in the return expression.

Only the declaration of scoped_memory_exhaustion and its member functions are needed in the header, put everything in the .cpp. This also seems to fix what is probably another thread_local issue with MinGW-w64 10.3.

Reorganize, simplify, add a graph to sum up the relations between the different components (and the DOT source code to draw it).

That type trait replaces the is_std_forward_list and is_std_list traits that were previously used. It is a much more generic and scalable solution to the problem.

stable_t<T> used T directly when it was a stable_adapter specialization instead of trying to use T::type when such a member type exists, which could inhibit some specializations. This change improves the stable_t in the presence of accidental stable_adapter nesting, which is more and more common as stable_t becomes the main consumer interface. This commit also updates the documentation accordingly, and splits the graph explaining stable adapters into two: one explaining how the stable_adapter works, and another one explaining what stable_t aliases.

This is mostly to harden the library and future dependencies against forgetting to add is_always_stable to every specialization. We keep the typedef in every specialization we ship to make sure that it's impossible to use incorrectly, but this specialization of sorter_traits makes it harder to get things accidentally wrong.

Instead of having a table, just recompute the numbers as needed, which is an inexpensive as two shifts and a subtraction per number. This makes no noticeable speed difference, scales more than the table solution, and reduces the size of the produce binaries.

merge_insertion_sort and slab_sort failed when sorting an empty collection for the same reason: they tried to create an empty fixed_size_list, which actually causes all knids of issues. The two sorters don't attempt to creat such a list anymore, and an assertion was added to the fixed_size_list constructor to ensure that it never happens.

I still don't understand what's wrong with the namespace-scope extern thread_local variables, but even though I wiggled through issues to make it work on other platforms, it doesn't work at all on MSVC. This commit uses a function instead to create a thread_local variable in a way similar to a Meyers singleton.

It was a remnant of old versions of Catch where passing an expression with commas outside of inner parentheses was not possible, which typically happened with explicit template parameters. The only ones left are due to other reason, such as Catch2 not wanting to decompose some expressions, the typical suspect being ||.

The library itself has compiled without warnings with /W2 for a long time, but the test suite was litered with conversion warnings that weren't errors but required some design tweaks in distributions. This commit solves those issues and finally makes the warnings disappear.

- Use xoshiro256** instead of mt19937_64 - Move random.h from the library to the test suite - Use the same two thread_local engines everywhere - Put thread_local engines in random.h (not in distributions) - Move thread_local engine definitions to random.cpp All those changes should hopefully simplify the handling of random numbers in the test suite, and possibly fix some of the -Winline warnings I constantly got with GCC, which I suspect were due either to the thread_local variables or to the size of the mt19937_64 engines.

Morwenn added 30 commits July 27, 2021 16:20

Add warning about future removal of O(n² log n) probe::par

5fcc6d1

Fix graph of partial ordering of MOPs in doc

c5f10ac

Make probe::par work with bidirectional iterators

952bc9c

Only the O(n) algorithm works with bidirectional iterators, which means that the fallback to the O(n² log n) algorithm only works with random-access iterators.

Reduce memory allocations in probe::par

c21d50b

Reduce memory use of probe::par

141efb2

Instead of computing all the values of DM and extracting the biggest one, we compute them on the fly and compare them to the current biggest element to avoid having to store O(n) values for DM.

Make probe::par work with forward iterators

729e319

is_p_sorted and the O(n log n) algorithm relying on it were changed to work with forward iterators. probe::par now works with forward iterators, but can only use the O(n) algorithm for bidirectional iterators and their refinements.

New ascending_duplicates distribution to simplify tests

d2b91a8

More/better documentation for MOPs

593d6a0

Add new MOP relation test

f9e440e

New O(n log n) implementation for probe::osc

77c4c75

Thanks to Control from The Studio for the algorithm. The old one is kept for now to avoid breaking the case where not enough memory is available, but it is deprecated and will be removed in version 2.0.0.

probe::dis: cosmetic change

b4f531e

Add another MOPs relation to the test suite

46ffc07

New measure of presortedness: block

2cf80ba

Replace std::enable_if_t with a custom implementation

6e8ebac

The new implementation reduces the number of class template instantiations to 2 and reduces a bit the size of the binary in debug mode.

Remove remnants of Valgrind support on MacOS

8c51c1b

Fix Windows Builds dabge in README

01e40e1

CI: bump Ubuntu version from 16.04 to 18.04

f867dab

Bump downloaded Catch2 version to v2.13.7

e40fb0a

Remove template parameter Branchless in pdqsort_loop

6ec0e86

Replace it with a constexpr variable inside the function: in cpp-sort this parameter is always fully dependent on the Compare and Projection parameters.

Fix links in documentation

ed3354d

Conditionally make kLeonardoNumbers inline

9e93f08

Slightly more consistent test names

b48fc6d

Morwenn added 29 commits November 1, 2021 13:17

Cleanup heap exhaustion testing code a bit

6f219f6

Only the declaration of scoped_memory_exhaustion and its member functions are needed in the header, put everything in the .cpp. This also seems to fix what is probably another thread_local issue with MinGW-w64 10.3.

Improve the documentation about stable adapters

ee0a894

Reorganize, simplify, add a graph to sum up the relations between the different components (and the DOT source code to draw it).

Remove redundant stable_adapter<T>::type

1cd671a

Add README example to examples directory

7f4711a

Introduce is_specialization_of

902dc43

That type trait replaces the is_std_forward_list and is_std_list traits that were previously used. It is a much more generic and scalable solution to the problem.

Deeper unwrapping of nested stable_adapter

7bbd843

Clarify stable_adapter documentation a bit

0dc7ab4

Let stable_t unwrap stable_adapter<StableSorter>

5ec423b

Improve sorter_traits documentation

30e0d3b

Fix links in documentation

8d54416

Test every sorter with empty/small collections

9b92259

CI: fix displaying Valgrind logs on failure

9c907d7

CI: fail when ctest finds no tests

c3ce2bb

Get rid of simple recursion in introselect

1385003

Compute bad_allowed directly in introselect

b8eee3a

Credit sources of random number algorithms

dae3c74

Give POSIX the middle finger

ecbfc67

Update measures of presortedness benchmark

5b19494

Fix benchmark includes

b5a159f

Preparing release 1.12.0

266707b

Morwenn merged commit f081ed4 into master Dec 2, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release 1.12.0 #196

Release 1.12.0 #196

Morwenn commented Dec 2, 2021

Release 1.12.0 #196

Release 1.12.0 #196

Conversation

Morwenn commented Dec 2, 2021