New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Release 1.12.0 #196
Merged
Release 1.12.0 #196
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The old algorithm was an unsightly O(n² log n) algorithm, and a really cunning on at that since it returned in the blink of an eye (generally even faster than that) for most inputs, and only hit its worst case for a few inputs. The new algorithm is a O(n) algorithm described by T. Altman and Y. Igarashi in *Roughly Sorting: Sequential And Parallel Approach*. It runs in O(n) space and doesn't run as fast for most inputs, but it at least doesn't hide as many surprises. The old algorithm is still used when not enough memory is available, to ensure that the next 1.x.y release won't be a breaking change, but it will be removed in cpp-sort 2.0.0.
Only the O(n) algorithm works with bidirectional iterators, which means that the fallback to the O(n² log n) algorithm only works with random-access iterators.
Instead of computing all the values of DM and extracting the biggest one, we compute them on the fly and compare them to the current biggest element to avoid having to store O(n) values for DM.
This in turns makes the probe::par fallback algorithm run in O(n log n) time and O(1) space instead of O(n² log n) time. That change makes the fallback algorithm worth keeping no matter what since O(n log n) is reasonable enough a suprise - the deprecation warning has been removed accordingly. Thanks a lot to Control from The Studio for showing me that it was actually easy enough to make is_p_sorted O(n) and for giving me the necessary insights to make it so.
is_p_sorted and the O(n log n) algorithm relying on it were changed to work with forward iterators. probe::par now works with forward iterators, but can only use the O(n) algorithm for bidirectional iterators and their refinements.
In *Right invariant metrics and measures of presortedness*, Estivill-Castro, Mannila and Wood mention that Par(X)=Dis(X) for all X. After checking, it appears that probe::par and probe::dis do always return the same results despite having been implemented differently to match different definitions. Estivill-Castro and Wood then stop using Par(X) and use Dis(X) consistently in all their subsequent works, so it makes sense to drop probe::par and only keep probe::dis in cpp-sort. This commit deprecates probe::par - which will be removed in a future breaking release -, and changes probe::dis to use the current implementation of probe::par, which has a better time complexity. Thanks a lot to Control from The Studio, who first suggested that Dis and Par were the same measure of presortedness, and pushed me to look into it.
The array wasn't needed because the cumulative min can be computed oon the fly during the computation of DM. This new version computes LR then goes straight to a fused RL/DM algorithm without extra space. The new algorithm was more than twice as fast than the old one in benchmarks.
The value of Osc(X) is obtained by computing j - i, and can only grow through the course of the algorithm. Since i always decreases, if it becomes smaller or equal to the current value of Osc(X), then it means that we're done because we can't find a bigger Osc(X) than the current value.
The implementation called std::min and std::max on projected values without passing the comparators to them, leading compile-time errors or runtime errors depending on whether the results of the projected values were comparable, and whether it compared them correctly.
Thanks to Control from The Studio for the algorithm. The old one is kept for now to avoid breaking the case where not enough memory is available, but it is deprecated and will be removed in version 2.0.0.
The new implementation reduces the number of class template instantiations to 2 and reduces a bit the size of the binary in debug mode.
This would produce different template instantiations for every variation of fixed_buffer and dynamic_buffer, while they all provide a pointer and a size, which is all Wiki::sort cares about. Hoisting the buffer instantiation to the top-level wiki_sort function cuts the number of instantiations, which notably shrinks the size of the test suite by 2%.
The test suite used to instantiate a thread_local PRNG for every instantiation of dist::shuffled::operator() and shuffled_16_values. This commit introduces a single thread_local PRNG common to all distrubtions, which makes the whole TLS segment of the test suite executable disappear.
Replace it with a constexpr variable inside the function: in cpp-sort this parameter is always fully dependent on the Compare and Projection parameters.
When 'param' is a by-value function parameter, 'return param;' automatically moves, but not something like 'return param += 1;'. This commit avoids the inhibition of the automatic move by only using the name of the parameter in the return expression.
Only the declaration of scoped_memory_exhaustion and its member functions are needed in the header, put everything in the .cpp. This also seems to fix what is probably another thread_local issue with MinGW-w64 10.3.
Reorganize, simplify, add a graph to sum up the relations between the different components (and the DOT source code to draw it).
That type trait replaces the is_std_forward_list and is_std_list traits that were previously used. It is a much more generic and scalable solution to the problem.
stable_t<T> used T directly when it was a stable_adapter specialization instead of trying to use T::type when such a member type exists, which could inhibit some specializations. This change improves the stable_t in the presence of accidental stable_adapter nesting, which is more and more common as stable_t becomes the main consumer interface. This commit also updates the documentation accordingly, and splits the graph explaining stable adapters into two: one explaining how the stable_adapter works, and another one explaining what stable_t aliases.
This is mostly to harden the library and future dependencies against forgetting to add is_always_stable to every specialization. We keep the typedef in every specialization we ship to make sure that it's impossible to use incorrectly, but this specialization of sorter_traits makes it harder to get things accidentally wrong.
Instead of having a table, just recompute the numbers as needed, which is an inexpensive as two shifts and a subtraction per number. This makes no noticeable speed difference, scales more than the table solution, and reduces the size of the produce binaries.
merge_insertion_sort and slab_sort failed when sorting an empty collection for the same reason: they tried to create an empty fixed_size_list, which actually causes all knids of issues. The two sorters don't attempt to creat such a list anymore, and an assertion was added to the fixed_size_list constructor to ensure that it never happens.
I still don't understand what's wrong with the namespace-scope extern thread_local variables, but even though I wiggled through issues to make it work on other platforms, it doesn't work at all on MSVC. This commit uses a function instead to create a thread_local variable in a way similar to a Meyers singleton.
It was a remnant of old versions of Catch where passing an expression with commas outside of inner parentheses was not possible, which typically happened with explicit template parameters. The only ones left are due to other reason, such as Catch2 not wanting to decompose some expressions, the typical suspect being ||.
The library itself has compiled without warnings with /W2 for a long time, but the test suite was litered with conversion warnings that weren't errors but required some design tweaks in distributions. This commit solves those issues and finally makes the warnings disappear.
- Use xoshiro256** instead of mt19937_64 - Move random.h from the library to the test suite - Use the same two thread_local engines everywhere - Put thread_local engines in random.h (not in distributions) - Move thread_local engine definitions to random.cpp All those changes should hopefully simplify the handling of random numbers in the test suite, and possibly fix some of the -Winline warnings I constantly got with GCC, which I suspect were due either to the thread_local variables or to the size of the mt19937_64 engines.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.