[WIP] Improve GAP's sort functions #609

ChrisJefferson · 2016-02-10T14:57:18Z

Before looking at this patch, look at these soothing benchmarks. These each show a list, and how long it takes to show to sort it in master, and in this branch.

List([1..10000000], x -> Random([1..10000000])) : 4467 -> 1324
List([1..10000000], x -> Random([1..2]));; : 736 -> 200
List([1..10000000], x -> -x) : 895 -> 175
Concatenation([1..10000000], [50]) : 639 -> 383
l := [1..100000000] * 0;; : 6245 -> 1412

This patch aims to do 2 things:

Using some reasonably horrid C macros, remove the fact we have 8 cut+pasted copy of our sort routine.

The 8 sorts are all combinations of:

Sorting a plist and any list
Sorting with a predicate, or without
Sort and SortParallel

Replace the shell sort (which has served us well) with a more modern search, in this case 'pdqsort' which is a variant of introsort (which is a variant of quicksort), designed to make use of sorted sub-sequences fairly cheaply.

The way this works is we define a small language of macro functions which must be implemented to instantiate our sort.

The main question is, do people consider the macros worth the reduction in code duplication, or does anyone have a suggestion on how to handle this differently, before I apply final spit+polish? (The answer "turn the GAP kernel in C++" seems like too huge a step right now).

markuspf · 2016-02-10T15:36:11Z

This looks really good.

For some reason I thought that the previous sort was stable, but having a quick look around I can't seem to confirm this (ShellSort isn't stable).

I do actually think that we should be looking at least at templates from C++. Sure we can hack up most things using macros, but templates would make some things more clean (permutations, transformations, I am looking at you).

ChrisJefferson · 2016-02-10T15:38:03Z

Now I've written this, this would be an ideal time to add StableSort and StableSortParallel, if users wantwant them.

hulpke · 2016-02-10T16:50:42Z

I seem to remember vaguely that in some code in the past I did slightly dodgy things with `Sort' (using an order that is not really total), relying on the fact that the sort was stable. Thus I want a StableSort.

ChrisJefferson · 2016-02-10T17:00:34Z

@hulpke : Do you mean you want a stable sort adding, or you want GAP's sort to remain stable? Because currently, Sort is not stable. For example (gap 4.7.8) (I made this array by random search. Notice how the second members of the pairs are in order, then after sorting [1,10] comes before[1,3].

 ┌───────┐   GAP, Version 4.7.8 of 09-Jun-2015 (free software, GPL)
 │  GAP  │   http://www.gap-system.org
 └───────┘   Architecture: x86_64-apple-darwin15.0.0-gcc-default64
gap> l := [ [ 1, 1 ], [ 2, 2 ], [ 1, 3 ], [ 3, 4 ], [ 1, 5 ], [ 4, 6 ], [ 4, 7 ],
>   [ 3, 8 ], [ 5, 9 ], [ 1, 10 ], [ 4, 11 ], [ 5, 12 ], [ 3, 13 ], [ 1, 14 ],
>   [ 4, 15 ], [ 2, 16 ], [ 1, 17 ], [ 4, 18 ], [ 5, 19 ], [ 2, 20 ] ];;
gap> Sort(l, function(x,y) return x[1] < y[1]; end);
gap> l;
[ [ 1, 1 ], [ 1, 10 ], [ 1, 3 ], [ 1, 5 ], [ 1, 14 ], [ 1, 17 ], [ 2, 16 ], [ 2, 20 ], [ 2, 2 ],
  [ 3, 4 ], [ 3, 13 ], [ 3, 8 ], [ 4, 7 ],  [ 4, 11 ], [ 4, 6 ], [ 4, 15 ], [ 4, 18 ], [ 5, 9 ],
  [ 5, 19 ], [ 5, 12 ] ]

hulpke · 2016-02-10T17:08:59Z

@ChrisJefferson Apparently then I've been wrong before -- I erroneously thought shell sort was stable and that had been the reason for using it.

So my request would be to also have a stable sort.

fingolfin · 2016-02-10T17:55:09Z

Having a StableSort (and StableSortBy, and StableSortParallel) would indeed be useful.

As to the implementation: I can live with the macro tricks (I did something similar, though much simpler, for the pc collector code, after all). I'd clearly prefer to rewrite this using C++ templates, but as you say, this might be a bit too much right now.

Hmm, though, is it really? Couldn't we just compile individual files with C++, making sure to only invoke extern "C" functions from the them? Perhaps we should ask on the list if there are serious objections to this? I think the template code would be much easier to understand, even for people who don't know much or any C++, at least compared to the macro trickery employed here.

But don't get me wrong: since you already wrote the code, and it seems to work, I don't mind it being added for now, and rewritten (if desired) later on.

fingolfin · 2016-02-10T18:00:03Z

BTW, I think this is a super cool change you are making -- all my review comments are meant as constructive criticism, as always, I hope they get across that way. If anything sounds unreasonable, please let me know

fingolfin · 2016-02-10T18:01:25Z

After looking at some of the commits, I think you might want to consider squashing several of them.

fingolfin · 2016-02-10T18:10:48Z

src/sortbase.h

+                SORT_ASS_TEMP_TO_LIST( k, w );
+                k -= h;
+                if ( h+(start-1) < k ) {
+									SORT_ASS_LIST_TO_TEMP( w, k-h );


Bad indention (tabs?)

ChrisJefferson · 2016-02-10T20:09:00Z

As a general thing, I'll squish commits, and probably just run through something like clang-format (as it's a new file).

The C++, I think it's worth asking. I expect it to be rare nowadays for people to have a C compiler and not a C++ one (at least, it will be trivial for them to install one), and we can stick with C++03, which everyone with a C++ compiler will have.

hulpke · 2016-02-10T20:13:24Z

More a question than a remark: Can one compile some files with C and some with C++ and have a guarantee that everything still fits together?

ChrisJefferson · 2016-02-10T20:20:47Z

@hulpke : The short answer is yes. There is already C++ in kernel code in the json, profiling and semigroups packages (maybe just dev versions of semigroups), distributed with GAP. The only practical issue is teaching the configure script to go find a C++ compiler.

The (only slightly) longer answer is that you have to wrap the prototypes of C++ functions you want to call from C with extern "C" { ... }, which ensures that the name mangling is set to the C standard (and you can't use any C++ types in the arguments/etc).

@ChrisJefferson : there could be issues with libGAP properly wrapping up C++, although not impossible to resolve, of course. As long as ./configure is generated by autoconf, finding C++ compiler is trivial...

frankluebeck · 2016-02-11T11:14:17Z

@ChrisJefferson: This is nice!

I have made some more tests than those you mention in the first post, in particular with much shorter lists. There were only a few cases where the Sort from this change is not faster or even slightly slower than the old one, but many cases where it is much faster.

I remember that years ago we have made some comparisons of various sorting algorithms. The experience was that some algorithms were faster than the shell sort for certain types of input but (sometimes much) slower on other types (long vs. short lists, almost sorted vs reversely sorted vs pretty random, many different vs few different entries, ...). So, at that time our conclusion was a confirmation of the comment in src/listfunc.c that shell sort is a very good general purpose choice.

The pdqsort strategy is more complicated than the previous shell sort but seems to be an even better choice for general purpose sorting.

frankluebeck · 2016-02-11T11:22:01Z

I have needed a stable sort on a few occasions and used the following generic workaround:
Instead of

Sort(l);

I used

tmp := List([1..Length(l)], i-> [l[i], i]);;
Sort(tmp);
l := List(tmp, a-> a[1]);;

dimpase · 2016-02-11T12:47:46Z

instead of replying to a comment, I edited it, sorry @ChrisJefferson :-)

ChrisJefferson · 2016-02-11T13:33:47Z

Now featuring StableSort, StableSortBy and StableSortParallel, just missing documentation.

These methods are (not unexpectedly) slower than Sort by varying amounts, but still (usually) faster than @frankluebeck 's mapping to an array of pairs.

To get the speed higher, one could implement one of the modern stable sorts which is similar in design to pdqsort, such as timsort (used in Python, among other places). I leave such an implementation to anyone who wants to volunteer :)

fingolfin · 2016-02-14T14:17:59Z

Such an implementation can be done later, no need to delay things for it ;-)

There are test failures now -- interstingly, in the new small groups code I added. Perhaps this is a case of an unstable sort result differing between old and new sort implementation, leading to different choice being made in the algorithm? Hmmm

ChrisJefferson · 2016-02-14T15:25:45Z

In some quick experiments, the old sort is "more stable" in some cases, but nothing anyone could reliably count on.

rbehrends · 2016-02-14T15:36:30Z

Once concern that I have is that the choice of pdqsort may not be the best for GAP (on the other hand, mergesort is now available through StableSort()). Quicksort is fast when it comes to sorting integers (due its fast inner loop), but when comparisons are expensive and/or refer to data with poor locality (something we can expect in the general case, especially when a comparison function is written in GAP), then it (and worse, heapsort, which pdqsort/introsort fall back on to avoid the worst case) use a high number of comparisons compared to mergesort (though, obviously, mergesort can't do in-place sorting).

Some quick benchmarking seems to indicate that the fallback to insertion sort is counterproductive for mergesort; mergesort is faster without that, and can be further optimized by not merging already sorted ranges (I did not test a further optimization with handcoding the cases for ranges of length 1 or 2). For (short) integers, pdqsort performs better, while mergesort beats out pdqsort for strings, especially when those are partially sorted already.

ChrisJefferson · 2016-02-14T20:57:02Z

My mergesort should certainly be improved (python's timsort could be a good basis). If we can make a stable sort that's good enough for the general sort, that would avoid the need for two sorts.

ChrisJefferson · 2016-02-18T13:42:39Z

@fingolfin (or others) I would be interested to know if the change in StructureDescription appears to be a bug (in which case I will try to track it down), or it is just an alternative, valid, output (I don't understand StructureDescription well enough myself to be sure).

hulpke · 2016-02-18T14:38:47Z

@ChrisJefferson As far as I understand StructrureDescription some decisions (on which of multiple decompositions to take) are based on selecting from a sorted list. Also the newly obtained structure descriptions seem to be as valid as the expected ones, so I'm inclined to consider this as a harmless side-effect.
(I don't think it is a good idea to have StructureDescription in tests and check for explicit strings returned.)

hungaborhorvath · 2016-02-18T18:47:19Z

@ChrisJefferson I have checked and all three problematic StructureDescription results are completely valid, do not worry about them. One could argue which output of StructureDescription is the best one, but that is for another place and time and should not affect (and certainly should not delay) the merge of this PR.

(Not to mention that StructureDescription is being rewritten, and a lot of outputs will change in exchange for faster output. BTW, the code that uses the sort and caused the difference is itself a bit outdated at the moment and will in fact be replaced in a future PR.)

I suggest to circumvent the test by replacing the old values in lib/grpnames.g by the new values for the three groups for which the difference occurs. I do not mind to do it in another PR after this PR is merged, if that is the preference.

I think it is a good idea to have StructureDescription used in tests, it already helped to reveal one of my mistakes in my previous code. And when the change in the output is harmless (as in this case), we can simply adapt accordingly.

markuspf · 2016-02-18T18:56:50Z

(Not to mention that StructureDescription is being rewritten, and a lot of

Is this documented/discussed somewhere?

hungaborhorvath · 2016-02-18T20:13:34Z

Offtopic:
@markuspf Actually, no. I discussed my ideas with @Stefan-Kohl as he is the maintainer of grpnames.g*, and some discussion has been started at e.g. #160. There are already some PRs along making it more efficient, e.g. #379, #561, #563.

What would be the proper forum to open a discussion about this? To open an issue?

markuspf · 2016-02-18T20:52:18Z

On Thu, Feb 18, 2016 at 12:13:35PM -0800, hungaborhorvath wrote:

Offtopic:
@markuspf Actually, no. I discussed my ideas with @Stefan-Kohl as he is the
maintainer of grpnames.g*, and some discussion has been started at e.g. #160.
There are already some PRs along making it more efficient, e.g. #379, #561, #
563.

What would be the proper forum to open a discussion about this? To open an
issue?

If you are planning to make major changes to the GAP system the best way to go
about probably is to send an email to gap@gap-system.org (the open development
mailing list) and make people aware of your plans.

We hope that this way you will get early feedback on your project and (hopefully)
don't end up in a situation where you invested a lot of time into a project
and you're then turned down because people don't agree with what you have
been doing.

If you have code, the best thing to do is open a pull-request.

I don't think that discussions about design decisions are well-placed in the
issue tracker.

markuspf · 2016-02-25T14:54:24Z

Shall we update the tests then and go ahead with a merge?

stevelinton · 2016-02-25T14:58:41Z

@markuspf yes

markuspf · 2016-02-25T15:38:40Z

@ChrisJefferson this PR produces a lot of warnings about variables being set but not used. I am investigating, but assme this is due to the generecity of the macros?.

markuspf · 2016-02-25T16:24:10Z

Another thing that this turns up is of course the slight horror of reading the code to be able to understand/debug it. But since it took me only 15 minutes I hope that's not too much of a factor for most.

ChrisJefferson · 2016-02-26T08:29:25Z

I'm not seeing any warnings, did you fix them, or is my computer just not producing them?

markuspf · 2016-02-26T09:34:48Z

On Fri, Feb 26, 2016 at 12:29:26AM -0800, Christopher Jefferson wrote:

I'm not seeing any warnings, did you fix them, or is my computer just not
producing them?

You can see them in the travis test. I didn't fix them yet.

I get the follwoing warnings (compiling with gcc 5.3.1 [DragonFly]
Release/2015-12-04)

../../src/sortbase.h: In function 'SORT_PARA_LISTCompIndices':
../../src/sortbase.h:113:21: warning: variable 'us' set but not used
[-Wunused-but-set-variable]
   SORT_CREATE_LOCAL(u);
                     ^
../../src/listfunc.c:807:48: note: in definition of macro 'SORT_CREATE_LOCAL'
 #define SORT_CREATE_LOCAL(name) Obj name ; Obj name##s ;
                                                ^
../../src/sortbase.h:112:21: warning: variable 'ts' set but not used
[-Wunused-but-set-variable]
   SORT_CREATE_LOCAL(t);
                     ^
../../src/listfunc.c:807:48: note: in definition of macro 'SORT_CREATE_LOCAL'
 #define SORT_CREATE_LOCAL(name) Obj name ; Obj name##s ;
                                                ^
../../src/sortbase.h: In function 'SORT_PARA_LISTPartition':
../../src/sortbase.h:161:25: warning: variable 'listcpys' set but not used
[-Wunused-but-set-variable]
       SORT_CREATE_LOCAL(listcpy);
                         ^
../../src/listfunc.c:807:48: note: in definition of macro 'SORT_CREATE_LOCAL'
 #define SORT_CREATE_LOCAL(name) Obj name ; Obj name##s ;
                                                ^
../../src/sortbase.h:170:25: warning: variable 'listcpys' set but not used
[-Wunused-but-set-variable]
       SORT_CREATE_LOCAL(listcpy);
                         ^
../../src/listfunc.c:807:48: note: in definition of macro 'SORT_CREATE_LOCAL'
 #define SORT_CREATE_LOCAL(name) Obj name ; Obj name##s ;
                                                ^
../../src/sortbase.h:152:21: warning: variable 'pivots' set but not used
[-Wunused-but-set-variable]
   SORT_CREATE_LOCAL(pivot);
                     ^
../../src/listfunc.c:807:48: note: in definition of macro 'SORT_CREATE_LOCAL'
 #define SORT_CREATE_LOCAL(name) Obj name ; Obj name##s ;
                                                ^
../../src/sortbase.h: In function 'SortParaDensePlistCompIndices':
../../src/sortbase.h:113:21: warning: variable 'us' set but not used
[-Wunused-but-set-variable]
   SORT_CREATE_LOCAL(u);
                     ^
../../src/listfunc.c:828:48: note: in definition of macro 'SORT_CREATE_LOCAL'
 #define SORT_CREATE_LOCAL(name) Obj name ; Obj name##s ;

(There are more of them, but I think you see the point).

ChrisJefferson · 2016-02-27T23:01:55Z

Just for confirmation, these have been fixed.

olexandr-konovalov · 2016-02-27T23:03:10Z

@ChrisJefferson - thanks, just to confirm that nightly tests with -Werror option are now back to normal.

fingolfin reviewed Feb 10, 2016
View reviewed changes

ChrisJefferson force-pushed the sortimprove branch 2 times, most recently from 3760dbb to 1cdf7ed Compare February 11, 2016 13:31

ChrisJefferson added 3 commits February 18, 2016 13:20

Separate out sorting

6578b93

Add pdqsort

25cfbbe

Code reformat

1d34563

ChrisJefferson force-pushed the sortimprove branch from 1cdf7ed to 1d34563 Compare February 18, 2016 13:26

markuspf merged commit 1d34563 into gap-system:master Feb 25, 2016

ChrisJefferson deleted the sortimprove branch February 26, 2016 08:29

fingolfin mentioned this pull request Sep 7, 2017

Release notes for GAP 4.9 #1699

Closed

fingolfin mentioned this pull request Dec 7, 2017

Broken example in GAP master branch gap-packages/unitlib#6

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Improve GAP's sort functions #609

[WIP] Improve GAP's sort functions #609

ChrisJefferson commented Feb 10, 2016

markuspf commented Feb 10, 2016

ChrisJefferson commented Feb 10, 2016

hulpke commented Feb 10, 2016

ChrisJefferson commented Feb 10, 2016

hulpke commented Feb 10, 2016

fingolfin commented Feb 10, 2016

fingolfin commented Feb 10, 2016

fingolfin commented Feb 10, 2016

fingolfin Feb 10, 2016

ChrisJefferson commented Feb 10, 2016

hulpke commented Feb 10, 2016

ChrisJefferson commented Feb 10, 2016

frankluebeck commented Feb 11, 2016

frankluebeck commented Feb 11, 2016

dimpase commented Feb 11, 2016

ChrisJefferson commented Feb 11, 2016

fingolfin commented Feb 14, 2016

ChrisJefferson commented Feb 14, 2016

rbehrends commented Feb 14, 2016

ChrisJefferson commented Feb 14, 2016

ChrisJefferson commented Feb 18, 2016

hulpke commented Feb 18, 2016

hungaborhorvath commented Feb 18, 2016

markuspf commented Feb 18, 2016

hungaborhorvath commented Feb 18, 2016

markuspf commented Feb 18, 2016

markuspf commented Feb 25, 2016

stevelinton commented Feb 25, 2016

markuspf commented Feb 25, 2016

markuspf commented Feb 25, 2016

ChrisJefferson commented Feb 26, 2016

markuspf commented Feb 26, 2016

ChrisJefferson commented Feb 27, 2016

olexandr-konovalov commented Feb 27, 2016

[WIP] Improve GAP's sort functions #609

[WIP] Improve GAP's sort functions #609

Conversation

ChrisJefferson commented Feb 10, 2016

markuspf commented Feb 10, 2016

ChrisJefferson commented Feb 10, 2016

hulpke commented Feb 10, 2016

ChrisJefferson commented Feb 10, 2016

hulpke commented Feb 10, 2016

fingolfin commented Feb 10, 2016

fingolfin commented Feb 10, 2016

fingolfin commented Feb 10, 2016

fingolfin Feb 10, 2016

Choose a reason for hiding this comment

ChrisJefferson commented Feb 10, 2016

hulpke commented Feb 10, 2016

ChrisJefferson commented Feb 10, 2016

frankluebeck commented Feb 11, 2016

frankluebeck commented Feb 11, 2016

dimpase commented Feb 11, 2016

ChrisJefferson commented Feb 11, 2016

fingolfin commented Feb 14, 2016

ChrisJefferson commented Feb 14, 2016

rbehrends commented Feb 14, 2016

ChrisJefferson commented Feb 14, 2016

ChrisJefferson commented Feb 18, 2016

hulpke commented Feb 18, 2016

hungaborhorvath commented Feb 18, 2016

markuspf commented Feb 18, 2016

hungaborhorvath commented Feb 18, 2016

markuspf commented Feb 18, 2016

markuspf commented Feb 25, 2016

stevelinton commented Feb 25, 2016

markuspf commented Feb 25, 2016

markuspf commented Feb 25, 2016

ChrisJefferson commented Feb 26, 2016

markuspf commented Feb 26, 2016

ChrisJefferson commented Feb 27, 2016

olexandr-konovalov commented Feb 27, 2016