fix stable sort (4584) #787

Merged
merged 3 commits into from Dec 12, 2012

Conversation

Projects
None yet
6 participants
@DmitryOlshansky
Member

DmitryOlshansky commented Sep 11, 2012

Use Xinok's public domain Tim Sort for stable sort.

We had broken stable sort for 2 years now but I really need a working one for upcoming std.uni ;)

@andralex

View changes

std/algorithm.d
@@ -7295,7 +7298,7 @@ sort(alias less = "a < b", SwapStrategy ss = SwapStrategy.unstable,
{
static assert(false, "Invalid predicate passed to sort: "~less);
}
- return assumeSorted!less(r);
+ return assumeSorted!(less)(r);

This comment has been minimized.

Show comment Hide comment
@andralex

andralex Sep 17, 2012

Member

why change from good style to bad style?

@andralex

andralex Sep 17, 2012

Member

why change from good style to bad style?

@andralex

View changes

std/algorithm.d
+ }
+
+ swapAt(r, r.length - 1, lessI);
+ auto right = r[lessI + 1..r.length];

This comment has been minimized.

Show comment Hide comment
@andralex

andralex Sep 17, 2012

Member

spacing around operators please

@andralex

andralex Sep 17, 2012

Member

spacing around operators please

@andralex

View changes

std/algorithm.d
- if(lessI < greaterI)
+ auto left = r[0..min(lessI, greaterI + 1)];
+ if (right.length > left.length)

This comment has been minimized.

Show comment Hide comment
@andralex

andralex Sep 17, 2012

Member

alert alert alert - mixed style in the same module and by same author "if(" and "if (". Please use "if (" throughout, as is the case everywhere else in std.algorithm.

@andralex

andralex Sep 17, 2012

Member

alert alert alert - mixed style in the same module and by same author "if(" and "if (". Please use "if (" throughout, as is the case everywhere else in std.algorithm.

@andralex

View changes

std/algorithm.d
- if(lessI < greaterI)
+ auto left = r[0..min(lessI, greaterI + 1)];

This comment has been minimized.

Show comment Hide comment
@andralex

andralex Sep 17, 2012

Member

spacing around operators (throughout)

@andralex

andralex Sep 17, 2012

Member

spacing around operators (throughout)

@andralex

View changes

std/algorithm.d
++/
+
+// Tim Sort implementation
+template TimSortImpl(alias pred, R)

This comment has been minimized.

Show comment Hide comment
@andralex

andralex Sep 17, 2012

Member

make private

@andralex

andralex Sep 17, 2012

Member

make private

@andralex

View changes

std/algorithm.d
+ immutable minRun = minRunLength(range.length);
+ immutable minTemp = range.length / 2 < MIN_STORAGE ? range.length / 2 : MIN_STORAGE;
+ size_t minGallop = MIN_GALLOP;
+ Slice[40] stack = void;

This comment has been minimized.

Show comment Hide comment
@andralex

andralex Sep 17, 2012

Member

should 40 be a formal constant?

@andralex

andralex Sep 17, 2012

Member

should 40 be a formal constant?

@andralex

View changes

std/algorithm.d
+ if(stack[stackLen - 3].length <= stack[stackLen - 1].length)
+ {
+ mergeAt(range, stack[0 .. stackLen], stackLen - 3, minGallop, temp);
+ --stackLen;

This comment has been minimized.

Show comment Hide comment
@andralex

andralex Sep 17, 2012

Member

move to after if/else

@andralex

andralex Sep 17, 2012

Member

move to after if/else

@andralex

View changes

std/algorithm.d
+ else
+ {
+ mergeAt(range, stack[0 .. stackLen], stackLen - 2, minGallop, temp);
+ --stackLen;

This comment has been minimized.

Show comment Hide comment
@andralex

andralex Sep 17, 2012

Member

remove

@andralex

andralex Sep 17, 2012

Member

remove

@andralex

View changes

std/algorithm.d
+ if(stackLen >= 3 && stack[stackLen - 3].length <= stack[stackLen - 1].length)
+ {
+ mergeAt(range, stack[0 .. stackLen], stackLen - 3, minGallop, temp);
+ --stackLen;

This comment has been minimized.

Show comment Hide comment
@andralex

andralex Sep 17, 2012

Member

ditto

@andralex

View changes

std/algorithm.d
+ else
+ {
+ mergeAt(range, stack[0 .. stackLen], stackLen - 2, minGallop, temp);
+ --stackLen;

This comment has been minimized.

Show comment Hide comment
@andralex

andralex Sep 17, 2012

Member

and ditto

@andralex

andralex Sep 17, 2012

Member

and ditto

@andralex

View changes

std/algorithm.d
+ }
+ body
+ {
+ size_t lower, upper, center;

This comment has been minimized.

Show comment Hide comment
@andralex

andralex Sep 17, 2012

Member

move definitions down to initialization

@andralex

andralex Sep 17, 2012

Member

move definitions down to initialization

@andralex

View changes

std/algorithm.d
+ body
+ {
+ size_t lower, upper, center;
+ T o;

This comment has been minimized.

Show comment Hide comment
@andralex

andralex Sep 17, 2012

Member

no symbols should be called o, O, l, or I

@andralex

andralex Sep 17, 2012

Member

no symbols should be called o, O, l, or I

@andralex

View changes

std/algorithm.d
+ if(less(o, range[center])) upper = center;
+ else lower = center + 1;
+ }
+ for(upper = i; upper > lower; --upper) range[upper] = range[upper-1];

This comment has been minimized.

Show comment Hide comment
@andralex

andralex Sep 17, 2012

Member

copy() should work here

@andralex

andralex Sep 17, 2012

Member

copy() should work here

@andralex

View changes

std/algorithm.d
+ }
+ body
+ {
+ // Just some values ...

This comment has been minimized.

Show comment Hide comment
@andralex

andralex Sep 17, 2012

Member

well that comment doesn't quite help much

@andralex

andralex Sep 17, 2012

Member

well that comment doesn't quite help much

@andralex

This comment has been minimized.

Show comment Hide comment
@andralex

andralex Sep 17, 2012

Member

There are quite a few stylistic issues and unrealized opportunities for adding more useful algorithms. For now, however, let's focus on fixing the more obvious ones and getting this in. Thanks!

Member

andralex commented Sep 17, 2012

There are quite a few stylistic issues and unrealized opportunities for adding more useful algorithms. For now, however, let's focus on fixing the more obvious ones and getting this in. Thanks!

@DmitryOlshansky

This comment has been minimized.

Show comment Hide comment
@DmitryOlshansky

DmitryOlshansky Sep 17, 2012

Member

Okay, I knew I'm up for some work :)
I'll do it step by step starting with style.

Member

DmitryOlshansky commented Sep 17, 2012

Okay, I knew I'm up for some work :)
I'll do it step by step starting with style.

@DmitryOlshansky

This comment has been minimized.

Show comment Hide comment
@DmitryOlshansky

DmitryOlshansky Sep 17, 2012

Member

Fixed all of raised issues.

Member

DmitryOlshansky commented Sep 17, 2012

Fixed all of raised issues.

@Xinok

This comment has been minimized.

Show comment Hide comment
@Xinok

Xinok Sep 17, 2012

Contributor

I'm happy to see this coming into fruition. It looks bad for D when simple things like a stable sort are broken.

I suggest removing the "Adopted From" comment and replace it with something more proper. If you really want to give attribution, you could simply provide a link to the original code.

You could replace the "optimisticInsertionSort" with binary insertion sort, whether mine or your own, and call that instead. While it's a bit slower when sorting primitive types, it does far fewer comparisons which would benefit the unstable sort as well.

As for moveAll, I would avoid that altogether. The optimizer isn't really to blame; The implementation uses input range primitives which are just slower. Also, destructively moving each element for insertion will add much unneeded overhead, especially for larger types. Insertion sort is where the majority of time is spent and is integral for performance.

My email is in my profile if you need help with anything. I'm not sure how difficult it is to understand as I don't thoroughly comment my code.

Contributor

Xinok commented Sep 17, 2012

I'm happy to see this coming into fruition. It looks bad for D when simple things like a stable sort are broken.

I suggest removing the "Adopted From" comment and replace it with something more proper. If you really want to give attribution, you could simply provide a link to the original code.

You could replace the "optimisticInsertionSort" with binary insertion sort, whether mine or your own, and call that instead. While it's a bit slower when sorting primitive types, it does far fewer comparisons which would benefit the unstable sort as well.

As for moveAll, I would avoid that altogether. The optimizer isn't really to blame; The implementation uses input range primitives which are just slower. Also, destructively moving each element for insertion will add much unneeded overhead, especially for larger types. Insertion sort is where the majority of time is spent and is integral for performance.

My email is in my profile if you need help with anything. I'm not sure how difficult it is to understand as I don't thoroughly comment my code.

@DmitryOlshansky

This comment has been minimized.

Show comment Hide comment
@DmitryOlshansky

DmitryOlshansky Sep 18, 2012

Member

@Xinok About "Adopted From..." I conjured this note in a couple of minutes and not particularly fond of it myself.

Still IRC public domain license requires to explicilty distinguish altered versions and this one is getting some changes. Anyway I'm more then willing to put in any note that you as an author feel would be more appropriate.

optimisticInsertionSort vs binaryInsertionSort - looks like I'd need to play some more with them to have an idea of their performance characteristics on small inputs.

Now speaking of input range primitives and moveAll I completely disagree. If range primitives are to have any success they must be (almost) as fast as naked arrays. If not then we are doing something wrong and that leaves us with 2 options: ether redesign ranges or hack compiler to optimize them better.

And I'd argue that ranges as defined are good enough ground for optimal (or near optimal) performance as they are just compile-time hardwired thin wrappers for arrays unlike say GoF Iterator that has dynamic dispatch.

See this sample that includes 3 versions of shift right array to the right:
https://gist.github.com/3745627

And the disassembly of 3 relevant pieces here:
https://gist.github.com/3745686
(cmd line: dmd -g -O -inline -release -noboundscheck test.d)
Note: inclusion of symbolic info does affect timings, meaning that '-g' doesn't disable any optimizations.

Obviously compiler doesn't do a prticularly good job and happily reloads plenty of stuff from stack even in simpliest case. 2nd version doesn't have move inlined, poor fellow.
3rd version is unrolled but compiler again uses stack a lot (and~2x as much as in 1st version), including some obviously awful cases like swapping 2 registers through the stack.

Having looked at its source code of DMD inliner I can confidently say: it's not good enough. For instance it won't ever optimize functions with multiple returns and in fact move currently has an early return and thus is not inlined.
All in all, it tries to rewrite everything into a kind of long comma-expressions and a lot of stuff can't be converted that easily.

Member

DmitryOlshansky commented Sep 18, 2012

@Xinok About "Adopted From..." I conjured this note in a couple of minutes and not particularly fond of it myself.

Still IRC public domain license requires to explicilty distinguish altered versions and this one is getting some changes. Anyway I'm more then willing to put in any note that you as an author feel would be more appropriate.

optimisticInsertionSort vs binaryInsertionSort - looks like I'd need to play some more with them to have an idea of their performance characteristics on small inputs.

Now speaking of input range primitives and moveAll I completely disagree. If range primitives are to have any success they must be (almost) as fast as naked arrays. If not then we are doing something wrong and that leaves us with 2 options: ether redesign ranges or hack compiler to optimize them better.

And I'd argue that ranges as defined are good enough ground for optimal (or near optimal) performance as they are just compile-time hardwired thin wrappers for arrays unlike say GoF Iterator that has dynamic dispatch.

See this sample that includes 3 versions of shift right array to the right:
https://gist.github.com/3745627

And the disassembly of 3 relevant pieces here:
https://gist.github.com/3745686
(cmd line: dmd -g -O -inline -release -noboundscheck test.d)
Note: inclusion of symbolic info does affect timings, meaning that '-g' doesn't disable any optimizations.

Obviously compiler doesn't do a prticularly good job and happily reloads plenty of stuff from stack even in simpliest case. 2nd version doesn't have move inlined, poor fellow.
3rd version is unrolled but compiler again uses stack a lot (and~2x as much as in 1st version), including some obviously awful cases like swapping 2 registers through the stack.

Having looked at its source code of DMD inliner I can confidently say: it's not good enough. For instance it won't ever optimize functions with multiple returns and in fact move currently has an early return and thus is not inlined.
All in all, it tries to rewrite everything into a kind of long comma-expressions and a lot of stuff can't be converted that easily.

@DmitryOlshansky

This comment has been minimized.

Show comment Hide comment
@DmitryOlshansky

DmitryOlshansky Sep 18, 2012

Member

Bottom line:
In general I believe it IS a compiler issue but there could be a few minor faults in Phobos:
For instance after fixing move for builtins (removing branch) I finally see identical code for fn1 vs fn2 in test sample.
The good thing about move is that it incapsulates proper logic for postblit/destructors.

Member

DmitryOlshansky commented Sep 18, 2012

Bottom line:
In general I believe it IS a compiler issue but there could be a few minor faults in Phobos:
For instance after fixing move for builtins (removing branch) I finally see identical code for fn1 vs fn2 in test sample.
The good thing about move is that it incapsulates proper logic for postblit/destructors.

@Xinok

This comment has been minimized.

Show comment Hide comment
@Xinok

Xinok Sep 20, 2012

Contributor

Considering that arrays are a pointer and length pair, input ranges have to increment the pointer and decrement the length for each iteration. Even when iterating in reverse (bidirectional ranges), although it only has to decrement the length, getting the back element is arr[$-1]. Optimizers aren't magic that can whisk away any inefficiency. The programmer must still put in effort to manually optimize their code.

The best thing to do is use foreach loops when possible. In this way, the compiler can choose the optimal method for iteration without the programmer writing specialized code for different types. In this case, the most efficient way to iterate an array is using pointers.

Personally, I optimize my code empirically. DMD is an odd compiler, things that should be faster in theory actually turn out to be slower. So in cases where a foreach loop should be optimal, I use a for loop instead because it proved to be faster in benchmarks.

I pushed out an update for Timsort today. Read the comment to understand what I changed and why.

I also wrote this. I tried to optimize insertion sort using Duff's Device, but again, it failed to be any faster in benchmarks.

As for attribution, just include a link to the original code. This is sort of what is done already; std.algorithm:sort links to STL's sort and stable sort.

Contributor

Xinok commented Sep 20, 2012

Considering that arrays are a pointer and length pair, input ranges have to increment the pointer and decrement the length for each iteration. Even when iterating in reverse (bidirectional ranges), although it only has to decrement the length, getting the back element is arr[$-1]. Optimizers aren't magic that can whisk away any inefficiency. The programmer must still put in effort to manually optimize their code.

The best thing to do is use foreach loops when possible. In this way, the compiler can choose the optimal method for iteration without the programmer writing specialized code for different types. In this case, the most efficient way to iterate an array is using pointers.

Personally, I optimize my code empirically. DMD is an odd compiler, things that should be faster in theory actually turn out to be slower. So in cases where a foreach loop should be optimal, I use a for loop instead because it proved to be faster in benchmarks.

I pushed out an update for Timsort today. Read the comment to understand what I changed and why.

I also wrote this. I tried to optimize insertion sort using Duff's Device, but again, it failed to be any faster in benchmarks.

As for attribution, just include a link to the original code. This is sort of what is done already; std.algorithm:sort links to STL's sort and stable sort.

@DmitryOlshansky

This comment has been minimized.

Show comment Hide comment
@DmitryOlshansky

DmitryOlshansky Sep 20, 2012

Member

Considering that arrays are a pointer and length pair, input ranges have to increment the pointer and decrement the length for each iteration. Even when iterating in reverse (bidirectional ranges), although it only has to decrement the length, getting the back element is arr[$-1].

Okay, I got it. What could be improved is rewriting copy/moveAll for random access ranges so that it can use single index loop instead of doing 2 popFronts. Experimenting this way I'm able to get near identical performance. I suspect it could be applied to other similar algorithms benefiting even more code.

To make it clear - I'm not opposed to optimizing by hand, I'm just in favor of reducing that amount and allow greater exposure of optimization. It's all about reuse and impact - if we are to make moveAll as fast as technically possible for random access ranges it would benefit a much broader audience then if we just hack stable sort to be faster.

Either way I see that even straight loop could use more efficient code. There has to be an optimized copy backwards primitive like memcpy but typesafe (a-la array ops).

Optimizers aren't magic that can whisk away any inefficiency. The programmer must still put in effort to manually optimize their code.

I'm mostly aware of what they can and can't do. I recall being surprised by both how much they can do and how much they can't. The good read is Agner's Fog optimization tutorials http://www.agner.org/optimize/.

Personally, I optimize my code empirically. DMD is an odd compiler, things that should be faster in theory actually turn out to be slower.

If there is anything stable it is "look at disassembly" method. I found it to be quite revealing.

So in cases where a foreach loop should be optimal, I use a for loop instead because it proved to be faster in benchmarks.

BTW I've found it myself on numerous occasions. It sounds like a bug.

As for attribution, just include a link to the original code. This is sort of what is done already; std.algorithm:sort links to STL's sort and stable sort.

Will do. Though in this case it was a pointer to how analogous C++ function works not that implementation was taken from STL.

Member

DmitryOlshansky commented Sep 20, 2012

Considering that arrays are a pointer and length pair, input ranges have to increment the pointer and decrement the length for each iteration. Even when iterating in reverse (bidirectional ranges), although it only has to decrement the length, getting the back element is arr[$-1].

Okay, I got it. What could be improved is rewriting copy/moveAll for random access ranges so that it can use single index loop instead of doing 2 popFronts. Experimenting this way I'm able to get near identical performance. I suspect it could be applied to other similar algorithms benefiting even more code.

To make it clear - I'm not opposed to optimizing by hand, I'm just in favor of reducing that amount and allow greater exposure of optimization. It's all about reuse and impact - if we are to make moveAll as fast as technically possible for random access ranges it would benefit a much broader audience then if we just hack stable sort to be faster.

Either way I see that even straight loop could use more efficient code. There has to be an optimized copy backwards primitive like memcpy but typesafe (a-la array ops).

Optimizers aren't magic that can whisk away any inefficiency. The programmer must still put in effort to manually optimize their code.

I'm mostly aware of what they can and can't do. I recall being surprised by both how much they can do and how much they can't. The good read is Agner's Fog optimization tutorials http://www.agner.org/optimize/.

Personally, I optimize my code empirically. DMD is an odd compiler, things that should be faster in theory actually turn out to be slower.

If there is anything stable it is "look at disassembly" method. I found it to be quite revealing.

So in cases where a foreach loop should be optimal, I use a for loop instead because it proved to be faster in benchmarks.

BTW I've found it myself on numerous occasions. It sounds like a bug.

As for attribution, just include a link to the original code. This is sort of what is done already; std.algorithm:sort links to STL's sort and stable sort.

Will do. Though in this case it was a pointer to how analogous C++ function works not that implementation was taken from STL.

@DmitryOlshansky

This comment has been minimized.

Show comment Hide comment
@DmitryOlshansky

DmitryOlshansky Sep 21, 2012

Member

I also wrote this. I tried to optimize insertion sort using Duff's Device, but again, it failed to be any faster in benchmarks.

I suspect it's because every step is dependent on the next one (i.e. all of these upper--). What helps a bit is partially unrolling the loop by factor of 2-4 and using Duff's Device to handle first length % factor.

Okay, I think I'd better leave fine-tuning to some time later e.g. when we have std.benchmark ;)
As it is now it works within couple of % of original that should be good enough.

UPDATE: looking over disassembly shows the move+retro code should be indeed slower, as it has more complex inner loop. So I revert to stright for-loop again :)

I added source link, see if it looks okay. The only thing left should be your latest enhancements to TimSort. I'll pull these in soon.

Member

DmitryOlshansky commented Sep 21, 2012

I also wrote this. I tried to optimize insertion sort using Duff's Device, but again, it failed to be any faster in benchmarks.

I suspect it's because every step is dependent on the next one (i.e. all of these upper--). What helps a bit is partially unrolling the loop by factor of 2-4 and using Duff's Device to handle first length % factor.

Okay, I think I'd better leave fine-tuning to some time later e.g. when we have std.benchmark ;)
As it is now it works within couple of % of original that should be good enough.

UPDATE: looking over disassembly shows the move+retro code should be indeed slower, as it has more complex inner loop. So I revert to stright for-loop again :)

I added source link, see if it looks okay. The only thing left should be your latest enhancements to TimSort. I'll pull these in soon.

@DmitryOlshansky

This comment has been minimized.

Show comment Hide comment
@DmitryOlshansky

DmitryOlshansky Sep 22, 2012

Member

Folded in. Interestingly I do observe slight speed up compared to a version of Tim Sort that I checked out from your repo previously.

Member

DmitryOlshansky commented Sep 22, 2012

Folded in. Interestingly I do observe slight speed up compared to a version of Tim Sort that I checked out from your repo previously.

@Xinok

View changes

std/algorithm.d
+ immutable minTemp = max(range.length / 2, MIN_STORAGE);
+ size_t minGallop = MIN_GALLOP;
+ Slice[STACK_SIZE] stack = void;
+ size_t stackLen = 0;

This comment has been minimized.

Show comment Hide comment
@Xinok

Xinok Sep 24, 2012

Contributor

Line up these declarations or remove the extra spaces.

@Xinok

Xinok Sep 24, 2012

Contributor

Line up these declarations or remove the extra spaces.

@DmitryOlshansky

This comment has been minimized.

Show comment Hide comment
@DmitryOlshansky

DmitryOlshansky Sep 26, 2012

Member

@Xinok Done.

@andralex Ready for another round of destruction.

Member

DmitryOlshansky commented Sep 26, 2012

@Xinok Done.

@andralex Ready for another round of destruction.

@andralex

View changes

std/algorithm.d
{
- put(target, source.front);
+ auto len = source.length;
+ for (size_t idx = 0; idx < len; idx++)

This comment has been minimized.

Show comment Hide comment
@andralex

andralex Sep 29, 2012

Member

foreach

@andralex

andralex Sep 29, 2012

Member

foreach

@andralex

View changes

std/algorithm.d
+ }
+ else
+ {
+ break;

This comment has been minimized.

Show comment Hide comment
@andralex

andralex Sep 29, 2012

Member

redundant control flow. Use

if (lessI >= greaterI)
{
    break;
}
swapAt(r, lessI, greaterI);
@andralex

andralex Sep 29, 2012

Member

redundant control flow. Use

if (lessI >= greaterI)
{
    break;
}
swapAt(r, lessI, greaterI);
@andralex

View changes

std/algorithm.d
+ enum MIN_GALLOP = 7;
+ enum MIN_STORAGE = 256;
+ enum STACK_SIZE = 40;
+

This comment has been minimized.

Show comment Hide comment
@andralex

andralex Sep 29, 2012

Member

these all caps enums are so C. Use regular symbol notation?

@andralex

andralex Sep 29, 2012

Member

these all caps enums are so C. Use regular symbol notation?

This comment has been minimized.

Show comment Hide comment
@DmitryOlshansky

DmitryOlshansky Oct 5, 2012

Member

Actually I find them convinient to distinguish true variables and manifest constants.
Esp as there is a variable named minGallop that starts with MIN_GALLOP value and changes over time.

@DmitryOlshansky

DmitryOlshansky Oct 5, 2012

Member

Actually I find them convinient to distinguish true variables and manifest constants.
Esp as there is a variable named minGallop that starts with MIN_GALLOP value and changes over time.

+ {
+ if (__ctfe) temp.length = minTemp;
+ else temp = uninitializedArray!(T[])(minTemp);
+ }

This comment has been minimized.

Show comment Hide comment
@andralex

andralex Sep 29, 2012

Member

Since the user can't know what minTemp is, would it make sense to pass tmp by reference? That way the changes in length will persist after the call to sort.

@andralex

andralex Sep 29, 2012

Member

Since the user can't know what minTemp is, would it make sense to pass tmp by reference? That way the changes in length will persist after the call to sort.

This comment has been minimized.

Show comment Hide comment
@Xinok

Xinok Sep 30, 2012

Contributor

If an extra parameter is added to std.algorithm:sort to allow such a thing, and as long as benchmarks are unaffected. An alternative is to return temp, since this function currently doesn't return a value.

@Xinok

Xinok Sep 30, 2012

Contributor

If an extra parameter is added to std.algorithm:sort to allow such a thing, and as long as benchmarks are unaffected. An alternative is to return temp, since this function currently doesn't return a value.

@andralex

View changes

std/algorithm.d
- while(pred(r[++lessI], pivot)) {}
- while(greaterI > 0 && pred(pivot, r[--greaterI])) {}
+ if (stackLen >= 3
+ && stack[stackLen - 3].length <= stack[stackLen - 2].length + stack[stackLen - 1].length)

This comment has been minimized.

Show comment Hide comment
@andralex

andralex Sep 29, 2012

Member

unusual alignment, use two indents instead

@andralex

andralex Sep 29, 2012

Member

unusual alignment, use two indents instead

@andralex

View changes

std/algorithm.d
+ if (stackLen >= 3
+ && stack[stackLen - 3].length <= stack[stackLen - 2].length + stack[stackLen - 1].length)
+ {
+ size_t at;

This comment has been minimized.

Show comment Hide comment
@andralex

andralex Sep 29, 2012

Member

use ?: to initialize

@andralex

andralex Sep 29, 2012

Member

use ?: to initialize

This comment has been minimized.

Show comment Hide comment
@andralex

andralex Sep 29, 2012

Member

to clarify:

immutable at = stack[stackLen - 3].length <= stack[stackLen - 1].length
    ? stackLen - 3
    : stackLen - 2;
@andralex

andralex Sep 29, 2012

Member

to clarify:

immutable at = stack[stackLen - 3].length <= stack[stackLen - 1].length
    ? stackLen - 3
    : stackLen - 2;
@andralex

View changes

std/algorithm.d
+ mergeAt(range, stack[0 .. stackLen], stackLen - 2, minGallop, temp);
+ --stackLen;
+ }
+ else break;

This comment has been minimized.

Show comment Hide comment
@andralex

andralex Sep 29, 2012

Member

There's a fair amount of uses for stack[stackLen - 3], stack[stackLen - 2], and stack[stackLen - 1]. How about defining locals third, second, and top?

@andralex

andralex Sep 29, 2012

Member

There's a fair amount of uses for stack[stackLen - 3], stack[stackLen - 2], and stack[stackLen - 1]. How about defining locals third, second, and top?

This comment has been minimized.

Show comment Hide comment
@DmitryOlshansky

DmitryOlshansky Oct 1, 2012

Member

Sounds good. Though after call to mergeAt they need to be reloaded as it changes stack.

I increasingly suspect that Tim Sort is best described as a stateful process.
A state consists of stack, temporary space and initial array to sort. And most of helper functions inevitably access this state in various ways. The swath of ref parameters seem to underline this point.

Then top, second, third are then methods of said intermediate object. I'll try to re-write it this way and see

  • if it helps readability/removes redundancy
  • if it has any negative effects on performance
@DmitryOlshansky

DmitryOlshansky Oct 1, 2012

Member

Sounds good. Though after call to mergeAt they need to be reloaded as it changes stack.

I increasingly suspect that Tim Sort is best described as a stateful process.
A state consists of stack, temporary space and initial array to sort. And most of helper functions inevitably access this state in various ways. The swath of ref parameters seem to underline this point.

Then top, second, third are then methods of said intermediate object. I'll try to re-write it this way and see

  • if it helps readability/removes redundancy
  • if it has any negative effects on performance

This comment has been minimized.

Show comment Hide comment
@Xinok

Xinok Oct 1, 2012

Contributor

Defining locals sounds like a good idea, but be cautious of the naming. When merging, Timsort deals with three runs at a time at the end of the stack. So when naming these locals, be sure it reflects:
Run1 = stackLen - 3
Run2 = stackLen - 2
Run3 = stackLen - 1

As for stateful process, I'm not sure exactly what you're proposing, but it doesn't sound good. I wrote my code for maximum efficiency. That meant exclusively using static functions with references, but I used references as minimally as possible. This can be seen in mergeLo and mergeHi, which return the new value of minGallop, since accessing this value through a reference would be detrimental to performance.

I don't know what effect it would have if you wrote it as a class, or stored the "state" in a struct. But I originally tried using nested functions, and the performance was awful.

@Xinok

Xinok Oct 1, 2012

Contributor

Defining locals sounds like a good idea, but be cautious of the naming. When merging, Timsort deals with three runs at a time at the end of the stack. So when naming these locals, be sure it reflects:
Run1 = stackLen - 3
Run2 = stackLen - 2
Run3 = stackLen - 1

As for stateful process, I'm not sure exactly what you're proposing, but it doesn't sound good. I wrote my code for maximum efficiency. That meant exclusively using static functions with references, but I used references as minimally as possible. This can be seen in mergeLo and mergeHi, which return the new value of minGallop, since accessing this value through a reference would be detrimental to performance.

I don't know what effect it would have if you wrote it as a class, or stored the "state" in a struct. But I originally tried using nested functions, and the performance was awful.

This comment has been minimized.

Show comment Hide comment
@DmitryOlshansky

DmitryOlshansky Oct 1, 2012

Member

I don't know what effect it would have if you wrote it as a class, or stored the "state" in a struct. But I originally tried using nested functions, and the performance was awful.

I'd try it as a struct and see what's the net effect is.

@DmitryOlshansky

DmitryOlshansky Oct 1, 2012

Member

I don't know what effect it would have if you wrote it as a class, or stored the "state" in a struct. But I originally tried using nested functions, and the performance was awful.

I'd try it as a struct and see what's the net effect is.

This comment has been minimized.

Show comment Hide comment
@DmitryOlshansky

DmitryOlshansky Oct 5, 2012

Member

And the net effect is about -10% performance on your tests using DMD.
Haven't had time to analyze the generated asm code.
I think it's enough of experiments already I'm doing another pass on style.

@DmitryOlshansky

DmitryOlshansky Oct 5, 2012

Member

And the net effect is about -10% performance on your tests using DMD.
Haven't had time to analyze the generated asm code.
I think it's enough of experiments already I'm doing another pass on style.

@andralex

View changes

std/algorithm.d
+ // Force collapse stack until there is only one run left
+ while (stackLen > 1)
+ {
+ size_t at;

This comment has been minimized.

Show comment Hide comment
@andralex

andralex Sep 29, 2012

Member

ditto

@andralex

View changes

std/algorithm.d
+ r |= n & 1;
+ n >>= 1;
+ }
+ return n + r;

This comment has been minimized.

Show comment Hide comment
@andralex

andralex Sep 29, 2012

Member

So what's the semantics of r here? As far as I can tell it's 0 or 1. It will be 1 if n has at least one bit set or something.

@andralex

andralex Sep 29, 2012

Member

So what's the semantics of r here? As far as I can tell it's 0 or 1. It will be 1 if n has at least one bit set or something.

This comment has been minimized.

Show comment Hide comment
@Xinok

Xinok Sep 30, 2012

Contributor

This is a direct copy from TimSort for Android. You can check the original code for detailed documentation:
http://cr.openjdk.java.net/~martin/webrevs/openjdk7/timsort/raw_files/new/src/share/classes/java/util/TimSort.java

@Xinok

Xinok Sep 30, 2012

Contributor

This is a direct copy from TimSort for Android. You can check the original code for detailed documentation:
http://cr.openjdk.java.net/~martin/webrevs/openjdk7/timsort/raw_files/new/src/share/classes/java/util/TimSort.java

This comment has been minimized.

Show comment Hide comment
@DmitryOlshansky

DmitryOlshansky Sep 30, 2012

Member

Now that's something of a showstopper. I've meant to ask you if you copied any other implementation and what's the license it has...
Looking there : GNU General Public License version 2. I'm afraid there is no way this can be put into a Boost licensed Phobos then unless we can convince them to re-license it.

@DmitryOlshansky

DmitryOlshansky Sep 30, 2012

Member

Now that's something of a showstopper. I've meant to ask you if you copied any other implementation and what's the license it has...
Looking there : GNU General Public License version 2. I'm afraid there is no way this can be put into a Boost licensed Phobos then unless we can convince them to re-license it.

This comment has been minimized.

Show comment Hide comment
@Xinok

Xinok Sep 30, 2012

Contributor

It's exactly the same code as is used in the C implementation for Python. The license is very short and pretty lenient. There should be little issue with including this in Phobos.

EDIT: I updated the link to the license page. It's virtually the same as the BeOpen license, but for PSF (Python Software Foundation) instead.

Code - http://svn.python.org/projects/python/trunk/Objects/listobject.c (merge_compute_minrun)
License - http://docs.python.org/license.html

P.S. - To be safe, you could rename the function to something else, e.g. calcMinRun.

@Xinok

Xinok Sep 30, 2012

Contributor

It's exactly the same code as is used in the C implementation for Python. The license is very short and pretty lenient. There should be little issue with including this in Phobos.

EDIT: I updated the link to the license page. It's virtually the same as the BeOpen license, but for PSF (Python Software Foundation) instead.

Code - http://svn.python.org/projects/python/trunk/Objects/listobject.c (merge_compute_minrun)
License - http://docs.python.org/license.html

P.S. - To be safe, you could rename the function to something else, e.g. calcMinRun.

This comment has been minimized.

Show comment Hide comment
@DmitryOlshansky

DmitryOlshansky Oct 1, 2012

Member

i'm far from being a lawyer but I don't think it's that easy. While it looks fine for an app. But not for a standard lib - it should be included alongside with boost in all stuff based on Phobos:
Copyright © 2001-2012 Python Software Foundation; All Rights Reserved
I'm not sure everybody is okay with that. Also note that there is no distinction between source form and binary both should include notice.

@andralex We gotta enlist somebody fluent with copyrights and licenses to look into this.

P.S. - To be safe, you could rename the function to something else, e.g. calcMinRun.

What that supposed to mean? Of course it's not the question of names, it's a question of being a derivative work that you yourself admitted.

To be honest I personally don't give a damn about license issues (and hate 'em with passion) but it is important to clear this up for the commercial use of D.

@DmitryOlshansky

DmitryOlshansky Oct 1, 2012

Member

i'm far from being a lawyer but I don't think it's that easy. While it looks fine for an app. But not for a standard lib - it should be included alongside with boost in all stuff based on Phobos:
Copyright © 2001-2012 Python Software Foundation; All Rights Reserved
I'm not sure everybody is okay with that. Also note that there is no distinction between source form and binary both should include notice.

@andralex We gotta enlist somebody fluent with copyrights and licenses to look into this.

P.S. - To be safe, you could rename the function to something else, e.g. calcMinRun.

What that supposed to mean? Of course it's not the question of names, it's a question of being a derivative work that you yourself admitted.

To be honest I personally don't give a damn about license issues (and hate 'em with passion) but it is important to clear this up for the commercial use of D.

This comment has been minimized.

Show comment Hide comment
@andralex

andralex Oct 1, 2012

Member

Let me make sure I understand - are we discussing the entire algorithm, or just this particular function? In the latter case, no problem because I thought it could be improved significantly.

@andralex

andralex Oct 1, 2012

Member

Let me make sure I understand - are we discussing the entire algorithm, or just this particular function? In the latter case, no problem because I thought it could be improved significantly.

This comment has been minimized.

Show comment Hide comment
@Xinok

Xinok Oct 1, 2012

Contributor

The only two functions which may be an issue are minRunLength and ensureCapacity. The rest of the code is my own. Descriptions of these functions can be found in the Timsort paper, if you wish to rewrite them:
http://svn.python.org/projects/python/trunk/Objects/listsort.txt

I proposed renaming the function to distance it from the Android source code (since the rest is identical to the Python C code).

This is why I released my code to the public domain. To hell with receiving credit!

@Xinok

Xinok Oct 1, 2012

Contributor

The only two functions which may be an issue are minRunLength and ensureCapacity. The rest of the code is my own. Descriptions of these functions can be found in the Timsort paper, if you wish to rewrite them:
http://svn.python.org/projects/python/trunk/Objects/listsort.txt

I proposed renaming the function to distance it from the Android source code (since the rest is identical to the Python C code).

This is why I released my code to the public domain. To hell with receiving credit!

This comment has been minimized.

Show comment Hide comment
@DmitryOlshansky

DmitryOlshansky Oct 1, 2012

Member

The only two functions which may be an issue are minRunLength and ensureCapacity.

Thanks for clarifying that.

Then this should the description of minRunLength:

Instead we pick a minrun in range(32, 65) such that N/minrun is exactly a
power of 2, or if that isn't possible, is close to, but strictly less than,
a power of 2. This is easier to do than it may sound: take the first 6
bits of N, and add 1 if any of the remaining bits are set. In fact, that
rule covers every case in this section, including small N and exact powers
of 2; merge_compute_minrun() is a deceptively simple function.

It's not quite the same as it isn't tied to minimal merge length. Which is good if we are to rewrite it.

@DmitryOlshansky

DmitryOlshansky Oct 1, 2012

Member

The only two functions which may be an issue are minRunLength and ensureCapacity.

Thanks for clarifying that.

Then this should the description of minRunLength:

Instead we pick a minrun in range(32, 65) such that N/minrun is exactly a
power of 2, or if that isn't possible, is close to, but strictly less than,
a power of 2. This is easier to do than it may sound: take the first 6
bits of N, and add 1 if any of the remaining bits are set. In fact, that
rule covers every case in this section, including small N and exact powers
of 2; merge_compute_minrun() is a deceptively simple function.

It's not quite the same as it isn't tied to minimal merge length. Which is good if we are to rewrite it.

@andralex

View changes

std/algorithm.d
+ if (temp.length < minCapacity)
+ {
+ size_t newSize = minCapacity;
+ foreach (n; TypeTuple!(1, 2, 4, 8, 16)) newSize |= newSize >> n;

This comment has been minimized.

Show comment Hide comment
@andralex

andralex Sep 29, 2012

Member

An explanatory comment would be of good help here.

@andralex

andralex Sep 29, 2012

Member

An explanatory comment would be of good help here.

This comment has been minimized.

Show comment Hide comment
@Xinok

Xinok Sep 30, 2012

Contributor

This comment has been minimized.

Show comment Hide comment
@DmitryOlshansky

DmitryOlshansky Oct 7, 2012

Member

From what little I understand here. Code goes as follows:
take next power of 2 of minCapacity call it a newSize.
(yeah this hacky loop does this.. but only on 32bit platform 64 bit needs also a 32 bits shift)

Obviously the next nextSize < minCapacity branch is never taken (unless overflow is hit and that's kind of unlikely).
So newSize is truncated to never exceed half of the range.

To be honest in both cases it's used I'm not sure why it has this check for the half of the full range.
Same goes for the next power of 2 here. I could have understood the desire for page size granular allocation but just power of 2, dunno.

@DmitryOlshansky

DmitryOlshansky Oct 7, 2012

Member

From what little I understand here. Code goes as follows:
take next power of 2 of minCapacity call it a newSize.
(yeah this hacky loop does this.. but only on 32bit platform 64 bit needs also a 32 bits shift)

Obviously the next nextSize < minCapacity branch is never taken (unless overflow is hit and that's kind of unlikely).
So newSize is truncated to never exceed half of the range.

To be honest in both cases it's used I'm not sure why it has this check for the half of the full range.
Same goes for the next power of 2 here. I could have understood the desire for page size granular allocation but just power of 2, dunno.

@Xinok

View changes

std/algorithm.d
+ void sort(R range, T[] temp)
+ {
+ // Do insertion sort on small range
+ if (range.length <= MIN_MERGE)

This comment has been minimized.

Show comment Hide comment
@Xinok

Xinok Sep 30, 2012

Contributor

Consider replacing MIN_MERGE here with a value between 128 to 256. My own benchmarks found insertion sort to be faster up to (about) 256 elements. There's a lot of overhead to initializing Timsort, so simple insertion sort is faster on arrays up to a certain length.

UPDATE: Now that this is the only place that MIN_MERGE is used, consider renaming the constant or removing it altogether.

@Xinok

Xinok Sep 30, 2012

Contributor

Consider replacing MIN_MERGE here with a value between 128 to 256. My own benchmarks found insertion sort to be faster up to (about) 256 elements. There's a lot of overhead to initializing Timsort, so simple insertion sort is faster on arrays up to a certain length.

UPDATE: Now that this is the only place that MIN_MERGE is used, consider renaming the constant or removing it altogether.

@DmitryOlshansky

This comment has been minimized.

Show comment Hide comment
@DmitryOlshansky

DmitryOlshansky Oct 6, 2012

Member

Awaiting futher review.

Member

DmitryOlshansky commented Oct 6, 2012

Awaiting futher review.

@Xinok

View changes

std/algorithm.d
+ if (stackLen >= 3 && stack[run1].length <= stack[run2].length + stack[run3].length)
+ {
+ size_t at = stack[run1].length <= stack[run3].length
+ ? run1 : run2;

This comment has been minimized.

Show comment Hide comment
@Xinok

Xinok Oct 6, 2012

Contributor

For all four variables (at, run1, run2, run3), replace the declarator with immutable.

@Xinok

Xinok Oct 6, 2012

Contributor

For all four variables (at, run1, run2, run3), replace the declarator with immutable.

@Xinok

View changes

std/algorithm.d
+ auto run2 = stackLen - 2;
+ auto run1 = stackLen - 3;
+ size_t at = stackLen >= 3 && stack[run1].length <= stack[run3].length
+ ? run1 : run2;

This comment has been minimized.

Show comment Hide comment
@Xinok

Xinok Oct 6, 2012

Contributor

Ditto.

@Xinok

Xinok Oct 6, 2012

Contributor

Ditto.

@DmitryOlshansky

This comment has been minimized.

Show comment Hide comment
@DmitryOlshansky

DmitryOlshansky Oct 6, 2012

Member

Okay, I rewrote auto --> immutable, and made MIN_MERGE = 128.

Member

DmitryOlshansky commented Oct 6, 2012

Okay, I rewrote auto --> immutable, and made MIN_MERGE = 128.

@andralex

View changes

std/algorithm.d
+ enum MIN_GALLOP = 7;
+ enum MIN_STORAGE = 256;
+ enum STACK_SIZE = 40;
+

This comment has been minimized.

Show comment Hide comment
@andralex

andralex Oct 7, 2012

Member

replace these with camelCase symbols?

@andralex

andralex Oct 7, 2012

Member

replace these with camelCase symbols?

This comment has been minimized.

Show comment Hide comment
@DmitryOlshansky

DmitryOlshansky Oct 7, 2012

Member

Well If you insist..

@DmitryOlshansky

DmitryOlshansky Oct 7, 2012

Member

Well If you insist..

@andralex

This comment has been minimized.

Show comment Hide comment
@andralex

andralex Oct 7, 2012

Member

OK so what's the status with the licensing and all?

Member

andralex commented Oct 7, 2012

OK so what's the status with the licensing and all?

@DmitryOlshansky

This comment has been minimized.

Show comment Hide comment
@DmitryOlshansky

DmitryOlshansky Oct 7, 2012

Member

OK so what's the status with the licensing and all?

AFAICT we are down to 1 function that is tied to some 3rd party source.
That is ensureCapacity and again it could be rewritten in a straightforward way with intrinsics.

Member

DmitryOlshansky commented Oct 7, 2012

OK so what's the status with the licensing and all?

AFAICT we are down to 1 function that is tied to some 3rd party source.
That is ensureCapacity and again it could be rewritten in a straightforward way with intrinsics.

+
+ // Copy run into temporary memory
+ temp = temp[0 .. mid];
+ copy(range[0 .. mid], temp);

This comment has been minimized.

Show comment Hide comment
@DmitryOlshansky

DmitryOlshansky Oct 7, 2012

Member

so here temp needs to hold only mid elements..

@DmitryOlshansky

DmitryOlshansky Oct 7, 2012

Member

so here temp needs to hold only mid elements..

This comment has been minimized.

Show comment Hide comment
@Xinok

Xinok Oct 7, 2012

Contributor

Pretty much. Truncating temp to make it the same length just makes the code simpler. Same as in mergeHi below.

@Xinok

Xinok Oct 7, 2012

Contributor

Pretty much. Truncating temp to make it the same length just makes the code simpler. Same as in mergeHi below.

+ assert(temp.length >= range.length - mid);
+
+ // Copy run into temporary memory
+ temp = temp[0 .. range.length - mid];

This comment has been minimized.

Show comment Hide comment
@DmitryOlshansky

DmitryOlshansky Oct 7, 2012

Member

and range.length - mid here...

@DmitryOlshansky

DmitryOlshansky Oct 7, 2012

Member

and range.length - mid here...

@DmitryOlshansky

This comment has been minimized.

Show comment Hide comment
@DmitryOlshansky

DmitryOlshansky Oct 10, 2012

Member

Okay. Current state: I've re-wrote ensureCapacity, fixed accidental bug, and made another stylistic pass.
Waiting for comments.

@Xinok
I have the felling that it may be better to just preallocate half of the range size. Tim sort tries to merge runs that are close to equal of one another and thus at the very end one of runs is slightly greater then half while the other is slightly smaller then the half. So the minimum amount of temporary ram to perform merge is slightly smaller then a half.
Extending temporary array on demand to me looks like it will only add more calls to GC and waste more ram in the end.

Member

DmitryOlshansky commented Oct 10, 2012

Okay. Current state: I've re-wrote ensureCapacity, fixed accidental bug, and made another stylistic pass.
Waiting for comments.

@Xinok
I have the felling that it may be better to just preallocate half of the range size. Tim sort tries to merge runs that are close to equal of one another and thus at the very end one of runs is slightly greater then half while the other is slightly smaller then the half. So the minimum amount of temporary ram to perform merge is slightly smaller then a half.
Extending temporary array on demand to me looks like it will only add more calls to GC and waste more ram in the end.

@Xinok

This comment has been minimized.

Show comment Hide comment
@Xinok

Xinok Oct 10, 2012

Contributor

@blackwhale
That's not quite accurate. This is the all important step here:
EDIT: I posted a link that didn't work for whatever reason, but it's in the merge function, under "Reduce range of elements" on line 7811.

I really should have included a more descriptive comment, so I'll try that now:
// Take the last element in the first range and use a gallop search to find its position in the second range
// Likewise, take the first element in the second range and use a gallop search to find its position in the first range
// Outside of this range, the elements are already in place, so there is no need to merge them.
// Slice the range to exclude those elements.

By doing this, you're shrinking the range of elements that need to be merged. To quote Wikipedia, "Timsort was designed to take advantage of partial orderings that already exist in most real-world data." So on lists with low entropy, Timsort is likely to use less than O(n/2) space.

What about random lists with high entropy? Well, it starts by allocating enough space for 256 elements and increases this space by a power of two (512, 1024, 2048, 4096 ....). If sorting ~16mil elements, it will do at most 15 allocations.

If that's unsatisfactory, rather than removing it, you could simply increase the value of minimalStorage. For example, increasing it from 256 to 65536 would reduce the above from 15 to 7 allocations. Of course, if we knew what to expect from D for manual memory management...

Contributor

Xinok commented Oct 10, 2012

@blackwhale
That's not quite accurate. This is the all important step here:
EDIT: I posted a link that didn't work for whatever reason, but it's in the merge function, under "Reduce range of elements" on line 7811.

I really should have included a more descriptive comment, so I'll try that now:
// Take the last element in the first range and use a gallop search to find its position in the second range
// Likewise, take the first element in the second range and use a gallop search to find its position in the first range
// Outside of this range, the elements are already in place, so there is no need to merge them.
// Slice the range to exclude those elements.

By doing this, you're shrinking the range of elements that need to be merged. To quote Wikipedia, "Timsort was designed to take advantage of partial orderings that already exist in most real-world data." So on lists with low entropy, Timsort is likely to use less than O(n/2) space.

What about random lists with high entropy? Well, it starts by allocating enough space for 256 elements and increases this space by a power of two (512, 1024, 2048, 4096 ....). If sorting ~16mil elements, it will do at most 15 allocations.

If that's unsatisfactory, rather than removing it, you could simply increase the value of minimalStorage. For example, increasing it from 256 to 65536 would reduce the above from 15 to 7 allocations. Of course, if we knew what to expect from D for manual memory management...

@DmitryOlshansky

This comment has been minimized.

Show comment Hide comment
@DmitryOlshansky

DmitryOlshansky Oct 10, 2012

Member

That's not really accurate. This is the all important step here:
D-Programming-Language/phobos#787

Sorry that link doesn't bring me anywhere aside for this pull request discussion.

.... So on lists with low entropy, Timsort is likely to use less than O(n/2) space.

Thanks, I entierely forgot about this technique.

What about random lists with high entropy? Well, it starts by allocating enough space for 256 elements and increases this space by a power of two (512, 1024, 2048, 4096 ....). If sorting ~16mil elements, it will do at most 15 allocations.

It does go like this now. But when I printed out values inside your original code it was almost always not a power of two.

Of course, if we knew what to expect from D for manual memory management...

Yeah, we are still waiting on this.

Member

DmitryOlshansky commented Oct 10, 2012

That's not really accurate. This is the all important step here:
D-Programming-Language/phobos#787

Sorry that link doesn't bring me anywhere aside for this pull request discussion.

.... So on lists with low entropy, Timsort is likely to use less than O(n/2) space.

Thanks, I entierely forgot about this technique.

What about random lists with high entropy? Well, it starts by allocating enough space for 256 elements and increases this space by a power of two (512, 1024, 2048, 4096 ....). If sorting ~16mil elements, it will do at most 15 allocations.

It does go like this now. But when I printed out values inside your original code it was almost always not a power of two.

Of course, if we knew what to expect from D for manual memory management...

Yeah, we are still waiting on this.

+ alias gallopSearch!(true, true) gallopReverseUpper;
+}
+
+unittest

This comment has been minimized.

Show comment Hide comment
@Xinok

Xinok Oct 10, 2012

Contributor

We could use a better unit test as this one isn't adequate for Tim Sort. It never touches gallop mode when merging, and the stability test doesn't go beyond insertion sort. I'll work on a better unit test and leave this comment here as a reminder for the time being.

@Xinok

Xinok Oct 10, 2012

Contributor

We could use a better unit test as this one isn't adequate for Tim Sort. It never touches gallop mode when merging, and the stability test doesn't go beyond insertion sort. I'll work on a better unit test and leave this comment here as a reminder for the time being.

This comment has been minimized.

Show comment Hide comment
@Xinok

Xinok Oct 11, 2012

Contributor

I updated my Timsort module with a much better unit test.

https://github.com/Xinok/XSort/blob/master/timsort.d#L564

@Xinok

Xinok Oct 11, 2012

Contributor

I updated my Timsort module with a much better unit test.

https://github.com/Xinok/XSort/blob/master/timsort.d#L564

This comment has been minimized.

Show comment Hide comment
@DmitryOlshansky

DmitryOlshansky Oct 11, 2012

Member

Great, I'll fold it in tomorrow.

@DmitryOlshansky

DmitryOlshansky Oct 11, 2012

Member

Great, I'll fold it in tomorrow.

This comment has been minimized.

Show comment Hide comment
@Xinok

Xinok Oct 29, 2012

Contributor
  • Remove the old unittest. The new unittest is more than adequate.
  • I rearranged the new unittest into a couple of nested functions which will allow a CTFE unittest to be added in the future. See here.
  • Somebody merge this already!!!
@Xinok

Xinok Oct 29, 2012

Contributor
  • Remove the old unittest. The new unittest is more than adequate.
  • I rearranged the new unittest into a couple of nested functions which will allow a CTFE unittest to be added in the future. See here.
  • Somebody merge this already!!!
@Xinok

This comment has been minimized.

Show comment Hide comment
@Xinok

Xinok Oct 10, 2012

Contributor

@blackwhale I edited my original post. You can include my more descriptive comment on that line. In fact, please do. :P

I don't thoroughly comment my code, so if there's anywhere that needs a more descriptive comment, leave a note on that line and I'll write a better explanation like I did above.

Contributor

Xinok commented Oct 10, 2012

@blackwhale I edited my original post. You can include my more descriptive comment on that line. In fact, please do. :P

I don't thoroughly comment my code, so if there's anywhere that needs a more descriptive comment, leave a note on that line and I'll write a better explanation like I did above.

@Xinok

View changes

std/algorithm.d
{
- swapAt(r, lessI, greaterI);
+ size_t at = stack[run1].length <= stack[run3].length
+ ? run1 : run2;

This comment has been minimized.

Show comment Hide comment
@Xinok

Xinok Oct 10, 2012

Contributor

size_t at -> immutable at

@Xinok

Xinok Oct 10, 2012

Contributor

size_t at -> immutable at

@Xinok

View changes

std/algorithm.d
+ immutable run2 = stackLen - 2;
+ immutable run1 = stackLen - 3;
+ size_t at = stackLen >= 3 && stack[run1].length <= stack[run3].length
+ ? run1 : run2;

This comment has been minimized.

Show comment Hide comment
@Xinok

Xinok Oct 10, 2012

Contributor

size_t at -> immutable at

@Xinok

Xinok Oct 10, 2012

Contributor

size_t at -> immutable at

@andralex

View changes

std/algorithm.d
@@ -926,7 +926,7 @@ unittest
enum filler = uint.max;
InputRange range;
fill(range,filler);
- foreach(value;range.arr)
+ foreach (value;range.arr)

This comment has been minimized.

Show comment Hide comment
@andralex

andralex Oct 14, 2012

Member

space after semicolon

@andralex

andralex Oct 14, 2012

Member

space after semicolon

@andralex

View changes

std/algorithm.d
&& is(typeof(range.length > filler.length)))
{
//Case we have access to length
auto len = filler.length;
//Start by bulk copies
- for( ; range.length > len ; )
+ for ( ; range.length > len ; )

This comment has been minimized.

Show comment Hide comment
@andralex

andralex Oct 14, 2012

Member

while?

@andralex

andralex Oct 14, 2012

Member

while?

@andralex

This comment has been minimized.

Show comment Hide comment
@andralex

andralex Oct 14, 2012

Member

Ping on this? Any work left prompted by Xinok's comments?

Member

andralex commented Oct 14, 2012

Ping on this? Any work left prompted by Xinok's comments?

@DmitryOlshansky

This comment has been minimized.

Show comment Hide comment
@DmitryOlshansky

DmitryOlshansky Oct 14, 2012

Member

Ping on this? Any work left prompted by Xinok's comments?

Yup. I need to add an exteneded unittest and do a few extra style changes.

Member

DmitryOlshansky commented Oct 14, 2012

Ping on this? Any work left prompted by Xinok's comments?

Yup. I need to add an exteneded unittest and do a few extra style changes.

@DmitryOlshansky

This comment has been minimized.

Show comment Hide comment
@DmitryOlshansky

DmitryOlshansky Oct 15, 2012

Member

Ready to go.

Member

DmitryOlshansky commented Oct 15, 2012

Ready to go.

@monarchdodra

View changes

std/algorithm.d
+ for (; !source.empty; source.popFront())
+ {
+ put(target, source.front);
+ }

This comment has been minimized.

Show comment Hide comment
@monarchdodra

monarchdodra Oct 19, 2012

Collaborator

Put natively supports a range as input, so you can replace this with put(target, source);.

I know the for loop wasn't yours, but if you touch it, might as well improve it.

@monarchdodra

monarchdodra Oct 19, 2012

Collaborator

Put natively supports a range as input, so you can replace this with put(target, source);.

I know the for loop wasn't yours, but if you touch it, might as well improve it.

This comment has been minimized.

Show comment Hide comment
@DmitryOlshansky

DmitryOlshansky Oct 20, 2012

Member

Will do

@monarchdodra

View changes

std/algorithm.d
+ // Typically 2 random access ranges are faster iterated by common
+ // index then by x.popFront(), y.popFront() pair
+ static if (isRandomAccessRange!Range1 && hasLength!Range1
+ && hasSlicing!Range2 && isRandomAccessRange!Range2)

This comment has been minimized.

Show comment Hide comment
@monarchdodra

monarchdodra Oct 19, 2012

Collaborator

Technically, you also need to check that hasLength!Range2, in the odd case of isInfinite!Range2. Infinite ranges can be random access, and slicing and infinite range (while a strange notion) is legal.

FYI, I am writing a isDroppable interface which would allow a less restrictive "slicing to end" functionality, but it is not ready yet.

EDIT: The common convention (as I have seen), is that operators be at line end, before the line feed, and not at the beginning of the line: eg:

    static if (isRandomAccessRange!Range1 && hasLength!Range1 &&
        hasSlicing!Range2 && isRandomAccessRange!Range2)

Also, 4x indent please (nitpick :p).

@monarchdodra

monarchdodra Oct 19, 2012

Collaborator

Technically, you also need to check that hasLength!Range2, in the odd case of isInfinite!Range2. Infinite ranges can be random access, and slicing and infinite range (while a strange notion) is legal.

FYI, I am writing a isDroppable interface which would allow a less restrictive "slicing to end" functionality, but it is not ready yet.

EDIT: The common convention (as I have seen), is that operators be at line end, before the line feed, and not at the beginning of the line: eg:

    static if (isRandomAccessRange!Range1 && hasLength!Range1 &&
        hasSlicing!Range2 && isRandomAccessRange!Range2)

Also, 4x indent please (nitpick :p).

This comment has been minimized.

Show comment Hide comment
@DmitryOlshansky

DmitryOlshansky Oct 20, 2012

Member

Actually, I swear that puttting operators at the begining of the line is the common convention.
For one thing it's less bug prone - no need to look back and forth across the line to get how things are composed.

"As I have seen" is hardly an argument cause Phobos is a mixture of conventions. I do not know a single module that has coherent style.
In this pull while doing global substitutions like "for(" --> "for (" I've found that 'space before paren' is not followed in std.algorithm at large. (it's supposed to)

I suspect in the long run we just have to throw in the damn STANDARD PHOBOS code formatter tool, especially w.r.t. spaces, commas, braces and parans. Otherwise the process of reviewing a pull goes painful ping-pong about every glitch with the spacing. And yeah, without the tool we'd still let through code with different style.

@DmitryOlshansky

DmitryOlshansky Oct 20, 2012

Member

Actually, I swear that puttting operators at the begining of the line is the common convention.
For one thing it's less bug prone - no need to look back and forth across the line to get how things are composed.

"As I have seen" is hardly an argument cause Phobos is a mixture of conventions. I do not know a single module that has coherent style.
In this pull while doing global substitutions like "for(" --> "for (" I've found that 'space before paren' is not followed in std.algorithm at large. (it's supposed to)

I suspect in the long run we just have to throw in the damn STANDARD PHOBOS code formatter tool, especially w.r.t. spaces, commas, braces and parans. Otherwise the process of reviewing a pull goes painful ping-pong about every glitch with the spacing. And yeah, without the tool we'd still let through code with different style.

This comment has been minimized.

Show comment Hide comment
@DmitryOlshansky

DmitryOlshansky Oct 20, 2012

Member

Infinite ranges can be random access, and slicing and infinite range (while a strange notion) is legal.

Okay...

@DmitryOlshansky

DmitryOlshansky Oct 20, 2012

Member

Infinite ranges can be random access, and slicing and infinite range (while a strange notion) is legal.

Okay...

@DmitryOlshansky

This comment has been minimized.

Show comment Hide comment
@DmitryOlshansky

DmitryOlshansky Oct 27, 2012

Member

Sorry, I've been somewhat busy. @monarchdodra nits are addressed, waiting for more comments if any.

Member

DmitryOlshansky commented Oct 27, 2012

Sorry, I've been somewhat busy. @monarchdodra nits are addressed, waiting for more comments if any.

@monarchdodra

This comment has been minimized.

Show comment Hide comment
@monarchdodra

monarchdodra Oct 29, 2012

Collaborator

I had nothing but nitpicks.

Collaborator

monarchdodra commented Oct 29, 2012

I had nothing but nitpicks.

@alexrp

This comment has been minimized.

Show comment Hide comment
@alexrp

alexrp Oct 31, 2012

Member

Are we good to go?

Member

alexrp commented Oct 31, 2012

Are we good to go?

@DmitryOlshansky

This comment has been minimized.

Show comment Hide comment
@DmitryOlshansky

DmitryOlshansky Oct 31, 2012

Member

Let me fold revamped unit test and squash it then we are good to go.

Member

DmitryOlshansky commented Oct 31, 2012

Let me fold revamped unit test and squash it then we are good to go.

@monarchdodra monarchdodra referenced this pull request Oct 31, 2012

Closed

Copy re-impl. #839

@DmitryOlshansky

This comment has been minimized.

Show comment Hide comment
@DmitryOlshansky

DmitryOlshansky Oct 31, 2012

Member

I'm done.

Member

DmitryOlshansky commented Oct 31, 2012

I'm done.

@DmitryOlshansky DmitryOlshansky referenced this pull request Nov 3, 2012

Closed

[RFC] Fix `move` #923

@DmitryOlshansky

This comment has been minimized.

Show comment Hide comment
@DmitryOlshansky

DmitryOlshansky Nov 13, 2012

Member

Squashed and waiting.

Member

DmitryOlshansky commented Nov 13, 2012

Squashed and waiting.

@klickverbot

This comment has been minimized.

Show comment Hide comment
@klickverbot

klickverbot Nov 23, 2012

Member

Could you separate out the random whitespace/style changes all over the modules to a separate commit? Other than that, looks good to me. Let's get this merged.

Member

klickverbot commented Nov 23, 2012

Could you separate out the random whitespace/style changes all over the modules to a separate commit? Other than that, looks good to me. Let's get this merged.

@DmitryOlshansky

This comment has been minimized.

Show comment Hide comment
@DmitryOlshansky

DmitryOlshansky Nov 23, 2012

Member

The whitespace changes are not random it's a deliberate pass over std.algo to bring existing code to the prevalent standard.

Anyway will do.

Member

DmitryOlshansky commented Nov 23, 2012

The whitespace changes are not random it's a deliberate pass over std.algo to bring existing code to the prevalent standard.

Anyway will do.

@DmitryOlshansky

This comment has been minimized.

Show comment Hide comment
@DmitryOlshansky

DmitryOlshansky Nov 23, 2012

Member

Here you go. Hopefully we can finally get this in.

Member

DmitryOlshansky commented Nov 23, 2012

Here you go. Hopefully we can finally get this in.

@klickverbot

This comment has been minimized.

Show comment Hide comment
@klickverbot

klickverbot Nov 23, 2012

Member

Thanks – and sorry for being such a stickler, but it is helpful to be able to quickly skip such commits when browsing the history (git blame, …).

Member

klickverbot commented Nov 23, 2012

Thanks – and sorry for being such a stickler, but it is helpful to be able to quickly skip such commits when browsing the history (git blame, …).

@DmitryOlshansky

This comment has been minimized.

Show comment Hide comment
@DmitryOlshansky

DmitryOlshansky Dec 1, 2012

Member

Ehm! Cool, now the win32 build runs out of memory...

Member

DmitryOlshansky commented Dec 1, 2012

Ehm! Cool, now the win32 build runs out of memory...

@DmitryOlshansky

This comment has been minimized.

Show comment Hide comment
@DmitryOlshansky

DmitryOlshansky Dec 12, 2012

Member

Ping.
Is it out of reach of 2.061 or is there any other problem?

Member

DmitryOlshansky commented Dec 12, 2012

Ping.
Is it out of reach of 2.061 or is there any other problem?

@andralex

This comment has been minimized.

Show comment Hide comment
@andralex

andralex Dec 12, 2012

Member

I assume "cannot read file b6395.d" is unrelated. Merging now.

Member

andralex commented Dec 12, 2012

I assume "cannot read file b6395.d" is unrelated. Merging now.

andralex added a commit that referenced this pull request Dec 12, 2012

@andralex andralex merged commit 0fcca44 into dlang:master Dec 12, 2012

1 check was pending

default Pending: 10
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment