Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

std.algorithm.sorted / schwartzSorted #9890

Open
dlangBugzillaToGithub opened this issue Oct 18, 2010 · 13 comments
Open

std.algorithm.sorted / schwartzSorted #9890

dlangBugzillaToGithub opened this issue Oct 18, 2010 · 13 comments

Comments

@dlangBugzillaToGithub
Copy link

bearophile_hugs reported this on 2010-10-18T19:05:59Z

Transfered from https://issues.dlang.org/show_bug.cgi?id=5076

CC List

Description

I propose to add two new little functions to Phobos, named sorted()/schwartzSorted(). Their purpose is to copy the input items, sort them and return them.

The -ed versions are more functional, they don't modify the input data, so they may work with a immutable input sequence too. They may be used as expressions instead of as statements, so in theory they allow code like (currently this doesn't work, see bug 5074 ):


auto foo(immutable(int)[] data) {
    return map!q{ -a }(sorted(data));
}


Instead of:

auto foo(immutable(int)[] data) {
    int[] arr = data.dup;
    sort(arr);
    return map!q{ -a }(arr);
}


sorted()/schwartzSorted() may be seen as less efficient than sort()/schwartzSort() because they copy the input data, but there are many situations (like script-like D code) where the programmer wants short and quick code, and knows the number of input items will be low enough to not cause memory shortage. Python too has both list.sort() and sorted() built-ins.


A possible simple implementation, for DMD 2.049 (Code not tested much):


import std.algorithm: SwapStrategy, schwartzSort, sort;
import std.range: isRandomAccessRange, hasLength;
import std.array: array;

auto sorted(alias less = "a < b", SwapStrategy ss = SwapStrategy.unstable,
        Range)(Range r)
{
    auto auxr = array(r);
    sort!(less, ss)(auxr);
    return auxr;
}

auto schwartzSorted(alias transform, alias less = "a < b",
        SwapStrategy ss = SwapStrategy.unstable, Range)(Range r)
    if (isRandomAccessRange!(Range) && hasLength!(Range))
{
    auto auxr = array(r);
    schwartzSort!(transform, less, ss)(auxr);
    return auxr;
}

import std.typecons: Tuple;
import std.stdio: writeln;

alias Tuple!(int, "x", int, "y") P;

void main() {
    P[] data = [P(1,4), P(2,3), P(3,1), P(4,0)];
    writeln(data);
    writeln(schwartzSorted!((x) { return x.y; })(data));
    writeln(data);
    writeln(sorted!q{ a.y < b.y }(data));
    writeln(data);
}
@dlangBugzillaToGithub
Copy link
Author

peter.alexander.au (@Poita) commented on 2010-10-19T00:20:20Z

This actually seems to be a common pattern.

By "this", I mean:

auto foo(T value)
{
  T copy = value.dup;
  modify(copy);
  return copy;
}

"modify" here could be sort, reverse, schwartzSort, partition etc. which would give you sorted, reversed, schwartzSorted and partitioned. It's also the same as defining op+ in terms of op+=.

I have no idea what you would call foo though :-(

arr2 = transformed!(sort)(arr1);
arr2 = mutated!(sort)(arr1);
arr2 = modified!(sort)(arr1);
arr2 = copyModify!(sort)(arr1);

???

@dlangBugzillaToGithub
Copy link
Author

andrei (@andralex) commented on 2010-10-19T06:22:08Z

One issue is that there's no standardized "create a copy" function for ranges.

@dlangBugzillaToGithub
Copy link
Author

peter.alexander.au (@Poita) commented on 2010-10-19T09:15:12Z

(In reply to comment #2)
> One issue is that there's no standardized "create a copy" function for ranges.

auto copy = array(input);

?

Ideally you'd be able to specify what container you want to copy into, but that should do as a default.

@dlangBugzillaToGithub
Copy link
Author

andrei (@andralex) commented on 2010-10-19T09:42:37Z

(In reply to comment #3)
> (In reply to comment #2)
> > One issue is that there's no standardized "create a copy" function for ranges.
> 
> auto copy = array(input);
> 
> ?
> 
> Ideally you'd be able to specify what container you want to copy into, but that
> should do as a default.

array creates an array from anything. We should have a way to say "duplicate and preserve type".

Probably the best idea is to define an optional property ".dup" for ranges. Then arrays implement it automatically, and other ranges may define it as they find fit.

@dlangBugzillaToGithub
Copy link
Author

peter.alexander.au (@Poita) commented on 2010-10-19T11:15:30Z

(In reply to comment #4)
> array creates an array from anything. We should have a way to say "duplicate
> and preserve type".
> 
> Probably the best idea is to define an optional property ".dup" for ranges.
> Then arrays implement it automatically, and other ranges may define it as they
> find fit.

What if you don't want to preserve type?

e.g. you have a list, but you want to get a sorted array of the elements in that list?

Why not take the best of both worlds, allowing the user to specify the new container type, but have it default to the original type, something like:

auto transformed(alias xform, OutputRange = InputRange, InputRange)(InputRange range)
{
  ...
}

@dlangBugzillaToGithub
Copy link
Author

bearophile_hugs commented on 2010-10-19T13:32:04Z

(In reply to comment #4)

> array creates an array from anything. We should have a way to say "duplicate
> and preserve type".

After thinking about your words for some time I have understood your point.

So after your change, for the semantics I am looking for, I'll need to write a bit longer code:
sorted(array(some_linked_list))

What if the input collection is not sortable? Like:
sorted(some_hash_set)

In that case I presume the compilation will fail, and I'll have to use something like:
sorted(array(some_hash_set))

Or even:
sorted(toList(some_hash_set))

(Where toList() is similar to array() but produces some kind of list out of an iterable).

A problem in using this:
sorted(array(some_linked_list))
is that array() is supposed to create an array duplicate of the collection, and then sorted() is supposed to create a second useless copy of it.

@dlangBugzillaToGithub
Copy link
Author

bearophile_hugs commented on 2010-10-19T13:33:57Z

(In reply to comment #6)

Another problem is that after your change this code will fail to compile even if some_linked_list is immutable:
sorted(some_linked_list)

@dlangBugzillaToGithub
Copy link
Author

bearophile_hugs commented on 2010-10-20T04:04:21Z

See another case:

http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D.learn&article_id=22381

This is supposed to not work:
sort(map(...))

This is supposed to work:
sorted(map(...))

@dlangBugzillaToGithub
Copy link
Author

peter.alexander.au (@Poita) commented on 2010-10-20T08:49:43Z

(In reply to comment #8)
> See another case:
> http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D.learn&article_id=22381
> This is supposed to not work:
> sort(map(...))
> This is supposed to work:
> sorted(map(...))

Just implement sorted et al. something like this:

auto sorted(Output = ElementType!InputRange[], InputRange)(InputRange range)
{
  Output output = Output(range); // copy range into new container
  sort(output);
  return output;
}

I don't know if you can construct built-in arrays like that, but you should, and you can always specialise for it if necessary.

This allows the input range to be whatever it likes (including Map), and also gives you the choice of the output range.

@dlangBugzillaToGithub
Copy link
Author

bearophile_hugs commented on 2011-01-24T03:04:20Z

In Python sort() is in-place. To help programmers remember this, sort() returns None (like void in D):

>>> a = [10, 30, 5]
>>> sorted(a)
[5, 10, 30]
>>> a
[10, 30, 5]
>>> a.sort()
>>> a
[5, 10, 30]

But in DMD 2.051 std.algorithm.sort() returns the sequence sorted in-place. Once a sorted/schwartzSorted are present, I suggest to let std.algorithm.sort() return void, as in Python.

sorted/schwartzSorted may be tagged with @nodiscard from bug 5464 to further help programmers remember their semantics.

@dlangBugzillaToGithub
Copy link
Author

bearophile_hugs commented on 2011-05-18T18:21:42Z

See also issue 6035

@dlangBugzillaToGithub
Copy link
Author

bearophile_hugs commented on 2011-10-28T17:33:16Z

An use case for sorted(). I have to create a function foo() with a int[] argument. Unless foo() is performance-critical the usual API requirements ask for its arguments to be constant (in), to make the program less bug-prone. Inside foo() I need to sort the a copy of items, and then I don't need to modify this array, so I'd like this array copy too to be const. This is an implementation that currently works:


import std.algorithm, std.exception;
void foo(in int[] unsortedData) {
    int[] tmpData_ = unsortedData.dup;
    tmpData_.sort();
    const(int[]) data = assumeUnique(tmpData_);
    // Use array 'data' here.
}
void main() {}


assumeUnique is not safe, and the tmpData_ name is present in the scope still (despite assumeUnique has turned its length to zero, this improves the situation a little).

With a pure sorted(), the code becomes more clean and safe:


import std.algorithm;
void foo(in int[] unsortedData) pure {
    const(int[]) data = sorted(unsortedData);
    // Use array 'data' here.
}
void main() {}

@dlangBugzillaToGithub
Copy link
Author

andrej.mitrovich (@AndrejMitrovic) commented on 2012-02-07T12:20:02Z

I run into this issue all the time, in particular when doing script-based programming. E.g.:

string[string] classes;
foreach (name; classes.keys.sorted) { } // ng

Having to do this is a chore:
auto keys = classes.keys;
sort(keys);
foreach (name; keys) { } 

This works ok for my purposes (yada yada about constraints):
T sorted(T)(T t)
{
    T result = t;
    sort(result);
    return result;
}

Why do we have replace and replaceInPlace, whereas we have sort which sorts in place implicitly? "findSkip" is another function that I sometimes use but I hate how it hides the fact that it modifies its arguments. "find" returns a range, but "findSkip" returns a bool and *modifies* your argument. It's not at all obvious from the call site.

@LightBender LightBender removed the P4 label Dec 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants