cartesianProduct improvement #2276

quickfur · 2014-06-28T04:03:41Z

Partial fix for issues:
https://issues.dlang.org/show_bug.cgi?id=9878
https://issues.dlang.org/show_bug.cgi?id=10693
https://issues.dlang.org/show_bug.cgi?id=10779
https://issues.dlang.org/show_bug.cgi?id=12957

The full fix is a little complex, so I thought at least the partial fix should go in first. Basically, this PR includes two fixes: standardize on lexicographic output order (for now), and for finite forward ranges, use a much less template-heavy implementation that doesn't involve an exponential number of template instantiations. For infinite ranges and non-forward ranges, we fall back to the old implementation, but I think the finite forward range case is probably the most common, and benefits the most from this fix, so it should be merged now rather than later.

bearophile · 2014-06-29T18:57:06Z

For infinite ranges and non-forward ranges, we fall back to the old implementation

So the output order changes according to the kind of range you are giving to this function? If this true, then it looks a little dangerous.

I'd also like some way to perform the optional "repeat" argument of the Python version of this function (named just product), is this possible, even just as template argument? (If it's possible is it to be asked for in a new ehancement request?)
https://docs.python.org/2/library/itertools.html#itertools.product

quickfur · 2014-06-30T01:05:49Z

Once you have a non-forward input range or an infinite range as one of the arguments, you can no longer dictate what order the elements of the product should be returned in, because there is only a very limited number of orders that are actually implementable. For example, demanding that the elements of the product return tuples where elements of the input range are permuted, is impossible to satisfy, since you can't move backwards in an input range. Similarly, if one or more argument ranges are infinite, then certain element orderings may not be possible, because in order to cover all possible elements of the infinite product, you must traverse it in a limited number of ways. Requiring only forward ranges restricts the possible return orderings even further, so generally, allowing the user to choose the return order only needlessly complicates the implementation, and most of the time you have to reject the request order anyway since it's impossible to satisfy.

The most common use case of the cartesian product, however, involves finite forward ranges. This is the case targeted by this PR, giving them an efficient implementation that (for now) returns elements in the expected lexicographic ordering. For this reason, I also folded in a previous PR that changes the 2-argument case, so that for the finite forward range case, it will return the elements in the same lexicographic ordering. Allowing the user to specify other orderings in this case is possible, since this is the most flexible case, but that will be more complex to implement, and can be done later.

For now, I think the most pressing issue is that the current implementation works poorly in the common, finite forward range case. The exponential number of templates used by the current variadic cartesianProduct makes it impossible to use more than a small number of arguments before the compiler runs out of memory at compile-time. This PR at least addresses this most common use case, so that most people will find cartesianProduct usable, then we can focus on improving the more obscure cases later (infinite ranges and non-forward input ranges).

quickfur · 2014-06-30T01:08:08Z

As for the "repeat" argument in Python, I believe you have an open enhancement request for it. I think it should be relatively easy to add once this PR is checked in. I'd prefer to implement it separately.

bearophile · 2014-06-30T07:04:44Z

Thank you for the answers, and for the work.

bearophile · 2014-07-02T18:36:33Z

Other related issues:
https://issues.dlang.org/show_bug.cgi?id=12007
https://issues.dlang.org/show_bug.cgi?id=11825
https://issues.dlang.org/show_bug.cgi?id=7128

bearophile · 2014-07-02T18:41:01Z

In theory cartesianProduct should be pure nothrow @safe @nogc if the inputs are the same.

quickfur · 2014-07-02T18:43:04Z

Attributes for template functions (including methods of templated types) are inferred by the compiler, so generally I wouldn't bother annotating them. Is something broken here that makes it non-pure, throwing, etc., when it shouldn't be?

bearophile · 2014-07-02T20:15:29Z

It's right to not annotate template functions, because the attributes are inferred. But I think it's also a very good idea to add to Phobos an unit test like this, that makes sure all attributes are inferred correctly in a simple use case:

unittest {
    void foo() pure nothrow @safe @nogc {
        immutable int[3] arr1 = [1, 2, 3];
        immutable int[2] arr2 = [1, 2];
        foreach (immutable t; cartesianProduct(arr1[], arr2[], arr1[])) {}
    }
}

…tesianProduct.

quickfur · 2014-07-02T23:05:19Z

Rebased to prevent compile failure with latest changes in dmd.

Added pure @safe nothrow @nogc to unittest to ensure cartesianProduct does not inadvertently introduce impurity, unsafety, hidden GC allocations, etc..

Also melded some commits together to make cleaner history.

quickfur · 2014-07-07T05:15:15Z

Anybody else reviewing this PR? @andralex ? @monarchdodra ? @jmdavis ?

WalterBright · 2014-07-10T18:54:41Z

How are we looking on unittest coverage for the new code? I.e.:

dmd -main -unittest -cov std/algorithm

quickfur · 2014-07-10T19:06:01Z

Fully covered, from what I can tell (there are no 00000 lines in the .lst file inside the cartesianProduct functions).

WalterBright · 2014-07-10T19:48:01Z

awesome!

WalterBright · 2014-07-10T19:48:12Z

Auto-merge toggled on

cartesianProduct improvement

bearophile · 2014-07-10T22:10:58Z

Wasn't this covered?
https://issues.dlang.org/show_bug.cgi?id=13091

quickfur · 2014-07-10T22:31:51Z

Unfortunately, no, because the 2-argument case still uses the original deeply-templated implementation. I'm wondering if I should do a followup PR that replaces that with the new implementation as well, when no input ranges or infinite ranges are involved.

bearophile · 2014-07-10T22:34:56Z

I have found a case that I don't understand:

void main() {
    import std.typecons: Tuple, Nullable;
    import std.algorithm: cartesianProduct;
    string[] a;
    const Nullable!(Tuple!(string[])) b;
    //foreach (ab; cartesianProduct(a, b[0])) {} // fails
    //foreach (ab; cartesianProduct(a, b.get[0])) {} // fails
    const string[] c = b.get[0]; // OK
    foreach (ab; cartesianProduct(a, c)) {} // OK
}

quickfur · 2014-07-10T23:01:03Z

pragma(msg, typeof(b[0])) reveals that its type is const(immutable(char)[][]), which is not a forward range because const arrays are not ranges (you can't modify them, so popFront doesn't work).
Ditto with b.get[0].

quickfur · 2014-07-10T23:03:27Z

P.S. Hmph, apparently c also has that same type. So looks like the compiler has a hack to make iterating const arrays possible, but for whatever reason, when returned from wrapper types like Nullable, the hack didn't work. Probably file a compiler bug?

bearophile · 2014-07-11T06:50:47Z

Once I have understood the situation I'll open another bug report. But has cartesianProduct internal logic to convert const(int[]) to a const(int)[]?

bearophile · 2014-07-11T08:01:03Z

OK, filed as: https://issues.dlang.org/show_bug.cgi?id=13092

quickfur mentioned this pull request Jun 28, 2014

Fix cartesianProduct: default order should be lexicographically sorted (issue 9878) #1314

Closed

H. S. Teoh added 3 commits July 2, 2014 15:59

cartesianProduct: default order should be lexicographically sorted.

a8f80e0

Update unittest and example code to reflect new ordering.

66287d0

Get rid of exponential template bloat for most common use case of car…

95ea9c8

…tesianProduct.

WalterBright added a commit that referenced this pull request Jul 10, 2014

Merge pull request #2276 from quickfur/cprod_improve1

bcceb54

cartesianProduct improvement

WalterBright merged commit bcceb54 into dlang:master Jul 10, 2014

quickfur deleted the cprod_improve1 branch July 16, 2014 05:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cartesianProduct improvement #2276

cartesianProduct improvement #2276

quickfur commented Jun 28, 2014

bearophile commented Jun 29, 2014

quickfur commented Jun 30, 2014

quickfur commented Jun 30, 2014

bearophile commented Jun 30, 2014

bearophile commented Jul 2, 2014

bearophile commented Jul 2, 2014

quickfur commented Jul 2, 2014

bearophile commented Jul 2, 2014

quickfur commented Jul 2, 2014

quickfur commented Jul 7, 2014

WalterBright commented Jul 10, 2014

quickfur commented Jul 10, 2014

WalterBright commented Jul 10, 2014

WalterBright commented Jul 10, 2014

bearophile commented Jul 10, 2014

quickfur commented Jul 10, 2014

bearophile commented Jul 10, 2014

quickfur commented Jul 10, 2014

quickfur commented Jul 10, 2014

bearophile commented Jul 11, 2014

bearophile commented Jul 11, 2014

cartesianProduct improvement #2276

cartesianProduct improvement #2276

Conversation

quickfur commented Jun 28, 2014

bearophile commented Jun 29, 2014

quickfur commented Jun 30, 2014

quickfur commented Jun 30, 2014

bearophile commented Jun 30, 2014

bearophile commented Jul 2, 2014

bearophile commented Jul 2, 2014

quickfur commented Jul 2, 2014

bearophile commented Jul 2, 2014

quickfur commented Jul 2, 2014

quickfur commented Jul 7, 2014

WalterBright commented Jul 10, 2014

quickfur commented Jul 10, 2014

WalterBright commented Jul 10, 2014

WalterBright commented Jul 10, 2014

bearophile commented Jul 10, 2014

quickfur commented Jul 10, 2014

bearophile commented Jul 10, 2014

quickfur commented Jul 10, 2014

quickfur commented Jul 10, 2014

bearophile commented Jul 11, 2014

bearophile commented Jul 11, 2014