Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nested comprehensions, bug or a feature? #1191

Closed
ocanbascil opened this issue Mar 9, 2011 · 27 comments
Closed

Nested comprehensions, bug or a feature? #1191

ocanbascil opened this issue Mar 9, 2011 · 27 comments

Comments

@ocanbascil
Copy link

I'm having problem with nested array comprehensions.

In python: [ [i,j] for i in range(3) for j in range(3) ]
gives you a 2D array:
[ [0, 0], [0, 1], [0, 2], [1, 0], [1, 1], [1, 2], [2, 0], [2, 1], [2, 2] ]

When I try the same in coffescript : ( [i,j] for i in [0..2] for j in [0..2] )

I end up with a 3D array:
[ [ [ 0, 0 ], [ 1, 0 ], [ 2, 0 ] ],
[ [ 0, 1 ], [ 1, 1 ], [ 2, 1 ] ],
[ [ 0, 2 ], [ 1, 2 ], [ 2, 2 ] ] ]

When I increase the nest depth, python keeps giving flattened results whereas CS increases the array dimensions.

So is this a bug or a feature?

@michaelficarra
Copy link
Collaborator

The coffeescript behaviour is what I would expect. It is easy enough to flatten and emulate Python's behaviour from CS, but it seems difficult to emulate the CS behaviour from Python.

edit: after seeing TrevorBurnham's comment below, I guess that you could put square brackets around each comprehension (I don't know python very well). Still, I like the consistency of CS. I'd be open to hearing ways to make the CS comprehensions more versatile, though.

edit again: since this post, my opinion has been changed. I am now strongly in favour of switching to the python style comprehensions.

@TrevorBurnham
Copy link
Collaborator

Feature. This is actually totally consistent, though believe me, I understand your surprise.

In Python, you have to explicitly declare array comprehensions by placing square brackets around each for. But in CoffeeScript, every for returns a list (for that matter, so does every while and until). It's only an optimization that the list isn't generated if it's unused.

So

arr = ( [i,j] for i in [0..2] for j in [0..2] )

assigns the list generated by for j in [0..2] to arr, and each entry in that list is generated by the expression [i,j] for i in [0..2]. Makes sense?

The easiest way to get the flat array you want is to generate the list without the use of comprehensions:

arr = []
arr.push [i, j] for i in [0..2] for j in [0..2]

@satyr
Copy link
Collaborator

satyr commented Mar 11, 2011

In other words, Coffee doesn't support nested comprenhensions in generic sense.

([i,j] for i in [0..2] for j in [0..2])

is equivalent to

(([i,j] for i in [0..2]) for j in [0..2])

and thus to Python's

[[[i,j] for i in range(3)] for j in range(3)]

@ocanbascil
Copy link
Author

Thanks a lot, that was really what I need as I'm trying to implement
solutions to the project euler problems.

On Thu, Mar 10, 2011 at 8:37 PM, TrevorBurnham <
reply@reply.github.com>wrote:

Feature. This is actually totally consistent, though believe me, I
understand your surprise.

In Python, you have to explicitly declare array comprehensions by placing
square brackets around each for. But in CoffeeScript, every for returns
a list
(for that matter, so does every while and until). It's only an
optimization that the list isn't generated if it's unused.

So

arr = ( [i,j] for i in [0..2] for j in [0..2] )

assigns the list generated by for j in [0..2] to arr, and each entry in
that list is generated by the expression [i,j] for i in [0..2]. Makes
sense?

To get the flat array you want, you could either use a utility function
like underscore.js'
_.flatten, or generate the list without the use of comprehensions:

arr = []
arr.push [i, j] for i in [0..2] for j in [0..2]

#1191 (comment)

@jonelf
Copy link

jonelf commented Apr 16, 2011

I found the same thing trying to convert this Python by Peter Norvig

def cross(A, B):`  
    "Cross product of elements in A and elements in B."`  
    return [a+b for a in A for b in B]`

to CoffeScript

cross = (A, B) -> (a+b for a in A for b in B)

Squeegy posted a working implementation at http://stackoverflow.com/questions/5685449/nested-array-comprehensions-in-coffeescript/5685926#5685926

@TrevorBurnham
Copy link
Collaborator

Can this issue be closed? It seems like the original question has been answered, and I don't see any proposal for a change in the syntax.

@jashkenas
Copy link
Owner

Reopening this for conversation, after: http://brehaut.net/blog/2011/coffeescript_comprehensions

... but I think it should be driven by use cases + consistency. Both patterns can accomplish both goals, but which is more expected and convenient?

@jashkenas jashkenas reopened this Sep 13, 2011
@pmuellr
Copy link

pmuellr commented Sep 13, 2011

Don't have strong feelings either way, but Brehaut's post also seems to indicate a Harmony incompatibility: "The Harmony project will introduce both generators and comprehensions to Javascript, and provide different semantics for both." That's the big issue to me.

@TrevorBurnham
Copy link
Collaborator

What we need is a separate syntax that passes through to Harmony's list comprehension syntax. Maybe it wouldn't be so bad for

[x for x of obj]

to pass through directly to [x for (x in obj)]. The current behavior of that syntax is nicely consistent, but pretty uncommon in practice (and can be achieved with [].concat (x for x of obj)).

Right now, this is pretty academic—it's not like developers are rushing to writing apps that can only run in Firefox. But when Node can handle Harmony-style list comprehensions, people will expect to be able to write them in CoffeeScript. The same goes for let and certain other Harmony-isms.

I'd suggest that we add such Harmony pass-throughs in CoffeeScript 2.0, along with a compiler flag to emit warnings about compatibility issues. That's the only way I can see for people to happily use CoffeeScript for development in both cutting-edge environments and older browsers.

@jashkenas
Copy link
Owner

That's a whole 'nother kettle of fish -- and doesn't need to be discussed in this issue. Personally, I'd rather have CoffeeScript never support Harmony features until they're available across all common JS runtimes.

Despite a decade of promises, JavaScript is still pegged to what's possible to accomplish in IE6 and IE7, and it's not realistic to expect that to change any time soon. IE9 will never run on XP. (All of this has been said before.)

One of the core principles of CoffeeScript is that you're not writing code that breaks on older engines, and we shouldn't give that up under any circumstances.

@raganwald
Copy link
Contributor

I have no experience with the Harmony situation, so I’ll limit my remarks to what I think about Coffeescript as it is today. I prefer flat comprehensions:

  1. I think it's easier and cleaner to make nested arrays with flat comprehensions than to make flat arrays with nested comprehensions.
  2. Compatibility with Haskell and Python’s mental model.

I think the latter reason is important. Being similar to the two most popular (or notorious) implementations of comprehensions but differing in a significant way throws some extra mental friction into Coffeescript. Most people will use the comprehensions for simple cases and be happy, but then run into an issue when they try to ape a Python or Haskell solution they find elsewhere.

I personally think Coffeescript will ”delight” programmers when they discover that implementing a Python comprehension “just works.”

JM2C.

@michaelficarra
Copy link
Collaborator

Yes, I also don't feel a pass-through is appropriate for CoffeeScript. But I'm very happy that this issue was re-opened, as the comprehension syntax/behaviour is one of only two coffeescript features with which I have gripes. I've listed both changes in the "manifesto" of my still-vaporware rewrite (it's coming, I promise!).

I think it's appropriate to accept the harmony-style comprehensions and compile them to their ES3 equivalents. People should be allowed to use these JS features in CS. When we were adding something that didn't exist, it was okay, but now that the same feature exists in JS, choosing a different syntax and different semantics just serves to confuse. I think Andrew Brehaut put it really well himself.

I refer to the CoffeeScript behaviour as ‘map’ oriented comprehensions, and the Python behaviour as ‘mapcat’ oriented comprehensions. The mapcat model is more compositional than the map model. This means that the result of a function that is defined as mapcat oriented comprehension can be the input to itself recursively.

def flatten(l):
  try:
    return [y for x in l for y in flatten(x)]
  except TypeError, e: # horrible test for iterability
    return [l]

But even more importantly for me, it will allow us to explicitly specify when you want to use the "value" of a loop and finally shut everyone up about #899 (no offense, @TrevorBurnham and others). The downside is that you can't claim that "everything has a value" any longer, but I'm not sure that's even the case today.

@jashkenas
Copy link
Owner

If we change this behavior, everything will still have a value -- the question is just if the value of a nested loop is a flat array, or an array of arrays for each inner loop. I think that the current behavior is, almost unarguably, more consistent ... but the proposed behavior may well be more useful.

It's not out of the realm of possibility for us to do both:

results = x + y for x in list for y in otherList

results is [ flat array ]

results = for y in otherList
  for x in list
    x + y

results is [[ array of arrays ]]

... but that would probably be an unacceptable distinction to draw. At the same time, the rationale for doing a flat map gets weaker and weaker the more in-depth and complex your inner loops get. If there's lots of arbitrary code between the loop bodies, I think producing a flat map back out at the top level would feel truly strange, especially because saving the inner loop to a local variable would dramatically change the behavior.

results = for y in otherList
  tempResult = for x in list
    x + y

... always must produce an array of arrays.

@avernet
Copy link

avernet commented Nov 8, 2011

My preference, inspired from Scala, would be to keep the current behavior when using multiple for, each for behaving like a map:

s1 = [1, 2]; s2 = [3, 4]
r = (i for i in s for s in [s1, s2])
# r contains [[1, 2], [3, 4]]

This is similar to what happens in Scala when using multiple for:

val s1 = List(1, 2); val s2 = List(3, 4)
val r = for (s <- Seq(s1, s2)) yield for (i <- s) yield i
# r contains List(List(1, 2), List(3, 4))

Now however in Scala, you get a flatMap when using just one for:

val r = for (s <- List(s1, s2); i <- s) yield i
# r contains List(1, 2, 3, 4)

By analogy, I'd like to get a flatMap in CoffeeScript when using one for, with a syntax TBD, for instance:

r = (i for s in [s1, s2], i in s)
# r would contain [1, 2, 3, 4]

@veryeasily
Copy link

Hi, beginner here. The wikipedia for list comprehensions implies the goal of comprehensions is to mimic set-builder notation. As a mathematician I can say what I'd expect using set-builder notation:

If A = ['a','b'], and B = ['1','2'] then
a+b for a in A for b in B := {a+b : a in A, b in B} = [a1, a2, b1, b2]
[a+b for a in A] for b in B := {{a+b: a in A}: b in B} = [[a1, b1], [a2,b2]]

I'm not sure if this is possible to implement with a parser, but this makes intuitive sense to me.

@showell
Copy link

showell commented Dec 29, 2011

FWIW, I find CoffeeScript's list comprehensions basically unusable for multiple loops, but I'm biased by my Python experience. It's not a huge issue for me, since the workaround is obvious (just use the imperative style when you have multiple loops).

It's going to be a gotcha for newbies coming from Python. If nothing else, I think the documentation should clearly call out the nested behavior.

@showell
Copy link

showell commented Dec 29, 2011

Here's an interesting litmus test for list comprehensions:

http://rosettacode.org/wiki/List_comprehensions

I tried it in CS, and I basically gave up.

Here's the Python solution (not mine, but Rosetta is GFDL, see their license):

[(x,y,z) for x in xrange(1,n+1) for y in xrange(x,n+1) for z in xrange(y,n+1) if x**2 + y**2 == z**2]

@showell
Copy link

showell commented Jan 14, 2012

See also [https://github.com//issues/2030](issue 2030).

@showell
Copy link

showell commented Jan 14, 2012

I hope that the docs eventually call out that CS differs from Python in two important ways. First, there's flattening. Second, there's the order in which loops are nested. Python documents its strategy in the PEP: "The form [... for x... for y...] nests, with the last index varying fastest, just like nested for loops." (http://www.python.org/dev/peps/pep-0202/)

CS arranges the loops in the opposite direction, with the first index varying fastest, the opposite of how nested for loops work:

(x for x in a for y in b)
var x, y, _i, _j, _len, _len2;

for (_i = 0, _len = b.length; _i < _len; _i++) {
  y = b[_i];
  for (_j = 0, _len2 = a.length; _j < _len2; _j++) {
    x = a[_j];
    x;
  }
}

@michaelficarra
Copy link
Collaborator

@showell: I agree, that would be a good way to document our comprehensions. Unlike python,

  • Nested comprehensions are not automatically flattened (map instead of concatMap/flatMap semantics)
  • The index of the comprehension nested deepest varies fastest

That seems like a concise way to describe both the nested comprehension semantics and the way it differs from Python.

@showell
Copy link

showell commented Jan 14, 2012

@michaelficarra I think those two lines would go a long way to avoiding confusion. Does it make sense to open a new issue that specifically addresses documenting the feature as currently implemented?

@michaelficarra
Copy link
Collaborator

It would make even more sense to just open a pull request (mentioning and summarising this issue for those that weren't paying attention to our discussion) where we can talk about changes if what was submitted is insufficient/erroneous.

@satyr
Copy link
Collaborator

satyr commented Jan 14, 2012

Calling it comprehension was a mistake. Just saying "our loop returns an array of each result" or something
would've been much less confusing.

@showell
Copy link

showell commented Jan 14, 2012

See issue 2039. @michaelficarra @satyr I am holding off on a pull request, because I think the main problem here is agreeing on some terminology. Once the terminology gets nailed down, I think the doc patches will be pretty trivial (and very helpful).

@gkz
Copy link

gkz commented Jan 15, 2012

I really like this idea:
[x, y] for x in [1..3], y in [7..9] when x + y != 10
producing
[[1, 7], [1, 8], [2, 7], [2, 9], [3, 8], [3, 9]]

This way it would not break or change any current code - you could still use multiple fors and get the current result. Eg you could do
[x, y, z] for x in [1..3], y in [7..9] when x + y != 10 for z in ['a', 'b', 'c']
if you wanted to.

This way we could have the current list producing fors, and also real list comprehensions that act like people expect.

I tried taking a stab at getting something like this working, but when I got to the compileNode method of For in nodes.coffee, I got in over my head.

@lucasb-eyer
Copy link
Contributor

I know this is old, but it's still open. I just want to point out that Julia also uses for a in [1,2], b in [3,4], as suggested by @avernet and @gkz. This has the advantages of 1. not breaking any existing code and 2. the amount of for words corresponds to the depth of the result. The disadvantage is that it's unknown to Pythonistas.

@GeoffreyBooth
Copy link
Collaborator

Closing this in the interest of cleaning up old issues. If someone would like to implement a new comprehension feature that uses a new syntax that isn’t a breaking change, please feel free to submit a PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests