Skip to content

Loading…

Independent splats in array pattern matching #870

Closed
TrevorBurnham opened this Issue · 35 comments

7 participants

@TrevorBurnham
Collaborator

Let's say that I want to get the first and last values of an array of arbitrary length, in a nice one-line statement. I could write

[first, middle..., last] = arr

But this is less than ideal from both a code efficiency standpoint and a readability standpoint, since I never use middle. What I'd prefer to write is simply

[first, ..., last] = arr

Do others agree that this should be allowed?

@satyr
Collaborator

Not as nice, but you can write:

{0: first, (arr.length - 1): last} = arr
@michaelficarra
Collaborator

I don't see why not. Technically, it doesn't actually make sense to introduce a middle variable if it is never intended to be used. It hurts the readability of the code. I'd be in favor of this as long as nobody can spot any glaring issues.

@satyr
Collaborator

You'd want a placeholder syntax that takes one space as well.

$ coffee -bpe '[first, _, third, _, fifth] = a'
var fifth, first, third, _;
first = a[0], _ = a[1], third = a[2], _ = a[3], fifth = a[4];

In SpiderMonkey, empty entries serve well.

js> [first, , third] = [1, 2, 3]
1,2,3
js> [first, third]
1,3
@michaelficarra
Collaborator

@satyr: good idea. I'd be in favor of the empty entries. The first syntax you mentioned would work fine right now if you didn't care about the value of the _ variable, but definitely shouldn't be introduced as a syntactic construct.

@satyr
Collaborator

Of course though,

{0: first, 2: third, 4: fifth} = a

works just as well.

@StanAngeloff

Ah this comes back to life. I need to dig up the old discussion we had on the topic.

EDIT: Start with #86, move on to #277.

EDIT 2: Thought I'd share http://githubissues.heroku.com/#jashkenas/coffee-script/ is great for searching.

@TrevorBurnham
Collaborator

Yes, I like weepy's proposal at issue 277 of using ? as a single-value placeholder. It looks very clear to me:

[first, ?, third, ?, fifth, others...] = arr
@TrevorBurnham
Collaborator

In fact, it'd be cool to see ? (or some other character) as a no-I-don't-need-this-value placeholder in other contexts in CoffeeScript as well; for instance, sometimes I just want the values of a hash, not the keys:

foo(val) for ?, val of hash

I understand the "named values contribute to more self-documenting code" argument Jeremy gave in issue 277, but I feel like that benefit is usually outweighed by the clarity that comes with avoiding the declaration of a variable you never use. For instance,

[first, ..., last] = arr

makes it much more clear that you only care about the first and last values of the array than any possible name for the middle value would. And, of course, there are the side benefits of brevity and slightly more efficient JavaScript output.

Would someone like to submit a patch?

@satyr
Collaborator

What would arr = [first, ..., last] produce?

@TrevorBurnham
Collaborator

A syntax error. It's not necessary for the pattern-matching syntax to be completely parallel to the array creation syntax.

@jashkenas
Owner

I don't think we need this for 1.0 ... Even if you're not going to use a variable that's serving as a placeholder, at least you can name it something descriptive. The only place that placeholder syntax would be used is in pattern matches... So, closing as a wontfix.

@satyr
Collaborator

[first, ..., last] = arr

[first, ?, third, ?, fifth, others...] = arr

Coco now supports both of those without needing this proposal, in the form of

{0: first, (*-1): last} = arr
[first, [], third, [], fifth, ...others] = arr
@jashkenas
Owner

satyr: want to link to your patch?

@TrevorBurnham
Collaborator

Could we reopen this issue? It was closed with the comment "I don't think we need this for 1.0"; now that 1.0 has successfully been released, I believe it merits further discussion.

Specifically, I'd like to propose allowing

[first, ..., last] = arr

and

[first, ?, third, ?, fifth] = arr

Here's a common use case for the latter: I run a regex, and I only want the group matches (or perhaps a subset of group matches), not the full match. For instance, let's say that I have a coordinates string in the format x,y, where x and y are integers. So I'd like to be able to write

[?, x, y] = coordinates.match /(\d+),(\d+)/

The closest I can come with the current syntax is to either 1) put in an unnecessary variable name instead of ?, or 2) write

[x, y] = coordinates.match(/(\d+),(\d+)/)[1..]

The ? syntax is, I think, both more readable and more writable, and would generate more efficient code.

Similarly, if I have a function call foo(bar) that returns a list of values, of which I only want the last three, I think

[..., x, y, z] = foo bar

is clearer than any existing syntax, and would generate more efficient code than any existing one-liner.

@michaelficarra
Collaborator

I am in support of the ... syntax, not so much the ?. Though I don't really have a better suggestion, so I'd be okay with it. With regards to your last example, though, I think [x, y, z] = (foo bar)[-3..] is pretty clean. Though I guess the new proposed syntax is more readable and more naturally understandable to people unfamiliar with coffee.

@TrevorBurnham
Collaborator

What don't you like about the single-value skipping syntax? Is it just that the ? feels like an arbitrary choice of symbol? I agree that it's arbitrary, but surely folks will get used to it? The only viable alternative I see is allowing [ , x, y], which is clearly less readable. I'd be happy to hear other suggestions, though.

The regex use case is a strong one. I frequently run matches just for the groups; adding [1..] to the end of the match, or sticking in an unused variable, feel like awfully kludgy ways of skipping the first array item. Plus, I sometimes do such regex matches in a performance-intensive loop, where superfluous assignments/slices potentially matter. And I may want to skip a group or two from the middle, not just the beginning.

@michaelficarra
Collaborator

@TrevorBurnham: It's not the syntax. I think a single value can always just be given a name. Skipping more than one value without naming it is more useful, though, because those skipped values may have nothing in common and thus no valid identifier. I like the hanging commas, but I remember jashkenas thought that it wasn't very readable, which I can pretty much agree with.

@sethaurus

I suggest null as a placeholder symbol instead of ?. It's pretty intuitive for assignment to null to be a no-op. null... is also an option for a discarded splat. Upon compilation, a null in assigment position could simply become a local variable __null, which would be reusable within a function since it would never be read.

@satyr
Collaborator

I'd suggest using some other symbol than ? at least--its semantic is consistent and syntax rules around it is quite complex as is.

Note that [] is half-working already:

$ coffee -bpe '[[], x, y] = match'
var x, y;
match[0], x = match[1], y = match[2];

Just remove the extra match[0] and we get the desired behavior.

@michaelficarra
Collaborator

@sethaurus: I like that idea a lot.

@TrevorBurnham
Collaborator

I'd be OK with the syntax

[null, second, third] = arr

Note, however, that the ideal compilation would be second = arr[1]; third = arr[2];. I don't see any reason for the __null variable, except that it may be easier to implement. (See @satyr's post above—he's halfway there in his implementation already.)

@michaelficarra
Collaborator

@TrevorBurnham: That's coffee. Go ahead and try it out.

@TrevorBurnham
Collaborator

Cool. So it's just a matter of getting rid of the extra symbol and using null instead of (or in addition to) []. I just find [null, second] = arr much more readable/writeable than [[], second] = arr.

Oh, and as to splats, is []... supposed to be working? It should be possible to implement non-assignment splats by doing a calculation on arr.length rather than making a slice call, e.g.

    [a, null..., b, c] = arr

would compile to something like

    __len = arr.length;
    a = arr[0], b = arr[__len - 2], c = arr[__len - 1];
@michaelficarra
Collaborator

It's more a case of allowing null and special-casing it so that it skips an index and doesn't get assigned. The [] syntax will probably always work as long as we have destructuring assignment. Though we can optimize it so that the reference to the value it's skipping is not output.

@odf
odf commented

I think I'd probably prefer using the [] over the null, since it stands out more. But assignment to null as a no-op makes sense as well.

@michaelficarra
Collaborator

Yeah, they both work. It kinda makes me uneasy because it does look like we are attempting to assign a value to null, which should cause an error, but once one understands that it's supposed to be a no-op, it seems alright. [] is perfectly okay, though. And it already fits with current semantics. I'd be okay with either one.

@TrevorBurnham
Collaborator

I still think ? is a bit clearer than null or []. As a newcomer to CoffeeScript, if I see

[null, second] = arr

then I'm wondering why there's an attempted assignment to null; and if I see

[[], second] = arr

then I'm wondering if there's some kind of fancy nested pattern-matching going on. Of course, ? is hardly self-explanatory, and perhaps it is overused...

Maybe void would be better than null, since it's an invalid keyword everywhere else in CoffeeScript:

[void, second] = arr

After all, it seems a little odd to allow [null, second] = arr but not [undefined, second] = arr or [false, second] = arr or [0, second] = arr; what makes null so special?

@ghost

Consider my one vote for using a period instead of a ? or null. A splat ... symbol does a great job indicating a bunch of "unknown" things and a single period indicates one "unknown" thing. The period is visually very small which makes it somewhat like the use of nothing. So I recommend

[a, ., b, c] = array

We already use periods for this function so why not stay with the period for the particular case of one item.

@sethaurus

The destructuring assignment syntax is already one of the more punctuation-heavy parts of the language ([, ,, ], =). If we use special characters to indicate a placeholder, we run the risk of making the whole construct visually confusing. I think a keyword is clearer, and I like TrevorBurnham's suggestion of using void.

@satyr
Collaborator

void as a placeholder makes sense. I'll probably make it work.

@jashkenas
Owner

I don't think that having a value-skipping syntax is a good idea, unless implemented consistently across the language ... ie, in function signatures, and arrays as well as destructuring assignment, and if it can be used in destructuring assignment, it should be use-able in regular assignment as well.

So, let's stick to the destructuring syntax we already have (as of 4ce374b):

[first, [], third] = list

Into:

var first, third;
first = list[0], third = list[2];

... an empty array or object will do. Personally, I'm going to continue to name the variables.

@michaelficarra
Collaborator

I have reverted jashkenas' commit 4ce374b above, as its maloptimization was the cause of issues #1103 and #1274. I am opening up this issue again so someone can make a proper optimization (or maybe even implement the [a, ..., b] = c or [null, a] = b syntaxes).

@michaelficarra michaelficarra reopened this
@jashkenas
Owner

Thanks for the revert.

@geraldalewis geraldalewis pushed a commit to geraldalewis/coffee-script that referenced this issue
@jashkenas Issue #870 ... placeholders in destructuring assignment. 4ce374b
@jashkenas
Owner

This all seems to be sorted out now on master.

list = [1..100]

[first, [], third] = list

console.log first, third

[first, []..., last] = list

console.log first, last

Produces...

1 3
1 100
@jashkenas jashkenas closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.