Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ambiguity in slicing with Ranges / WhateverCodes #50

Closed
lizmat opened this issue Jun 25, 2019 · 5 comments
Closed

Ambiguity in slicing with Ranges / WhateverCodes #50

lizmat opened this issue Jun 25, 2019 · 5 comments
Assignees
Labels
language Changes to the Raku Programming Language

Comments

@lizmat
Copy link
Collaborator

lizmat commented Jun 25, 2019

On the surface, these two pieces of code do exactly the same thing:

$ perl6 -e 'my @a = ^10; dd @a[^Inf]'
(0, 1, 2, 3, 4, 5, 6, 7, 8, 9)

$ perl6 -e 'my @a = ^10; dd @a[^*]'
(0, 1, 2, 3, 4, 5, 6, 7, 8, 9)

But under the hood, they do very different things.

In the first case, the slice is produced from a Range 0..^Inf, so it will just produce values for the slice until the source is exhausted.

In the second case, the slice is produced from a WhateverCode:

$ perl6 -e 'my $a = ^*; dd $a.^name; dd $a(10).list'
"WhateverCode"
(0, 1, 2, 3, 4, 5, 6, 7, 8, 9)

Specifically, ^* codegens as { ^$_ }, which is effectively the same as ^(*-0).

So, what does one need to do if one wants to have a slice of all but the last values of a Iterable? Well, this does not do the right thing:

$ perl6 -e 'my @a = ^10; dd @a[^*]'
(0, 1, 2, 3, 4, 5, 6, 7, 8, 9)

because that's the equivalent of doing:

$ perl6 -e 'my @a = ^10; dd @a[^10]'
(0, 1, 2, 3, 4, 5, 6, 7, 8, 9)

The next thing one could do, is to use the *-1 syntax:

$ perl6 -e 'my @a = ^10; say @a[^*-1]'
Effective index out of range. Is: -1, should be in 0..^Inf

This does not work, because:

$ perl6 -e 'dd (^*-1)(10)'
-1..^9

To make this work, one needs to use parentheses:

$ perl6 -e 'my @a = ^10; dd @a[^(*-1)]'
(0, 1, 2, 3, 4, 5, 6, 7, 8)

Issue rakudo/rakudo#3010 indicates that a warning would need to be in place. I'm not sure that that is the correct solution to this situation.

I think part of the underlying issue is the difference in codegen for:

$ perl6 -e 'dd 0..^*'
0..^Inf
$ perl6 -e 'dd ^*'
{ ... }

Perhaps we need to change the codegen for ^*. In any case, any changes here are part of potentially very hot code, so any additional checks will slow things down for all. In that vein, I also think that rakudo/rakudo@35b69f0 should probably be reverted.

@AlexDaniel AlexDaniel added the language Changes to the Raku Programming Language label Jun 25, 2019
@AlexDaniel
Copy link
Member

AlexDaniel commented Jun 25, 2019

So, what does one need to do if one wants to have a slice of all but the last values of a Iterable?

FWIW I think 0..*-2 is the most readable alternative (out of options that currently work).

@taboege
Copy link
Member

taboege commented Jun 25, 2019

It's not clear to me if you had a potential solution in mind or what it would be. Changing the meaning of ^* from { ^$_ } to ^Inf would scratch an itch I got from reading your comment, but not change any result of slicing with it.

About the commit

In that vein, I also think that rakudo/rakudo@35b69f0 should probably be reverted.

Which vein do you mean? Does it play on the ambiguity too much or do the changes slow things down? At least it does not introduce an additional check.

What I tried to do in that commit is remove the remaining observable difference in slicing with

  • the Range ^Inf,
  • the Range-lookalike WhateverCode ^*,
  • the definitely-WhateverCode 0..*-1 and
  • the WhateverCode-lookalike Range 0..^*.

Before, the last one would actually omit the last element in a slice and I still think this was just an off-by-one bug which was triggered in a corner case¹.

If I read * in a subscript, I'm thinking of .elems and slicing ^.elems will include the last element, like it did with the other three ways of writing superficially the same. If you want, the pun here is that it doesn't matter if you substitute Inf or .elems for * in the expression ^*. Both values are sufficiently large.

About the warning

So, what does one need to do if one wants to have a slice of all but the last values of a Iterable?

I'd write verbosely

@a[-> $elems { 0 .. $elems-2 }]

which becomes

@a[0..*-2]   # up to the second-to-last element
@a[0..^*-1]  # up to and excluding the last element

In any case, it's going to be a WhateverCode because ^Inf is still an infinite range, so it shouldn't omit the last element; ^* should just behave the same. Based on that, slicing with an excludes-max infinite range is a thinko which is what the warning would scold.

BTW, I would also prefer @a[^*-1] but precedence forbids that without parens or a preceding 0.. as you showed. That's slightly weird but in a way which is irrelevant to slices.

Where is the problem?

As I understand it and as you said, the issue involves &infix:<..^> vs. &prefix:<^>: how one treats * as Inf and the other turns itself into a WhateverCode instead, as well as their precedence against arithmetic. These properties feel too far apart for sibling Range-making operators, but I'm sure that has been brought up and explained before.

On the other hand, I do not think that this is a problem with slicing. The only source of ambiguity I see here are the types being sent to the slicer, but you don't want to lose these possibilities, and after rakudo/rakudo@35b69f0 I haven't seen an example where the result is discontinuous when you pass from one type to another by a little change of syntax (as in the list above).

Another ambiguity

Well, except this one:

say (1..10)[0 .. *];   #= (1 2 3 4 5 6 7 8 9 10)
say (1..10)[0 .. *+0]; #= (1 2 3 4 5 6 7 8 9 10 Nil)

But the solution to that, IMHO, lies in documentation.


¹ Namely

use nqp;
say nqp::eqaddr((0..Inf).max, Inf);  #= 1
say nqp::eqaddr((0..*).max, Inf);    #= 0

@AlexDaniel
Copy link
Member

I agree with @taboege and I was also confused by the ticket, but didn't have time to write it down as nicely. @taboege++.

@lizmat
Copy link
Collaborator Author

lizmat commented Jun 25, 2019

It's not clear to me if you had a potential solution in mind or what it would be

I don't have a clear picture on what would be a solution.

I see two issues: one is that 0..^5 is the same as ^5, and 0..^Inf is the same as ^Inf, but that 0..^* is NOT the same as ^*. And from this all sorts of things start to become blurred. And rakudo/rakudo@35b69f0 is an attempt to fix one context of that discrepancy.

The other one is that:

use nqp;
say nqp::eqaddr((0..Inf).max, Inf);  #= 1
say nqp::eqaddr((0..*).max, Inf);    #= 0

There should not be a difference here. Fixing this would fix what rakudo/rakudo@35b69f0 is trying to fix, would it not?

@taboege
Copy link
Member

taboege commented Jun 25, 2019

0..^* is NOT the same as ^*. And from this all sorts of things start to become blurred. And rakudo/rakudo@35b69f0 is an attempt to fix one context of that discrepancy.

Agree completely.

There should not be a difference here. Fixing this would fix what rakudo/rakudo@35b69f0 is trying to fix, would it not?

No, it would make the first of the two changes in the diff unnecessary. If that commit was reverted entirely and the nqp::eqaddr(…, Inf) fixed, then

  • 0..^* and 0..^Inf would return consistent results in slicing (thanks to the nqp::eqaddr being as consistent as … === Inf now – and maybe faster, I don't know) but what they would return is everything but the last element. Arguably not what you mean by 0..^Inf.
  • ^* not being an infinite Range in a slice would turn into 0..^.elems and thus return everything. Arguably what you mean, but then something still has to be done about the visible difference between ^* and 0..^* slices, where the current WhateverCode behaviour of ^* feels more DWIMy to me.

I would say the first half of the diff should be reverted as soon as nqp::eqaddr(…, Inf) works and is faster than … === Inf. The second half of the diff says 0..^Inf should return everything and not everything but the last element. Whether that half should go depends on a decision on intended semantics.

@lizmat lizmat closed this as completed May 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
language Changes to the Raku Programming Language
Projects
None yet
Development

No branches or pull requests

4 participants