Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

change switch range syntax to be more clear and perhaps also allow exclusive ranges #359

Closed
andrewrk opened this issue May 3, 2017 · 37 comments
Labels
breaking Implementing this issue could cause existing code to no longer compile or have different behavior. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Milestone

Comments

@andrewrk
Copy link
Member

andrewrk commented May 3, 2017

Right now, .. is the only slice operator available, and it is exclusive. Meanwhile ... (1 extra dot) is the only switch range operator available, and it is inclusive.

I believe that the difference in exclusivity of each kind of expression is appropriate based on typical use cases, however the difference in syntax is subtle. It may be worth choosing more clear syntax to represent status quo, or perhaps adding the exclusive ability to switch statements.

Here's an example of a switch statement where exclusive ranges are better:

switch (rng.getRandomPercent()) {
    0...30 => std.debug.warn("Choice A\n"),
    30...70 => std.debug.warn("Choice B\n"),
    70...100 => std.debug.warn("Choice C\n"),
}

Right now this would give you an error because 30 and 70 are used twice. To fix it, the code would look like this:

switch (rng.getRandomPercent()) {
    0 ... 30 - 1 => std.debug.warn("Choice A\n"),
    30 ... 70 - 1 => std.debug.warn("Choice B\n"),
    70 ... 100 - 1 => std.debug.warn("Choice C\n"),
}

It's not so bad, especially considering the -1 happens at compile-time, but this is an example of where exclusive range is desired. Another example would be enum ranges. There is no reasonable way to do "enum value" minus 1. Another example would be if they were floats instead of integers. In this case -1 doesn't make sense and you absolutely need the exclusivity ability.

Here are some proposals:

  • Allow .. in switch as well as .... This matches Perl - two dots is exclusive, three dots is inclusive.
  • Change .. slice syntax to :. This matches Python. Switch statements still have no exclusive range operator.
  • Change .. slice syntax to :, and allow : in switch statements as well, so that they have an exclusive range operator available.

If we have a for range syntax (See #358) then that should be taken into consideration as well.

@andrewrk andrewrk added the breaking Implementing this issue could cause existing code to no longer compile or have different behavior. label May 3, 2017
@andrewrk andrewrk added this to the 0.1.0 milestone May 3, 2017
@raulgrell
Copy link
Contributor

raulgrell commented May 3, 2017

Since the for over a range is under consideration, I just want to think out loud a bit.... using the two different range operators allowed in both places:

var array: [3]u8 {0, 1, 2 }
array[0..0] == []u8{}
array[0..1] == []u8{0}
array[0...0] == []u8{0}
array[0...1] == []u8{0,1}

// With chars, the range being exclusive can get weird
switch (c) {
    'a'...'b' => {}, // inclusive
    'c'..'f', 'g' => {}, // exclusive (no f)
    'f'..'g' => {}, // OK (no f above, no g here)
}

// Ints are ultimately the same, but easier to reason I guess 
switch (c) {
    1 .. 10 => {},
    10 .. 100 => {},
    100 .. 1000 => {},
}

// If you could do a switch on floats, inclusive would be weird
switch (f) {
    0.0 ... 1.0 => {}, // inclusive
    1.0 .. 2.0 => {}, // Not OK, 1.0 in two branches
    2.0 .. 3.0=> {}, // OK
}

If we're considering python style array slicing could we do negative indices:

var array: [5]u8 {0, 1, 2, 3, 4 }
array[0...] == []u8 {0, 1, 2, 3, 4 }
array[0..3] == []u8 {0, 1, 2 }
array[0...-1] == []u8 {0, 1, 2, 3, 4 }
array[0...-2] == []u8 {0, 1, 2, 3}
array[0..-1] == []u8 {0, 1, 2, 3 }

Could we iterate backwards?

array[-1...] == []u8 {4, 3, 2, 1, 0 }
array[5...0] == []u8 {4, 3, 2, 1, 0 }
array[5..0] == []u8 { 4, 3, 2, 1 }

And finally, could we specify a stride/step?

array[0... : 2] == []u8 {0, 1, 2, 3, 4}
array[0..3 : 2 ] == []u8 {0, 2 }
array[1..5 : 3 ] == []u8 {1,4}
array[0...5 : -1] == []u8 {4, 3, 2, 1, 0} // Would this be a better way to iterate backwards?

This only really makes sense if we're able to do the same thing for the for over a range:

var values = []u8 {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
var slice = values[0 ... 10 : 4];

for (0 ... 10 : 4 ) | x, i | {
    assert( x == slice[i] )
    printf("{}, {}, {}", x, i, slice[i])
} 
>> 0, 0, 0
>> 4, 1, 4
>> 8, 2, 8

@andrewrk
Copy link
Member Author

andrewrk commented May 3, 2017

I think having both .. and ... is reasonable, although I do want to encourage programmers to use the exclusive one for slices.

We couldn't do the backwards or stride, because a slice only creates a pointer and a length; it does not copy data around.

As for negative... it seems simpler to require a usize for the end argument of a slice and leave it to the user to figure out how to do indexes an offset from the length.

@thejoshwolfe
Copy link
Sponsor Contributor

Another problem with negative indexes is that if the compiler doesn't know at comptime if an index is positive or negative, it would have to emit a conditional branch, which sounds like a bad idea. If there was going to be a way to index backwards, it would need to be comptime unambiguous.

@raulgrell
Copy link
Contributor

raulgrell commented May 4, 2017

Yeah, the python style slicing is not appropriate. Doing step/direction in the for wouldn't require the copying around as it would just be a while loop with a counter, but the point here is to make the syntaxes consistent and it makes things less simple not more... So yeah, disregard the above...

The other only other thought on this subject I wanted to share is specifying a range not with start and end, but start and number of elements...

ie:

var array: [5]u8 {0, 1, 2, 3, 4 }

// Exclusive
array[0..2] == []u8{0,1}

// Inclusive
array[0...2] == []u8{0,1, 2}

// Range
array[0 : 0] == []u8{}
array[0 : 2] == []u8{0, 1}
array[2 : 2] == []u8{2, 3}

// Each in the for
for (0 .. 2 ) | x, i | { }  // 0, 1
for (0 ... 2) | x, i | { }  // 0, 1, 2 
for  (2 : 2)  | x, i | { }  // 2, 3 

// Each in the switch
switch (c) {
    'a'...'b' => {}, // inclusive
    'c'..'A' => {}, // exclusive (no A)
    'A':26 => {}, // Range - All capital letters
}

@andrewrk
Copy link
Member Author

andrewrk commented May 4, 2017

I think it's reasonable to want to have a start and a length rather than start and end. But I think there's value in the language having a single convention.

@raulgrell
Copy link
Contributor

Yeah, this is purely syntactic sugar and completely unecessary.

Consider the following:

fn printRange(a, b) {
    for (a ... b) | x, i | { }  // Fails if a > b
    for  (arr[a ... b])  | x, i | { }  // Fails if a > b OR b > arr.len OR a > arr.len
}

fn printN(a, n) {
    for  (a : n)  | x, i | { }  // Can't fail
    for  (arr[a : n])  | x, i | { }  // Fails if n > arr.len OR a + n > arr.len
}

@andrewrk
Copy link
Member Author

andrewrk commented May 4, 2017

I see your point here, but I question this assertion:

for (a ... b) | x, i | { }  // Fails if a > b

I think this would simply iterate 0 times, the same way that this would:

var i: usize = 100;
while (i < 10; i += 1) {}

As for the other one:

for  (arr[a ... b])  | x, i | { }  // Fails if a > b OR b > arr.len OR a > arr.len

Because of the transitive property, we only have to compare a <= b and then b <= arr.len.

@raulgrell
Copy link
Contributor

raulgrell commented May 4, 2017

I think this would simply iterate 0 times, the same way that this would:

var i: usize = 100;
while (i < 10; i += 1)

Good point, sounds reasonable. You'd need to check the second case anyway.

andrewrk added a commit that referenced this issue May 19, 2017
@andrewrk
Copy link
Member Author

Now slicing syntax is .. instead of .... So the syntax is at least not misleading.

@andrewrk andrewrk added enhancement Solving this issue will likely involve adding new logic or components to the codebase. and removed breaking Implementing this issue could cause existing code to no longer compile or have different behavior. labels May 19, 2017
@andrewrk andrewrk modified the milestones: 0.2.0, 0.1.0 May 19, 2017
@andrewrk andrewrk changed the title consistent slicing/range syntax allow slicing and switch case ranges with both .. and ... syntax May 19, 2017
@raulgrell
Copy link
Contributor

Looks good!

@andrewrk andrewrk modified the milestones: 0.2.0, 0.3.0 Oct 19, 2017
@andrewrk andrewrk modified the milestones: 0.3.0, 0.4.0 Feb 28, 2018
@skyfex
Copy link

skyfex commented Jul 30, 2018

It should be said that this syntax is the exact opposite of what Ruby does: https://ruby-doc.org/core-2.1.5/Range.html

Not that Ruby should dictate Zig, but it's very unfortunate.

But the syntax is very clear though. I think it's hard to see the difference clearly.

I had a related comment here: #358 (comment)

@bheads
Copy link

bheads commented Jul 31, 2018

I think having both .. and ... will lead to lots of bugs..

@andrewrk andrewrk added proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. breaking Implementing this issue could cause existing code to no longer compile or have different behavior. and removed enhancement Solving this issue will likely involve adding new logic or components to the codebase. labels Jul 31, 2018
@andrewrk
Copy link
Member Author

I updated the OP to clear up confusion.

Ruby's syntax is nuts. How could more dots mean less numbers in the range? The mnemonic is completely backwards!

@skyfex
Copy link

skyfex commented Aug 2, 2018

I like @thejoshwolfe's suggestion. The a .. b syntax could be generalized. It could be syntactic sugar for Range {.from=a, .to=b}, or some kind of special built-in tuple. But this doesn't make sense if switch uses the same syntax. Like he says, switching is doing a comparison operation, not actually iterating over a range. I think it makes a lot of sense to make those operators look more like comparison operators.

This also resolves the question of wether a .. b is inclusive or exclusive. Then a and b are just two numbers really, and it should be considered obvious that arr[a..b] is exclusive on b. If it's used elsewhere it should be made sure that it's obivous from context as well.

Switch on floats could be nice. I think only a <..< b is safe in that case. Equality on float is tricky. But
if x <= b is allowed on floats then a <..<= b should be too.

Maybe it looks a bit ugly to some, and a few more characters to type, but it's easier to read unambiguously

@andrewrk andrewrk modified the milestones: 0.4.0, 0.5.0 Sep 28, 2018
@daurnimator
Copy link
Contributor

Yesterday I different switch range usecase came up: I wanted to switch on type and have a case for i0...i63 and then a different one for i64...i65535

@Rocknest
Copy link
Contributor

Rocknest commented Apr 12, 2019

Yesterday I different switch range usecase came up: I wanted to switch on type and have a case for i0...i63 and then a different one for i64...i65535

@daurnimator you already can switch on size of integers, just like that:

switch (@typeInfo(arg).Int.bits) {
    0...63 => //
    64...65535 => //
}

@Rocknest
Copy link
Contributor

Rocknest commented Apr 14, 2019

I propose this syntax:

switch (c) {
    5 -> 10 => {}, // exclusive, another variant: a ~~ b
    'a' ->+ 'z' => {}, // inclusive, another variant: a ~~+ b
}

Proposal for range syntax

@Srekel
Copy link

Srekel commented Jan 6, 2020

A tiny suggestion: If it's decided that both .. and ... are allowed in some context, maybe it's better to have .. and .... (four dots).

I feel like the difference between two and three dots is small enough that there will be hard-to-find typo bugs, similar to the classic if (mybool);

Four dots would stand out clearly.

@momumi
Copy link
Contributor

momumi commented Jan 9, 2020

if we want to support switching on floats, we need to support exclusive lower bound too.

In python you can write 1 < x <= 20 which translates to (1 < x) and (x <= 20) (except x is only evaluated once). So you can write this:

if (0 <= x <= 10) {
    // ..
} else if (10 < x <= 20) {
    // ..
} else if (x > 20) {
    // ..
}

Which is very intuitive to understand. It's more flexible too because you can use it outside of switch statements as well. Also, in python you can chain more than one expression: ie 0 < x < y < 100 becomes (0 < x) and (x < y) and (y < 100). The expression a == b == c == d becomes (a == b) and (b == c) and (c == d) etc.

@adontz
Copy link

adontz commented Feb 23, 2020

I think it is very important to remember, that range bounds may be constants defined somewhere else, so all this +1/-1 may just confuse and make code less readable.

Adding my 2 cents to @thejoshwolfe proposal.

switch (x) {
     0<= ... <  5 => {}, // 0, 1, 2, 3, 4
     5<= ... <=10 => {}, // 5, 6, 7, 8, 9, 10
    10<  ... < 15 => {}, // 11, 12, 13, 14
    14<  ... <=20 => {}, // 15, 16, 17, 18, 19, 20
}

maybe even

switch (getRandom(0, 20)) {
     0<= |x| <  5 => {}, // 0, 1, 2, 3, 4
     5<= |x| <=10 => {}, // 5, 6, 7, 8, 9, 10
    10<  |x| < 15 => {}, // 11, 12, 13, 14
    14<  |x| <=20 => {}, // 15, 16, 17, 18, 19, 20
}

@gingerBill
Copy link

In Odin, there were many options to go for iff I wanted to unify slicing operations and ranges. However, I decided not to unify them and keep them as different concepts because they are fundamentally different ones too. The act of slicing is different to indexing with a range, you can treat them as if they were the same, but they are actually different things conceptually.

array[lo:hi] // slicing syntax, [lo, hi)
case a ..< b:  // range syntax [a, b)
case a ..  b:  // range syntax [a, b]

If you wanted to unify these conceptions, these are the possible solutions:

a .. b
a ... b

a .. b   or a ... b
a ..= b

a ..< b
a .. b or a ... b

The first approach is the most confusing for two reasons, the things are not that distinct in their appearance and they can have the opposite meanings in different languages e.g. Ruby vs Rust.

For Odin I settled on the third approach because it's probably the clearest view in my opinion.

@ManDeJan
Copy link

I also wanted to give my 2 cents, I like how Raku handles this: https://docs.raku.org/type/Range
adding a caret to either side of the .. that indicates that the point marked with it is excluded from the range.

switch (x) {
     0  ..^  5 => {}, // 0, 1, 2, 3, 4
     5  ..  10 => {}, // 5, 6, 7, 8, 9, 10
    10 ^..^ 15 => {}, // 11, 12, 13, 14
    14 ^..  20 => {}, // 15, 16, 17, 18, 19, 20
}

@gingerBill
Copy link

@ManDeJan In Nim, the caret is used to be shorthand to mean from the end.

This it the problem with choosing syntax. Every other language chooses it differently.

@jakwings
Copy link

My proposal?

elements[.[a, b)]
elements[.[a, b]]
elements[.(a, b]]
elements[.(a, b)]

switch (x) {
    .[ 0,  5) => {}, // 0, 1, 2, 3, 4
    .[ 5, 10] => {}, // 5, 6, 7, 8, 9, 10
    .(10, 15) => {}, // 11, 12, 13, 14
    .(14, 20] => {}, // 15, 16, 17, 18, 19, 20
}

my feeling now: (>_<) oh sissy me, why not just use .. and ...? off-by-one errors are not new and will never be old, you can also mistype a < b and get a <= b instead.

so my real proposal:

a .. b   // exclusive
a ... b  // inclusive

never mind those crazy ideas about float range, enum range and enum-indexed array/tuple...

@Mouvedia
Copy link

I am against using : because it's currently used for sentinel elements.
That would be confusing.

@Mouvedia
Copy link

Mouvedia commented Jan 21, 2021

Can we have a reason as to why this has been closed?

@thejoshwolfe
Copy link
Sponsor Contributor

status quo:

  • slicing uses arr[a..b] and has exclusive upper bound. there doesn't seem to be a compelling use case for inclusive upper bound slicing.
  • switch case ranges use a ... b => and have inclusive upper bounds. there doesn't seem to be a compelling use case for exclusive case ranges.

the two symbols are different looking .. vs ..., so there's no problem with inconsistency. confusion will always be a subjective matter, but status quo is not horribly confusing at least.

closing this issue doesn't preclude the possibility of changes, but the wording of this issue's title and the lack of concrete proposal here mean that this discussion is not actionable. if a change is to be considered, it should be a separate issue with a concrete proposal.

@jedisct1
Copy link
Contributor

I quite agree that the status quo is fine. Having both .. and ... for slices may be more confusing than useful in actual applications. And it is certainly not required.

Using brackets for intervals would look neat, but once again:

.(10, 15) => {}, // 11, 12, 13, 14

I don't see any reason to ever use that over 11...14.

@gonzus
Copy link
Contributor

gonzus commented Feb 18, 2021

I think it is actually more complicated to have to remember the rules .. is for slices and excludes upper bound and ... is for switches and includes the upper bound than to just have both of them operate consistently in all cases: .. excludes upper bound, ... includes upper bound.

It is unfortunate that there already exist inconsistencies in how other languages assign the semantics, but again, I would personally much rather remember which is which, and use any of them as I see fit, than remember only one is for slices and only one is for switches.

Summary: please take this as my personal opinion that this issue should be reopened. Cheers!

@Rocknest
Copy link
Contributor

I have been appreciating zig's preference to keywords over operators (especially control flow keywords). Operators may be heavily overloaded with incompatible meanings in different programming languages, but keywords are less likely to cause confusion. So i have come up with this idea: x upto y for exclusive switch ranges and x uptoand y for inclusive switch ranges. I think the slice syntax is fine as it is.

switch (rng.getRandomPercent()) {
    0 upto 30 => std.debug.warn("Choice A"), // 0-29
    30 upto 70 => std.debug.warn("Choice B"), // 30-69
    70 upto 100 => std.debug.warn("Choice C"), // 70-99
}

switch (c) {
    'a' upto 'c' => {}, // a, b
    'c' uptoand 'e' => {}, // c, d, e
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Implementing this issue could cause existing code to no longer compile or have different behavior. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Projects
None yet
Development

No branches or pull requests