# Discussion: Range operator #198

Open
opened this Issue Feb 27, 2017 · 78 comments

Projects
None yet
Contributor

### HaloFour commented Feb 27, 2017 • edited by gafter

[Editor's note: this is a discussion regarding championed proposal #185. -@gafter]

### Summary

I propose a general purpose range operator which can be used to obtain a range a integral values. This range can then be used in loops, patterns or slices:

```var slice = source[5..10];
foreach (var i in 5..10) {
Console.WriteLine(i);
}```

### Questions

1. Can a range expression be assigned directly to a variable? If so, what is the type of that expression?
`var range = 5..10;  // int[]?  IEnumerable<int>?  IReadOnlyList<int>?  Range<int>?  None of the above?`
2. What exact syntax should be supported? There are numerous examples found in other languages.
```var perl = source[5..10];
var swift = source[5...10];
var basic = source[5 to 10];
var python = source[5:10];```
3. Should the upper boundary be inclusive or exclusive? Should the syntax of the operator support explicit inclusivity?
```var range1 = 5...10;  // 5, 6, 7, 8, 9, 10
var swift = 5..<10;  // 5, 6, 7, 8, 9
var perl = 5^..10; // 6, 7, 8, 9, 10```
4. Should the syntax include the ability to specify a step or a stride?
`var stride = 5:10:2; // 5, 7, 9`
5. Should the syntax include specifying a range relative to an anchor, such as the end of an array/collection?
```var source = new[] { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
var slice1 = source[-3:-1]; // { 8, 9, 10 }```
6. Should the range syntax be supported in pattern matching against integral (or comparable) types?
```int value = CalculateSomeValue();
if (value is 10..50) {
Console.WriteLine("The value is somewhere between 10 and 50!");
}
Point p = CalculateSomePoint();
switch (p) {
case (x, -25:-25):
Console.WriteLine("The point is near the central axis!");
break;
}```
7. Should variables be permitted for boundaries of the range?
`var slice = source[start...end];`
8. Should the syntax allow for open-ended ranges, e.g. omitting either the start or the end boundary which indicates that the range would extend from the start or end of an enumerable? Can such a range be assigned to a variable or enumerated?
```var beginning = source[:10]; // grabs 0-10
var end = source[10:]; // grabs 10-last```
9. What happens if the start value is greater than the end value? Should that be different if the values are literals v. variables?
```var source = new[] { 1, 2, 3, 4, 5 };
var slice1 = source[4..2];  // compiler error? empty array? { 4, 3, 2 }?
var start = 4;
var end = 2;
var slice2 = source[start..end]; // ?```

1. Fixed `foreach` loop example
2. Added questions 7 and 8.
3. Added question 9 from #198 (comment)

Open

### orthoxerox commented Feb 27, 2017

 In my opinion: Can a range expression be assigned directly to a variable? If so, what is the type of that expression? Yes, `Range : IEnumerable` What exact syntax should be supported? There are numerous examples found in other languages. `e1..e2` Should the upper boundary be inclusive or exclusive? Should the syntax of the operator support explicit inclusivity? Inclusive. `..<` can be the exclusive operator. Should the syntax include the ability to specify a step or a stride? No. Should the syntax include specifying a range relative to an anchor, such as the end of an array/collection? Yes, if negative indexing is added in general. Should the range syntax be supported in pattern matching against integral (or comparable) types? Yes. And my own suggestion: range operator should be applicable to all types that support `IComparable` (or `SComparable`) and `++`/`--` operators (`SSuccessive`?).
Contributor

### svick commented Feb 27, 2017

 Previous discussion: dotnet/roslyn#7632. `for (var i in 5..10)` Should this have been `foreach`?
Contributor

### alrz commented Feb 27, 2017

 Taking a page from Rust, there could be also a syntax for "open" ranges, `..5`, `,5..`, etc.

### gordanr commented Feb 27, 2017

 There is very nice notation in mathematics for open and closed intervals. ``````[3..5) [6, 12] (8, 12) (9..11] `````` Unfortunately square and round brackets are not suitable for C# syntax.
Contributor

### svick commented Feb 27, 2017

 @alrz What happens when you iterate left-open range? If I understand it correctly, in Rust, both kinds of open ranges are separate types and left-open range is not iterable.
Contributor

### alrz commented Feb 27, 2017

 @svick I think so. That's particularly useful in slices and patterns.
Contributor

### HaloFour commented Feb 27, 2017

 @svick Yes, thanks for catching that. Will fix. @orthoxerox I tend to agree. What's more this `Range` struct can implement the enumerable pattern with a struct enumerator so that enumerating it should be more efficient. The compiler could take that a step further and unroll loops, turning an iterator variable into a constant.

### vbcodec commented Feb 27, 2017

 Do not think this is worthy feature. There is low demand for fixed integer ranges, and also for other types. Also, expanding such synetax for even less popular cases, is even more unjustified. Simply, too much resources for too small problems.
Contributor

### HaloFour commented Feb 27, 2017

 @vbcodec A range operator of sorts is likely given the championed proposal to add slicing to the language. Given such it is worthwhile to explore the syntax and capabilities of that operator both inside and outside of the context of slices. Fixed ranges of integers are still pretty common, and using a range operator to describe them makes sense. Apple even completely removed the classic C `for` loop from their Swift language in favor of enumerating over a range.

 For the case of variables (7), what should happen if `begin` is greater than `end`? Should this silently produce an empty enumeration or throw an exception?
Contributor

### HaloFour commented Feb 28, 2017

 @vladd Good question. I added it to the list of questions above in the discussion.

### Thaina commented Feb 28, 2017 • edited

 I would argue that we don't need it to be operator. Just extension method is enough ```public static class Ext { public IEnumerable To(this int start,int end) { var sign = Math.Sign(end - start); while(start != end) { yield return start; start += sign; } yield return start; } } foreach(var i in 3.To(5)) // (3..5) { }``` (ported from previous issue) Also we could easily make similar function for exclusive `3.ExTo(5)` // [3..5) `3.ToEx(5)` // (3..5] `3.Until(5)` // [3..5]
Contributor

### alrz commented Feb 28, 2017

 what should happen if begin is greater than end? Should this silently produce an empty enumeration or throw an exception? An empty range, and a warning if it's known at the compile time i.e. via constant bounds.

### Thaina commented Feb 28, 2017

 @alrz @vladd Why shouldn't we iterate in reverse order?
Contributor

### alrz commented Feb 28, 2017

 Because the default step is 1, and in slices it doesn't make sense.

### Thaina commented Feb 28, 2017

 @alrz I don't feel it weird to step 1 in any direction 5 to 3 mean 5 4 3 is also 1 step

### AdamSpeight2008 commented Feb 28, 2017

 There are other types that are `IComparable` ```var atoz = 'a'..'z' // char var daysinyear = JanFirst..DecThirtyFirst // DateTime```

### Thaina commented Feb 28, 2017 • edited

 @AdamSpeight2008 Actually it should work for all integer type. `char` included. But not for all `IComparable` because float would too ambiguous What would `0f.To(0.5f)` will do?

### AdamSpeight2008 commented Feb 28, 2017 • edited

 What would `0f.To(0.5f)` will do? yield 0f then done.

### Thaina commented Feb 28, 2017

 I'm quite sure that if that is inclusive range we would want both 0 and 0.5f

### AdamSpeight2008 commented Feb 28, 2017 • edited

 See VB LINQ Range for other concerning issues.

### Thaina commented Feb 28, 2017 • edited

 @alrz Also reverse slice is not uncommon I think. Sometimes I made round robin algorithm it involve slice first half match with second half in backward Normally I use LINQ ```var users = new User[14]; var left = users.Take(7); var right = users.Skip(7).Reverse(); var matches = left.Zip(right); ///Actually I do one line var matches = users.Take(7).Zip(users.Skip(7).Reverse());``` If it use slice ```var users = new User[14]; var left = users.Slice(0.To(7)); var right = users.Slice(13.To(8)); // users index 0=13,1=12,2=11,3=10,4=9,5=8,6=7 var matches = left.Zip(right); /// var matches = users[0.To(7)].Zip(users[13.To(8)]);```
Contributor

### alrz commented Feb 28, 2017

 I didn't say reverse range is not useful, but it should not go backwards if left-bound happen to be grather. It should be explicit with a -1 step value.

### Thaina commented Feb 28, 2017 • edited

 @alrz Maybe we would look at what mathematicians do If they write something like 9..3 are they assume negative step? partial sum `sum (i=9->1) i / 3` something like that?
Contributor

### alrz commented Feb 28, 2017

 When your range bounds are not constant, it will cause bugs and you'd have no idea what's going on e.g. `left..right` should this be conditionally go backwards? PS: In programming there is no such thing as "assumption" unless you Assert.

### GeirGrusom commented Feb 28, 2017 • edited

 @eyalsk Should the upper boundary be inclusive or exclusive? Should the syntax of the operator support explicit inclusivity? Should be exclusive and have an inclusive operator. Why? Why not just pick one and be done with it? Exclusive range: YAGNI.
Contributor

### eyalsk commented Feb 28, 2017 • edited

 @GeirGrusom Sure, it can probably be added later iff it's necessary.

 @GeirGrusom Exclusive range: YAGNI. Well, if both ends are included, then the simple iterating on the whole array would need explicit subtracting one from the length: `[0..arr.length-1]`. And given that having `end < start` might throw (we didn't reach the decision on this yet), inclusive range might need explicit checks for the case when `length == 0`. This would make the feature cumbersome to use.

### AdamSpeight2008 commented Jun 7, 2017

 In VB `For i = 0 To 10` is equivalent to `For i = 0 To 10 Step 1'. a step value of one (1) is likely what most want. If they want something other the default they include it `For i = 0 To 10 Step 0.1'.

### quinmars commented Jun 13, 2017 • edited

 I'd prefer if the `..` syntax produce an interval of the value type `Range where T : IComparable`. In general an interval is inumerable. For special types like `int` there could be an extension method `GetEnumerator()`. Other types such as `float` would have explicit extension methods to generate an `IEnumerable`, types like `string` wouldn't have any at all: ```foreach (var i in 5..10) { // do something } foreach (var f in 1.0 .. 3.14 .StepWidth(0.3)) { // do something }``` An interval can be very useful in many cases. Operations such as `IntersectsWith`, `IntersectionWith` could be useful for non numeric datatypes like 2D or 3D Points maybe even for `strings`: `var r = "A" .. "Henry" .IntersectionWith("Anton" .. "Z"); // r == "Anton" .. "Henry";` In my opinion, it would be a mistake to reduce ranges to enumerable sequences.

### Thaina commented Jun 13, 2017

 Really, I think we should satisfied with just extension method ```Span oneToTen = 1.To(10); Span floatStep = 0.1f.To(3.14f,0.3f);```

Open

### Richiban commented Oct 9, 2017

 @Thaina I'm not sure that's going to work too well, since you'd have to create loads of extension methods (or create an extension method on `object` or `T` which is not a good idea for the BCL). Introducing a new struct `Range` to the BCL with special syntax support (`a..b` => `new Range<_>(a, b)`) is very reasonable. Then changes & upgrades can be made to the Range class (such as when Shapes come along) and no one will have to change any code. The new `Range` type makes it easy to write your own type that accepts a range (e.g. in an indexer).

### ufcpp commented Oct 29, 2017

 I want "range pattern" especial in `case` statements. ```switch (x) { case 1..2: // ... }``` I recently implemented a Unicode grapheme breaking algorithm. In this library, a `switch` statement generated from GraphemeBreakProperty table has about 20 thousand lines. A `if` statement version of that is much shorter but 25 times slower than the `switch`. If C# have "range pattern", this `switch` becomes much shorter. ```// without range switch (codePoint) { case 1536: case 1537: case 1538: case 1539: case 1540: case 1541: // ... // with range switch (codePoint) { case 1536..1541: // ...```
Member

### jcouv commented Oct 30, 2017

 @ufcpp This code can already be simplified with `when` filters: ``` switch (x) { case int x1 when x1 >= 0 && x1 < 10: break; case int x2 when x2 >=10 && x2 < 20: break; }```

### orthoxerox commented Oct 30, 2017

 @jcouv but will it be compiled into a switch jump table?
Member

### jcouv commented Oct 30, 2017

 @orthoxerox I don't think that is a language design concern (the language spec does not specify such things), but rather an implementation concern. You can see the current decompiled for using the `when` filters here. It is straightforward (and probably not the most performance for @ufcpp's case). It could be optimized further in the future when the binding for `switch` learns that `x < 10` and `x >= 10` are exclusive. I had missed the benchmark for a linear `if` version from @ufcpp. For that scenario, manually implementing a binary-search `if` version would be faster than a linear `if` and while less verbose than fully expanded `switch`.

### ufcpp commented Oct 31, 2017

@jcouv
I've tried it.

Method Mean Error StdDev
ByBinarySearch 4.642 ms 0.0468 ms 0.0415 ms
BySwitch 3.825 ms 0.0473 ms 0.0442 ms
BySwitchWhen 115.310 ms 2.7970 ms 2.9928 ms
ByIf 142.658 ms 1.3744 ms 1.1477 ms
ByBinaryIf 2.502 ms 0.0147 ms 0.0137 ms

As you said, I should use a binary-search.
Any way, do you have any plant to optimize this kind of `case when` clause? (Is this roslyn's matter rather than cshaplang's?)

### Thaina commented Oct 31, 2017 • edited

 Great experiment! I wish in the future the compiler would convert all `switch` to `if-else-if`

### AdamSpeight2008 commented Oct 31, 2017

 @ufcpp Could you also get the IL size for each of those implementations?

### AdamSpeight2008 commented Oct 31, 2017

 @jcouv The operator `>=` is not the expected compliment to `<` (to the average user, yes) but weirdly on a language implementation details level it's `>`.
Member

### jcouv commented Oct 31, 2017

 @ufcpp Fantastic data points. Thanks! Glad to see the binary search approach does work. The code is harder to read than I expected though. I feel like code generation is probably a good fit in case the switch needs to be updated (I suspect that's what you did to test it out). Yes, such optimizations are more suitable for roslyn repo. Please do file an issue with the experiments you collected. Another data point that would be awesome is the typical size of switches (how often do people need massive switches like the one you describe?). I'll ask around, as I think we can analyze github repos as examples. As far as planning, there is a significant rework for patterns (to support recursive patterns) which is starting. Optimizations based on completeness and exclusivity would probably come after that.

### ufcpp commented Oct 31, 2017

build script

Debug (`/o-`) build:

Length Name
74240 BinarySearchIf.dll
54272 LinearIf.dll
140800 Switch.dll
48128 SwitchWhen.dll

Release (`/o+`) build:

Length Name
31232 BinarySearchIf.dll
24064 LinearIf.dll
140800 Switch.dll
38912 SwitchWhen.dll

### ufcpp commented Oct 31, 2017

 @jcouv I'll post an issue to Roslyn in a few days. Yes, these codes are code-generated. The generator is here: https://github.com/ufcpp/GraphemeSplitter/blob/master/GraphemeBreakPropertyCodeGenerator/Program.cs

### ufcpp commented Oct 31, 2017

 In addition, it requires 50 seconds to compile the 20 thousand-line switch-case. (My PC is Core i7-4790 3.60GHz.)

Open

Open

Open

Open

Open

### Starwer commented Apr 3, 2018 • edited

 My vote... I pretty much agree with @orthoxerox. 1.Can a range expression be assigned directly to a variable? If so, what is the type of that expression? Yes, Range : IEnumerable Agree: Let's not reuse the `LINQ.Range`. This is limiting the range ;) of application of the operator. And I fear performance impact on the length calculation: `a..b --> new Range(a, b - a + 1)` 2.What exact syntax should be supported? There are numerous examples found in other languages. e1..e2 Agree again: Perl-style is clear and doesn't conflict with any existing C# notation. 3.Should the upper boundary be inclusive or exclusive? Should the syntax of the operator support explicit inclusivity? Inclusive. ..< can be the exclusive operator. Partly agree: I'm not too enthousiastic about the exclusive operator. Maybe again something hard to handle in optimization. 4.Should the syntax include the ability to specify a step or a stride? No. Agree: That makes the syntax more fishy...but we can make this Range constructor accept a step or stride: `var even = new Range(0, 10, 2);` 5.Should the syntax include specifying a range relative to an anchor, such as the end of an array/collection? Yes, if negative indexing is added in general. This time: Not for me. Again a performance issue. In case the range is not constant, we must check if the boundaries values are positive, yield perfomance impact. 6.Should the range syntax be supported in pattern matching against integral (or comparable) types? Yes. This syntax would be nice: ```// exit the awkward C-stype for-loop: foreach (int i in 0..10) { ... } // Let's keep it consistent in if: if (myvar in 0..10) { ... } // Let make it in switch/case: switch (myvar) { case 0..10: {...} break; }``` 7.Should variables be permitted for boundaries of the range? Yes. definitly 8.Should the syntax allow for open-ended ranges Would be nice. Although I don't think we could map it to the same Range type then as defined as in 1. (performance, again). 9.What happens if the start value is greater than the end value? This should result in an empty range. Again, this choice is driven for a performance concern. I believe those two expressions shall be equivalent, also performance-wise, no matter what (strange) value the variables `start` and `end` may contain: ```for (int i=start; i <= end; i++) { ... } foreach (int i in start..end) { ... }``` Why do I care so much about performance ? I really want the lazy/readable/sweet code to be as efficient as the verbose ancestral way to do it. Otherwise... It will just be a feature for dummies... IMHO, I think the ultimate goal of this is to be able to write such a code, with the help of `Span` and `this[Range]` overload on `ICollection`, wihtout penalty on performance: ```var myvar = new mytype[10]; ... myvar[2..4] = othervar[1..3];```

 @Starwer I believe those two expressions shall be equivalent, also performance-wise, no matter what (strange) value the variables start and end may contain: ``````for (int i=start; i <= end; i++) { ... } foreach (int i in start..end) { ... } `````` Should this be true even for `end == int.MaxValue`? That is, should ``````foreach (int i in (int.MaxValue-1)..int.MaxValue) { ... } `````` be an infinite loop?

### Starwer commented Apr 4, 2018

 @vladd: I agree it's not ideal, but I still prefer that solution than adding a boundary-safe condition that would grow with a O(n) complexity with the loop, just in case we hit the int.MaxValue. We already accept that the arithmetic around the corners requires user-awareness for the even simpler expressions like: `var i = int.MaxValue; i++;` I consider this is still a scarce corner case, compared to the empty-range which is a frequent case. You seem to have this awareness. So honestly, if you would know that the use of `foreach`/Range is more expensive than a simple for-loop because it is safer, would you ever use it ?

### TheRWe commented Jun 5, 2018

 Should the upper boundary be inclusive or exclusive? Should the syntax of the operator support explicit inclusivity? The best way is to have upper bound exclusive. examples: exclusive ```var a = new int[] {/*1,2,3,4,5*/} var len = a.Length var al = a[ .. len/2] var ah = a[ len/2 .. ] // al.Concat(ah) = a var aWithoutLast = a [ .. len-1 ]``` inclusive (you need to substract 1 all the time) ```var a = new int[] { } var len = a.Length var al = a[ .. len/2-1 ] var ah = a[ len/2 .. len-1 ] // al.Concat(ah) = a var aWithoutLast = a [..len-2]```

Open

### james-uffindell-granta commented Jun 25, 2018

 Regarding using ranges in patterns, I think I'd prefer an explicit 'in' pattern: ``````var input = 9; switch (input) { case in 0..10: // code } `````` All our other patterns (const pattern, type pattern, var pattern) are all things which are the same type as the input, and we're doing an 'is' check on those. Using a bare range makes it look superficially like a const pattern, but it acts like a completely new type of pattern (doing an 'in' check instead of an 'is' check). (I think this would also be clearer if we ever wanted to extend the switch pattern in future, to use any (non-const) collection object, instead of just a range literal: ``````switch (userId) { case in currentlyAssignedIds: // code `````` whereas I think just `case currentlyAssignedIds:` would require a bit extra mental overhead to interpret correctly.) Also, using a bare range could make switching on a range confusing: ``````var range = 0..10; switch (range) { case 0..10: // code } `````` since there presumably the case pattern should be interpreted as a const pattern, rather than a range pattern? Seeing how patterns can be used in if statements too, this would give us `if (input is in 0..10)`, rather than `if (input in 0..10)`. This is slightly more verbose, but only by three characters, and to me it feels a bit more consistent with the fact that all other pattern-based if statements use 'is'.

Closed

Open