Add a range/step/seq function #1039

Closed
alco opened this Issue May 8, 2013 · 30 comments

Projects

None yet

5 participants

@alco
Member
alco commented May 8, 2013

Something akin to Python's range could make a nice addition to the language as a library or builtin function.

>>> range(0, 8, 2)
[0, 2, 4, 6]
>>> range(3, 5, -1)
[]
>>> range(-3, -5, -1)
[-3, -4]

It would produce an iterable, not a list.

Related discussion on the mailing list https://groups.google.com/forum/?fromgroups=#!topic/elixir-lang-core/HpBIV5zMAHw , continued in the issue #1017.

Both step and seq are OK names. Altough, as another alternative, we could rename current ranges to intervals (it makes just as much sense to talk about inclusiveness and non-inclusiveness of intervals, aka closed intervals and open intervals) and call the new function range.

Owner

I don't see yet a huge motivation for renaming our ranges to intervals. :)
However, it would be useful to support sequences. Here is an API I propose:

# seq(first, number, step // 1)

seq(0, 5) #=> [0, 1, 2, 3, 4]
seq(0, 0) #=> [0]
seq(0, 5, 2) #=> [0, 2, 4, 6, 8]

Notice that we can get an "excluding end range" like with:

seq(first, last-first)

The goal is to not have first class syntax for sequences as they won't be supported in guard clauses.

Member
alco commented May 17, 2013

There has also been a proposal on IRC:

02:30:34 antifuchs: I'd like to suggest non-syntax-based things now :)
02:31:33 antifuchs: range(above: 5, upto: 6)
02:31:43 antifuchs: range(from: 5, below: 6)
...
02:36:39 nickmeharry: To be clear, from: and upto: are inclusive, while above: and below: are not?
02:37:07 nickmeharry: So those two examples above are [6] and [5].
02:37:26 antifuchs: I think so, yeah :)
02:37:45 antifuchs: other, clearer, names would work too :)

I like this one because it allows us to have both "number of elements" option and "upper bound" option, i.e.

range(below: 4)                   #=> [0,1,2,3]
range(upto: 4)                    #=> [0,1,2,3,4]
range(from: 3, count: 4, step: 2) #=> [3, 5, 7, 9]
Member
ericmj commented May 17, 2013

The goal is to not have first class syntax for sequences as they won't be supported in guard clauses.

Is there a technical reason for this or why is that?

Owner

@ericmj guard clauses are very limited in which expressions is supported and although we may be able to match sequences in guard clauses, it would probably be too complicated to be worthy it.

Owner

@alco What about this API:

Seq.from(1, up_to: 10)
Seq.from(10, down_to: 1)
Seq.from(5, count: 10)
Seq.from(5, count: 10, step: 2)

It makes the starting point explicit (which is clearer) and I also think it reads better. However, given that the examples above will return different iterators as result, I am still not sure of the benefits over:

1..10
10..1
seq(5, 10)
seq(5, 10, step: 2)

I think seq and range are inherently different things and treating them as different things may be the way to go. For example, steps can only contain numbers while ranges can include everything (since everything is comparable in Elixir) and for this reason ranges also come with their own protocol.

Member
alco commented May 17, 2013

@josevalim Making the starting point explicit is really an arbitrary choice. In Python range is mostly used as range(N) which is equivalent to range(0, N). Likewise, deciding that the second argument should denote the number of elements in the resulting list rather than the upper bound is also not universally useful.

So to please all parties, we will either have to have two functions step(from, count, step) and seq(from, to, step) or have one like range above that encompasses the former two.

Owner

Making the starting point explicit is really an arbitrary choice. In Python range is mostly used as range(N) which is equivalent to range(0, N).

The choice is not arbitrary. The problem with making the first item implicit is that we have an "arguments" dance:

range(end)
range(start, end)
range(start, end, step)

Maybe you are familiar with it due to Python but at first it is confusing. The same would happen for sequences:

seq(count)
seq(start, count)
seq(start, count, step)

And in code like:

range(upto: 4)

It doesn't read nicely: what is up to 4? On the other hand, I may just be spoiled by Ruby:

1.upto(4)

So to please all parties, we will either have to have two functions step(from, count, step) and seq(from, to, step) or have one like range above that encompasses the former two.

Having step and seq would just aid on confusion. I would never remember which one receives what (the names and apis are just too similar). I would rather have something along the lines of what antifuchs proposed (or Seq.from/2) and document what makes it different from ranges (i.e. support to steps).

Member
alco commented May 17, 2013

I don't think range(upto: 4) will be a problem. It's a known fact, that indexing starts from 0. It's something that users will learn. The intended use I see for non-inclusive ranges is range(0, N) where N can be 0. So with Seq.from we'll just have the code Seq.from(0, ...) repeated many times.

And Seq.from also bothers me because of the name. The function name doesn't convey it's intent.

Member
alco commented May 17, 2013

So I think we should try antifuchs' approach. The name seq is OK. It'll be seq(from: 0, upto: 5) or seq(from: 5, downto: 0). Just seq(count: N) should pretty naturally result in [0, 1, ..., N-1] if from is omitted (because of the zero-based indexing in the language). Same goes for upto and downto.

Owner

@alco just to continue exploring our options, what about:

Seq.upto(10)
Seq.upto(10, from: -10)
Seq.upto(10, from: -10, step: 2)
Seq.count(10)
Seq.count(10, from: 25)
Seq.count(10, from: 25, step: 2)
Member
alco commented May 17, 2013

I don't see the benefit over seq from the previous comment. Then we'll also need Seq.below(10) for non-inclusive seqs.

To cut down on typing, from could be an optional first arg.

seq(below: 4)    #=> [0, 1, 2, 3]
seq(1, below: 4) #=> [1, 2, 3]
Owner

This is getting too complex. I don't really think we should support below
for example. People can easily achieve this with just the other options so I
don't see any reason for supporting such a wide range of options.

José Valim
www.plataformatec.com.br
Founder and Lead Developer
*

Member
alco commented May 17, 2013

You are right. I forgot that with step parameter we can easily get away with less options. So we have one varying parameter which can be either count or to.

We have two proposed approaches.

1. Have two functions

count(N) == count(N, from: 0, step: 1) == [0, 1, ..., N-1]

count(3, from: 3, step: 1)
#=> [3, 4, 5]
count(3, from: 3, step: -1)
#=> [3, 2, 1]

to(N) == to(N, from: 0, step: 1) == [0, 1, ..., N-1]
to(3, from: 1, step: 1)
#=> [1, 2]
to(3, from: 3, step: 1)
#=> []
to(3, from: 2, step: -1)
#=> []

We can cut down on typing here and require either 1 or 3 position arguments, no keywords.

2. Have one function

seq(N) == seq(count: N, from: 0, step: 1) == seq(to: N, from: 0, step: 1) == [0, 1, ..., N-1]

seq(count: 3, from: 3, step: 1)
#=> [3, 4, 5]
seq(count: 3, from: 3, step: -1)
#=> [3, 2, 1]

seq(to: 3, from: 1, step: 1)
#=> [1, 2]
seq(to: 3, from: 3, step: 1)
#=> []
seq(to: 3, from: 2, step: -1)
#=> []

Here we need to use keywords, but we can still have a 1-argument seq(N) because it's the same for count and to.

I do think that the upper bound (or count) is more important than the from value because we can have a sane default for from -- 0. For to or count there is no such default value. And there is no problem using from as 1st argument in the 2-argument version of seq. It is not uncommon to have the first argument optional in Elixir.

seq(N) == seq(N, from: 0, step: 1)
seq(from, count: N) or seq(from, to: N)
seq(from, count: N, step: M) or seq(from, to: N, step: M)
Owner

I prefer the first approach. I also like the proposed names (to and count) too. Now, to add an extra challenge, I have been planning to add other kind of iterators for a while. I have mentioned them before, I can think of two:

  1. cycle: cycles through the given items forever
iex> cycle = cycle([1,2,3])
Cycle[collection: [1,2,3]]
iex> Enum.take(cycle, 6)
[1,2,3,1,2,3]

There were a couple situations I wanted to use it in the past, unfortunately I can't recall them now.

  1. with_index: receives a collection and returns a new iterator wrapping each item in a tuple with an index
iex> with_index = with_index([:a, :b, :c])
WithIndex[collection: [:a, :b, :c]]
iex> Enum.map(with_index)
[a: 0, b: 1, c: 2]

It feels (I am not sure) that they could all be part of the same module. I don't feel they belong to Enum though, since Enum is always eager and all of those iterators are lazy. What are your thoughts? Any idea for names?

Member
alco commented May 17, 2013

Do those functions really provide any benefit by being lazy? The Enum.take cycle([1,2,3]) example if fine, but which other functions will you use cycle with? Enum.take is pretty much the only function it can be used with.

Similarly, with_index will be resolved to the full lists after calling any function from Enum except for take.

Owner

@alco You are correct about with_index being resolved after the first call but the point of being lazy here is to avoid the iteration of the iterator twice (if you have a big list for example). And for big sequences, the trade-off is memory.

When I used cycle, I was actually looping infinitely expecting for some messages in a given order. So there are other use cases beyond take. :)

Member
alco commented May 18, 2013

@josevalim I see. Maybe you have clearer picture of this, but to me with_index seems to belong in Enum because it's going to be used mostly with Enum functions. As for cycle, it takes a list and produces an iterator? This is what Enum does :)

Owner

"This is what Enum does" - nothing in Enum returns an iterator, they all produce the result of consuming an iterator (a list) and that's why I am worried about mixing them. People will write code like Enum.with_index([1,2,3]) and expect to have a list back but they would need to call to_list first. :S

Member
alco commented May 18, 2013

Sorry, I meant Enum.Iterator. Well, you were right, cycle and with_index deserve to have their own place somewhere.

Since those two functions take a collection and return an iterator, they work like Enum.Iterator. But Enum.Iterator already has a contract that it returns successive elements of the collection, and those two functions return something transformed. So they could be placed in a module by itself. Seq is short but not too descriptive. Maybe Iterators or Iter? I would actually prefer the module to be namespaced to Enum, like Enum.IterTools, for ideological reasons. But for practical reasons a shorter name will win.

Owner

Final question, assuming we will implement:

count(N) == count(N, from: 0, step: 1) == [0, 1, ..., N-1]
count(3, from: 3, step: 1)
#=> [3, 4, 5]
count(3, from: 3, step: -1)
#=> [3, 2, 1]

to(N) == to(N, from: 0, step: 1) == [0, 1, ..., N-1]
to(3, from: 1, step: 1)
#=> [1, 2]
to(3, from: 3, step: 1)
#=> []
to(3, from: 2, step: -1)
#=> []

Should they return lists or iterators? Why?

Member
ericmj commented May 19, 2013

Lists. I see no reason why we should return an iterator here. If we want an iterator it is easy to use a list as an iterator. Also, returning a list would be consistent with the rest of the API.

EDIT:
One reason I can actually see for iterators here is that we don't have to allocate a list which is good for very long lists/iterators. But I think that should be a separate set of functions if we decide that we need them.

Member
alco commented May 20, 2013

Why not both? Have icount return an iterator and lcount return a list. I would be really sad if we had no way of producing an iterator without the need to construct a list first.

Owner

@alco both is a good compromise but let's put this on hold for now. I want to think more about this API and how we can make everything fit together. :)

Contributor
ToJans commented Jun 23, 2013

This all looks very complicated to me... Why not do it like this (no idea whether it is possible tbh); inspired by LINQ:

10 |> Seq.next(&1+3) |> take (100) |> skip(4) etc...
Member
alco commented Jun 23, 2013

@ToJans Because writing your snippet in the following way would be clearer (once you're familiar with the count function):

count(100, from: 10, step: 3)

While count and friends might not be general enough to generate arbitrary sequences (remember, we also have list comprehensions and possibly will have lazy comprehensions), they're supposed to simplify the common case. So getting rid of the code you showed is actually the goal of this proposal, in my opinion.

Owner

Now that streams are merged into master, we can restart this discussion. @alco I will ping you later to iron out some details.

With #1388, can this one be closed?

Owner

@alindeman I have been thinking about it. Not sure if we still want something specific for counting or if we wait until the need arises.

Contributor
ToJans commented Aug 9, 2013

@josevalim I would opt to wait for it; one can easily simulate this with 1..100, Enum funcs etc...

Owner

I am closing this since we haven't reached an agreement after all this time. :) We can always revisit when a seq function is actually needed.

@josevalim josevalim closed this Aug 11, 2013
@h4cc h4cc referenced this issue in lnikkila/elixir-range-extras Sep 26, 2016
Open

Adding RangeExtras.new(first, last, step) #1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment