Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add range(start => stop, length) #39069

Closed
wants to merge 6 commits into from

Conversation

mkitti
Copy link
Contributor

@mkitti mkitti commented Jan 2, 2021

The pull request creates a new form of range where start and stop are given as a Pair along with an integer length: range(start => stop, length).

The idea for providing start => stop as a Pair originates with @mcabbott and @MasonProtter via Zulip. Providing start => stop as a Pair has several advantages over other proposals:

  1. The arguments start and stop provided as a Pair is distinct and can be easily distinguished from length.
  2. Since the start and stop are provided as a separate type, allowing us to provide separate conventions due to multiple dispatch. This allows us to focus on length and frees us from having to assume step = 1 if length is not provided in a future pull request.
  3. In this form, stop is always included as the last element of the iteration.

Because of the default positional parameter length=2, this proposal also implements range(start => stop) which may differ from start:stop and range(start; stop=stop) since step is not assumed to be 1.

This is an alternate option to pull request #39055 range(start, stop, length). There length as positional argument has no default and a two argument range(start,stop) is not allowed. It is possible for this version to coexist with #39055 range(start, stop, length) if needed.

julia> Base.range(start_stop::Pair, length::Integer =
    Base._range(start_stop.first, nothing, start_stop.second, length)

julia> range(1 => 3.5, 26)
1.0:0.1:3.5

julia> range(1 => 3.5, 2)
1.0:2.5:3.5

julia> range(1; stop=3.5) # For contrast
1.0:1.0:3.0

Edit 2021/01/02: Fix attribution of using Pair

Edit 2021/01/03: Remove default length value of 2. length is now required.

@mkitti mkitti changed the title range(start => stop, length) Add range(start => stop, length) Jan 2, 2021
@timholy
Copy link
Sponsor Member

timholy commented Jan 2, 2021

This is not the first time I've wondered whether IntervalSets should be in Base. Of course the piracy issue for IntervalArithmetic would be a potential concern.

Doing this via a Pair doesn't seem like too much of an abuse; Pairs do sometimes feel like key/value pairs, and this is not such a case, but the idea of a Pair representing an interval is not foreign to the way they print.The one slight oddity might be 5 => 3 as a way of indicating a range with negative step.

@mkitti
Copy link
Contributor Author

mkitti commented Jan 2, 2021

This is not the first time I've wondered whether IntervalSets should be in Base

Using the Interval operator .. was @MasonProtter's original suggestion, actually. Looking back at the conversation, @mcabbott was the first to suggest Pair.

The one slight oddity might be 5 => 3 as a way of indicating a range with negative step.

I do like the Pair syntax a lot because it strongly implies directionality to the point where I would intuit that 5 => 3 might involve a negative step, which it does.

julia> range(5 => 3, 2)
5.0:-2.0:3.0

julia> range(5 => 3, 3)
5.0:-1.0:3.0

julia> 4  range(5 => 3, 3)
true

julia> range(5; stop = 3) # For comparison, this is empty
5:4

julia> isempty( range(5; stop = 3) ) # For comparison, this is empty
true

julia>  range(5, 3; length = 2) # Consistent with this proposal
5.0:-2.0:3.0

In contrast, 5..3 looks like a backwards interval to me implying empty set. That appears to be what is it actually is. It would be confusing and inconsistent if 5..3 actually contained any values. Perhaps this could be used in the future to denote a strictly increasing sequence?

julia> using IntervalSets

julia> 4  5..3
false

julia> 4  3..5
true

julia> isempty(5..3)
true

@antoine-levitt
Copy link
Contributor

I don't like the spelling a=>b, it's a very unusual way of specifying arguments, and I don't see why it should be used here?

@MasonProtter
Copy link
Contributor

To me, a => b reads as a to b so that’s what I like about this spelling. I think if someone wrote range(a, b, n) I wouldn’t know what I was looking at, but range(a => b, n) is (I think) a bit more suggestive.

@DilumAluthge
Copy link
Member

DilumAluthge commented Jan 2, 2021

There is precedent for using pairs to provide arguments to a function in Base.

For example, I can use replace("foobar", "foo" => "FOO").

@antoine-levitt
Copy link
Contributor

Yes and it always seemed confusing for me, but there there is the added utility of being able to pass multiple pairs, which is not present here.

@timholy
Copy link
Sponsor Member

timholy commented Jan 2, 2021

How is the pair syntax confusing? "foo" => "FOO" says very explicitly that you're changing foo to FOO.

For the bounds of the range, it's maybe slightly less intuitive, but I still like it. As @MasonProtter says, a => b does imply "from a to b" and that's pretty good.

@MasonProtter
Copy link
Contributor

MasonProtter commented Jan 2, 2021

Yes and it always seemed confusing for me, but there there is the added utility of being able to pass multiple pairs, which is not present here.

I don't really understand how the presence of multiple pairs or just one pair is at all relevant to this, but replace on ::String only accepts on ::Pair argument.

julia> replace("foo", 'f' => 'b')
"boo"

julia> replace("foo", 'f' => 'b', 'o' => 'a')
ERROR: MethodError: no method matching replace(::String, ::Pair{Char, Char}, ::Pair{Char, Char})
Closest candidates are:
  replace(::AbstractString, ::Pair, ::Pair) at set.jl:613
  replace(::Any, ::Pair...; count) at set.jl:555
  replace(::Union{Function, Type}, ::Pair, ::Pair; count) at set.jl:612
  ...
Stacktrace:
 [1] replace(a::String, b::Pair{Char, Char}, c::Pair{Char, Char})
   @ Base ./set.jl:613
 [2] top-level scope
   @ REPL[1]:1

@cmcaine
Copy link
Contributor

cmcaine commented Jan 3, 2021

Why is length=2 useful as a default? When might I use it?

@mkitti
Copy link
Contributor Author

mkitti commented Jan 3, 2021

Why is length=2 useful as a default? When might I use it?

That's a good question. The short answer is rarely, and the default is not there for utility. It is there mainly to promote understanding of this form of range, and it is the simplest value that produces a consistent result.

First, let's compare this syntax and prior syntax.

julia> collect( range(1 => 5.5) )
2-element Array{Float64,1}:
 1.0
 5.5

julia> collect( range(1; stop=5.5) )
5-element Array{Float64,1}:
 1.0
 2.0
 3.0
 4.0
 5.0

julia> collect( 1:5.5 )
5-element Array{Float64,1}:
 1.0
 2.0
 3.0
 4.0
 5.0

The syntax start => stop is meant to emphasize that the first element is start and the last element is stop. The focus here is on length rather than step. In prior forms of range the default was step = 1. This makes the use of range redundant with :. As you can see above, if step = 1 then the last element cannot be guaranteed to be stop.

Given that we want the first element to be start and the last element is stop, what would be a a good default for length?

One answer might there is no good default for length and that it should always be specified. No default actually makes quite a bit of sense, and we may eventually go with that.

Another answer is that perhaps we should pick some large number like 50 or 100 and this would be more useful. This was discussed in #28708 where the conclusion was that arbitrary default values was not the direction we wanted to head in.

The reason I chose 2 is because it is the smallest length that satisfies the predicate of having the first element as start and the last element as stop. That is the basic contract of this form of range. It's a feature that makes it distinct from start:stop where stop is not guaranteed to be the last element. In summary, length=2 would help convey the point of this form of range.

@MasonProtter
Copy link
Contributor

You could just make it a method error to not provide the length. That would be a bit more consistent with this:

julia> range(1, 10)
ERROR: ArgumentError: At least one of `length` or `step` must be specified

@mkitti
Copy link
Contributor Author

mkitti commented Jan 3, 2021

An alternative to a default length would be using Base.Fix1 to curry the endpoints and allow the length to be specified later.

julia> Base.range(start_stop::Pair, length::Integer) =
           Base._range(start_stop.first, nothing, start_stop.second, length)

julia> Base.range(start_stop::Pair) = Base.Fix1(Base.range, start_stop)

julia> r = Base.range(1 => 10)
(::Base.Fix1{typeof(range),Pair{Int64,Int64}}) (generic function with 1 method)

julia> print( collect( r(10) ) )
[1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0]

julia> print( collect( r(5) ) )
[1.0, 3.25, 5.5, 7.75, 10.0]

julia> print( collect( r(3) ) )
[1.0, 5.5, 10.0]

That does seem really useful actually. I can compute integrals with this.

julia> p( n, r = range(-10 => 10) ) = sum( ( exp( -x^2 )*20/n  ) for x in r(n) )^2
p (generic function with 2 methods)

julia> p(10^2)
3.0790749597833598

julia> p(10^4)
3.1409643664749933

julia> p(10^8)
3.141592590613959

@mkitti
Copy link
Contributor Author

mkitti commented Jan 3, 2021

You could just make it a method error to not provide the length

I'm reusing the same underlying machinery actually, so I would just have to remove the default.

julia> Base.range(start_stop::Pair, length::Integer) =
                  Base._range(start_stop.first, nothing, start_stop.second, length)

julia> range( 1 => 10 )
ERROR: ArgumentError: At least one of `length` or `stop` must be specified
Stacktrace:
 [1] _range(::Pair{Int64,Int64}, ::Nothing, ::Nothing, ::Nothing) at .\range.jl:126
 [2] range(::Pair{Int64,Int64}; length::Nothing, stop::Nothing, step::Nothing) at .\range.jl:91
 [3] range(::Pair{Int64,Int64}) at .\range.jl:91
 [4] top-level scope at REPL[2]:1

@cmcaine
Copy link
Contributor

cmcaine commented Jan 3, 2021 via email

@antoine-levitt
Copy link
Contributor

I don't really understand how the presence of multiple pairs or just one pair is at all relevant to this, but replace on ::String only accepts on ::Pair argument.

Yes but not the replace(A, old_new::Pair...; [count::Integer]) method. This is relevant because it's using pairs to actually perform multiple dispatch and allow multiple pairs (whereas replace(A, old1, new1, old2, new2) would be bad), not just to change syntax for function arguments.

How is the pair syntax confusing? "foo" => "FOO" says very explicitly that you're changing foo to FOO.

It's not confusing when you read it, it's confusing when you write it : I would naively go for replace(a, old, new) because that's how every function works, then have to look at the docs to see it expects me to write replace(a, old=>new). Changing range to not use positional arguments but pairs instead would be breaking, and I don't see the point in having both positional and pairs.

@mkitti
Copy link
Contributor Author

mkitti commented Jan 3, 2021

Changing range to not use positional arguments but pairs instead would be breaking, and I don't see the point in having both positional and pairs.

Breaking means that some syntax that used to work no longer works after this change. This pull request is not breaking. This pull request does not remove or preclude any positional argument syntax consisting of scalar numbers.

The availability of this Pair based range removes my concerns about unclear argument order in range(start, stop, length). This pull request does not preclude range(start, stop, length), which you are advocating for, and could also work along side that positional syntax if needed. If you would like to promote range(start, stop length) please promote the pull request #39055 or submit your own.

All prior syntax, including past and future prospective positional forms of range, continues to work after this pull request.

julia> Base.range(start_stop::Pair, length::Integer) =
           Base._range(start_stop.first, nothing, start_stop.second, length)

julia> Base.range(start, stop, length::Integer) = Base._range(start, nothing, stop, length)

julia> range(1, 5; length = 5)
1.0:1.0:5.0

julia> range(1, 5; step = 1)
1:1:5

julia> range(1, 5, 3) # Fully positional argument syntax works
1.0:2.0:5.0

julia> range(1 => 5, 3) # Pair syntax works along side above positional argument syntax
1.0:2.0:5.0

@mkitti
Copy link
Contributor Author

mkitti commented Jan 3, 2021

I think I prefer length not having a default.

In response to feedback, I have removed the default length and created a custom ArgumentError that mentions only length in 9ba4b4f. I will update the description above. range after this pull request will not accept a Pair alone.

julia> Base.range(start_stop::Pair, length::Integer) =
                  Base._range(start_stop.first, nothing, start_stop.second, length)

julia> Base.range(start_stop::Pair, length::Nothing = nothing) =
           throw(ArgumentError("`length` must be specified after $start_stop"))

julia> range( 5 => 3 )
ERROR: ArgumentError: `length` must be specified after 5 => 3
Stacktrace:
 [1] range(::Pair{Int64,Int64}) at .\REPL[6]:1
 [2] top-level scope at REPL[7]:1

julia> range( 5 => 3, nothing ) # As of 83f5d53
ERROR: ArgumentError: `length` must be specified after 5 => 3
Stacktrace:
 [1] range(::Pair{Int64,Int64}, ::Nothing) at .\REPL[2]:1
 [2] top-level scope at REPL[5]:1

julia> range( 5 => 3, 2 )
5.0:-2.0:3.0

Copy link
Sponsor Member

@mbauman mbauman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it. I was initially hesitant about using 5=>3 for a negative step, but I'm coming around to it — it's a nice feature of using => for this. This PR itself is great — documented, tested, ready to go.

As a purely bikesheddy surfacy opiniony thing, I think it'd be good to just do a quick up/down on triage.

base/range.jl Outdated Show resolved Hide resolved
@mbauman mbauman added the status:triage This should be discussed on a triage call label Jan 4, 2021
@JeffBezanson
Copy link
Sponsor Member

I'm against this. It adds too many different ways to do the same simple thing. As more such things pile up, it becomes quite user-hostile.

Co-authored-by: Matt Bauman <mbauman@gmail.com>
@mkitti
Copy link
Contributor Author

mkitti commented Jan 4, 2021

I'm against this. It adds too many different ways to do the same simple thing. As more such things pile up, it becomes quite user-hostile.

@JeffBezanson, I agree there is a user-hostile situation here with range. My original entry into this discussion was trying to improve the documentation for range in #37875 because it took me 30 minutes to figure out how to use it properly when I first reached for it.

With this PR, I'm responding to two issues:

  1. The need for an an easy way to create a range of a certain length as articulated by @StefanKarpinski:
    range(start, stop, length) #38750 (comment)
  2. The desire for clear and unambiguous syntax and argument order that is not going to trip up users coming from from other languages like Python or Ruby. The Pair, => syntax seems to accomplish that for many without being overly verbose.

Should this PR fail, I hope we can focus on emphasizing the range(start, stop; length = length) syntax, perhaps by deprecating (without removing) other forms of range that are well served by by colon, :.

Thank you for your consideration.

@antoine-levitt
Copy link
Contributor

@JeffBezanson what do you object to? The syntax of the functionality itself?

A number of people have agreed that we need a simpler way to create a range of a given length. The following options have been proposed:

  1. status quo: range(start, stop; length=length)
  2. range(start=>stop, length)
  3. range(length, start, stop)
  4. range(start, stop, length)
  5. linrange(start, stop, length)

This is orthogonal to improvements to the handling of kwargs by range (#38041, which I believe should be merged independently of this)

I'd prefer nothing than 2 (departure from the way arguments are usually used) or 3 (unnatural to me, and against matlab and python's linspace). I used to like 4, but now I prefer 5, for consistency with logspace and for future extensibility (eg endpoints).

@JeffBezanson
Copy link
Sponsor Member

The range(start=>stop, length) syntax by itself might be ok --- e.g. if we had done it that way from the beginning. But adding it on top of what we have gives too many variants. There are already lots of ways to construct ranges. Some code will use one style, other code will use another style, and it will be confusing. I'm imagining myself trying to teach how :, =>, and range are all used to construct ranges, and it's not going well :)

@mkitti
Copy link
Contributor Author

mkitti commented Jan 5, 2021

The range(start=>stop, length) syntax by itself might be ok --- e.g. if we had done it that way from the beginning. But adding it on top of what we have gives too many variants. There are already lots of ways to construct ranges. Some code will use one style, other code will use another style, and it will be confusing. I'm imagining myself trying to teach how :, =>, and range are all used to construct ranges, and it's not going well :)

What @antoine-levitt wants is range(start [, stop, [, length] ]; length, step) as implemented in #39055 which adds range(start, stop, length) as signed off by @StefanKarpinski in #38750 (comment).

As you noted in #38750 (comment) this conflicts with range(start, stop, step) in Python.

This proposal represents a collective brainstorming session on how to avoid the Python conflict while also getting the succinct length via position syntax.

@mkitti
Copy link
Contributor Author

mkitti commented Jan 5, 2021

What I do like about the current range(start, stop; step, length) is that the syntax ends up being rather explicit about what it does. It probably is easier to explain in the classroom as well. The main issue with the status quo is the documentation really needs improvement. If it were more readable, range(start, stop; length) or LinRange(start, stop, length) would seem like a more apparent solution to the need expressed in #38750.

The current documentation is confusing until you get to the examples. Let me try to read this as a very naïve new user:

  range(start[, stop]; length, stop, step=1)

  Given a starting value, construct a range either by length or from start to stop, optionally with a given step
  (defaults to 1, a UnitRange). One of length or stop is required. If length, stop, and step are all specified, they
  must agree.

Let's try to turn that first paragraph into code:

julia> range(11, length = 3) # "construct a range by length". Great. Seems verbose versus 11:13 though.
11:13

julia> range(1, 5) # "or from start to stop"  `step` defaults to 1, and I specified `stop` .... what gives?
ERROR: ArgumentError: At least one of `length` or `step` must be specified

julia> r = range(1, stop = 5.5)  # OK that works. This is the same as `1:5.5`. What is the difference versus colon?
1.0:1.0:5.0

julia> typeof(r) # "optionally with a given step (defaults to 1, a UnitRange)".  OK, so this should be a `UnitRange`....
StepRangeLen{Float64,Base.TwicePrecision{Float64},Base.TwicePrecision{Float64}}

julia> range(1, 5; length = 5, step = 1) #  "If length, stop, and step are all specified, they must agree."
ERROR: ArgumentError: Too many arguments specified; try passing only one of `stop` or `length`

Hmm. That first paragraph did not make much too sense. I'll keep reading ...

  If length and stop are provided and step is not, the step size will be computed automatically such that there are
  length linearly spaced elements in the range.

I thought step defaults to 1. If I ask for length = 3 , then this will just give [1, 2, 3] which are "length linearly spaced elements"

julia> range(1, 5.5; length = 3) # 1:3 ?
1.0:2.25:5.5

Strange. Suddenly stop matters now. OK, I guess step doesn't always default to 1. I wonder when step does not default to 1.

  If step and stop are provided and length is not, the overall range length will be computed automatically such that
  the elements are step spaced.
julia> range(;step = 1, stop = 5) # "If step and stop are provided and length is not", 
ERROR: MethodError: no method matching range(; step=1, stop=5)

julia> r = range(4 ;step = 3, stop = 5) # It should probably say `start` is always needed in words.
4:3:4

juila> # "the overall range length will be computed automatically..."

julia> length(r) # " such that  the elements are step spaced... "    well there is only one element. Where's the space?
1
  Special care is taken to ensure intermediate values are computed rationally. To avoid this induced overhead, see the
  LinRange constructor.

Oh, I like my intermediate values computed rationally. Why would I want them computed irrationally? It would be irrational to use this LinRange(start, stop, length) method although it provides the exact positional syntax that I want to use and the error is actually really small (1e-16). Also uppercase letters seem unusual compared to the rest of the API.

  stop may be specified as either a positional or keyword argument.

I thought I tried this already ...

julia> range(1, 3)  # Except sometimes it is required to be a keyword argument??
ERROR: ArgumentError: At least one of `length` or `step` must be specified

If we are stuck with the status quo in terms of syntax, could we revisit the documentation as in #37875 ?

@JeffBezanson
Copy link
Sponsor Member

Yes, those are all good points and I agree. There are some existing PRs to help address them. I'm only commenting on => here.

If we are stuck with the status quo in terms of syntax

Yes, we cannot break existing code in 1.x.

@mkitti
Copy link
Contributor Author

mkitti commented Jan 5, 2021

Yes, we cannot break existing code in 1.x.

To be clear, nothing in this pull request causes any existing code to stop functioning.

It also does not preclude a prospective range(start, stop) or range(start, stop, length) from functioning. Those forms were discussed in #38750 (comment)

@oscardssmith
Copy link
Member

closing based on triage discussion. (which was basically "oh god no. We'll do anything")

@mbauman mbauman removed the status:triage This should be discussed on a triage call label Jan 7, 2021
@mkitti
Copy link
Contributor Author

mkitti commented Apr 1, 2021

This PR is clearly superseded by the superior usage of emojis in #40298

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants