Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What should repeat(::OffsetArray, ...) do #36294

Open
mbauman opened this issue Jun 15, 2020 · 6 comments
Open

What should repeat(::OffsetArray, ...) do #36294

mbauman opened this issue Jun 15, 2020 · 6 comments
Labels
domain:arrays [a, r, r, a, y, s]

Comments

@mbauman
Copy link
Sponsor Member

mbauman commented Jun 15, 2020

In #35944 we noted that repeat(::OffsetArray, ...) is a little funny; it errors for inner repeats and happens to return 1-based arrays for outer repeats. I don't think either of these behaviors were ever explicitly thought through; it'd be good to decide how we want them to behave in the future.

@mbauman mbauman added the domain:arrays [a, r, r, a, y, s] label Jun 15, 2020
@timholy
Copy link
Sponsor Member

timholy commented Jun 16, 2020

repeat, like cat, push!, etc, seems to belong to the "arrays as lists" worldview rather than "arrays as associative containers" worldview. xref https://github.com/JuliaArrays/CatIndices.jl, which has never yet gotten around to deciding on a very complete solution to this question but has taken a few tentative stabs at it.

One option I can't remember noticing in the past (but surely @andyferris and others have): should we use values(A) to indicate that we wish to strip the indices, returning a 1-based indexing view? At one point I decided to call this collect, but that's eager rather than lazy and the docstring for collect is not as clear on this point as I seem to remember it.

Currently it's a bit of a wreck:

julia> a = Base.IdentityUnitRange(-3:3)
Base.IdentityUnitRange(-3:3)

julia> collect(a)
7-element Array{Int64,1}:
 -3
 -2
 -1
  0
  1
  2
  3

julia> Array(a)
7-element Array{Int64,1}:
 -3
 -2
 -1
  0
  1
  2
  3

julia> b = OffsetArray(-3:3, -3:3)
-3:3 with indices -3:3

julia> collect(b)
7-element Array{Int64,1}:
 -3
 -2
 -1
  0
  1
  2
  3

julia> Array(b)
ERROR: BoundsError: attempt to access 7-element Array{Int64,1} at index [-3:3]
Stacktrace:
 [1] throw_boundserror(::Array{Int64,1}, ::Tuple{OffsetArrays.IdOffsetRange{Int64,Base.OneTo{Int64}}}) at ./abstractarray.jl:537
 [2] checkbounds at ./abstractarray.jl:502 [inlined]
 [3] copyto!(::Array{Int64,1}, ::OffsetArray{Int64,1,UnitRange{Int64}}) at ./multidimensional.jl:959
 [4] Array at ./array.jl:541 [inlined]
 [5] Array(::OffsetArray{Int64,1,UnitRange{Int64}}) at ./boot.jl:429
 [6] top-level scope at REPL[9]:1

julia> values(b)
-3:3 with indices -3:3

julia> values(a)
Base.IdentityUnitRange(-3:3)

@andyferris
Copy link
Member

It's a good question. I have to say, I'm even sometimes a little confused about what collect is supposed to do (well, it's eager?) - it seems like such a nice generic english verb one might like to extend, but it's tied to the Array literal syntax.

we wish to strip the indices, returning a 1-based indexing view

If I think beyond arrays, I can use enumerate(iter) to get "indices" for each element in an iterable which might not even have been iterable to begin with. Not all iterables will support fast lookup based on their iteration position, but for those that do I can imagine instead of doing something like:

for (i, x) in enumerate(iter)
    ...
end

I could do something like

for (i, x) in pairs(enumeration(iter))
    ...
end

and enumeration(iter) would be an indexable such that enumeration(iter)[1] === first(iter), etc.

@andyferris
Copy link
Member

values is currently an AbstractDict thing also. E.g. it would seem natural to me if values(::Dict) would still be indexable - and the indices don't change. Whatever we come up with it would be nice to have similar semantics across dictionaries and arrays (and their keys).

@StefanKarpinski
Copy link
Sponsor Member

StefanKarpinski commented Jun 17, 2020

My original intention for collect was pretty simple: it should produce the vector of values that you would get if you iterated an entire iterable and put the results in a vector. But that concept was not adhered to terribly well and now it has a hodgepodge of behaviors 🤷‍♂️

@andyferris
Copy link
Member

Thanks Stefan. Out of curiosity, what was wrong with using vector for that - like we have tuple and string as constructors for other built-in collections?

@tkf
Copy link
Member

tkf commented Jun 17, 2020

Related to this, I suggested more "lawful" approach to values etc. in #36175.

I think it's a good idea to make values and keys optionally indexible by "tokens". The tokens can be 1-origin dense indices for arrays.

what was wrong with using vector for that

I think the problem is that it doesn't scale. I don't think it's ideal for each custom container to have lowercase factory function that may or may not do what you expect (although it might be OK for Vector and Array). See #36288

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:arrays [a, r, r, a, y, s]
Projects
None yet
Development

No branches or pull requests

5 participants