Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Type-piracy in similar with OffsetAxis #87

Open
dlfivefifty opened this issue Dec 2, 2019 · 20 comments
Open

Type-piracy in similar with OffsetAxis #87

dlfivefifty opened this issue Dec 2, 2019 · 20 comments

Comments

@dlfivefifty
Copy link
Member

This is type-piracy:

function Base.similar(A::AbstractArray, ::Type{T}, inds::Tuple{OffsetAxis,Vararg{OffsetAxis}}) where T

@timholy
Copy link
Member

timholy commented Dec 2, 2019

It's basically "sanctioned type piracy." The issue is that Base is incapable of creating arrays with axes that start at anything other than 1. The dispatch hierarchy for Base ensures that any combination of inputs that can be reduced to axes that start at 1 will get caught by methods in Base. This method is the fallback, in case that doesn't happen. That's OK here because this is the foundational package for arrays with offset axes.

Any other package that wants to define offset axes has to define their own axis types. See https://docs.julialang.org/en/latest/devdocs/offset-arrays/#Custom-AbstractUnitRange-types-1.

Does that cover it? Or is there an issue that I'm not understanding?

@dlfivefifty
Copy link
Member Author

The issue is that as designed now type ambiguities in other packages (in this case BlockArrays.jl) only are triggered if a user loads OffsetArrays.jl.

I think a better approach would be in Base to add something like:

_offset_similar(_) = error("Not implemented in Base. Please use external package.")
Base.similar(A::AbstractArray, ::Type{T}, inds::Tuple{OffsetAxis,Vararg{OffsetAxis}}) where T = _offset_similar(A, T, index)

then OffsetArrays.jl would overload Base._offset_similar. This means ambiguity is consistent.

@timholy
Copy link
Member

timholy commented Dec 2, 2019

If you can sync this across Base, BlockArrays, and OffsetArrays, by all means go for it! Might want to also check CatIndices.jl as an example of a package that defines its own index types.

@dlfivefifty
Copy link
Member Author

I won’t be able to tackle this right now but perhaps in the future. Maybe just leave this open for now?

@yha
Copy link
Contributor

yha commented Feb 3, 2020

This type piracy has become worse with #90: now similar is pirated for general AbstractArray with the new index type IdOffsetRange, which means other packages defining similar(a::OtherArrayType,...) must disambiguate on an index type internal to OffsetArrays. See Vexatos/CircularArrays.jl#2 (comment).
I think what is needed is something like #87 (comment) suggested above but with OffsetAxis being an abstract type that packages can subtype. Then the rule would be that package may specialize offset_similar only for the last (indexes) argument, and similar only for the first (array) argument, like this:

# In Base
const OffsetDims = Tuple{OffsetAxis,Vararg{OffsetAxis}}
similar(A::AbstractArray, ::Type{T}, off::OffsetDims) where T = offset_similar(A, T, off)

# In OffsetArray
struct IdOffsetRange <: OffsetAxis
  ...
end
function Base.offset_similar(A::AbstractArray, ::Type{T}, inds::IdOffsetRange) where T
  ...
  OffsetArray(...)
end

# In another package
Base.similar(A::MyArrayType, ::Type{T}, dims::Dims) = MyArrayType(...)
Base.similar(A::MyArrayType, ::Type{T}, inds::OffsetDims) = MyArrayType(offset_similar(A,T,inds))
# and/or
Base.offset_similar(A::AbstractArray, ::Type{T}, inds::MyOffsetDimsType) = MyArrayType(...)

@timholy
Copy link
Member

timholy commented Feb 3, 2020

Base ranges need an overhaul anyway to address JuliaLang/julia#30950. That PR was starting to turn into a nightmare, I think the best option will be to start fresh and go slowly, thinking very carefully about how to avoid breakage. (There is a chance that those issues simply can't be fixed until Julia 2.0 😦 ). It would be good to address this as part of that effort.

I don't see myself as having time for it in the next few weeks or more, so definitely looking for help from others!

@Tokazama
Copy link
Member

Tokazama commented May 2, 2020

I'm working on something similar to this right now (OffsetAxes.jl) that builds on AxisIndices.jl. If I understand correctly (which it's possible I don't) then similar was originally designed to handle element types and size. This inserts the notion of "axes" which isn't really a part of Array and may not be applicable to many other subtypes of AbstractArray. I'm running into some similar things with reshape too.

Would it make sense to have a more explicit argument for what is happening here? Perhaps something like reaxes because the point is to change the axes of an array.

@yha
Copy link
Contributor

yha commented Nov 15, 2020

I think the abmiguity issues caused by this piracy is nearly resolved now. At least, it's resolved for CircularArrays, where this issue previously caused trouble (see Vexatos/CircularArrays.jl#8)
The reason is that the signature has become

function Base.similar(A::AbstractArray, ::Type{T}, inds::Tuple{OffsetAxisKnownLength,Vararg{OffsetAxisKnownLength}}) where T

and #157 changed OffsetAxisKnownLength to mean Union{Integer, AbstractUnitRange} (which is the same thing as Base.DimOrInd).
So packages can define similar for dims::Tuple{DimOrInd, Vararg{DimOrInd}} without triggering ambiguity with this method and without depending on a type internal to OffsetArrays. This signature is almost the current "official" recommendation in
https://github.com/JuliaLang/julia/blob/110f1256ac88ee0f12885289e807229acf9c44a3/base/abstractarray.jl#L723-L727 (it says UnitRange rather than AbstractUnitRange).
Perhaps the official documented recommendation should now be to use Base.DimOrInd when specializing similar for offset arrays.

@akosuas
Copy link

akosuas commented Dec 9, 2020

I just encountered something similar when trying to reshape an Array where the length was given as Int32 instead of Int64:

julia> using OffsetArrays
julia> which(Base.reshape, (Array{Float64, 1}, Int32))
reshape(A::AbstractArray, inds::Union{Colon, Integer, AbstractUnitRange}...) in OffsetArrays at ~/.julia/packages/OffsetArrays/PXUn7/src/OffsetArrays.jl:225
(MWE) pkg> st
Project MWE v0.1.0
Status `MWE/Project.toml`
  [6fe1bfb0] OffsetArrays v1.4.1

This feels a bit naughty to me.

@mcabbott
Copy link

mcabbott commented Dec 9, 2020

To be fair, that's a MethodError without this package, and the expected answer with it. More surprising is:

julia> reshape(rand(1,2), Int32(2))
2-element Vector{Float64}:
 0.23063257026910478
 0.007514586985956306

julia> reshape(rand(2,3), :, Int32(2))
ERROR: StackOverflowError:
Stacktrace:
 [1] reshape(A::Matrix{Float64}, inds::Tuple{Colon, Int32})
   @ OffsetArrays ~/.julia/packages/OffsetArrays/ExQCD/src/OffsetArrays.jl:227

@akosuas
Copy link

akosuas commented Dec 9, 2020

Ah, my bad - hadn't thought to try it, assumed the Base one would accept any Integer.

Yes - that StackOverflowError is how I found this! My original case that crashed was something like reshape(::Array{Float32,3}, Int32, Int64, Colon).

@mcabbott
Copy link

mcabbott commented Dec 9, 2020

I assumed the same. And while I didn't write this, if I had, it would have been accidental piracy from this assumption -- trying to allow UnitRanges in otherwise the same Union as Base has. Maybe the signature here should just be tightened to match.

(Also, how did you land up with an Int32? I'm not sure I've ever seen one in the wild...)

@jishnub
Copy link
Member

jishnub commented Dec 10, 2020

Many of the reshape methods are defined in reshapedarray.jl, where curiously Int32 arguments are permitted if all the sizes are specified, but disallowed if one of them is specified implicitly using a colon. The method defined in this package appears to try to call one of these, fails, and ends up calling itself.

julia> reshape(rand(2,3), (Int32(3), Int32(2)))
3×2 Array{Float64,2}:
 0.296956   0.113515
 0.0535506  0.564747
 0.90234    0.00424624

julia> reshape(rand(2,3), (Int32(3), :))
ERROR: MethodError: no method matching reshape(::Array{Float64,2}, ::Tuple{Int32,Colon})

Even stranger is the fact that this is only permitted when the sizes are passed as a Tuple, but not allowed when passed as a Vararg.

julia> reshape(rand(2,3), Int32(3), Int32(2))
ERROR: MethodError: no method matching reshape(::Array{Float64,2}, ::Int32, ::Int32)

I wonder if there are a few method signatures that need to be altered here?

@phipsgabler
Copy link

I found this funny behaviour, reproducible in a fresh temp project with only OffsetArrays:

julia> using OffsetArrays

julia> A = rand(10, 10);

julia> A[Base.Slice(1:10)]
10-element OffsetArray(::Vector{Float64}, 1:10) with eltype Float64 with indices 1:10:
 0.31581111033816844
 0.2408986878236269
 0.865276045142753
 0.4158474542637325
 0.5674733494230301
 0.5910918167329304
 0.5583008496658141
 0.8863892215050752
 0.9094492113020204
 0.1511050866350474

Without OffsetArrays, it's a method error:

julia> A[Base.Slice(1:10)]
ERROR: MethodError: no method matching similar(::Matrix{Float64}, ::Type{Float64}, ::Tuple{Base.IdentityUnitRange{UnitRange{Int64}}})

@jishnub
Copy link
Member

jishnub commented Feb 5, 2022

I think such type-piracy, where an error becomes a valid operation, isn't considered too serious. After all, Base cannot return the correct result here, since in general, this will have offset axes. Currently Base chooses to be conservative and simply doesn't permit the operation, while OffsetArrays produces the correct result for all such slices:

julia> A[Base.Slice(9:10)]
2-element OffsetArray(::Vector{Float64}, 9:10) with eltype Float64 with indices 9:10:
 0.13029549937036167
 0.024726720571425886

julia> UnitRange(axes(A[Base.Slice(1:10)], 1))
1:10

julia> UnitRange(axes(A[Base.Slice(9:10)], 1))
9:10

In fact, interestingly, such an operation for ranges does not preserve the axes even with OffsetArrays loaded

julia> r = 1:10; s = Base.Slice(9:10);

julia> r[s] |> axes
(Base.OneTo(2),)

julia> Vector(r)[s] |> axes
(OffsetArrays.IdOffsetRange(values=9:10, indices=9:10),)

julia> r == Vector(r)
true

julia> r[s] == Vector(r)[s] # should hold, but doesn't as the axes don't match
false

The reason for this is perhaps historical. I had proposed a fix for this, but I haven't looked at it in a while.

@rafaqz
Copy link

rafaqz commented Jul 11, 2022

@jishnub I have to disagree that an error not throwing is not serious type piracy.

When OffsetArrays.jl is just a secondary dependency not loaded by the user, it can still break the locality of errors when debugging - there are many typos with base julia methods and types that fail in Base but don't with OffsetArrays.jl loaded.

That is not something you look for, because it shouldn't be possible. Especially when you havent even intentionally loaded OffsetArrays.jl.

If every package did this Julia would be unusable. I'm not sure why OffsetArrays still gets a pass.

@timholy
Copy link
Member

timholy commented Jul 11, 2022

there are many typos with base julia methods and types that fail in Base but don't with OffsetArrays.jl loaded

@rafaqz can you give some concrete examples? I'm in agreement with @jishnub, but happy to be persuaded otherwise. I don't understand what kinds of typos you're talking about, so I'd need some hints from you to understand how what you're describing could happen.

@rafaqz
Copy link

rafaqz commented Jul 11, 2022

The most obvious one from my other issue: I typed something like zeros(10:20) instead of zero(10:20).

Hurrah, OffsetArray.

Then get an error somewhere totally different, with a bounds error.

I didnt even load OffsetArrays. It was only installed as a dependency.

With the examples above, its easy to pass range instead of axes(range, 1) to base methods like Slice, and not get the error you should get. Or to accidentally use Int32 instead of Int64 to index.

These mistakes should be quickly uncovered in predictable ways.

@timholy
Copy link
Member

timholy commented Jul 12, 2022

from my other issue

For posterity: this references #306

OK, but FYI I happened to see this and not that. The more explicit you can be the better: "this bugs me" is not really actionable. Anyway, sorry it annoyed you and I see your point.

@rafaqz
Copy link

rafaqz commented Jul 12, 2022

Absolutely, apologies for not being clearer.

Part of the vagueness is how I experience the problems with type piracy on errors is quite abstract. "zeros should error" sounds like a nitpick compared to my perception of the scope of the problem. I will attempt to clarify.

Essentially I don't consider type piracy on Base methods/types in the list of possible problems when I'm debugging code, because its a large mental overhead to do that.

Generally we have consistent set of methods and types in scope we can only intentionally add to by importing packages. And we know:

  • If we use base types and methods the call stack is totally within Base.
  • If we use a package type and method together that the whole call stack is within that package, its dependencies, and Base.
  • When we are combining methods and types from unrelated packages, and to think: will this work? (especially in light of various OffsetArrays related correctness bugs)

When we hit errors, we can apply our understanding of these scopes to limit the potential causes of problems, and solve them more quickly. We can look at the stack trace, and know the methods and types that could have led to it, and scan for them in our code.

Type piracy on base methods and types breaks these assumptions. The possible scope of the call stack for any method call becomes all of the packages currently loaded and all of their dependency trees (realistically its the packages that are allowed to do type piracy, but I'm not across the history of that).

We also can't go in reverse from an error in an OffsetArrays.jl method to looking for a method or type from a package that may depend on OffsetArrays.jl. Instead we have to inspect every line of our code as a potential source of the error.

To me, getting errors in predictable ways is valuable. OffsetArrays doing type piracy isn't just about occasionally getting these strange errors, but about breaking an efficient approach to programming that ignores the possibility of these problems completely.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants