Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pointer fallback for AbstrctArrays broken on master when using old idiom #51962

Open
chriselrod opened this issue Oct 31, 2023 · 9 comments
Open
Labels
domain:arrays [a, r, r, a, y, s]

Comments

@chriselrod
Copy link
Contributor

chriselrod commented Oct 31, 2023

It was common to define

Base.unsafe_convert(::Ptr{T}, A::MyArray) = unsafe_convert(Ptr{T}, parent(A))

e.g.
https://github.com/JuliaArrays/OffsetArrays.jl/blob/a2e68178c5a3233f3d1a7aba39f2a659509574f8/src/OffsetArrays.jl#L464
but the change in behavior of unsafe_convert breaks both old methods of getting a pointer:
pointer and unsafe_convert.

We used to have

pointer(x::AbstractArray{T}) where {T} = unsafe_convert(Ptr{T}, x)

in Base, but it is now implemented as

pointer(x::AbstractArray{T}) where {T} = unsafe_convert(Ptr{T}, cconvert(Ptr{T}, x))

so it seems like unsafe_convert was considered internal and isn't a supported means of getting pointers.

What is the recommended solution?

This causes breakages such as FluxML/NNlib.jl#540 (comment)
I can update packages that called unsafe_convert as an end user, such as LayoutPointers.jl and TriangularSolve.jl, but updating all the packages defining their own pointer methods in terms of unsafe_convert could be a taller order.

@chriselrod chriselrod changed the title pointer fallback for AbstrctArrays broken pointer fallback for AbstrctArrays broken on master when using old idiom Oct 31, 2023
@vtjnash
Copy link
Sponsor Member

vtjnash commented Oct 31, 2023

Calling pointer alone may segfault as it provides no way to insert a GC.@preserve. It is generally best to implement cconvert (since it is not unsafe), and generally required to call both functions as well (mimicking the behavior defined for ccall)

@chriselrod
Copy link
Contributor Author

LayoutPointers.jl uses stridedpointer_preserve that returns a 2-tuple of the pointer and the object we want to preserve.
We in general do not want to pass the object to preserve around as a means of (a) cutting down on excessive monomorphization and (b) avoiding heap allocations caused by escape analysis failure.

@vtjnash
Copy link
Sponsor Member

vtjnash commented Oct 31, 2023

Yeah, I feel like we should have a special type in Base that just exists to have that cconvert / pair / unwrap semantics, if just to avoid the punning on unsafe_converting a generic Tuple{Ptr, Any} to Ptr

I don't know why LayoutPointers.jl seems to invent its own scheme for this. The standard way to define this is that memory_reference[2] is called cconvert and memory_reference[1] is called unsafe_convert. They are separated that way so that there is not a heap allocation to hold both together at once. I see also one call to pointer_from_objref on Ref which is undefined what that returns. It should use unsafe_convert.

@chriselrod
Copy link
Contributor Author

I don't know why LayoutPointers.jl seems to invent its own scheme for this.

I did not know about any pre-existing schemes.

I see also one call to pointer_from_objref on Ref which is undefined what that returns. It should use unsafe_convert.

Thanks, I didn't know that. JuliaSIMD/LayoutPointers.jl@8be20b7

@chriselrod
Copy link
Contributor Author

chriselrod commented Oct 31, 2023

Is this undefined behavior? RefValue is also a mutable struct.
https://github.com/JuliaArrays/StaticArrays.jl/blob/72d2bd3538235c9162f630a5130112b83eaa0af7/src/MArray.jl#L25-L35
I don't know at what level this behavior is undefined, e.g. that Ref's implementation is subject to change, or what pointer_to_objref does may change (in which case the MArray code is also undefined).

@vtjnash
Copy link
Sponsor Member

vtjnash commented Oct 31, 2023

Both Ref's/RefValue's implementation has changed sometimes, and pointer_to_objref may change. But unsafe_convert is defined for Ref to point to the (possibly immutable) data of the contents when T is concrete, otherwise to a boxed value (essentially to a jl_value_t*, which is valid to load from but Undefined behavior to store to that pointer). It is also Undefined behavior what happens if there are any unsafe_store calls to that pointer, except when the element type of Ref{T} is isbitstype(T), in which case it is defined to agree with C.

@jishnub
Copy link
Contributor

jishnub commented Nov 16, 2023

What are the recommended methods to be added here to resolve the issue in OffsetArrays?

@mkitti
Copy link
Contributor

mkitti commented Mar 23, 2024

HDF5.jl is also encountering an issue with nightly.

On Julia 1.10.2, this works:

julia> Base.unsafe_convert(Ptr{UInt8}, reinterpret(UInt8, [true; false; false]))
Ptr{UInt8} @0x000070bfdb9e41f8

This no longer works on master:

julia> Base.unsafe_convert(Ptr{UInt8}, reinterpret(UInt8, [true; false; false]))
ERROR: conversion to pointer not defined for Base.ReinterpretArray{UInt8, 1, Bool, Vector{Bool}, false}

The error occurs when passing a reinterpreted array to ccall.

julia> fn = "test_2024_03_23.h5"
"test_2024_03_23.h5"

julia> h5open(fn, "w") do f
           data = reinterpret(UInt8, [true, false, false])
           write(f, "reinterpret array", data)
       end
julia> fn = "test_2024_03_23.h5"
"test_2024_03_23.h5"

julia> h5open(fn, "w") do f
           data = reinterpret(UInt8, [true, false, false])
           write(f, "reinterpret array", data)
       end
ERROR: conversion to pointer not defined for Base.ReinterpretArray{UInt8, 1, Bool, Vector{Bool}, false}
Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:35
  [2] unsafe_convert(::Type{Ptr{UInt8}}, a::Base.ReinterpretArray{UInt8, 1, Bool, Vector{Bool}, false})
    @ Base ./pointer.jl:68
  [3] unsafe_convert(::Type{Ptr{Nothing}}, a::Base.ReinterpretArray{UInt8, 1, Bool, Vector{Bool}, false})
    @ Base ./pointer.jl:66
  [4] h5d_write(dataset_id::HDF5.Dataset, mem_type_id::HDF5.Datatype, mem_space_id::Int64, file_space_id::Int64, xfer_plist_id::HDF5.DatasetTransferProperties, buf::Base.ReinterpretArray{…})
    @ HDF5.API ~/.julia/dev/HDF5/src/api/functions.jl:908
  [5] write_dataset(dataset::HDF5.Dataset, memtype::HDF5.Datatype, buf::Base.ReinterpretArray{…}, xfer::HDF5.DatasetTransferProperties)
    @ HDF5 ~/.julia/dev/HDF5/src/datasets.jl:499

mkitti added a commit to mkitti/HDF5.jl that referenced this issue Mar 23, 2024
mkitti added a commit to JuliaIO/HDF5.jl that referenced this issue Mar 26, 2024
* Add Julia 1.11 nightly to tests

* Fix reinterpret arrays due to JuliaLang/julia#51962

* Mark Windows virtual dataset tests as broken
mkitti added a commit to JuliaIO/HDF5.jl that referenced this issue Apr 5, 2024
* Add Julia 1.11 nightly to tests

* Fix reinterpret arrays due to JuliaLang/julia#51962

* Mark Windows virtual dataset tests as broken
@nsajko nsajko added the domain:arrays [a, r, r, a, y, s] label May 1, 2024
@nsajko
Copy link
Contributor

nsajko commented May 1, 2024

FTR the Interfaces page in the Manual documents unsafe_convert as the function to overload for strided arrays. cconvert is not mentioned:

https://docs.julialang.org/en/v1/manual/interfaces/

nsajko added a commit to JuliaArrays/FixedSizeArrays.jl that referenced this issue May 2, 2024
JuliaLang/julia#51962 (comment)

> It is generally best to implement cconvert (since it is not unsafe),
> and generally required to call both functions as well (mimicking the
> behavior defined for ccall)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:arrays [a, r, r, a, y, s]
Projects
None yet
Development

No branches or pull requests

5 participants