Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CuDeviceTexture getindex breaks when executed on the CPU #1757

Closed
zsoerenm opened this issue Feb 5, 2023 · 3 comments
Closed

CuDeviceTexture getindex breaks when executed on the CPU #1757

zsoerenm opened this issue Feb 5, 2023 · 3 comments
Labels
bug Something isn't working needs information Further information is requested

Comments

@zsoerenm
Copy link

zsoerenm commented Feb 5, 2023

Describe the bug

I get a stackoverflow when I use a custom struct in combination with Adapt and CuTexture when it is printed to the REPL

I tested CUDA v3.5 and v4.0.1

To reproduce

The Minimal Working Example (MWE) for this bug:

using CUDA, Adapt
struct MyStruct{T}
    code::T
end
mystruct = MyStruct(
    CuTexture(
        CuTextureArray(CuArray(randn(Float32, 4000, 30))),
        address_mode = CUDA.ADDRESS_MODE_WRAP,
        interpolation = CUDA.NearestNeighbour()
    )
)
Adapt.@adapt_structure MyStruct
A = cudaconvert(mystruct); # works fine with semicolon
show(A);

Error:

MyStruct{CuDeviceTexture{Float32, 2, CUDA.ArrayMemory, false, CUDA.NearestNeighbour}}(Float32[Internal error: encountered unexpected error during compilation of LLVMException:
StackOverflowError()

signal (11): Segmentation fault
in expression starting at none:0

signal (6): Aborted
in expression starting at none:0
Aborted (core dumped)

This throws a stackoverflow as well:

julia> A.code[1]
ERROR: StackOverflowError:
Stacktrace:
 [1] getindex(t::CuDeviceTexture{Float32, 2, CUDA.ArrayMemory, false, CUDA.NearestNeighbour}, idx::Float32) (repeats 79984 times)
   @ CUDA ~/.julia/packages/CUDA/ZdCxS/src/device/texture.jl:87

Manifest

[[deps.Adapt]]
deps = ["LinearAlgebra"]
git-tree-sha1 = "0310e08cb19f5da31d08341c6120c047598f5b9c"
uuid = "79e6a3ab-5dfb-504d-930d-738a2a938a0e"
version = "3.5.0"
[[deps.CUDA]]
deps = ["AbstractFFTs", "Adapt", "BFloat16s", "CEnum", "CUDA_Driver_jll", "CUDA_Runtime_Discovery", "CUDA_Runtime_jll", "CompilerSupportLibraries_jll", "ExprTools", "GPUArrays", "GPUCompiler", "LLVM", "LazyArtifacts", "Libdl", "LinearAlgebra", "Logging", "Preferences", "Printf", "Random", "Random123", "RandomNumbers", "Reexport", "Requires", "SparseArrays", "SpecialFunctions"]
git-tree-sha1 = "edff14c60784c8f7191a62a23b15a421185bc8a8"
uuid = "052768ef-5323-5732-b1bb-66c8b64840ba"
version = "4.0.1"

[[deps.CUDA_Driver_jll]]
deps = ["Artifacts", "JLLWrappers", "LazyArtifacts", "Libdl", "Pkg"]
git-tree-sha1 = "75d7896d1ec079ef10d3aee8f3668c11354c03a1"
uuid = "4ee394cb-3365-5eb0-8335-949819d2adfc"
version = "0.2.0+0"

[[deps.CUDA_Runtime_Discovery]]
deps = ["Libdl"]
git-tree-sha1 = "58dd8ec29f54f08c04b052d2c2fa6760b4f4b3a4"
uuid = "1af6417a-86b4-443c-805f-a4643ffb695f"
version = "0.1.1"

[[deps.CUDA_Runtime_jll]]
deps = ["Artifacts", "CUDA_Driver_jll", "JLLWrappers", "LazyArtifacts", "Libdl", "Pkg", "TOML"]
git-tree-sha1 = "d3e6ccd30f84936c1a3a53d622d85d7d3f9b9486"
uuid = "76a88914-d11a-5bdc-97e0-2f5a05c973a2"
version = "0.2.3+2"

[[deps.GPUArrays]]
deps = ["Adapt", "GPUArraysCore", "LLVM", "LinearAlgebra", "Printf", "Random", "Reexport", "Serialization", "Statistics"]
git-tree-sha1 = "4dfaff044eb2ce11a897fecd85538310e60b91e6"
uuid = "0c68f7d7-f131-5f86-a1c3-88cf8149b2d7"
version = "8.6.2"

[[deps.GPUArraysCore]]
deps = ["Adapt"]
git-tree-sha1 = "57f7cde02d7a53c9d1d28443b9f11ac5fbe7ebc9"
uuid = "46192b85-c4d5-4398-a991-12ede77f4527"
version = "0.1.3"

[[deps.GPUCompiler]]
deps = ["ExprTools", "InteractiveUtils", "LLVM", "Libdl", "Logging", "TimerOutputs", "UUIDs"]
git-tree-sha1 = "48832a7cacbe56e591a7bef690c78b9d00bcc692"
uuid = "61eb1bfa-7361-4325-ad38-22787b887f55"
version = "0.17.1"

[[deps.LLVM]]
deps = ["CEnum", "LLVMExtra_jll", "Libdl", "Printf", "Unicode"]
git-tree-sha1 = "b8ae281340f0d3e973aae7b96fb7502b0119b376"
uuid = "929cbde3-209d-540e-8aea-75f648917ca0"
version = "4.15.0"

[[deps.LLVMExtra_jll]]
deps = ["Artifacts", "JLLWrappers", "LazyArtifacts", "Libdl", "Pkg", "TOML"]
git-tree-sha1 = "771bfe376249626d3ca12bcd58ba243d3f961576"
uuid = "dad2f222-ce93-54a1-a47d-0025e8a3acab"
version = "0.0.16+0"

Version info

Details on Julia:

julia> versioninfo()
Julia Version 1.8.3
Commit 0434deb161e (2022-11-14 20:14 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 4 × Intel(R) Core(TM) i5-2500K CPU @ 3.30GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-13.0.1 (ORCJIT, sandybridge)
  Threads: 1 on 4 virtual cores

Details on CUDA:

julia> CUDA.versioninfo()
CUDA runtime 11.8, artifact installation
CUDA driver 12.0
NVIDIA driver 525.85.5

Libraries: 
- CUBLAS: 11.11.3
- CURAND: 10.3.0
- CUFFT: 10.9.0
- CUSOLVER: 11.4.1
- CUSPARSE: 11.7.5
- CUPTI: 18.0.0
- NVML: 12.0.0+525.85.5

Toolchain:
- Julia: 1.8.3
- LLVM: 13.0.1
- PTX ISA support: 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.3, 6.4, 6.5, 7.0, 7.1, 7.2
- Device capability support: sm_35, sm_37, sm_50, sm_52, sm_53, sm_60, sm_61, sm_62, sm_70, sm_72, sm_75, sm_80, sm_86

1 device:
  0: NVIDIA GeForce GTX 950 (sm_52, 1.327 GiB / 2.000 GiB available)
@zsoerenm zsoerenm added the bug Something isn't working label Feb 5, 2023
@maleadt
Copy link
Member

maleadt commented Feb 6, 2023

MyStruct{CuDeviceTexture{...}}

Device structures are not intended to be used on the CPU. Their functionality is strictly meant for use on the GPU, and calling functions (like getindex here) may just crash your session. We could make each of those definitions a @device_function, but that's a bit of a bother (and breaks tools like Revise.jl).

Why are you attempting to use this object on the CPU? You generally shouldn't ever need to call cudaconvert.

@maleadt maleadt added the needs information Further information is requested label Feb 6, 2023
@maleadt maleadt changed the title Stackoverflow when using custom struct with adapt and CuTexture while printing CuDeviceTexture getindex breaks when executed on the CPU Feb 6, 2023
@maleadt
Copy link
Member

maleadt commented Feb 6, 2023

Also doesn't need Adapt:

julia> using CUDA

julia> tex =     CuTexture(
               CuTextureArray(CuArray(randn(Float32, 4000, 30))),
               address_mode = CUDA.ADDRESS_MODE_WRAP,
               interpolation = CUDA.NearestNeighbour()
           )
c4000×30 1-channel CuTexture(::CuTextureArray) with eltype Float32

julia> cudaconvert(tex)
4000×30 CuDeviceTexture{Float32, 2, CUDA.ArrayMemory, false, CUDA.NearestNeighbour}:
Internal error: encountered unexpected error during compilation of LLVMException:
StackOverflowError()

signal (11): Segmentation fault
in expression starting at none:0

signal (6): Aborted
in expression starting at none:0

@zsoerenm
Copy link
Author

zsoerenm commented Feb 6, 2023

okay my bad. I wanted to print it for debugging purposes, but I guess it's not going to work this way. Thanks for clarifying.

@zsoerenm zsoerenm closed this as completed Feb 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs information Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants