Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA Crash with julia 1.9.0 #91

Closed
LaurentPlagne opened this issue May 15, 2023 · 8 comments
Closed

CUDA Crash with julia 1.9.0 #91

LaurentPlagne opened this issue May 15, 2023 · 8 comments

Comments

@LaurentPlagne
Copy link

Hi,
I tried to run acoustic3D.jl miniapp using CUDA (USE_GPU = true).
Everything is fine with julia 1.8.5 but crash with julia 1.9

julia> include("acoustic3D.jl")
┌ Warning: ParallelStencil has already been initialized, with the same arguments. If you are using ParallelStencil interactively in the REPL, then you can ignore this message. If you are using ParallelStencil non-interactively, then you are likely using ParallelStencil in an inconsistent way: @init_parallel_stencil should only be called once, right after 'using ParallelStencil'.
└ @ ParallelStencil ~/.julia/packages/ParallelStencil/fQa5L/src/init_parallel_stencil.jl:73
┌ Warning: Module Data from previous module initialization found in caller module (Main); module Data not created. If you are working interactively in the REPL, then you can ignore this message.
└ @ ParallelStencil.ParallelKernel ~/.julia/packages/ParallelStencil/fQa5L/src/ParallelKernel/init_parallel_kernel.jl:33
ERROR: CUDA error: an illegal memory access was encountered (code 700, ERROR_ILLEGAL_ADDRESS)
Stacktrace:
  [1] throw_api_error(res::CUDA.cudaError_enum)
    @ CUDA ~/.julia/packages/CUDA/BbliS/lib/cudadrv/error.jl:89
  [2] macro expansion
    @ ~/.julia/packages/CUDA/BbliS/lib/cudadrv/error.jl:97 [inlined]
  [3] cuMemAllocAsync(dptr::Base.RefValue{CUDA.CuPtr{Nothing}}, bytesize::Int64, hStream::CUDA.CuStream)
    @ CUDA ~/.julia/packages/CUDA/BbliS/lib/utils/call.jl:26
  [4] #alloc#1

...

(test_pstencil) pkg> status
Status `~/temp/test_pstencil/Project.toml`
⌅ [052768ef] CUDA v3.13.1
  [94395366] ParallelStencil v0.6.1
  [91a5bcdd] Plots v1.38.11
Info Packages marked with ⌅ have new versions available but compatibility constraints restrict them from upgrading. To see why use `status --outdated`

julia> versioninfo()
Julia Version 1.9.0
Commit 8e630552924 (2023-05-07 11:25 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 32 × 13th Gen Intel(R) Core(TM) i9-13900K
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, goldmont)
  Threads: 8 on 32 virtual cores
Environment:
  JULIA_EDITOR = code
  JULIA_NUM_THREADS = 8
  JULIA_IMAGE_THREADS = 1

julia> using CUDA

julia> CUDA.versioninfo()
CUDA toolkit 11.7, artifact installation
NVIDIA driver 530.41.3, for CUDA 12.1
CUDA driver 12.1

Libraries: 
- CUBLAS: 11.10.1
- CURAND: 10.2.10
- CUFFT: 10.7.2
- CUSOLVER: 11.3.5
- CUSPARSE: 11.7.3
- CUPTI: 17.0.0
- NVML: 12.0.0+530.41.3
- CUDNN: 8.30.2 (for CUDA 11.5.0)
- CUTENSOR: 1.4.0 (for CUDA 11.5.0)

Toolchain:
- Julia: 1.9.0
- LLVM: 14.0.6
- PTX ISA support: 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.3, 6.4, 6.5, 7.0, 7.1, 7.2, 7.3, 7.4, 7.5
- Device capability support: sm_35, sm_37, sm_50, sm_52, sm_53, sm_60, sm_61, sm_62, sm_70, sm_72, sm_75, sm_80, sm_86

1 device:
  0: NVIDIA GeForce RTX 4070 (sm_89, 9.942 GiB / 11.994 GiB available)

julia> Unhandled Task ERROR: CUDA error: an illegal memory access was encountered (code 700, ERROR_ILLEGAL_ADDRESS)
Stacktrace:
@omlins
Copy link
Owner

omlins commented May 15, 2023

@LaurentPlagne : thanks for reporting this; we were like you know when we have this fixed...

@LaurentPlagne
Copy link
Author

Thank your for your prompt answer.
I wonder if the constraint on the CUDA version (imposed by ParallelStencils.jl?) could cause the problem ?
It works OK with Julia 1.8.5 so it is not a blocking issue for me.

@omlins
Copy link
Owner

omlins commented May 16, 2023

I wonder if the constraint on the CUDA version (imposed by ParallelStencils.jl?) could cause the problem ?

Hopefully it will be chose this! We will soon release ParallelStencil compatible with the latest CUDA version. Julia 1.9 was released earlier than i expected. In the last years it has been rather around the time of the conference... So we got surprised with it and are not ready..

@omlins
Copy link
Owner

omlins commented Jun 8, 2023

@LaurentPlagne : can you test if with the main branch (] add ParallelStencil#main) you still get the error?

@LaurentPlagne
Copy link
Author

It work with Julia 1.9.0 but I had to manually install CellArrays.jl and StaticArrays.jl
I can't run the miniapp acoustic_waves_multixpu. It seems that ImplicitGlobalGrid.jl prevent from using recent CUDA.jl versions...

@luraess
Copy link
Collaborator

luraess commented Jun 15, 2023

We are in the process of updating IGG to run with latest MPI.jl and to support GPU-aware operation with AMDGPU.jl bckend.

@omlins
Copy link
Owner

omlins commented Jun 15, 2023

Fixed in #81.

@omlins omlins closed this as completed Jun 15, 2023
@omlins
Copy link
Owner

omlins commented Jun 15, 2023

@LaurentPlagne: thanks for testing!

It work with Julia 1.9.0 but I had to manually install CellArrays.jl and StaticArrays.jl

Solved here: Remove need to have any packaged pre installed #95

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants