Description
cs_sles_solve_ccc_fv allocates its extended solver buffers using CS_ALLOC_DEVICE (device-only memory). Those buffers are subsequently passed to the CPU-side convergence check, which dereferences them on the host and causes a SIGSEGV.
Setting CS_CUDA_ALLOC_DEVICE_UVM=1 before launching the solver works around the crash: it remaps cs_alloc_mode_device to CS_ALLOC_HOST_DEVICE_SHARED at initialisation time (see cs_base_cuda.cu:777), making the buffers accessible from both host and device.
Steps to reproduce
- Build code_saturne v9.1.0 with CUDA support.
- Run any case that exercises the CUDA sparse linear solver (
cs_sles_solve_ccc_fv) without setting CS_CUDA_ALLOC_DEVICE_UVM.
- Observe SIGSEGV during the convergence check after the first solver call.
Workaround
export CS_CUDA_ALLOC_DEVICE_UVM=1
Set this before invoking the solver. It prevents the crash by ensuring solver buffers are in unified/shared memory.
Expected behaviour
cs_sles_solve_ccc_fv should either allocate convergence-check buffers in host-accessible memory directly, or the convergence check should use a device-to-host copy rather than a direct host dereference.
Environment
- code_saturne version: 9.1.0
- CUDA version: 13.1
- GPU architecture: sm_75
- OS: Ubuntu (x86_64)
Description
cs_sles_solve_ccc_fvallocates its extended solver buffers usingCS_ALLOC_DEVICE(device-only memory). Those buffers are subsequently passed to the CPU-side convergence check, which dereferences them on the host and causes a SIGSEGV.Setting
CS_CUDA_ALLOC_DEVICE_UVM=1before launching the solver works around the crash: it remapscs_alloc_mode_devicetoCS_ALLOC_HOST_DEVICE_SHAREDat initialisation time (seecs_base_cuda.cu:777), making the buffers accessible from both host and device.Steps to reproduce
cs_sles_solve_ccc_fv) without settingCS_CUDA_ALLOC_DEVICE_UVM.Workaround
export CS_CUDA_ALLOC_DEVICE_UVM=1Set this before invoking the solver. It prevents the crash by ensuring solver buffers are in unified/shared memory.
Expected behaviour
cs_sles_solve_ccc_fvshould either allocate convergence-check buffers in host-accessible memory directly, or the convergence check should use a device-to-host copy rather than a direct host dereference.Environment