Skip to content

SIGSEGV in cs_sles_solve_ccc_fv when CS_CUDA_ALLOC_DEVICE_UVM is not set (v9.1.0) #164

@diskdog

Description

@diskdog

Description

cs_sles_solve_ccc_fv allocates its extended solver buffers using CS_ALLOC_DEVICE (device-only memory). Those buffers are subsequently passed to the CPU-side convergence check, which dereferences them on the host and causes a SIGSEGV.

Setting CS_CUDA_ALLOC_DEVICE_UVM=1 before launching the solver works around the crash: it remaps cs_alloc_mode_device to CS_ALLOC_HOST_DEVICE_SHARED at initialisation time (see cs_base_cuda.cu:777), making the buffers accessible from both host and device.

Steps to reproduce

  1. Build code_saturne v9.1.0 with CUDA support.
  2. Run any case that exercises the CUDA sparse linear solver (cs_sles_solve_ccc_fv) without setting CS_CUDA_ALLOC_DEVICE_UVM.
  3. Observe SIGSEGV during the convergence check after the first solver call.

Workaround

export CS_CUDA_ALLOC_DEVICE_UVM=1

Set this before invoking the solver. It prevents the crash by ensuring solver buffers are in unified/shared memory.

Expected behaviour

cs_sles_solve_ccc_fv should either allocate convergence-check buffers in host-accessible memory directly, or the convergence check should use a device-to-host copy rather than a direct host dereference.

Environment

  • code_saturne version: 9.1.0
  • CUDA version: 13.1
  • GPU architecture: sm_75
  • OS: Ubuntu (x86_64)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions