Skip to content

FA bug causes Memory access fault by GPU #12238

@cb88

Description

@cb88

Name and Version

./bin/llama-cli --version
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 2 ROCm devices:
Device 0: AMD Radeon Graphics, gfx906:sramecc+:xnack- (0x906), VMM: no, Wave Size: 64
Device 1: AMD Instinct MI60 / MI50, gfx906:sramecc+:xnack- (0x906), VMM: no, Wave Size: 64
version: 4819 (becade5)
built with cc (GCC) 14.2.1 20250207 for x86_64-pc-linux-gnu

Operating systems

Linux

GGML backends

HIP

Hardware

AMD EPYC 7352 (48) @ 2.300GHz
GPU: AMD Radeon Instinct MI60 32GB
GPU: AMD Radeon Instinct MI50 32GB
Memory: 128669MiB

Models

llama-2-7b.Q4_0.gguf

Problem description & steps to reproduce

run llama-bench with -fa 1

First Bad Commit

Git bisected to becade5

Relevant log output

./bin/llama-bench  -m ~/Downloads/llama-2-7b.Q4_0.gguf  -fa 1 -sm none  -mg 0
ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 2 ROCm devices:
  Device 0: AMD Radeon Graphics, gfx906:sramecc+:xnack- (0x906), VMM: no, Wave Size: 64
  Device 1: AMD Instinct MI60 / MI50, gfx906:sramecc+:xnack- (0x906), VMM: no, Wave Size: 64
| model                          |       size |     params | backend    | ngl |    sm | fa |          test |                  t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | ----: | -: | ------------: | -------------------: |
Memory access fault by GPU node-1 (Agent handle: 0x5cf36b0ef290) on address 0x7ba863e00000. Reason: Page not present or supervisor privilege.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions