Skip to content

Conversation

@Corsair-cxs
Copy link

Overview

This PR introduces two changes:

  1. A function call to check_invalid_values() within ggml_graph_compute_thread() to detect inf or NaN in tensor data.
  2. A modification in ggml_compute_forward_soft_max_f32() where, if inf is detected, it is forcibly converted to FLT_MAX.

These changes serve as a temporary workaround on the RK3588 (ARM64) platform to prevent garbled text output caused by inf values in the tensors. However, it only addresses the symptoms and may increase CPU usage.

Changes

  • ggml-cpu.c:

    • Added calls to check_invalid_values() to detect problematic data.
    • Modified ggml_compute_forward_soft_max_f32() to clamp inf to FLT_MAX.
  • debug_check.gdb:

    • Provides a GDB script that sets breakpoints in check_invalid_values().
    • Automatically prints the src0 structure and its first 128 floats whenever an inf is detected.

How to Reproduce and Debug

  1. Compile the project in Debug mode (on RK3588 or any ARM64 environment):
cmake .. -G Ninja -DCMAKE_BUILD_TYPE=Debug -DBUILD_SHARED_LIBS=OFF -DGGML_OPENCL=ON -DGGML_VULKAN_CHECK_RESULTS=ON
ninja
  1. Edit debug_check.gdb to point to your DeepSeek R1-1.5B model file path.
  2. Run GDB:
cd llama-cpp
gdb -x debug_check.gdb ./build/bin/llama-cli
  1. Once the breakpoint is hit at return true;, inspect the data:
p *src0
p (*src0).data
x/128f (*src0).data
  1. You’ll see inf values in the tensor.

Notes

  • This patch works around the immediate issue but doesn’t tackle the underlying reason why inf values appear in the first place.
  • CPU overhead is increased by the additional checks/clamping.

Related Issue

@github-actions github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Mar 19, 2025
@Corsair-cxs Corsair-cxs changed the title fix rk3588 inf issue fix rk3588 inf issue #12458 Mar 19, 2025
@Corsair-cxs Corsair-cxs changed the title fix rk3588 inf issue #12458 [Issue #12458] Temporarily Clamp inf Values in ggml-cpu.c to Prevent Garbled Output(or coredump) on RK3588 Mar 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant