Skip to content

Inference fails on Windows with non-AVX CPU (Intel N6000) #305

@goney3

Description

@goney3

Hello,

I am attempting to run the BitNet model on a Windows 11 machine with an Intel N6000 CPU, which does not have AVX/AVX2 support. The installation completes, but inference results in a repeating character output (e.g., "GGGGGG...").

Key Findings:

  • This behavior is reproducible on my Intel N6000 machine.
  • I can successfully compile and run the same model on a Raspberry Pi 4 B, which proves that AVX is not a fundamental requirement for the model's logic. This suggests the bug is specific to the Windows x86 non-AVX build.

Steps to Reproduce:

  1. On a Windows machine with a non-AVX CPU (e.g., Intel N6000), follow the standard installation instructions.
  2. During the build process, a compilation error occurs in 3rdparty/llama.cpp/common/common.cpp due to a missing header. Adding #include <chrono> fixes this initial error.
  3. The project then compiles successfully.
  4. Running inference with a command like python run_inference.py -m models/BitNet-b1.58-2B-4T/ggml-model-i2_s.gguf -p "Once upon a time" results in a repeating character output.

What I've Tried:

  • Compiling with the default settings.
  • Forcing a build with -DLLAMA_SSE4_2=ON.
  • Forcing a generic build with no flags.

All of these configurations compile successfully but produce the same incorrect inference output. The system_info log confirms that AVX is disabled.

This seems to be a bug in the x86 fallback code path when compiled with the Windows toolchain.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions