Skip to content

Fix flash_attention_test link error on multi-target SIMD builds#917

Open
plawanrath wants to merge 1 commit into
google:devfrom
plawanrath:fix/flash-attention-test-link-error
Open

Fix flash_attention_test link error on multi-target SIMD builds#917
plawanrath wants to merge 1 commit into
google:devfrom
plawanrath:fix/flash-attention-test-link-error

Conversation

@plawanrath
Copy link
Copy Markdown

Summary

  • Build libgemma with HWY_IS_TEST=1 when GEMMA_ENABLE_TESTS=ON so its foreach_target.h translation units (flash_attention.cc, attention.cc, gemma.cc, vit.cc) emit symbols for every attainable SIMD target — including EMU128, which flash_attention_test references.
  • Add find_package(GTest REQUIRED) inside the test block. The CMakeLists already uses GTest::Main, but the target was only being provided transitively by Highway's vendored gtest. Disabling HWY_ENABLE_TESTS (required to avoid test-target name collisions with Highway's own dot_test/image_test) makes that transitive target disappear; an explicit find_package keeps tests buildable in that configuration.

Root cause

flash_attention_test.cc includes hwy/foreach_target.h and is compiled with -DHWY_IS_TEST=1, so it expands once per attainable SIMD target and references per-target symbols (gcpp::N_EMU128::FlashAttention, gcpp::N_EMU128::DotSoftmaxWeightedSum, gcpp::N_EMU128::GetVTileSize). Those symbols live in libgemma's flash_attention.cc / attention.cc, which were built without HWY_IS_TEST=1 — so Highway only emitted baseline+best targets (no EMU128). Result: Undefined symbols for architecture arm64: gcpp::N_EMU128::FlashAttention ....

Test plan

  • cmake -B build -DGEMMA_ENABLE_TESTS=ON -DHWY_ENABLE_TESTS=OFF configures clean
  • cmake --build build --target flash_attention_test links
  • ./build/flash_attention_test passes 3/3 cases across NEON_BF16, NEON_WITHOUT_AES, and EMU128
  • No regressions in tensor_info_test, blob_store_test, fields_test, compress_test, sfp_test, nuq_test, distortion_test, ops_test, matmul_test, image_test, basics_test, gemma_batch_bench
  • ./build/gemma still builds and runs

🤖 Generated with Claude Code

flash_attention_test.cc uses hwy/foreach_target.h and is compiled with
HWY_IS_TEST=1, expanding the test TU for every attainable SIMD target
(including EMU128). It references per-target symbols like
gcpp::N_EMU128::FlashAttention, gcpp::N_EMU128::DotSoftmaxWeightedSum,
and gcpp::N_EMU128::GetVTileSize, defined in libgemma's flash_attention.cc
and attention.cc.

libgemma is built without HWY_IS_TEST=1, so its foreach_target sources
emit only the baseline+best targets, not EMU128. Linking fails with
"Undefined symbols for architecture arm64: gcpp::N_EMU128::FlashAttention".

When GEMMA_ENABLE_TESTS is enabled, build libgemma with HWY_IS_TEST=1 so
its foreach_target sources emit all attainable targets, matching what the
tests reference. Also add an explicit find_package(GTest REQUIRED): the
CMakeLists references GTest::Main but the target only existed transitively
via Highway's vendored gtest, which is unavailable when HWY_ENABLE_TESTS
is disabled (required to avoid test-target name collisions with Highway's
dot_test/image_test).

Verified flash_attention_test now passes 3/3 cases across NEON_BF16,
NEON_WITHOUT_AES, and EMU128 targets on arm64. No regressions in
tensor_info_test, blob_store_test, fields_test, compress_test, sfp_test,
nuq_test, distortion_test, ops_test, matmul_test, image_test, or
basics_test.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant