Fix flash_attention_test link error on multi-target SIMD builds#917
Open
plawanrath wants to merge 1 commit into
Open
Fix flash_attention_test link error on multi-target SIMD builds#917plawanrath wants to merge 1 commit into
plawanrath wants to merge 1 commit into
Conversation
flash_attention_test.cc uses hwy/foreach_target.h and is compiled with HWY_IS_TEST=1, expanding the test TU for every attainable SIMD target (including EMU128). It references per-target symbols like gcpp::N_EMU128::FlashAttention, gcpp::N_EMU128::DotSoftmaxWeightedSum, and gcpp::N_EMU128::GetVTileSize, defined in libgemma's flash_attention.cc and attention.cc. libgemma is built without HWY_IS_TEST=1, so its foreach_target sources emit only the baseline+best targets, not EMU128. Linking fails with "Undefined symbols for architecture arm64: gcpp::N_EMU128::FlashAttention". When GEMMA_ENABLE_TESTS is enabled, build libgemma with HWY_IS_TEST=1 so its foreach_target sources emit all attainable targets, matching what the tests reference. Also add an explicit find_package(GTest REQUIRED): the CMakeLists references GTest::Main but the target only existed transitively via Highway's vendored gtest, which is unavailable when HWY_ENABLE_TESTS is disabled (required to avoid test-target name collisions with Highway's dot_test/image_test). Verified flash_attention_test now passes 3/3 cases across NEON_BF16, NEON_WITHOUT_AES, and EMU128 targets on arm64. No regressions in tensor_info_test, blob_store_test, fields_test, compress_test, sfp_test, nuq_test, distortion_test, ops_test, matmul_test, image_test, or basics_test.
5 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
libgemmawithHWY_IS_TEST=1whenGEMMA_ENABLE_TESTS=ONso itsforeach_target.htranslation units (flash_attention.cc,attention.cc,gemma.cc,vit.cc) emit symbols for every attainable SIMD target — includingEMU128, whichflash_attention_testreferences.find_package(GTest REQUIRED)inside the test block. The CMakeLists already usesGTest::Main, but the target was only being provided transitively by Highway's vendored gtest. DisablingHWY_ENABLE_TESTS(required to avoid test-target name collisions with Highway's owndot_test/image_test) makes that transitive target disappear; an explicitfind_packagekeeps tests buildable in that configuration.Root cause
flash_attention_test.ccincludeshwy/foreach_target.hand is compiled with-DHWY_IS_TEST=1, so it expands once per attainable SIMD target and references per-target symbols (gcpp::N_EMU128::FlashAttention,gcpp::N_EMU128::DotSoftmaxWeightedSum,gcpp::N_EMU128::GetVTileSize). Those symbols live inlibgemma'sflash_attention.cc/attention.cc, which were built withoutHWY_IS_TEST=1— so Highway only emitted baseline+best targets (noEMU128). Result:Undefined symbols for architecture arm64: gcpp::N_EMU128::FlashAttention ....Test plan
cmake -B build -DGEMMA_ENABLE_TESTS=ON -DHWY_ENABLE_TESTS=OFFconfigures cleancmake --build build --target flash_attention_testlinks./build/flash_attention_testpasses 3/3 cases acrossNEON_BF16,NEON_WITHOUT_AES, andEMU128tensor_info_test,blob_store_test,fields_test,compress_test,sfp_test,nuq_test,distortion_test,ops_test,matmul_test,image_test,basics_test,gemma_batch_bench./build/gemmastill builds and runs🤖 Generated with Claude Code