build: enable hardware-accelerated BLAKE3 in cas_core#2
Merged
Conversation
These five files are copied verbatim from the upstream BLAKE3 1.8.4 distribution. They are not yet compiled or wired into CMake — that will happen in a subsequent commit. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds AGENTVFS_BLAKE3_SIMD CMake option (default ON), wires the vendored SIMD C sources into cas_core with per-file -msse/-mavx flags scoped to those translation units only. Drops the BLAKE3_NO_AVX512 / BLAKE3_NO_AVX2 / BLAKE3_NO_SSE41 / BLAKE3_NO_SSE2 defines and the forced BLAKE3_USE_NEON=0. Adds cas_test_blake3_simd which verifies (a) BLAKE3 of the empty input matches the canonical spec vector and (b) the per-arch SIMD entry point is linked into cas_core. Gates the test target on AGENTVFS_BLAKE3_SIMD=ON so the rollback build is clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Two review fixes against the BLAKE3 SIMD enablement: - Add `_M_ARM64EC` to the test's aarch64 branch in tests/cas/test_blake3_simd.cpp. blake3_impl.h treats ARM64EC as aarch64 and dispatches NEON at runtime, so the test should verify the NEON entry point on that target too. - Expand the MSVC comment in CMakeLists.txt to note that /arch:SSE2 is x86-only and is silently ignored on x64 (where SSE2 is the implicit baseline). Documents that the flag is harmless rather than load-bearing. Also fills in the canonical empty-input hash hex in the test source's KAT comment (was a stray "hash =" left over from an earlier edit). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
.cfiles (SSE2 / SSE4.1 / AVX2 / AVX512 / NEON) intoinclude/blake3/and wires them intocas_corewith per-file-msse* / -mavx*(and MSVC/arch:equivalents) scoped viaset_source_files_properties— the rest ofcas_corekeeps its baseline ISA.blake3_dispatch.cnow picks the best path at runtime via CPUID /getauxval.AGENTVFS_BLAKE3_SIMDCMake option (default ON). When OFF,BLAKE3_NO_AVX512 / BLAKE3_NO_AVX2 / BLAKE3_NO_SSE41 / BLAKE3_NO_SSE2 / BLAKE3_USE_NEON=0come back and the test target is omitted — clean rollback.cas_test_blake3_simdverifies (a) BLAKE3 of the empty input equals the canonical spec value and (b) the per-arch SIMD entry point (blake3_hash_many_avx2on x86_64,blake3_hash_many_neonon aarch64 / ARM64EC) is linked intocas_core. Wired into the linux / macos / windows CI jobs./arch:AVX512codegen in the Windows build-from-source block.Plan:
docs/superpowers/plans/2026-05-23-blake3-hwaccel.mdSpec:
docs/superpowers/specs/2026-05-23-blake3-hwaccel-design.mdTest plan
🤖 Generated with Claude Code