Add ARM64 NEON support to x86 SSE2 compute benchmarks #8
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Migrates compute benchmark suite from x86-only to multi-architecture support (x86-64 + ARM64).
Changes
Architecture detection
__aarch64__,__ARM_NEON)SIMD intrinsics
_mm_loadu_pd→vld1q_f64,_mm_mul_pd→vmulq_f64, fused multiply-add withvmlaq_f64_mm_loadu_si128→vld1q_u8, array-based byte extraction (NEON lanes require compile-time constants)_mm_cmpeq_epi8→vceqq_u8,_mm_movemask_epi8→ array storage + iteration_mm_loadu/storeu_si128→vld1q_u8/vst1q_u8_mm_mul_pd/_mm_add_pd→vmulq_f64/vaddq_f64, horizontal add via lane extractionBuild system
-march=nativefor automatic SIMD enablement.gitignore: Exclude build artifactsExample transformation in matrix multiplication:
All modules maintain scalar fallback for non-SIMD platforms.
Original prompt
Note
Custom agent used: arm-migration-agent
Scan a project and migrate to ARM architecture
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.