Add ARM64 NEON support to x86 SSE2 compute benchmarks #8

Copilot · 2025-10-28T20:21:13Z

Migrates compute benchmark suite from x86-only to multi-architecture support (x86-64 + ARM64).

Changes

Architecture detection

Added compile-time architecture detection (__aarch64__, __ARM_NEON)
Main now displays "ARM64 with NEON" or "x86-64 with SSE2" based on build target

SIMD intrinsics

Matrix operations: _mm_loadu_pd → vld1q_f64, _mm_mul_pd → vmulq_f64, fused multiply-add with vmlaq_f64
Hash operations: _mm_loadu_si128 → vld1q_u8, array-based byte extraction (NEON lanes require compile-time constants)
String search: _mm_cmpeq_epi8 → vceqq_u8, _mm_movemask_epi8 → array storage + iteration
Memory copy: _mm_loadu/storeu_si128 → vld1q_u8/vst1q_u8
Polynomial eval: _mm_mul_pd/_mm_add_pd → vmulq_f64/vaddq_f64, horizontal add via lane extraction

Build system

Dockerfile: Added -march=native for automatic SIMD enablement
.gitignore: Exclude build artifacts

Example transformation in matrix multiplication:

#if USE_X86_SIMD
    __m128d sum_vec = _mm_setzero_pd();
    __m128d a_vec = _mm_loadu_pd(&data[i][k]);
    __m128d b_vec = _mm_set_pd(other.data[k+1][j], other.data[k][j]);
    sum_vec = _mm_add_pd(sum_vec, _mm_mul_pd(a_vec, b_vec));
#elif USE_ARM_NEON
    float64x2_t sum_vec = vdupq_n_f64(0.0);
    float64x2_t a_vec = vld1q_f64(&data[i][k]);
    double b_arr[2] = {other.data[k][j], other.data[k+1][j]};
    float64x2_t b_vec = vld1q_f64(b_arr);
    sum_vec = vmlaq_f64(sum_vec, a_vec, b_vec);  // fused multiply-add
#endif

All modules maintain scalar fallback for non-SIMD platforms.

Original prompt

This section details on the original issue you should resolve

<issue_title>Migrate this repo to Arm</issue_title>
<issue_description>Please use any MCP tools that you need.</issue_description>

Comments on the Issue (you are @copilot in this section)

Note

Custom agent used: arm-migration-agent
Scan a project and migrate to ARM architecture

Fixes Migrate this repo to Arm #7

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Co-authored-by: JoeStech <4088382+JoeStech@users.noreply.github.com>

Initial plan

d33de29

Copilot AI assigned Copilot and JoeStech Oct 28, 2025

Copilot started work on behalf of JoeStech October 28, 2025 20:21 View session

Copilot AI and others added 5 commits October 28, 2025 20:30

Initial commit: Plan for ARM migration

ba6d082

Co-authored-by: JoeStech <4088382+JoeStech@users.noreply.github.com>

Add ARM NEON support to all benchmark modules

f76d5dd

Co-authored-by: JoeStech <4088382+JoeStech@users.noreply.github.com>

Add .gitignore and remove build artifacts

dff3da3

Co-authored-by: JoeStech <4088382+JoeStech@users.noreply.github.com>

Finalize ARM migration with comprehensive SIMD support

22d4e05

Co-authored-by: JoeStech <4088382+JoeStech@users.noreply.github.com>

Remove temporary symlink

e3d5b9b

Co-authored-by: JoeStech <4088382+JoeStech@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Migrate this repository to Arm architecture~~ Add ARM64 NEON support to x86 SSE2 compute benchmarks Oct 28, 2025

Copilot AI requested a review from JoeStech October 28, 2025 20:43

Copilot finished work on behalf of JoeStech October 28, 2025 20:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ARM64 NEON support to x86 SSE2 compute benchmarks #8

Add ARM64 NEON support to x86 SSE2 compute benchmarks #8

Uh oh!

Copilot AI commented Oct 28, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add ARM64 NEON support to x86 SSE2 compute benchmarks #8

Are you sure you want to change the base?

Add ARM64 NEON support to x86 SSE2 compute benchmarks #8

Uh oh!

Conversation

Copilot AI commented Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Comments on the Issue (you are @copilot in this section)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Oct 28, 2025 •

edited

Loading