Skip to content

PERF: Backport ABI-guaranteed SIMD baselines for FFTW builds#6028

Merged
hjmjohnson merged 1 commit intoInsightSoftwareConsortium:release-5.4from
hjmjohnson:backport-6007-release-5.4
Apr 10, 2026
Merged

PERF: Backport ABI-guaranteed SIMD baselines for FFTW builds#6028
hjmjohnson merged 1 commit intoInsightSoftwareConsortium:release-5.4from
hjmjohnson:backport-6007-release-5.4

Conversation

@hjmjohnson
Copy link
Copy Markdown
Member

Summary

Backport of PR #6007 to release-5.4. Cherry-pick of commits 36624017c8
and 934faaa8d5 squashed into a single commit.

Enables FFTW SIMD codelets using ABI-guaranteed baselines for
redistributable binary packages:

Architecture SIMD Default Rationale
x86_64 / AMD64 SSE + SSE2 ON Required by AMD64 ABI
aarch64 / arm64 NEON ON Required by AArch64 ABI
x86_64 AVX / AVX2 OFF Not part of baseline ABI

MSVC excluded (FFTW SIMD codelets use GCC/Clang inline assembly).

Also simplifies AVX detection guards to #ifndef form.

Closes #6025.

Test plan

  • CI: release-5.4 builds with FFTW enabled
  • Verify SIMD status message in CMake output

🤖 Generated with Claude Code

Cherry-pick of PR InsightSoftwareConsortium#6007 (commits 3662401, 934faaa) from main.

Enable FFTW SIMD codelets using ABI-guaranteed baselines:
  - x86_64: SSE + SSE2 (required by AMD64 ABI) -- default ON
  - aarch64: NEON (required by AArch64 ABI) -- default ON
  - AVX/AVX2: default OFF unless compiler already targets them

MSVC is excluded (FFTW SIMD codelets use GCC/Clang inline assembly).

Simplifies AVX detection guards to #ifndef form.

Closes InsightSoftwareConsortium#6025.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions github-actions bot added type:Infrastructure Infrastructure/ecosystem related changes, such as CMake or buildbots type:Performance Improvement in terms of compilation or execution time labels Apr 9, 2026
@hjmjohnson hjmjohnson self-assigned this Apr 9, 2026
@hjmjohnson hjmjohnson requested review from thewtex April 9, 2026 17:46
@hjmjohnson hjmjohnson marked this pull request as ready for review April 9, 2026 17:46
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 9, 2026

Greptile Summary

This backport cherry-picks two commits from main (3662401, 934faaa) to enable ABI-guaranteed SIMD baselines in FFTW builds — SSE/SSE2 on x86_64 and NEON on aarch64 default ON, while AVX/AVX2 remain OFF unless the compiler is already targeting those ISAs. The approach is sound for redistribution, but the cherry-pick appears to have dropped the MSVC guard that commit 418a2c261c ("COMP: Fix FFTW SIMD detection for Windows ARM64 and MSVC") on the base branch specifically introduced.

  • The SIMD-default block (lines 144–191) contains no NOT MSVC (or equivalent) guard; on MSVC x86_64 FFTW_ENABLE_SSE and FFTW_ENABLE_SSE2 default ON, and on MSVC ARM64 FFTW_ENABLE_NEON defaults ON — contradicting both the PR description and the file-header comment ("MSVC excluded") and likely breaking FFTW compilation on Windows MSVC targets.

Confidence Score: 3/5

Not safe to merge without restoring the MSVC exclusion guard dropped from the base branch.

There is one P1 finding: the documented and previously implemented MSVC exclusion for SIMD defaults appears to have been removed by the cherry-pick. The parent commit on this branch (418a2c2) was titled 'COMP: Fix FFTW SIMD detection for Windows ARM64 and MSVC', indicating a guard was present and is now missing. This can cause build failures on MSVC Windows targets. All other aspects of the change are well-implemented.

CMake/itkExternal_FFTW.cmake — the SIMD architecture detection block needs a NOT MSVC (or CMAKE_C_COMPILER_ID) guard.

Vulnerabilities

No security concerns identified. The FFTW tarball is downloaded from a trusted Kitware data server and verified with a SHA-512 hash. No user-controlled input is interpolated into shell commands.

Important Files Changed

Filename Overview
CMake/itkExternal_FFTW.cmake Adds ABI-guaranteed SIMD defaults (SSE/SSE2 on x86_64, NEON on aarch64) and AVX/AVX2 opt-in via compile-time macro detection; the documented MSVC exclusion is not guarded in code, likely regressing the fix from base-branch commit 418a2c2.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Configure ITK with FFTW] --> B{ITK_USE_SYSTEM_FFTW?}
    B -- Yes --> Z[find_package FFTW]
    B -- No --> C{macOS universal build?}
    C -- Yes --> D[All SIMD defaults OFF]
    C -- No --> E{CMAKE_SYSTEM_PROCESSOR}
    E -- aarch64/arm64/ARM64 --> F[_fftw_default_neon = ON]
    E -- x86_64/AMD64 --> G[_fftw_default_sse = ON / _fftw_default_sse2 = ON]
    G --> H{check_c_source_compiles #ifndef __AVX__}
    H -- compiles --> I[_fftw_default_avx = ON]
    H -- fails --> J[avx stays OFF]
    I --> K{check_c_source_compiles #ifndef __AVX2__}
    J --> K
    K -- compiles --> L[_fftw_default_avx2 = ON]
    K -- fails --> M[avx2 stays OFF]
    E -- i686/i386 --> N[All defaults OFF]
    F --> O[option FFTW_ENABLE_* first-configure defaults]
    L --> O
    M --> O
    N --> O
    O --> P[ExternalProject_Add fftwf / fftwd with -DENABLE_* flags]
Loading

Reviews (1): Last reviewed commit: "PERF: Backport ABI-guaranteed SIMD basel..." | Re-trigger Greptile

Copy link
Copy Markdown
Member

@thewtex thewtex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

Will merge after CI turns green for 5.4.6.

@thewtex thewtex added this to the ITK 5.4.6 milestone Apr 9, 2026
@hjmjohnson
Copy link
Copy Markdown
Member Author

MSVC exclusion documented but not implemented

Responding to @greptile-apps's review comment:

This file is now identical to the main branch — the backport is a verbatim copy.

The MSVC concern is unfounded. FFTW 3.3.10's SSE/SSE2 codelets are intrinsics-based (_mm_* from <xmmintrin.h> / <emmintrin.h>), not GCC inline assembly. The few places that use __asm__ (e.g. the LD() workaround in simd-sse2.h, the CPUID probe in x86-cpuid.h) all have proper #ifdef _MSC_VER fallback paths using either MSVC _asm {} syntax or _mm_* intrinsics. FFTW's authors explicitly maintained MSVC compatibility.

Additionally, the ITK Windows CI does not enable FFTW (ITK_USE_FFTWD=OFF, ITK_USE_FFTWF=OFF), so this code path is not exercised on Windows CI regardless.

No guard needed — this matches main as-is.

@hjmjohnson
Copy link
Copy Markdown
Member Author

/azp run ITK.macOS

@hjmjohnson
Copy link
Copy Markdown
Member Author

/azp run

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 9, 2026

You're absolutely right — thanks for the detailed clarification. I wasn't aware that FFTW's SSE/SSE2 codelets use intrinsics rather than inline assembly, and that the few __asm__ instances have proper MSVC fallbacks.

Since this is a verbatim backport matching main and Windows CI doesn't exercise this path anyway, the guard would be unnecessary. Appreciate the correction.

Tip: You can give me more context with .greptile/rules so I get this right next time.

@hjmjohnson
Copy link
Copy Markdown
Member Author

/azp run

@hjmjohnson
Copy link
Copy Markdown
Member Author

@thewtex the failures are all transient. The Mac failure had nothing to do with the changes. FFTW is not even built on any dashboard. The CDash failure needs fixes to the CI infrastructure: #6033

I'll attack #6033 once I receive some feedback on the preferred method for looking at this.

@hjmjohnson hjmjohnson merged commit 41a9b6f into InsightSoftwareConsortium:release-5.4 Apr 10, 2026
18 of 19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

type:Infrastructure Infrastructure/ecosystem related changes, such as CMake or buildbots type:Performance Improvement in terms of compilation or execution time

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants