Skip to content

perf(scan): AVX-512 + vpclmulqdq scanner backend #9

@membphis

Description

@membphis

Context

Current scanner uses AVX2 + PCLMUL (128-bit). On CPUs supporting avx512bw + vpclmulqdq (Ice Lake / Sapphire Rapids / Zen 4+), a 128-byte chunk path could halve the loop iteration count.

Prerequisite: CPU support audit

This issue is gated on confirming that the project's actual build/CI hosts support vpclmulqdq. If not, ROI is 0 and the issue should be deferred indefinitely.

  • Local dev host: confirmed missing vpclmulqdq (Skylake-X / Skylake-SP — has avx512bw but not vpclmulqdq). Cannot test locally.

  • CI runners: ubuntu-latest runner CPUs vary by allocation. Add a one-line diagnostic to the workflow:

    - name: CPU features
      run: grep -oE '\b(avx2|avx512bw|vpclmulqdq|pclmulqdq)\b' /proc/cpuinfo | sort -u

    Collect output over several CI runs; only proceed if vpclmulqdq is reliably present.

If CI runners do not reliably provide vpclmulqdq, the only path to validating this is paid larger-runners or self-hosted runners.

Proposal (pending CPU confirmation)

  • New src/scan/avx512.rs mirroring avx2.rs with 128-byte chunks
  • Dispatcher (src/scan/mod.rs): AVX-512 → AVX2 → scalar fallback chain
  • New avx512 feature flag (default off) so release builds stay portable
  • Use _mm512_clmulepi64_epi128 for the inside-string prefix-XOR

Estimated impact

est. speedup
CPUs with avx512bw + vpclmulqdq ~1.5–2× scan throughput
Other CPUs 0 (dispatcher falls back)

Validation plan

  • scanner_crosscheck proptest extended to compare AVX-512 vs scalar
  • CI matrix on a runner confirmed to have vpclmulqdq
  • make bench 3-run median on supported hardware

Recommendation

Last in the perf followup queue. The CPU support situation is uncertain; if it turns out CI runners don't have vpclmulqdq, this is dead code we maintain forever. Do the cheap wins (#5 memchr, #6 pooling, #7 PGO, #8 micro-opts) first.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions