Improve runtime feature detection (and performance) #21

onethumb · 2025-10-30T19:55:15Z

The Problem

By relying heavily on compile-time feature detection, rather than runtime feature detection, the library was more fragile (leading to bugs like #14) and unable to gracefully degrade across CPU architecture variants with a single build.

The Solution

Rely on runtime feature detection (out of the hot path) to determine which hardware acceleration target to use, enabling graceful degradation across CPU types with a single build, and minimizing the risk of a SIGILL or similar bug sneaking in.

As a side benefit, AWS Graviton targets are ~36% faster and peak at ~53GiB/s (Graviton4).

Changes

Adds a feature detection mechanism at instantiation time to determine which acceleration target is ideal.
Removes much of the compile-time detection.
Enables a single binary build to gracefully degrade for older CPUs in the same family.
Improves Graviton4 performance by ~36% to ~53GiB/s.
Improves file structure and directory layout to isolate each acceleration target further.
Adds a benchmarking flag to the checksum utility to enable easier benchmarking using a single binary, rather than a source checkout, across platforms.
Updates cargo packages to latest supported.

Planned version bump

Which: MINOR
Why: non-breaking new functionality

Links

SIGILL on 1.4 with machine that doesn't support avx512 #14

Uses more runtime feature detection, rather than compile time feature detection, for improved reliability, maintainability, graceful degradataion across CPU families, and performance. Should help minimize bugs such as awesomized#14 in the future.

# Conflicts: # Cargo.lock # Cargo.toml

Copilot

Pull Request Overview

This PR adds comprehensive x86 architecture support (32-bit) and reorganizes CRC-32 fusion implementations to improve code maintainability across multiple architectures (x86, x86_64, and aarch64). The main changes involve:

Adding 32-bit x86 support for CRC calculations with native CRC32C and PCLMULQDQ instructions
Restructuring the codebase to separate architecture-specific implementations into dedicated modules
Adding extensive integration tests for benchmark functionality
Implementing a feature detection system for optimal hardware acceleration selection
Adding future-proof tests for CRC key storage functionality

Reviewed Changes

Copilot reviewed 38 out of 44 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
`tests/checksum_integration_tests.rs`	New integration tests for benchmark flag parsing and various input scenarios
`src/test/mod.rs`	Renamed module reference from `future_proof` to `future_proof_tests`
`src/test/future_proof_tests.rs`	Extensive new tests for CRC key storage bounds checking and backwards compatibility
`src/lib.rs`	Added x86 support to conditional compilation, new feature_detection module, enhanced documentation
`src/feature_detection.rs`	New comprehensive feature detection system with performance tier selection
`src/crc32/mod.rs`	Extended fusion support to include x86 (32-bit) architecture
`src/crc32/fusion/mod.rs`	Simplified architecture dispatching with cleaner conditional compilation
`src/crc32/fusion/x86/mod.rs`	New x86-specific CRC implementation with AVX512 and SSE support
`src/crc32/fusion/x86/iscsi/*.rs`	Multiple new files implementing SSE, AVX512 PCLMULQDQ, and VPCLMULQDQ variants
`src/crc32/fusion/aarch64/mod.rs`	Reorganized aarch64 implementation into separate sub-modules
`src/crc32/fusion/aarch64/iscsi/*.rs`	Split iSCSI implementations into PMULL and PMULL+SHA3 variants
`src/crc32/fusion/aarch64/iso_hdlc/*.rs`	Split ISO-HDLC implementations into PMULL and PMULL+SHA3 variants

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/crc32/fusion/x86/mod.rs

src/crc32/fusion/x86/iscsi/avx512_vpclmulqdq.rs

src/feature_detection.rs

* [Improve runtime feature detection (and performance)](#21) * [remove libc](#20) * [Enable generating and publishing binary packages](#22)

onethumb added 2 commits October 30, 2025 12:17

Improve runtime feature detection

dd3c796

Uses more runtime feature detection, rather than compile time feature detection, for improved reliability, maintainability, graceful degradataion across CPU families, and performance. Should help minimize bugs such as awesomized#14 in the future.

Merge branch 'main' into improve-runtime-checks

85e7a09

# Conflicts: # Cargo.lock # Cargo.toml

onethumb requested a review from Copilot October 30, 2025 19:55

Copilot AI reviewed Oct 30, 2025

View reviewed changes

src/crc32/fusion/x86/mod.rs Outdated Show resolved Hide resolved

src/crc32/fusion/x86/iscsi/avx512_vpclmulqdq.rs Show resolved Hide resolved

src/feature_detection.rs Outdated Show resolved Hide resolved

Improve docs (from GitHub Copilot’s suggestions)

d9b8291

onethumb merged commit d9b8291 into awesomized:main Oct 30, 2025
76 checks passed

onethumb added a commit that referenced this pull request Oct 30, 2025

Release 1.6.0

45394b1

* [Improve runtime feature detection (and performance)](#21) * [remove libc](#20) * [Enable generating and publishing binary packages](#22)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve runtime feature detection (and performance) #21

Improve runtime feature detection (and performance) #21

Uh oh!

onethumb commented Oct 30, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Improve runtime feature detection (and performance) #21

Improve runtime feature detection (and performance) #21

Uh oh!

Conversation

onethumb commented Oct 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

The Problem

The Solution

Changes

Planned version bump

Links

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

onethumb commented Oct 30, 2025 •

edited

Loading