Skip to content

u32 saturating_mul with small constant is slower than a multiply+compare #34948

@eefriedman

Description

@eefriedman
#![feature(test)]
#![feature(core_intrinsics)]

extern crate test;

static mut XXX: u32 = 10;

use test::Bencher;

#[bench]
fn bench_sat_mul(b: &mut Bencher) {
    b.iter(|| unsafe {
        for _ in 1..1000 {
            let mut r = std::intrinsics::volatile_load(&XXX);
            r = r.saturating_mul(3);
            std::intrinsics::volatile_store(&mut XXX, r);
        }
    });
}

#[inline(always)]
fn fast_saturating_mul(a: u32, b: u32) -> u32 {
    let r = a as u64 * b as u64;
    if r > 0xFFFFFFFF { 0xFFFFFFFF } else { r as u32 }
}

#[bench]
fn bench_sat_mul_2(b: &mut Bencher) {
    b.iter(|| unsafe {
        for _ in 1..1000 {
            let mut r = std::intrinsics::volatile_load(&XXX);
            r = fast_saturating_mul(r, 3);
            std::intrinsics::volatile_store(&mut XXX, r);
        }
    });
}

Resulting timings (x86-64 Linux, Ivy Bridge processor):

test bench_sat_mul   ... bench:       4,354 ns/iter (+/- 231)
test bench_sat_mul_2 ... bench:       3,710 ns/iter (+/- 108)

Maybe not a perfect benchmark, but there's probably something worth looking at. Originally reported at https://users.rust-lang.org/t/unexpected-performance-from-array-bound-tests-and-more/6376/5 .

Metadata

Metadata

Assignees

No one assigned

    Labels

    C-enhancementCategory: An issue proposing an enhancement or a PR with one.I-slowIssue: Problems and improvements with respect to performance of generated code.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions