Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🍵 use slice internally + implement for various types #13

Merged
merged 23 commits into from
Feb 3, 2023
Merged

Conversation

jvdd
Copy link
Owner

@jvdd jvdd commented Jan 9, 2023


  • final rerun of benchmarks
    • no regressions (i.e., same performance or even better 🚀) for scalar implementation
    • no regressions (i.e., same performance or even better 🚀) for SIMD implementation
      => I notices very few (minor) regressions for certain dtype / SIMD combinations & highlighted those in the benches/results file.. But, since the codspeed continuous monitoring only shows 2% regressions for the AVX2 u8 & i8 - (which is most likely not the most common datatypes for end users) - I am pro for merging this into main, given the significant enhanced flexibility that this PR adds :)

Cargo.toml Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
Copy link
Owner Author

@jvdd jvdd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🙃

@codspeed-hq
Copy link

codspeed-hq bot commented Feb 2, 2023

CodSpeed Performance Report

Merging #13 other_types (bad1543) will improve performances by 5.5%.

Summary

🔥 21 improvements
❌ 2 regressions
✅ 21 untouched benchmarks

🆕 0 new benchmarks
⁉️ 0 dropped benchmarks

Benchmarks breakdown

Benchmark main other_types Change
🔥 scalar_random_long_f32 1.9 ms 1.7 ms 9.08%
🔥 sse_random_long_f32 883.6 µs 712.1 µs 19.41%
🔥 scalar_random_long_i16 2.1 ms 1.6 ms 23.97%
🔥 sse_random_long_i16 411.3 µs 389.3 µs 5.35%
🔥 scalar_random_long_i32 2.1 ms 1.7 ms 16.62%
🔥 sse_random_long_i32 819.4 µs 797.3 µs 2.70%
🔥 scalar_random_long_f64 2.1 ms 1.9 ms 8.30%
🔥 sse_random_long_f64 1.8 ms 1.4 ms 19.38%
🔥 scalar_random_long_i64 2.2 ms 1.9 ms 15.30%
🔥 sse_random_long_i64 2 ms 1.8 ms 12.71%
🔥 scalar_random_long_u32 2.1 ms 1.7 ms 16.62%
🔥 scalar_random_long_u64 2.2 ms 1.9 ms 15.30%
🔥 sse_random_long_u64 2.1 ms 1.8 ms 12.19%
🔥 scalar_random_long_f16 2.6 ms 2.3 ms 12.89%
🔥 sse_random_long_f16 497.6 µs 475.6 µs 4.42%
🔥 scalar_random_long_u16 2.1 ms 1.6 ms 23.97%
🔥 sse_random_long_u16 433.1 µs 421.3 µs 2.73%
🔥 scalar_random_long_i8 2 ms 1.9 ms 4.28%
avx2_random_long_i8 200.8 µs 205.3 µs -2.26%
🔥 scalar_random_long_u8 2 ms 1.9 ms 4.28%
🔥 sse_random_long_u8 287 µs 275.6 µs 3.98%
avx2_random_long_u8 197.3 µs 201.8 µs -2.28%
🔥 impl_random_long_u8 287.2 µs 275.8 µs 3.97%

@jvdd jvdd mentioned this pull request Feb 2, 2023
6 tasks
@jvdd jvdd deleted the other_types branch February 28, 2023 21:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

✨ implement ArgMinMax trait for other types
1 participant