AVX2 version of Luhn algorithm
-
Updated
Jul 12, 2023 - C++
AVX2 version of Luhn algorithm
A SIMD library that provides an intuitive and readable interface to 256-bit AVX and AVX2 SIMD instructions using low-cost abstractions.
SIMD implementation in normal CPU code, SSE and AVX. I created this project to test if the AVX and SSE code for my neural network works correctly and to compare its performance with regular CPU code. The focus is on operations like dot product, Adam optimizer, and gradient updates.
Mandelbrot set SIMD optimization
Tiny optimized math framework game oriented
A hash table optimization project - using assembly and several techniques in order to improve its performance.
SIMD accelerate xoshiro128, generate 256 bits 'number' at once
a template based C++ short vector library with vectorized faithfully rounded elementary functions
This C++ program is a demonstration of array vectorization techniques utilized in the AVX2 SIMD Assembly library, being run with C++ arrays through the vector class library created by Agner Fog. An ASM version of the same process has been implemented for comparison.
Compile-time blend masks that unifies _mm256_blend_epi8, _mm256_blend_epi16, _mm256_blend_epi32
Calculate Sum of Absolute Difference (SAD) by AVX-512
FastXor - SIMD-based XOR Encryption
2-norm guided FP32 truncation for heterogeneous deep learning training
Small fixed size image correlation filter implemented with AVX
Some loose performance experiments with Agner Fog's VCL
Add a description, image, and links to the avx2 topic page so that developers can more easily learn about it.
To associate your repository with the avx2 topic, visit your repo's landing page and select "manage topics."