Performance-portable, length-agnostic SIMD with runtime dispatch
-
Updated
Jun 3, 2024 - C++
Performance-portable, length-agnostic SIMD with runtime dispatch
Fast inference engine for Transformer models
Achieve peak performance on x86 CPUs and NVIDIA GPUs
Pintool library for running Quantum Break on pre-SSE4.2 CPUs
What is a camera calibration, why is it necessary, and how do we compute it?
OpenCV4 C++ camera calibration in some lines
My C/C++/Intrinsic, OpenGL/OpenGLES2 experiments for desktop computers.
Example implementations of spinlocks
A C++ header-only library for vector, matrix, and quaternion math.
A GUI for viewing Intel intrinsic information combined with uops.info measurement data.
IQ-TREE ported to work for systems with ARM NEON ISA
An discoverable fractal world.
Some assignments from my Performance Analysis and Optimization course.
project uses Google Benchmark to test few intrinsics implementations (AVX and AVX2) against MSVC max optimizations
Mandelbrot set SIMD optimization
Add a description, image, and links to the intrinsics topic page so that developers can more easily learn about it.
To associate your repository with the intrinsics topic, visit your repo's landing page and select "manage topics."