Skip to content

SIMD (SSE) implementation of the infamous Fast Inverse Square Root algorithm from Quake III Arena.

License

Notifications You must be signed in to change notification settings

Xerbo/simd_fastinvsqrt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

simd_fastinvsqrt

SIMD (SSE) implementation of the infamous Fast Inverse Square Root algorithm from Quake III Arena.

Why

Why not.

How

This video explains it well.

Speed test

Here is the results of running benchmark.c compiled with -O2 on my hardware:

Q_rsqrt took(x) 110.617000ms
1.0f/sqrtf(x) took 343.441000ms
Q_rsqrt_sse(x) took 31.145000ms
_mm_div_ps(_mm_set1_ps(1.0f), _mm_sqrt_ps(x)) took 85.942000ms
_mm_rsqrt_ps(x) took 13.855000ms

We can clearly see that Q_rsqrt_sse significantly faster (about 3.5x, the theoretical maximum being 4x) than the scalar version, with the fastest being SSE's native inverse square root function.

Using it

If for some god forsaken reason you want to use this just include the simd_fastinvsqrt.h header in your program, define INCLUDE_ORIGINAL before including to bring in the original Q_rsqrt as well.

About

SIMD (SSE) implementation of the infamous Fast Inverse Square Root algorithm from Quake III Arena.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages