Skip to content

Latest commit

 

History

History
181 lines (157 loc) · 7.2 KB

README.md

File metadata and controls

181 lines (157 loc) · 7.2 KB

qFib GitHub license


This is a little proof of concept of a very curious matrix that I have derived that allows for multiple fibonacci terms to be calculated in parallel using only a subset of the previous terms:

Important relations
F( n - 0 )	= F( n - 1 ) + F( n - 2 )
F( n - 1 )	= F( n - 2 ) + F( n - 3 )
F( n - 2 )	= F( n - 3 ) + F( n - 4 )
F( n - 3 )	= F( n - 4 ) + F( n - 5 )
F( n - 4 )	= F( n - 5 ) + F( n - 6 )

Guided substitutions, favoring power-of-two coefficients(marked with *).
Also limiting the dependencies to only the previous four terms.

F( n )		= F( n - 1 ) + F( n - 2 )
			= ( F( n - 2 ) + F( n - 3 ) ) + F( n - 2 )
	*		= 2 * F( n - 2 ) + F( n - 3 )
			= 2 * (F( n - 3 ) + F( n - 4 )) + F( n - 3  )
			= 3 * F( n - 3 ) + 2 * F( n - 4 )

F( n + 1 )  = F( n ) + F( n - 1 )
			= ( F( n - 1 ) + F( n - 2 ) ) + F( n - 1 )
	*		= 2 * F( n - 1 ) + F( n - 2 )
			= 2 * (F( n - 2 ) + F( n - 3 )) + F( n - 2  )
			= 3 * F( n - 2 ) + 2 * F( n - 3 )

F( n + 2 )  = F( n + 1 ) + F( n )
			= (F( n ) + F( n - 1 )) + F( n )
			= 2 * F( n ) + F( n - 1 )
			= 2 * (F( n - 1 ) + F( n - 2 )) + F( n - 1 )
			= 2 * F( n - 1 ) + 2 * F( n - 2 ) + F( n - 1 )
			= 3 * F( n - 1 ) + 2 * F( n - 2 )

			= 2 * F( n ) + F( n - 1 )
			= 2 * ( 2 * F( n - 2 ) + F( n - 3 ) ) + F( n - 1 )
	*		= 4 * F( n - 2 ) + 2 * F( n - 3 ) + F( n - 1 )


F( n + 3 )  = F( n + 2 ) + F( n + 1 )
			= 3 * F( n - 1 ) + 2 * F( n - 2 ) + 3 * F( n - 2 ) + 2 * F( n - 3 )
			= 3 * F( n - 1 ) + 5 * F( n - 2 ) + 2 * F( n - 3 )
			= 3 * F( n - 1 ) + 2 * F( n - 2 ) + F( n ) + F( n - 1 )
			= 4 * F( n - 1 ) + 2 * F( n - 2 ) + F( n )
	*		= 4 * F( n - 1 ) + 4 * F( n - 2 ) + F( n - 3 )

Here, this matrix allows the next four fibonacci terms to be calculated using only the previous four terms. The coefficients curiously follow a pattern similar to Pascal's Triangle.

F( n + 4 * k + 3 ) = [ 4 4 1 0 ]^k  ( F( n - 1 ) )
F( n + 4 * k + 2 ) = [ 1 4 2 0 ]    ( F( n - 2 ) )
F( n + 4 * k + 1 ) = [ 2 1 0 0 ]    ( F( n - 3 ) )
F( n + 4 * k + 0 ) = [ 1 1 0 0 ]    ( F( n - 4 ) )

Exponentiation allows for a much larger 4*k stride of fibonacci values

exp 1, Skip 4
F( n + 11 ) = [ 4 4 1 0 ]^1  ( F( n - 1 ) )
F( n + 10 ) = [ 1 4 2 0 ]    ( F( n - 2 ) )
F( n +  9 ) = [ 2 1 0 0 ]    ( F( n - 3 ) )
F( n +  8 ) = [ 1 1 0 0 ]    ( F( n - 4 ) )

exp 2, Skip 8
F( n + 11 ) = [ 4 4 1 0 ]^2  ( F( n - 1 ) )
F( n + 10 ) = [ 1 4 2 0 ]    ( F( n - 2 ) )
F( n +  9 ) = [ 2 1 0 0 ]    ( F( n - 3 ) )
F( n +  8 ) = [ 1 1 0 0 ]    ( F( n - 4 ) )

F( n + 11 ) = [ 22 33 12 0 ]  ( F( n - 1 ) )
F( n + 10 ) = [ 12 22  9 0 ]  ( F( n - 2 ) )
F( n +  9 ) = [  9 12  4 0 ]  ( F( n - 3 ) )
F( n    8 ) = [  5  8  3 0 ]  ( F( n - 4 ) )

exp 3, skip 12
F( n + 15 ) = [ 4 4 1 0 ]^3  ( F( n - 1 ) )
F( n + 14 ) = [ 1 4 2 0 ]    ( F( n - 2 ) )
F( n + 13 ) = [ 2 1 0 0 ]    ( F( n - 3 ) )
F( n   12 ) = [ 1 1 0 0 ]    ( F( n - 4 ) )

F( n + 15 ) = [ 145 232 88 0 ]  ( F( n - 1 ) )
F( n + 14 ) = [  88 145 56 0 ]  ( F( n - 2 ) )
F( n + 13 ) = [  56  88 33 0 ]  ( F( n - 3 ) )
F( n + 12 ) = [  34  55 21 0 ]  ( F( n - 4 ) )

The coefficients of this matrix all happen to be powers of two, or zero, which reduces the matrix multiplication into very fast and simple bit shifts to the left(_mm_sllv_epi32) and horizontal adds(transpose the matrix, and use _mm_add_epi32).

On an Intel(R) Core(TM) i3-6100 CPU @ 3.70GHz, batches of four fibonacci terms(mod 2^32) can be calculated, on average, in parallel in only 21ns using SSE instructions.


       0:                               0
       1:                               1
       2:                               1
       3:                               2
24ns | 
       0:                               3
       1:                               5
       2:                               8
       3:                              13
27ns | 
       4:                              21
       5:                              34
       6:                              55
       7:                              89
20ns | 
       8:                             144
       9:                             233
      10:                             377
      11:                             610
20ns | 
      12:                             987
      13:                            1597
      14:                            2584
      15:                            4181
19ns | 
      16:                            6765
      17:                           10946
      18:                           17711
      19:                           28657
19ns | 
      20:                           46368
      21:                           75025
      22:                          121393
      23:                          196418
19ns | 
      24:                          317811
      25:                          514229
      26:                          832040
      27:                         1346269
19ns | 
      28:                         2178309
      29:                         3524578
      30:                         5702887
      31:                         9227465
...

While other methods, on average, would take hundreds of nanoseconds just to serially calculate one fibonacci term.

n  |  Recursive    2-Stack 2-Stack-Register Matrix-Exponent Chun-Min Chang
0  |       219|       103|              95|            100|           285|
1  |        67|        67|              50|             54|           101|
2  |        84|        53|              52|             54|            95|
3  |       121|       118|             117|            139|           128|
4  |       142|       117|             133|            113|           108|
5  |       156|       108|             130|            421|           111|
6  |       266|       445|             105|            138|           114|
7  |       379|       145|             131|            106|           131|
8  |       402|       132|             129|            140|           134|
9  |       535|       161|              98|            135|           110|
10 |       728|       137|              98|             94|           124|
11 |      1403|       449|             127|            112|           149|
12 |      1855|       130|             137|            134|           149|
13 |      2168|       152|             137|            122|           118|
14 |      3064|       111|              96|             89|           102|
15 |      3266|        83|             105|             74|            93|
16 |      5846|       106|             136|             93|           123|
17 |     11604|       141|             153|            131|           138|
18 |     26330|       128|             138|            137|           163|
19 |     30329|       107|             117|            119|            98|
20 |     46648|       136|             123|             99|            92|
21 |     67866|       112|             108|             86|           124|
22 |    102843|       121|             115|            102|           102|