### Timing of inverse square root

This algorithm's cost is dominated by multiplications, and it does 3 fixed-point multiplications per iteration.

I would like to test it (for correctness) with high precision (so I can require accuracy at least $10^{-9}$, which means ~30 bits after decimal 
point, but it runs so slowly that it makes it impractical to test it).

In [1]:
import time
from psiqworkbench import QPU, QFixed
from psiqworkbench.filter_presets import BIT_DEFAULT
from qmath.func.inv_sqrt import InverseSquareRoot

qpu = QPU(filters=BIT_DEFAULT)
qpu.reset(2000)
q_a = QFixed(40, name="a", radix=30, qpu=qpu)
a = 0.1
q_a.write(a)
func = InverseSquareRoot(num_iterations=5)
t0 = time.time()
func.compute(q_a)
print(f"Time: {time.time()-t0}s")
q_result = func.get_result_qreg()
expected = a**-0.5
print("Error:", abs(q_result.read() - expected))

Time: 16.151357173919678s
Error: 6.683499087500877e-09


It took 16 seconds to do 5*3=15 multiplications, about a second per multiplication of two 40-bit numbers. This makes it impossible to have a good tests over variety of inputs for different algorithms with reasonable precision.

Below is even simpler repro: single multiplication of two 40-bit numbers takes ~1 second.

In [2]:
import time
from psiqworkbench import QPU, QFixed
import psiqworkbench.qubricks as qbk
from psiqworkbench.filter_presets import BIT_DEFAULT

qpu = QPU(filters=BIT_DEFAULT)
qpu.reset(300)
q_a = QFixed(40, name="a", radix=30, qpu=qpu)
q_b = QFixed(40, name="a", radix=30, qpu=qpu)
q_c = QFixed(40, name="a", radix=30, qpu=qpu)
q_a.write(2.2)
q_b.write(3.3)
t0=time.time()
qbk.GidneyMultiplyAdd().compute(q_c, q_a, q_b)
print(f"Time: {time.time()-t0}s")
print(q_c.read())

Time: 0.9535529613494873s
7.259999999776483
