# 8bit Power Iterator

We have an 8-bit covariance estimator.

Now we can look at an 8-bit power iterator. This follows on from the Speeding up with Cython Part I - http://localhost:8888/notebooks/2020-04-25%20-%20Speeding%20Up%20with%20Cython.ipynb.

For power iterator we have the following data:
* ev - this is normally a float numpy 1D array with fractional values
* cov - this is normally a float numpy 2D array 
* scaler - this will have a fractional value greater than 1 (normally greater than 1.4).

We can keep ev and cov in signed 8 bit space and take the 1/127 out of the calculations (putting them at the front).

We have:
* cov^power \* ev - we can take out (1/127)^2 if power = 1.
* ev^T.input_data - we can take out (1/127) - input_data is -1, 0, 1 so so we have the same possible scale at ev but flipped - why we need symmetric scales - clip at -127 to 127.

We can either operate in the next power of 2 space - e.g. 16 bit vs 8 bit - the result is stored in 16 bit space to represent a max of 128 \* 128 (if signed - 255 if unsigned). OR. We can divide both the cov and the ev by 16 - so that the we have max 16\*16 = 256 (right shift by 3). 

In [2]:
import numpy as np

---
### Short Aside on Shifting

In [5]:
temp_ev = np.ones(shape=(4, 1)).astype(np.int16)*127; print(temp_ev)
shifted = np.right_shift(temp_ev, 7).astype(np.int8)
print(shifted, temp_ev//127, temp_ev/127, sep="\n")

[[127]
 [127]
 [127]
 [127]]
[[0]
 [0]
 [0]
 [0]]
[[1]
 [1]
 [1]
 [1]]
[[1.]
 [1.]
 [1.]
 [1.]]


In [6]:
%%timeit
np.right_shift(temp_ev, 7).astype(np.int8)

2.89 µs ± 66.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


In [7]:
%%timeit
(temp_ev//127).astype(np.int8)

3.18 µs ± 43.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


In [8]:
2**7, 2**6

(128, 64)

In [15]:
np.right_shift(128, 7).astype(np.int8)

1

In [17]:
np.right_shift(25, 2).astype(np.int8), 25//4

(6, 6)

In [19]:
for i in range(0, 127):
    print(np.right_shift(i, 2).astype(np.int8), i//4)

0 0
0 0
0 0
0 0
1 1
1 1
1 1
1 1
2 2
2 2
2 2
2 2
3 3
3 3
3 3
3 3
4 4
4 4
4 4
4 4
5 5
5 5
5 5
5 5
6 6
6 6
6 6
6 6
7 7
7 7
7 7
7 7
8 8
8 8
8 8
8 8
9 9
9 9
9 9
9 9
10 10
10 10
10 10
10 10
11 11
11 11
11 11
11 11
12 12
12 12
12 12
12 12
13 13
13 13
13 13
13 13
14 14
14 14
14 14
14 14
15 15
15 15
15 15
15 15
16 16
16 16
16 16
16 16
17 17
17 17
17 17
17 17
18 18
18 18
18 18
18 18
19 19
19 19
19 19
19 19
20 20
20 20
20 20
20 20
21 21
21 21
21 21
21 21
22 22
22 22
22 22
22 22
23 23
23 23
23 23
23 23
24 24
24 24
24 24
24 24
25 25
25 25
25 25
25 25
26 26
26 26
26 26
26 26
27 27
27 27
27 27
27 27
28 28
28 28
28 28
28 28
29 29
29 29
29 29
29 29
30 30
30 30
30 30
30 30
31 31
31 31
31 31


Not much slower just to use //. But we might move to right shift later. Only 128 will be shifted to 1. We would just add one to positive values and -1 to negative values then shift.

**WE CAN'T APPLY BIT SHIFTS TO NEGATIVE VALUES - https://stackoverflow.com/questions/4009885/arithmetic-bit-shift-on-a-signed-integer**

---
## Power Iteration Definition and Test

We have two areas where we need to normalise - one we need to multiply by 127 the other we don't. But if we don't divide by 127 when we square we'll got beyond 8-bits - so we might need int32s for some. If we used int32's we could avoid the need to divide by array.shape[0].

In [107]:
def normalise(array, scale_by_max=True):
    """Scale 8-bit array by L2 norm.
    
    Args:
        array - int16 numpy 1D array holding 8-bit values.
    """
    # Square 
    squared = temp_array**2
    # Scale by L
    squared = squared // array.shape[0]
    # Sum
    array_sum = squared.sum()
    # Square root 
    sq_root = np.sqrt(array_sum) # We can keep this is 16-bit space to avoid unnecessary casting
    
    if scale_by_max:
        # Scale by max_value / sqrt(length) and divide by norm
        scaled_array = (temp_array*127)//(sq_root*np.sqrt(array.shape[0]))
    else:
        # Avoid the scaling by max
        scaled_array = temp_array//(sq_root*np.sqrt(array.shape[0]))
    return scaled_array.astype(np.int16)

As it takes time to cast backwards and forwards it makes sense to assume our ev and cov are 8-bit values but define those variables as signed 16 bit ints.

In [345]:
class PowerIterator:
    """Module to determine an eigenvector using power iteration.
    
    Operates on 8-bit values. The right shift can be replaced for "// length" to allow all lengths
    """

    def __init__(self, length=4):
        """Initialise.

        Args:
            length: integer setting the 1D size of the eigenvector 
            - needs to be a power of 2.
        """
        assert isinstance(length, int)
        self.length = length
        # Initialise eigenvector as random vector
        # Set range to -127 to 127 (to be symmetrical)
        # But generate as 16 bit value as we normalise to 8-bit - saves a casting
        self.ev = np.random.randint(low=-127, high=128, size=(length, 1), dtype=np.int16)
        # Loop if we get all zeros
        while not self.ev.any():
            self.ev = np.random.randint(low=-127, high=128, size=(length, 1), dtype=np.int16)
        # Normalise the eigenvector using the L2 norm
        self.ev = normalise(self.ev)
        # Define placeholder for covariance matrix
        self.cov = np.zeros(shape=(length, length), dtype=np.int16)
        # Define eigenvalue
        self.rayleigh = np.zeros(1, dtype=np.int8)

    def iterate(self):
        """One pass of iteration.
        
        Applies power iteration with power = 1.
        Casts cov and ev to 16-bit, matmuls then casts
        back to 8-bit after scaling.
        
        We could even do this in 32-bit space then apply the scaling afterwards.
        I.e. in this case we won't need to divide by the length or by 127 and
        could use the same normalise routine.
        """
        # Check cov is not all zero - if all 0 you get nan
        if self.cov.any():
            # Divide by L - this is needed to avoid overflow in the matmul
            temp_ev = self.ev//self.length
            # Just do one multiplication per round
            temp_ev = np.matmul(temp_cov, temp_ev)
            # Divide by 127 (the max value)
            temp_ev = (temp_ev // 127).astype(np.int8)
            self.ev = temp_ev
        return self.ev

    @property
    def eigenvector(self):
        """Return the top eigenvector."""
        return self.ev.copy()

    @property
    def eigenvalue(self):
        """Return associated eigenvalue."""
        # Convert cov and ev to 16-bit
        temp_cov = self.cov.astype(np.int16)
        # Scale ev by L
        temp_ev = self.ev.astype(np.int16) // self.length
        # Compute in 16-bit space 
        top_1 = np.matmul(temp_ev.T, temp_cov)
        bottom = np.matmul(temp_ev.T, temp_ev)
        # Scale both to divide by bit_depth_max (i.e. 128 for 8bit)
        top_1 = top_1 // 127
        bottom = bottom // 127
        # Compute eigenvalue as Raleigh Quotient
        rayleigh = np.matmul(top_1, temp_ev) / bottom
        rayleigh = rayleigh.astype(np.int8)
        self.rayleigh = rayleigh
        return rayleigh

    def load_covariance(self, cov):
        """Update the covariance matrix."""
        self.cov = cov
        
    def __repr__(self):
        """Generate printable representation of state."""
        string = (
            f"Power Iterator - length {self.length}\n"
            f"Eigenvector:\n{self.eigenvector}\n"
            f"Eigenvalue:\n{self.eigenvalue}\n"
            f"Covariance:\n{self.cov}\n"
        )
        return string

In [346]:
# Test initialise
power = PowerIterator(2)
print(power)

ev1 = power.ev
assert ev1.any()
assert not power.cov.any()
# Check logic to avoid ev = nan
ev1_a = power.iterate()
print(power)
assert np.array_equal(ev1, ev1_a)
# Check update with non-zero cov
random_cov = np.random.randint(255, size=(2, 2))
# Don't scale by max here
power.load_covariance(random_cov)
print(power)
assert np.array_equal(power.cov, random_cov)
ev2 = power.iterate()
assert not np.array_equal(ev1, ev2)
assert np.array_equal(ev2, power.eigenvector)
print(power)

Power Iterator - length 2
Eigenvector:
[[ -13.]
 [-127.]]
Eigenvalue:
[[0]]
Covariance:
[[0 0]
 [0 0]]

Power Iterator - length 2
Eigenvector:
[[ -13.]
 [-127.]]
Eigenvalue:
[[0]]
Covariance:
[[0 0]
 [0 0]]

Power Iterator - length 2
Eigenvector:
[[ -13.]
 [-127.]]
Eigenvalue:
[[93]]
Covariance:
[[224 156]
 [148  55]]



ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 2 is different from 3)

Ah the unit length scaling also helps us to increase the values within the eigenvector as well as stopping explosion. Because if the norm is less than one, we multiple each value by a value > 1 when dividing.

We need to add back in the scaling.

In [103]:
def norm(array):
    """Calculate an L2 norm of an 8-bit array.
    Args:
        array - int8 numpy 1D array.
    """
    # Convert to 16-bit space
    temp_array = array.astype(np.int16)
    # Square 
    temp_array = temp_array**2
    # Scale by L
    temp_array = temp_array // array.shape[0]
    # Sum
    array_sum = temp_array.sum()
    # Square root
    sq_root = np.sqrt(array_sum)
    # Convert back to 8-bit space
    return sq_root.astype(np.int8)

In [78]:
length = 4
rand_vals = np.random.randint(low=-127, high=128, size=(length, 1)); print(rand_vals)

[[-125]
 [ -23]
 [ -91]
 [ -73]]


In [82]:
norm(rand_vals)

86.2554346113913

In [80]:
np.linalg.norm(rand_vals)

172.52246230563716

In [75]:
rand_vals.dtype

dtype('int64')

In [83]:
np.sqrt((rand_vals**2).sum())

172.52246230563716

In [84]:
norm(rand_vals)*2

172.5108692227826

Ah yes - it's half because we've divided by 4 to avoid overflow on 16bits - sqrt(4) = 2. Likewise 177 will overflow 8bits.

In [85]:
rand_vals/norm(rand_vals), rand_vals/np.linalg.norm(rand_vals)

(array([[-1.44918405],
        [-0.26664987],
        [-1.05500599],
        [-0.84632348]]), array([[-0.72454333],
        [-0.13331597],
        [-0.52746755],
        [-0.42313331]]))

In [88]:
rand_vals/norm(rand_vals)*63, rand_vals/np.linalg.norm(rand_vals)*127

(array([[-91.2985951 ],
        [-16.7989415 ],
        [-66.46537723],
        [-53.31837954]]), array([[-92.0170034 ],
        [-16.93112863],
        [-66.98837847],
        [-53.73792998]]))

In [89]:
rand_vals*63/norm(rand_vals), rand_vals*127/np.linalg.norm(rand_vals)

(array([[-91.2985951 ],
        [-16.7989415 ],
        [-66.46537723],
        [-53.31837954]]), array([[-92.0170034 ],
        [-16.93112863],
        [-66.98837847],
        [-53.73792998]]))

In [90]:
%%timeit
norm(rand_vals)

15.7 µs ± 198 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


In [91]:
%%timeit
np.linalg.norm(rand_vals)

7.21 µs ± 78.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


In [95]:
%%timeit
np.sqrt(((rand_vals**2)).sum())

9 µs ± 332 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


In [93]:
%%timeit
np.sqrt(((rand_vals**2)//4).sum())

10.8 µs ± 36.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


In [94]:
%%timeit
np.sqrt(((rand_vals.astype(np.int16)**2)//4).sum()).astype(np.int8)

15.3 µs ± 170 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


In [96]:
int8_vals = rand_vals.astype(np.int8)

In [98]:
%%timeit
np.sqrt(((int8_vals**2)).sum())

8.83 µs ± 67.5 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


In [99]:
%%timeit
np.linalg.norm(int8_vals)

7.24 µs ± 59.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


In [100]:
np.linalg.norm(int8_vals).dtype

dtype('float64')

Ah casting takes the extra time. There's not much extra difference in using a function. Maybe we can keep the eigenvector and cov as 16 bit values but always scale to keep within 8-bit range?

The divide is what slows us down. There's no real change in doing the calculations in 8-bit or 64-bit.

I wonder what happens if we use linalg and just convert back to 8-bit.

In [101]:
%%timeit
((rand_vals/np.linalg.norm(rand_vals))*127).astype(np.int8)

17.3 µs ± 153 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


In [102]:
%%timeit
((rand_vals/norm(rand_vals))*63).astype(np.int8)

23.7 µs ± 315 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)


Ah but not much difference for the whole process.

In [None]:
np.sqrt((rand_vals**2).sum())

So we could bring in the multiplication by 2 later > I.e. we divide by the norm then only need to multiply by (127/2)

In [108]:
normalise(rand_vals)

array([[-93],
       [-17],
       [-67],
       [-54]], dtype=int8)

In [109]:
def test_normalise():
    """Test normalising an array using the L2 norm."""
    rand_vals = np.random.randint(low=-127, high=128, size=(4, 1))
    # Norm using function
    scaled = normalise(rand_vals)
    # Norm using linalg function
    linalg = ((rand_vals / np.linalg.norm(rand_vals))*127).astype(np.int8)
    assert np.allclose(scaled, linalg, atol=5)

In [112]:
test_normalise()

In [113]:
%%timeit
normalise(rand_vals)

26.7 µs ± 1.29 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)


In [114]:
%%timeit
((rand_vals / np.linalg.norm(rand_vals))*127).astype(np.int8)

17 µs ± 117 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


### 32 bit overkill

What if we go up into 32-bits yet keep the 8-bit values?

We can define temp_ev as a 32bit int and the covariance matrix as a 32-bit int. We can then just matmul and normalise to get an 8-bit output. 

We then don't have to worry about length scaling?

**BELOW IS THE CURRENT WORKING CODE.**

In [366]:
def normalise(array):
    """Scale array by L2 norm.
    
    Args:
        array - int32 numpy 1D array holding 8-bit values.
    """
    sq_root_sum = np.sqrt((array**2).sum())
    # Watch the below - we need to bring the 127 onto the top to keep everything in integer space
    scaled_array = (array*127//sq_root_sum) 
    return scaled_array.astype(np.int32)

In [367]:
class PowerIterator:
    """Module to determine an eigenvector using power iteration.
    
    Operates on 8-bit values. The right shift can be replaced for "// length" to allow all lengths
    """

    def __init__(self, length=4):
        """Initialise.

        Args:
            length: integer setting the 1D size of the eigenvector 
            - needs to be a power of 2.
        """
        assert isinstance(length, int)
        self.length = length
        # Initialise eigenvector as random vector
        # Set range to -127 to 127 (to be symmetrical)
        # But generate as 16 bit value as we normalise to 8-bit - saves a casting
        self.ev = np.random.randint(low=-127, high=128, size=(length, 1), dtype=np.int32)
        # Loop if we get all zeros
        while not self.ev.any():
            self.ev = np.random.randint(low=-127, high=128, size=(length, 1), dtype=np.int32)
        # Normalise the eigenvector using the L2 norm
        self.ev = normalise(self.ev)
        # Define placeholder for covariance matrix - values will be 8-bit but we need 32 for future calculations
        self.cov = np.zeros(shape=(length, length), dtype=np.int32)
        # Define eigenvalue
        self.rayleigh = np.zeros(1, dtype=np.uint16)

    def iterate(self):
        """One pass of iteration.
        
        Applies power iteration with power = 1.
        Casts cov and ev to 16-bit, matmuls then casts
        back to 8-bit after scaling.
        
        We could even do this in 32-bit space then apply the scaling afterwards.
        I.e. in this case we won't need to divide by the length or by 127 and
        could use the same normalise routine.
        """
        # Check cov is not all zero - if all 0 you get nan
        if self.cov.any():
            # Just do one multiplication per round
            temp_ev = np.matmul(self.cov, self.ev)
            # Divide by 127 (the max value)
            self.ev = normalise(temp_ev)
        return self.ev

    @property
    def eigenvector(self):
        """Return the top eigenvector."""
        return self.ev.copy().astype(np.int8)

    @property
    def eigenvalue(self):
        """Return associated eigenvalue."""
        if self.cov.any():
            # Compute in 32-bit space 
            top_1 = np.matmul(self.ev.T, self.cov)
            bottom = np.matmul(self.ev.T, self.ev)
            rayleigh = np.matmul(top_1, self.ev) / bottom
            rayleigh = rayleigh.astype(np.uint16).ravel()
            self.rayleigh = rayleigh
        return self.rayleigh

    def load_covariance(self, cov):
        """Update the covariance matrix."""
        # Remember to convert the input cov to 32-bit variable
        self.cov = cov.astype(np.int32)
        return None
        
    def __repr__(self):
        """Generate printable representation of state."""
        string = (
            f"Power Iterator - length {self.length}\n"
            f"Eigenvector:\n{self.eigenvector}\n"
            f"Eigenvalue:\n{self.eigenvalue}\n"
            f"Covariance:\n{self.cov}\n"
        )
        return string

In [459]:
ev = np.random.randint(
            low=-127, high=128, size=(length, 1), dtype=np.int32)
print(ev, not ev.any())

[[  78]
 [  24]
 [-115]
 [-123]] False


In [368]:
array = np.random.randint(low=-127, high=128, size=(length, 1), dtype=np.int32)
print(array.dtype)
sq_root_sum = np.sqrt((array**2).sum())
print(sq_root_sum, np.ceil(sq_root_sum))
scaled_array = (array*127//sq_root_sum)
print(scaled_array)
ceil_scale = (array*127//np.ceil(sq_root_sum))
print(ceil_scale)
print(np.linalg.norm(scaled_array), np.linalg.norm(ceil_scale))
print(np.linalg.norm((array*127//(sq_root_sum+1))))

int32
156.26259949200897 157.0
[[ 95.]
 [-51.]
 [ 14.]
 [-66.]]
[[ 94.]
 [-51.]
 [ 14.]
 [-66.]]
127.1927670899568 126.44761761298629
126.44761761298629


Sometimes the linalg provides a result above 127 but adding one to the sum or using ceil doesn't make a difference.

In [369]:
np.linalg.norm(scaled_array)

127.1927670899568

In [370]:
# Test initialise
power = PowerIterator(2)
print(power)

ev1 = power.ev
assert ev1.any()
assert not power.cov.any()
# Check logic to avoid ev = nan
ev1_a = power.iterate()
print(power)
assert np.array_equal(ev1, ev1_a)
# Check update with non-zero cov
# random_cov = np.random.randint(low=-127, high=127, size=(2, 2))
cov = np.random.uniform(low=-127, high=128, size=(2, 2))
cov = np.dot(cov.T, cov)//127
random_cov = cov.astype(np.int8)
# Don't scale by max here
power.load_covariance(random_cov)
print(power)
assert np.array_equal(power.cov, random_cov)
ev2 = power.iterate()
assert not np.array_equal(ev1, ev2)
assert np.array_equal(ev2, power.eigenvector)
print(power)

Power Iterator - length 2
Eigenvector:
[[-92]
 [ 88]]
Eigenvalue:
[0]
Covariance:
[[0 0]
 [0 0]]

Power Iterator - length 2
Eigenvector:
[[-92]
 [ 88]]
Eigenvalue:
[0]
Covariance:
[[0 0]
 [0 0]]

Power Iterator - length 2
Eigenvector:
[[-92]
 [ 88]]
Eigenvalue:
[82]
Covariance:
[[104 -23]
 [-23  10]]

Power Iterator - length 2
Eigenvector:
[[-123]
 [  31]]
Eigenvalue:
[109]
Covariance:
[[104 -23]
 [-23  10]]



Our eigenvector values are 127, -127 or 0.

---

## Back to Testing

In [371]:
power.eigenvalue

array([109], dtype=uint16)

We are getting negative eigenvalues - our covariance matrix is not positive definite. It was because we were dividing by cov.max() - if we do the above where we scale by 127 this seems to work.

In [372]:
assert power.eigenvalue > 0
# Check passing a cov
ev3 = power.iterate()
assert not np.array_equal(ev2, ev3)

In [373]:
print(power)

Power Iterator - length 2
Eigenvector:
[[-124]
 [  28]]
Eigenvalue:
[109]
Covariance:
[[104 -23]
 [-23  10]]



In [374]:
power.iterate()

array([[-124],
       [  28]], dtype=int32)

In [375]:
from src.tests.test_vpu import rand_same, rand_diff, rand_opposite
from src.var_processor.covariance8bit import CovarianceUnit

In [444]:
def init_power(size):
    """Helper function."""
    # Test with length = size
    cov = np.random.uniform(low=-127, high=127, size=(size, size))
    cov = np.dot(cov.T, cov)//(127*np.sqrt(size))
    # Clip to avoid under/overflow
    cov = np.clip(cov, -127, 127)
    print(cov)
    random_cov = cov.astype(np.int8)
    # Generate test power iterator
    power = PowerIterator(size)
    power.load_covariance(random_cov)
    for _ in range(0, 1000):
        power.iterate()
    return power, random_cov

Ah there are always positive values - we just additionally need to scale by L.

Interesting though that only a few values are out of this scale - clipping rather than scaling by L seems better.

In [445]:
for _ in range(0, 100):
    print(init_power(3))

[[ 83. -22.  35.]
 [-22.  29. -29.]
 [ 35. -29.  39.]]
(Power Iterator - length 3
Eigenvector:
[[-100]
 [  46]
 [ -64]]
Eigenvalue:
[115]
Covariance:
[[ 83 -22  35]
 [-22  29 -29]
 [ 35 -29  39]]
, array([[ 83, -22,  35],
       [-22,  29, -29],
       [ 35, -29,  39]], dtype=int8))
[[103.   4.  56.]
 [  4.  21. -10.]
 [ 56. -10.  82.]]
(Power Iterator - length 3
Eigenvector:
[[-98]
 [  3]
 [-82]]
Eigenvalue:
[149]
Covariance:
[[103   4  56]
 [  4  21 -10]
 [ 56 -10  82]]
, array([[103,   4,  56],
       [  4,  21, -10],
       [ 56, -10,  82]], dtype=int8))
[[100.  16.  13.]
 [ 16.  22.  13.]
 [ 13.  13.  76.]]
(Power Iterator - length 3
Eigenvector:
[[112]
 [ 27]
 [ 51]]
Eigenvalue:
[110]
Covariance:
[[100  16  13]
 [ 16  22  13]
 [ 13  13  76]]
, array([[100,  16,  13],
       [ 16,  22,  13],
       [ 13,  13,  76]], dtype=int8))
[[111.  47. -22.]
 [ 47.  52. -59.]
 [-22. -59.  79.]]
(Power Iterator - length 3
Eigenvector:
[[-84]
 [-69]
 [ 66]]
Eigenvalue:
[167]
Covariance:
[[111  

(Power Iterator - length 3
Eigenvector:
[[   0]
 [-103]
 [ -75]]
Eigenvalue:
[170]
Covariance:
[[ 37 -25  33]
 [-25 127  60]
 [ 33  60  87]]
, array([[ 37, -25,  33],
       [-25, 127,  60],
       [ 33,  60,  87]], dtype=int8))
[[ 85.  20. 102.]
 [ 20.  95.  21.]
 [102.  21. 127.]]
(Power Iterator - length 3
Eigenvector:
[[-79]
 [-30]
 [-96]]
Eigenvalue:
[216]
Covariance:
[[ 85  20 102]
 [ 20  95  21]
 [102  21 127]]
, array([[ 85,  20, 102],
       [ 20,  95,  21],
       [102,  21, 127]], dtype=int8))
[[ 99. -14. 116.]
 [-14. 127. -59.]
 [116. -59. 127.]]
(Power Iterator - length 3
Eigenvector:
[[ 72]
 [-51]
 [ 90]]
Eigenvalue:
[253]
Covariance:
[[ 99 -14 116]
 [-14 127 -59]
 [116 -59 127]]
, array([[ 99, -14, 116],
       [-14, 127, -59],
       [116, -59, 127]], dtype=int8))
[[127. 110.  63.]
 [110.  88.  34.]
 [ 63.  34.  71.]]
(Power Iterator - length 3
Eigenvector:
[[92]
 [73]
 [47]]
Eigenvalue:
[247]
Covariance:
[[127 110  63]
 [110  88  34]
 [ 63  34  71]]
, array([[127, 110,

(Power Iterator - length 3
Eigenvector:
[[-64]
 [-68]
 [-87]]
Eigenvalue:
[181]
Covariance:
[[59 31 66]
 [31 93 47]
 [66 47 96]]
, array([[59, 31, 66],
       [31, 93, 47],
       [66, 47, 96]], dtype=int8))
[[ 49. -17. -36.]
 [-17.  67.  59.]
 [-36.  59.  65.]]
(Power Iterator - length 3
Eigenvector:
[[ 48]
 [-81]
 [-86]]
Eigenvalue:
[140]
Covariance:
[[ 49 -17 -36]
 [-17  67  59]
 [-36  59  65]]
, array([[ 49, -17, -36],
       [-17,  67,  59],
       [-36,  59,  65]], dtype=int8))
[[22. -6. 13.]
 [-6. 70. 19.]
 [13. 19. 18.]]
(Power Iterator - length 3
Eigenvector:
[[ -5]
 [121]
 [ 38]]
Eigenvalue:
[76]
Covariance:
[[22 -6 13]
 [-6 70 19]
 [13 19 18]]
, array([[22, -6, 13],
       [-6, 70, 19],
       [13, 19, 18]], dtype=int8))
[[ 67.  49.  46.]
 [ 49.  76. -10.]
 [ 46. -10.  80.]]
(Power Iterator - length 3
Eigenvector:
[[-90]
 [-65]
 [-64]]
Eigenvalue:
[134]
Covariance:
[[ 67  49  46]
 [ 49  76 -10]
 [ 46 -10  80]]
, array([[ 67,  49,  46],
       [ 49,  76, -10],
       [ 46, -1

In [446]:
cov = np.random.uniform(low=-127, high=128, size=(size, size))
cov = np.dot(cov.T, cov)//127
random_cov = cov.astype(np.int8)
# Fill diagonals with positive values
np.fill_diagonal(random_cov, np.abs(np.diag(random_cov)))
print(random_cov, np.diag(random_cov), pos_def_cov, sep="\n")

[[ 78 -48  -2 -43]
 [-48  83 -14 -54]
 [ -2 -14  57  25]
 [-43 -54  25 107]]
[ 78  83  57 107]
None


In [447]:
"""Test power iterator is finding the eigenvector and value."""
power, cov = init_power(3)
evec = power.eigenvector
evalue = power.eigenvalue
# Use numpy linear algebra to determine eigenvectors and values
w, v = np.linalg.eig(cov)
# Check eigenvectors are close (abs removes difference in sign)
print(abs(evec.T), abs(v[:, np.argmax(w)])*127)
print(evalue, w[np.argmax(w)])
print(power)

[[ 74. -40.  67.]
 [-40.  86. -80.]
 [ 67. -80. 127.]]
[[57 68 91]] [57.14566723 67.4210801  91.20181289]
[228] 228.12130252417754
Power Iterator - length 3
Eigenvector:
[[ 57]
 [-68]
 [ 91]]
Eigenvalue:
[228]
Covariance:
[[ 74 -40  67]
 [-40  86 -80]
 [ 67 -80 127]]



These negative eignvectors were due to having negatives on the diagonals of our randomly generated matrix.
```
[[26 89 87]] [124.43607834  20.24203505  15.32717928]
[65317] 35.04401114087452
Power Iterator - length 3
Eigenvector:
[[-26]
 [ 89]
 [-87]]
Eigenvalue:
[65317]
Covariance:
[[  25   36  -34]
 [  36 -106  106]
 [ -34  106 -101]]
```

In [448]:
def test_power():
    """Test power iterator is finding the eigenvector and value."""
    power, cov = init_power(3)
    evec = power.eigenvector
    evalue = power.eigenvalue
    # Use numpy linear algebra to determine eigenvectors and values
    w, v = np.linalg.eig(cov)
    # Check eigenvectors are close (abs removes difference in sign)
    print(power)
    print(abs(evec.T), abs(v[:, np.argmax(w)])*127)
    print(evalue, w[np.argmax(w)])
    assert np.allclose(
        abs(evec.T), abs(v[:, np.argmax(w)])*127, atol=10)
    # Check eigenvalues are close
    assert np.allclose(evalue, w[np.argmax(w)], atol=5)

We've changed power_init to only have positive diagonals and I've tweaked high to 127 as we were getting some negative values still which appear to be overflow.

In [449]:
for i in range(0, 100):
    test_power()

[[ 65. -36.   1.]
 [-36. 101.  34.]
 [  1.  34.  49.]]
Power Iterator - length 3
Eigenvector:
[[  54]
 [-107]
 [ -43]]
Eigenvalue:
[133]
Covariance:
[[ 65 -36   1]
 [-36 101  34]
 [  1  34  49]]

[[ 54 107  43]] [ 55.3840631  106.21395964  42.19242032]
[133] 133.27794702576477
[[ 36. -14. -14.]
 [-14. 106.  46.]
 [-14.  46.  56.]]
Power Iterator - length 3
Eigenvector:
[[  23]
 [-107]
 [ -65]]
Eigenvalue:
[137]
Covariance:
[[ 36 -14 -14]
 [-14 106  46]
 [-14  46  56]]

[[ 23 107  65]] [ 23.75168468 106.68135793  64.68342404]
[137] 137.00786450069702
[[ 17. -39. -42.]
 [-39.  99. 107.]
 [-42. 107. 126.]]
Power Iterator - length 3
Eigenvector:
[[ 32]
 [-82]
 [-92]]
Eigenvalue:
[235]
Covariance:
[[ 17 -39 -42]
 [-39  99 107]
 [-42 107 126]]

[[32 82 92]] [32.23096209 81.39860095 92.00235239]
[235] 235.3814500203886
[[ 28.   5. -15.]
 [  5.  55. -32.]
 [-15. -32.  29.]]
Power Iterator - length 3
Eigenvector:
[[ -32]
 [-100]
 [  72]]
Eigenvalue:
[79]
Covariance:
[[ 28   5 -15]
 [  5  55 -32

 [-37. -29. 104.]]
Power Iterator - length 3
Eigenvector:
[[-41]
 [-35]
 [115]]
Eigenvalue:
[125]
Covariance:
[[ 16   7 -37]
 [  7  20 -29]
 [-37 -29 104]]

[[ 41  35 115]] [ 41.00604273  34.28984466 115.20291235]
[125] 125.80178456300483
[[ 61. -26. -44.]
 [-26.  12.  13.]
 [-44.  13.  53.]]
Power Iterator - length 3
Eigenvector:
[[-93]
 [ 35]
 [ 80]]
Eigenvalue:
[109]
Covariance:
[[ 61 -26 -44]
 [-26  12  13]
 [-44  13  53]]

[[93 35 80]] [92.01671952 35.29901327 80.09933203]
[109] 109.27541100690054
[[ 90. -31. -28.]
 [-31.  46. -22.]
 [-28. -22.  50.]]
Power Iterator - length 3
Eigenvector:
[[-114]
 [  41]
 [  37]]
Eigenvalue:
[110]
Covariance:
[[ 90 -31 -28]
 [-31  46 -22]
 [-28 -22  50]]

[[114  41  37]] [113.89832539  41.91521518  37.40703424]
[110] 110.60406613769959
[[ 40.  -2. -14.]
 [ -2.   5. -17.]
 [-14. -17.  90.]]
Power Iterator - length 3
Eigenvector:
[[  29]
 [  21]
 [-122]]
Eigenvalue:
[96]
Covariance:
[[ 40  -2 -14]
 [ -2   5 -17]
 [-14 -17  90]]

[[ 29  21 122]] [ 2

Power Iterator - length 3
Eigenvector:
[[ 19]
 [ 83]
 [-94]]
Eigenvalue:
[185]
Covariance:
[[120 -18 -31]
 [-18 102 -78]
 [-31 -78 110]]

[[19 83 94]] [21.59787901 82.72718535 93.90817018]
[185] 185.84275886040518
[[ 55.  35.   9.]
 [ 35. 115.  39.]
 [  9.  39.  15.]]
Power Iterator - length 3
Eigenvector:
[[ -49]
 [-112]
 [ -38]]
Eigenvalue:
[143]
Covariance:
[[ 55  35   9]
 [ 35 115  39]
 [  9  39  15]]

[[ 49 112  38]] [ 48.0726484  111.47427604  37.30289879]
[143] 143.14421280301104
[[ 58. -14.  -1.]
 [-14.  36. -23.]
 [ -1. -23.  93.]]
Power Iterator - length 3
Eigenvector:
[[ 11]
 [-45]
 [118]]
Eigenvalue:
[101]
Covariance:
[[ 58 -14  -1]
 [-14  36 -23]
 [ -1 -23  93]]

[[ 11  45 118]] [ 11.48008156  44.10524452 118.5408585 ]
[101] 101.46071603610801
[[ 73.  27.   3.]
 [ 27.  57.  73.]
 [  3.  73. 113.]]
Power Iterator - length 3
Eigenvector:
[[ -25]
 [ -74]
 [-101]]
Eigenvalue:
[166]
Covariance:
[[ 73  27   3]
 [ 27  57  73]
 [  3  73 113]]

[[ 25  74 101]] [ 24.32699828  73.170

In [383]:
power.eigenvalue

array([65395], dtype=uint16)

In [384]:
self = power
# Compute in 32-bit space 
print(self.ev.dtype, self.cov.dtype)

int32 int32


In [385]:
print(power.ev.dtype, power.cov.dtype)

int32 int32


The variable bit size doesn't change automatically on assignement, rather it takes on the assigned bit depth.

So we need to cast out cov to 32bit when we load it. Also it is likely that the normalising is being performed for some reason in float64 bit space.

In [381]:
array = np.random.randint(low=-127, high=128, size=(length, 1), dtype=np.int32)
sq_root_sum = np.sqrt((array**2).sum())
print(sq_root_sum, sq_root_sum.dtype)
# Watch the below - we need to bring the 127 onto the top to keep everything in integer space
scaled_array = (array*127//sq_root_sum) 
print(scaled_array, scaled_array.dtype)

144.97586005952854 float64
[[ 67.]
 [-18.]
 [ 48.]
 [-95.]] float64


np.sqrt returns a float64 - we need to cast. The squared array sum is an int64 (regardless of the bit depth of array).

Later we can convert to int32 space but we just use the numpy functions for now. We can use Newton's method to determine an integer square root - http://www.codecodex.com/wiki/Calculate_an_integer_square_root#Python_3.1.

**But still giving us wrap around on the eigenvalue.**

In [365]:
(array**2).sum().dtype

dtype('int64')

In [388]:
# These are the max values for bottom and top
127**2, 127**3, (127**3)*4**2, (127**2)*4

(16129, 2048383, 32774128, 64516)

In [391]:
print(self.ev, self.cov)
top_1 = np.matmul(self.ev.T, self.cov)
bottom = np.matmul(self.ev.T, self.ev)
print(top_1, bottom, "\n")
print(top_1.dtype, bottom.dtype)
# Compute eigenvalue as Raleigh Quotient
rayleigh = np.matmul(top_1, self.ev) / bottom
print(rayleigh.dtype)
# rayleigh = rayleigh.astype(np.int8)
self.rayleigh = rayleigh
print(rayleigh, "\n")

[[-119]
 [  39]
 [ -25]] [[-102   91  -26]
 [  91  101    4]
 [ -26    4  -12]]
[[16337 -6990  3550]] [[16307]] 

int32 int32
float64
[[-141.37873306]] 



Ah this has a negative eigenvector but it's because the diagonals of the covariance matrix are negative.

In [341]:
print(top_1.dtype, bottom.dtype, rayleigh.dtype)

float64 float64 float64


In [296]:
assert np.allclose(
    abs(evec.T), abs(v[:, np.argmax(w)])*127, rtol=0.05, atol=0.05)
# Check eigenvalues are close

assert np.allclose(evalue, w[np.argmax(w)]*127, rtol=0.05, atol=0.05)

AssertionError: 

My eigenvalue code is out. The eigenvector appears close.

We can adapt the code because our variables are now 32-bit.

It is better now - we're getting the eigenvalues approximately 50% of the time. But sometimes we have huge eigenvalues (e.g. 65457) which is also governed by no eigenvector matches.

In [245]:
self = power
# Compute in 32-bit space 
top_1 = np.matmul(self.ev.T, self.cov)
bottom = np.matmul(self.ev.T, self.ev)
print(top_1, bottom, "\n")
# Scale both to divide by bit_depth_max (i.e. 128 for 8bit)
top_1 = top_1
bottom = bottom
print(top_1, bottom, "\n")
# Compute eigenvalue as Raleigh Quotient
rayleigh = np.matmul(top_1, temp_ev) / bottom
# rayleigh = rayleigh.astype(np.int8)
self.rayleigh = rayleigh
print(rayleigh, "\n")


[[  7899. -15377.  19659.]] [[16094.]] 

[[  7899. -15377.  19659.]] [[16094.]] 

[[67.642724]] 



We are still getting negative eigenvalues. 

In [49]:
# Build cov
size = 4
buf_length = 1000
data_buffer = np.zeros(shape=(size, buf_length))
cov_unit = CovarianceUnit(size)
for i in range(0, buf_length):
    coin_flip = np.random.randint(2)
    if coin_flip:
        # Make all entries the same
        data = rand_same(size=4, negative=True)
    else:
         # Make entries random
        data = np.random.randint(low=-1, high=2, size=(size, 1))
    cov_unit.update_cov(data)

In [50]:
print(cov_unit)

There are 8 stages to process 1D arrays of length 4.
Data is assumed to have a maximum absolute value of 127.
-------
Counter: [111   7   0   0   0   0   0   0]
Running sum of squares:
[[73 43 47 39]
 [43 72 46 42]
 [47 46 63 40]
 [39 42 40 69]]
[[1 3 3 3]
 [3 4 2 1]
 [2 0 3 2]
 [1 3 3 5]]
Complete covariance estimates:
[[86 23 26 35]
 [23 84 39 31]
 [26 39 82 44]
 [35 31 44 89]]

---------
Current covariance estimate (index: 0):
[[86 23 26 35]
 [23 84 39 31]
 [26 39 82 44]
 [35 31 44 89]]



In [51]:
cov_unit.covariance

array([[86, 23, 26, 35],
       [23, 84, 39, 31],
       [26, 39, 82, 44],
       [35, 31, 44, 89]], dtype=int8)

In [53]:
cov = np.asarray([[86, 23, 26, 35],
       [23, 84, 39, 31],
       [26, 39, 82, 44],
       [35, 31, 44, 89]], dtype=np.int8)

In [54]:
cov

array([[86, 23, 26, 35],
       [23, 84, 39, 31],
       [26, 39, 82, 44],
       [35, 31, 44, 89]], dtype=int8)

In [55]:
cov = np.random.randn(size, size)
cov = np.dot(cov, cov.T)
cov = cov / cov.max()
# Convert to 8bit
cov = (cov*127).astype(np.int8)

In [56]:
cov

array([[ 77,  48,   9,  14],
       [ 48, 127,   4,  20],
       [  9,   4,  83, -42],
       [ 14,  20, -42,  31]], dtype=int8)

In [58]:
power, cov = init_power(4)

In [60]:
print(power, cov)

Power Iterator - length 4
Eigenvector:
[[-1]
 [ 0]
 [-1]
 [-1]]
Eigenvalue:
[[0]]
Covariance:
[[ 62 -71  -6  20]
 [-71 127 -24 -19]
 [ -6 -24  43  -6]
 [ 20 -19  -6  37]]
 [[ 62 -71  -6  20]
 [-71 127 -24 -19]
 [ -6 -24  43  -6]
 [ 20 -19  -6  37]]




In [61]:
power.eigenvalue



array([[0]], dtype=int8)

The small size of the eigenvector appears an error.

In [67]:
# Convert cov and ev to 16-bit
temp_cov = self.cov.astype(np.int16)
print(self.ev)
# Divide by L
temp_ev = self.ev.astype(np.int16)//self.length
# Just do one multiplication per round
temp_ev = np.matmul(temp_cov, temp_ev)
# Divide by 127 (the max value)
temp_ev = (temp_ev // 127).astype(np.int8)
self.ev = temp_ev

[[-1]
 [ 0]
 [-1]
 [-1]]


In [None]:
evec = power.eigenvector
evalue = power.eigenvalue
# Use numpy linear algebra to determine eigenvectors and values
w, v = np.linalg.eig(cov)

In [None]:
# Check eigenvectors are close (abs removes difference in sign)
assert np.allclose(
        abs(evec.T), abs(v[:, np.argmax(w)]), rtol=0.05, atol=0.05)
# Check eigenvalues are close
assert np.allclose(evalue, w[np.argmax(w)], rtol=0.05, atol=0.05)

In [29]:
# Test out on tests from src.tests.test_power_iteration

def test_power_iterator():
    """Test power iterator."""
    # Test initialise
    power = PowerIterator(2)
    ev1 = power.ev
    assert ev1.any()
    assert not power.cov.any()
    # Check logic to avoid ev = nan
    ev1_a = power.iterate()
    assert np.array_equal(ev1, ev1_a)
    # Check update with non-zero cov
    random_cov = np.random.randint(255, size=(2, 2))
    random_cov = random_cov / random_cov.max()
    power.load_covariance(random_cov)
    assert np.array_equal(power.cov, random_cov)
    ev2 = power.iterate()
    assert not np.array_equal(ev1, ev2)
    assert np.array_equal(ev2, power.eigenvector)
    assert power.eigenvalue > 0
    # Check passing a cov
    ev3 = power.iterate(cov=random_cov)
    assert not np.array_equal(ev2, ev3)


def init_power(size):
    """Helper function."""
    # Test with length = size
    cov = np.random.randn(size, size)
    cov = np.dot(cov, cov.T)
    cov = cov / cov.max()
    # Generate test power iterator
    power = PowerIterator(size)
    power.load_covariance(cov)
    for _ in range(0, 1000):
        power.iterate()
    return power, cov


def test_power_computation():
    """Test power iterator is finding the eigenvector and value."""
    power, cov = init_power(3)
    evec = power.eigenvector
    evalue = power.eigenvalue
    # Use numpy linear algebra to determine eigenvectors and values
    w, v = np.linalg.eig(cov)
    # Check eigenvectors are close (abs removes difference in sign)
    assert np.allclose(
        abs(evec.T), abs(v[:, np.argmax(w)]), rtol=0.05, atol=0.05)
    # Check eigenvalues are close
    assert np.allclose(evalue, w[np.argmax(w)], rtol=0.05, atol=0.05)


def test_feature_scaling_2():
    """Test that the features are scaled to have max of 1."""
    # Test with length = 2
    p_2, _ = init_power(2)
    e_2 = p_2.eigenvector
    assert np.array_equal(p_2.feature, e_2*(np.sqrt(2)/2))
    assert np.max(np.abs(p_2.feature)) <= 1


def test_feature_scaling_3():
    """Test that the features are scaled to have max of 1."""
    # Test with length = 3
    p_3, _ = init_power(3)
    # Not a timing thing - added time.sleep still had error
    # Why does 2 work but not 3? Works when in different functions!
    assert np.array_equal(p_3.feature, p_3.eigenvector*(np.sqrt(3)/3))
    assert np.max(np.abs(p_3.feature)) <= 1