## Can we translate any 'Markov-Chain with biases' to a 'Markov-Chain without biases'
**Yes.** 
Idea: biases are just a deterministic distribution of probability mass. A bias of mass $b_{1j} \in [0,1]$ on neuron j,
can be achieved by (1) scaling down the original weight matrix, by the factor $(1- b_{1j)$, 
and then (2) adding to every weight $w_{ij} = P(j | i) += b_{1j)$

In [34]:
import numpy as np
from util.util_lrp_transformation_visualization import norm_arr as a, h0_W_h1_R1_C_R0

In [30]:
inp = a(1,2)

W = np.array([[.2, .8], 
              [.9, .1]])
W /= W.sum(axis=1, keepdims=True) # normalize (if not already)

biases_l1 = np.array([.2, .4])
biases_mass = biases_l1.sum()       # -> we want 60% for our entire prob mass to come only from biases

In [37]:
states_t0 = inp * (1 - biases_mass) # -> thus, our input can only contribute 40% of the prob mass

states_t1 = W.T @ states_t0
print(states_t1.sum())

states_t1 += biases_l1
print(states_t1.sum())

inp, states_t0, states_t1

0.3999999999999999
1.0


(array([0.33333333, 0.66666667]),
 array([0.13333333, 0.26666667]),
 array([0.46666667, 0.53333333]))

In [47]:
W_explicit_biases = np.zeros((4,4))
W_explicit_biases[:2, :2] = W
W_explicit_biases[2, 0] = 1
W_explicit_biases[3, 1] = 1

states_t0 = np.hstack((
    inp * (1 - biases_mass),
    biases_l1
))

states_t1 = W_explicit_biases.T @ states_t0
print(states_t1.sum())

inp, states_t0, states_t1

1.0


(array([0.33333333, 0.66666667]),
 array([0.13333333, 0.26666667, 0.2       , 0.4       ]),
 array([0.46666667, 0.53333333, 0.        , 0.        ]))

In [36]:
W_to_account_for_biases = biases_l1[None, :] * np.ones(2)

# now include the above in our matrix W:
W_implicit_biases = W * (1 - biases_mass) + W_to_account_for_biases # -> thus, our input can only contribute 40% of the prob mass

states_t0 = inp

states_t1 = W_implicit_biases.T @ states_t0
print(states_t1.sum())

inp, states_t0, states_t1

1.0


(array([0.33333333, 0.66666667]),
 array([0.33333333, 0.66666667]),
 array([0.46666667, 0.53333333]))

In [35]:
## How does this change the relevancy scores?
h0_W_h1_R1_C_R0()

TypeError: h0_W_h1_R1_C_R0() missing 3 required positional arguments: 'w11', 'w12', and 'A1'

In [108]:

W = np.array([[0.6, 0.2],
              [0.4, 0.8]])

for p1 in np.linspace(0., 1., 11):
    print("\n"+"="*20+"\n")
    h0 = a(p1,1-p1)

    # explicit biases
    W_eb = np.block([[(W - W.min(axis=1, keepdims=True)), np.eye(2)], 
                     [np.zeros((2,4))]])
    W_eb /= W_eb.sum(axis=0, keepdims=True)
    # print("W_eb \n", W_eb, '\n')

    biases = W.min(axis=1)
    biases_mass = biases.sum()
    assert(biases_mass <= 1)

    h0_eb = np.hstack((h0 * (1 - biases_mass), biases))
    print("h0_eb", h0_eb, '\n')

    h1_eb = W_eb @ h0_eb
    print("h1_eb", h1_eb, '\n')

    C_eb = W_eb * h0_eb[None, :]
    C_eb /= C_eb.sum(axis=1, keepdims=True)
    C_eb[np.isnan(C_eb)] = 0
    print("C_eb.T \n", C_eb.T, '\n the distance:', np.sqrt(np.sum((C_eb.T[:2, 0] - C_eb.T[:2, 1])**2)), '\n')

    R1_eb = h1_eb * (h1_eb == h1_eb.max())
    # print("R1_eb", R1_eb, '\n')

    R0_eb = C_eb.T @ R1_eb
    print("R0_eb", R0_eb, '\n')



h0_eb [0.  0.4 0.2 0.4] 

h1_eb [0.2 0.8 0.  0. ] 

C_eb.T 
 [[0.  0.  0.  0. ]
 [0.  0.5 0.  0. ]
 [1.  0.  0.  0. ]
 [0.  0.5 0.  0. ]] 
 the distance: 0.49999999999999994 

R0_eb [0.  0.4 0.  0.4] 



h0_eb [0.04 0.36 0.2  0.4 ] 

h1_eb [0.24 0.76 0.   0.  ] 

C_eb.T 
 [[0.16666667 0.         0.         0.        ]
 [0.         0.47368421 0.         0.        ]
 [0.83333333 0.         0.         0.        ]
 [0.         0.52631579 0.         0.        ]] 
 the distance: 0.5021498870653232 

R0_eb [0.   0.36 0.   0.4 ] 



h0_eb [0.08 0.32 0.2  0.4 ] 

h1_eb [0.28 0.72 0.   0.  ] 

C_eb.T 
 [[0.28571429 0.         0.         0.        ]
 [0.         0.44444444 0.         0.        ]
 [0.71428571 0.         0.         0.        ]
 [0.         0.55555556 0.         0.        ]] 
 the distance: 0.528359269114071 

R0_eb [0.   0.32 0.   0.4 ] 



h0_eb [0.12 0.28 0.2  0.4 ] 

h1_eb [0.32 0.68 0.   0.  ] 

C_eb.T 
 [[0.375      0.         0.         0.        ]
 [0.         0.41176471 0

  C_eb /= C_eb.sum(axis=1, keepdims=True)
