# RNN construction from scratch

Walkthrough of the functionality in `rnnops.ops.construction`

## The pseudoinverse rule for constructing RNNs with specific fixed points 

Suppose we are interested in constructing an $N$-neuron recurrent network which has specific fixed points under various input conditions.

The dynamics of our network, with rate vector $x$, connectivity matrix $J$, element-wise nonlinearity $\phi$, and inputs $u$, are given by
$$
\frac{dx}{dt} = f(x, u) = -x +J \phi(x) + u.
$$
In general, $x^\mu$ is a fixed point under input $u^\mu$ if:
$$
0 =  f(x^\mu, u^\mu) = -x^\mu +J \phi(x^\mu) + u^\mu,
$$
i.e.,
$$
J \phi(x^\mu) = x^\mu - u^\mu, \\
J a^\mu = b^\mu.
$$

In the last line we made the assignments $a^\mu = \phi(x^\mu)$ and $b^\mu = x^\mu - u^\mu$. By the so-called pseudoinverse learning rule [1], we may make $a^\mu$s and $b^\mu$s into the columns of two matrices $A$ and $B$, and solve the least squares problem
$$
(*) \quad J A = B
$$
to obtain a connectivity matrix $\hat J$ which enforces these fixed points.

In [31]:
from rnnops.ops.construction import pseudoinverse_rule
import numpy as np
n_rec = 100
n_fixed = 10
x_fixed = np.random.randn(n_rec, n_fixed)
u_fixed = np.random.randn(n_rec, n_fixed)
J = pseudoinverse_rule(x_fixed, u_fixed, nonlinearity='relu')
print(J.shape)
print(np.linalg.norm(J))

(100, 100)
7.961633521956537


## Using the pseudoinverse learning rule to construct RNNs that compute Boolean functions

Suppose a network with $N$ recurrent neurons is to compute the boolean function $y = F(z)$, where $y$ and $z$ are Boolean-valued vectors. Under input
$$
u^\mu = W_{in}^{(1)} z^\mu \in R^N,
$$
we want to construct a fixed point $x^\mu$ such that $y^\mu = W^{out} x^{\mu}$. Assume there is negligible overlap between all of the input and output weights (i.e. they are mutually orthogonal).

How to choose the $x^\mu$s? As stated initially, we want to enforce
$$
(**)\quad  W_{out} B = Y,
$$
where the columns of $Y$ are the $y^\mu$s. So, $B$ itself is the solution to a squares problem, of which one solution is

$$
x^\mu = W_{out}^+ y^\mu + u^\mu.
$$

Then the optimal value $\hat J$ is the least-norm solution to the equation $(**)$, which can be given closed form via the pseudoinverse. Note that the rank of $\hat J$ is therefore upper bounded by both the number of readout vectors and the number of fixed points, as well as the rank of the matrix of output values $Y$.

Also note that any $J$ constructed to satisfy $(*)$ and $(**)$ will work. More succinctly, $J$ must satisfy
$$
W_{out} J A = Y 
$$

To recap, the recipe we use here is:

1) As an optional preparatory step, expand the input dimensionality such that each dimension of the original input is paired with a dimension that is one minus the original value. This is equivalent to inserting a linear layer (with biases) between the inputs and the recurrent layer.

2) Given $W_{out}$, $W_{in}$ and the input-output pairs $(z^\mu ,y^\mu )$, let $u^\mu = W_{in} z^\mu$ and solve $(**)$ for $b^\mu$ and let $x^\mu = b^\mu + u^\mu$. 

3) Let $a^\mu = \phi(x^\mu)$ and solve $(*)$ for $J$ via the pseudoinverse rule.



In [20]:
# look at the inputs and outputs of the boolean XOR function
from rnnops.ops.construction import XOR_conditions, expand_condition_inputs
print('XOR conditions:')
print('\n'.join([str(_) for _ in XOR_conditions]))


print('\nXOR conditions with expanded inputs:')
expanded_XOR_conditions = expand_condition_inputs(XOR_conditions)
print('\n'.join([str(_) for _ in expanded_XOR_conditions]))

XOR conditions:
((0, 0), (0,))
((0, 1), (1,))
((1, 0), (1,))
((1, 1), (0,))

XOR conditions with expanded inputs:
((0, 1, 0, 1), (0,))
((0, 1, 1, 0), (1,))
((1, 0, 0, 1), (1,))
((1, 0, 1, 0), (0,))


In [24]:
from rnnops.ops.construction import construct_boolean_integration_rnn
construction_args = {
    'conditions': XOR_conditions,
    'n_rec': 600,
    'nonlinearity': 'relu',
    'expand_inputs': True,
}
rnn = construct_boolean_integration_rnn(**construction_args)
print(rnn)

RNN object 
 signature: (4, 1)
 n_rec: 600
 nonlinearity: relu



### References

[1] Personnaz, L., Guyon, I., & Dreyfus, G. (1986). Collective computational properties of neural networks: New learning mechanisms. *Physical Review A*, 34(5), 4217.