# Visualizing a Hopfield RNN with Fovea

In this tutorial, we'll dig into the source of some of Fovea's machine learning backend and demonstrate how to use a Hopfield Network for the purposes of optical character recognition. While our example will be rudimentary in scope so as to avoid getting too bogged down in minutiae, the network used can be applied to an extensive variety of real-world scientific challenges.

## The Structure of Fovea's Machine Learning Utilities

Fovea comes packaged with a full-fledged implementation of a generic Hopfield network, a type of *Recurrent Neural Network*. RNNs are distinguishable from vanilla neural networks in that they allow for feedback between the component computational units, also called neurons. This gives the network an internal representation of *associative memory*, as we'll demonstrate in this example.

The network implementation code is contained in `retina/mlearn/hopfield_network.py`. The file defines a single class, `HopfieldNetwork`, which is easily instantiated by a single call of the form `myNet = HopfieldNetwork(num_neurons)`. Optionally, an activation can be supplied (more on this later). The `num_neurons` argument specifies the number of neurons to be used internally by the network.

The file `visuals.py` contains the front-end code that defines a `VisualHopfield` network built on top of the `HopfieldNetwork` backend. The `VisualHopfield` network also relies on an internal `VisualNeuron` class which creates drawings of neurons and the connections between them. In its default form, `VisualHopfield` runs a detailed visual simulation of the Hopfield network training and learning on whatever data with which it's supplied. That said, the each component of the visualization can easily be separated from the action of the others should your needs require a customized approach.

## Understanding the Hopfield Network

In order to understand the Hopfield network's action, we must first understand its implementation. The network has three defining characteristics that separate it from other RNNs.

1. As in all neural networks, the connection between any two neurons is assigned some weight. In the case of the Hopfield network all connections are symmetric, i.e. for any two neurons $N_i$, $N_j$, we have $w_{ij} = w_{ji}$ where $w_{ij}$ and $w_{ji}$ are the weights of the connections between neurons $i$ and $j$ and neurons $j$ and $i$, respectively. What's more, each neuron is connected to every other neuron in the network, although not to itself.

2. The neurons in a Hopfield network are bipolar. That means that they have two possible output states: 1 if the sum of the input values to the neuron exceeds the threshold given by the network's activation function and -1 otherwise.

3. Because every Hopfield network is fully connected, we can associate with it a weight matrix which contains as the $ij^{\text{th}}$ entry the weight of the connection between $N_i$ and $N_j$. As a result of the properties enumerated above, this weight matrix is symmetric and consists of -1's and 1's with 0's along the diagonal (since $w_{ii} = 0 \quad \forall i$).

The natural question arising from this definition of the network is the following: How are the weights of the connections between individual neurons assigned?

### Training the Hopfield Network

Assigning weights to the connections of a neural network takes place during a process called "training" in which the network is fed input data having some well-defined set of features which the network would ideally learn. There are a number of different statistical models of learning that have been proposed and applied over the course of the neural network's development. Our Hopfield network implements the *Hebbian* learning rule and will support *Storkey* learning in the near future.

The Hebbian learning rule is one of the oldest learning models in existence, having been developed by Donald Hebb in 1949. It is based on a simple premise: that "neurons that fire together, wire together, [and] neurons that fire out of sync, fail to link." Formally speaking, the rule calculates each weight as

$$w_{ij} = \frac{1}{n} \sum_{i = 1}^{n} \mu_i \otimes \mu_i$$

Thus the network is presented with a set of $n$ vectors $\{\mu_i \ldots \mu_n\}$ and Here $\otimes$ represents 