# L9a: Introduction to Hopfield Networks
In this lecture, we will introduce Hopfield Networks, a type of recurrent neural network that is used for associative memory. We will discuss the architecture of Hopfield Networks, their dynamics, and their applications in various fields. The key concepts in this lecture are:
* __A Hopfield network__ is a type of recurrent neural network designed to function as a content-addressable memory system, capable of storing and retrieving patterns through associative recall. Introduced in [1982 by John Hopfield](https://pmc.ncbi.nlm.nih.gov/articles/PMC346238/) and recently awared [the Nobel Prize in Physics in 2025](https://news.cornell.edu/stories/2024/10/john-hopfield-phd-58-wins-nobel-prize-physics), it consists of a single layer of interconnected neurons with bidirectional, symmetric connections (excluding self-connections). 
* Each neuron in a Hopfield network operates in a binary $s\in\left\{0,1\right\}$ or biploar $s\in\left\{-1,1\right\}$ state and updates its state based on weighted inputs from others until the network stabilizes to an energy minimum, representing a stored memory.
* __Hopfield Networks__ have a few interesting features. First, they use Hebbian learning: Connection weights are adjusted during training to encode patterns via correlation-based rules, not a seacrh. Second, they are content-addressable: The network can retrieve stored patterns even if the input is noisy or incomplete. Third, they are energy-based: The network's dynamics are governed by an energy function, which decreases as the network converges to a stable state.

## What is a Hopfield network?
A [Hopfield network](https://en.wikipedia.org/wiki/Hopfield_network) is a fully connected undirected graph 
consisting of $N$ nodes, where each node in the graph has a state $s = \pm{1}$; each node is connected to every other node, but not to itself, i.e., the network has no self-loops. The weights of the connection between node $i$ and $j$, denoted as $w_{ij}\in\mathbf{W}$ are learned using [a Hebbian learning rule](https://en.wikipedia.org/wiki/Hebbian_theory). 
* _What is Hebbian learning?_ The [Hebbian learning rule](https://en.wikipedia.org/wiki/Hebbian_theory), proposed by [Donald Hebb in 1949](https://en.wikipedia.org/wiki/Donald_O._Hebb), says that synaptic connections between neurons are strengthened when they activate (fire) simultaneously, forming the biological basis for __associative learning__. This "fire together, wire together" principle underpins unsupervised learning in neural networks, linking co-active nodes to enable pattern storage and adaptation.
* _How is this different than other learning approaches?_ Unlike the previous examples of learning, e.g., logistic regression, or any of the online learning approaches that we looked at previously, the parameters (weights) in a [Hopfield network](https://en.wikipedia.org/wiki/Hopfield_network) are completely specified by the memories we want to encode. Thus, we do not need to search for the weights, or learn them by experimentation with the world. Instead, we can directly compute the weights from the memories we want to encode.

### Encoding memories into a Hopfield network
Suppose we wish our network to memorize $m$-images, where each image is an $n\times{n}$ collection of black and white pixels represented as a vector $\mathbf{s}_{i}\in\left\{-1,1\right\}\in{R}^{n^2}$. We encode the image using the following rule: if the pixel is white, we set the value to $1$, and if the pixel is black, we set the value to $-1$. Then the weights that encode these $m$-images are given by:
$$
\begin{equation*}
\mathbf{W} = \frac{1}{m}\cdot\sum_{i=1}^{m}\mathbf{s}_{i}\otimes\mathbf{s}_{i}^{\top}
\end{equation*}
$$
where $\mathbf{s}_{i}$ denotes the state (pixels) of the image we want to memorize, and $\otimes$ denotes the outer product. Thus, the weights are the average of all of our memories!

* __How big is $m$?__: The maximum theoretical storage limit $K_{\text{max}}$ (the maximum number of possible images that can be stored) of a Hopfield network, using the standard Hebbian learning rule, is approximately $K_{max}\sim{0.138}{N}$, where $N$ is the number of neurons in the network. Thus, the network can reliably store about 14\% of its size in patterns before retrieval errors become significant due to interference between stored patterns.

Suppose we've encoded $m$-images, and we want to retrieve one of them.  This seems a little magical. How does that work? 

### Memory retrieval
The basic idea of [a Hopfield network](https://en.wikipedia.org/wiki/Hopfield_network) is that each memory is encoded as a _local minimum_ of a global energy function. Thus, during memory retrival, when we supply a random state vector $\hat{\mathbf{s}}$, we will recover the _closet_ memory encooded in the network to where we start, assuming a greedy descent, where we only take downhill energy steps.
The overall energy of the network is given by:
$$
\begin{equation*}
E(\mathbf{s}) = -\frac{1}{2}\cdot\sum_{ij}w_{ij}s_{i}s_{j} - \sum_{i}b_{i}s_{i}
\end{equation*}
$$
where $w_{ij}\in\mathbf{W}$ are the weights of the network, and $b_{i}$ is a bias term. The bias term is used to control the activation of the neurons in the network. The bias term is usually set to zero, but it can be used to control the activation threshold of the neurons in the network.

#### Algorithm
__Initialize__: Compute the weights $w_{ij}\in\mathbf{W}$ using the Hebbian learning rule, as described above.  Initialize the network with a random state $\mathbf{s}$. Then, use the following algorithm to retrieve a memory:

While __not__ converged:
1. Compute the energy of the network in the current state $\mathbf{s}$ before the update: $E(\mathbf{s})$.
1. Choose a random node $i$ and compute a new (potential) state $s_{i}^{\prime}$ using the update rule: $s_{i}^{\prime} \leftarrow \sigma\left(\sum_{j}w_{ij}s_{j}-b_{i}\right)$, where $\sigma$ is [the `sign(...)` function](https://docs.julialang.org/en/v1/base/math/#Base.sign) and $b_{i}$ is a bias (threshold) parameter.  
2. Compute the energy of the network after the (potential) update: $E(\mathbf{s}^{\prime})$. If $E(\mathbf{s}^{\prime}) < E(\mathbf{s})$, then update the state of the network: $\mathbf{s} \leftarrow \mathbf{s}^{\prime}$.
3. Check for convergence: If the state of the network does not change between updates i.e., $\lVert \mathbf{s}^{\prime} - \mathbf{s} \rVert \leq\epsilon$, then the network has converged to a memory. Alternatively, check if the energy of the network has converged, i.e., $\lVert E(\mathbf{s}^{\prime}) - E(\mathbf{s}) \rVert \leq\epsilon$.
4. Continue until convergence.

## Example: Learning a single image pattern
First, let's set up the computational environment by importing the necessary libraries (and codes) by including the `Include.jl` file.

In [None]:
include("Include.jl");

## Lab
In `L9d`, we will encode multiple memories, see how many images we can encode into a Hopfield network, and think about how to encode continuous values (not just binary values) into a Hopfield network.

# Today?
That's a wrap! What are some of the interesting things we discussed today?