# Hopfield Network for Binary Pattern Restoration

This notebook introduces the **Energy-based model (EBM)** perspective on the Hopfield Network, explains its connection to the **Ising model**, and guides you through implementing a simple binary pattern restoration in Python. Leverage your physics intuition to understand the energy function and system dynamics, and observe the similarity between neural networks and physical spin systems.

## 1. Overview of Energy-based Models (EBM)
Energy-based models define an energy function over all possible system states and perform learning and inference by **minimizing** that energy. Given an input, the model identifies the state that achieves the lowest energy. Examples include Hopfield Networks and Boltzmann Machines.

- **Lyapunov Function**: The energy serves as a Lyapunov function; with each state update, the energy decreases or remains constant, and the system converges to a stable attractor.
- **Inference**: Finding the minimum-energy state corresponds to the network’s operation, enabling applications such as pattern restoration and solving optimization problems.

## 2. Introduction and Theoretical Background of Hopfield Networks

A Hopfield Network is a recurrent neural network that has a strong analogy with the **Ising model** in physics. Its main properties are:

1. **Neurons (Spins) and States**
   - Each neuron $s_i$ takes a binary state $+1$ or $-1$.
   - This corresponds to the up/down spin states in magnetic systems.

2. **Symmetric Weights (Interactions)**
   - The weight $w_{ij}$ between neurons $i$ and $j$ is symmetric ($w_{ij} = w_{ji}$), and self-connections are zero ($w_{ii} = 0$).

3. **Energy Function**
   - The network’s energy is defined as:
\begin{equation}
  E(\mathbf{s}) = -\frac{1}{2} \sum_{i,j} w_{ij} \, s_i \, s_j 
                   - \sum_i \theta_i \, s_i,
\end{equation}
   - Here, $\theta_i$ is the bias (threshold) for neuron $i$.
   - This is structurally identical to the Ising Hamiltonian:
\begin{equation}
  H(\{s_i\}) = -\sum_{\langle i,j \rangle} J_{ij} \, s_i \, s_j 
                - \sum_i h_i \, s_i,
\end{equation}
     where $J_{ij} \leftrightarrow w_{ij}$ and $h_i \leftrightarrow \theta_i$.

4. **State Update Rules**
   - **Asynchronous update**: At each time step, randomly select one neuron $i$ and update:
\begin{equation}
  s_i \leftarrow \mathrm{sign}\Bigl(\sum_j w_{ij} s_j - \theta_i\Bigr).
\end{equation}
   - **Synchronous update**: Update all neurons simultaneously using the same rule.

5. **Learning Rule: Hebbian Learning**
   - Given $P$ patterns $p^{(\mu)}$ ($\mu = 1, \dots, P$), set the weights by:
\begin{equation}
  w_{ij} = \frac{1}{N} \sum_{\mu=1}^P p_i^{(\mu)} \, p_j^{(\mu)}, \quad w_{ii} = 0,
\end{equation}
   - $N$ is the dimensionality of each pattern vector.

Thanks to these properties, Hopfield Networks serve as **content-addressable memory**, solve **optimization problems**, and provide insights into **stable states of physical systems**.


## 3. Weight Calculation with Hebbian Learning (`train_hopfield`)

- Implement the Hebbian learning rule:
\begin{equation}
  w_{ij} = \frac{1}{N} \sum_{\mu=1}^P p_i^{(\mu)} \, p_j^{(\mu)}, \quad w_{ii} = 0
\end{equation}
- Input: `patterns` — a list of 1D numpy arrays of length $N$ with values +1 or -1.
- Output: Symmetric weight matrix `W` of shape $(N, N)$.

In [1]:
import numpy as np
import random

In [2]:
def train_hopfield(patterns):
    """
    Compute the weight matrix W using the Hebbian learning rule.

    Args:
        patterns (list of np.ndarray): List of pattern vectors (length N, values +1 or -1).

    Returns:
        np.ndarray: Symmetric weight matrix W of shape (N, N).
    """
    P = len(patterns)
    N = patterns[0].size
    W = np.zeros((N, N))
    # --- TODO: Implement Hebbian learning ---
    for p in patterns:
        W += np.outer(p, p)
    W /= N
    np.fill_diagonal(W, 0)
    return W

## 4. Asynchronous State Update (`update_state`)

- Randomly select one neuron $i$ and update its state:
\begin{equation}
  s_i \leftarrow \mathrm{sign}\Bigl(\sum_j w_{ij} s_j - \theta_i\Bigr).
\end{equation}
- Return the updated state vector.

In [None]:
def update_state(state, weights):
    """
    Perform one asynchronous update of the network state.

    Args:
        state (np.ndarray): Current state vector (length N, values +1 or -1).
        weights (np.ndarray): Weight matrix W of shape (N, N).

    Returns:
        np.ndarray: Updated state vector after one asynchronous update.
    """
    new_state = state.copy()
    # TODO: Implement asynchronous update
    i = random.randrange(state.size)
    total_input = np.dot(weights[i], state)
    new_state[i] = 1 if total_input >= 0 else -1
    return new_state

## 5. Energy Computation (`compute_energy`)

- Compute the network energy:
\begin{equation}
  E = -\tfrac{1}{2} s^T W s.
\end{equation}
- Return the energy value.

In [None]:
def compute_energy(state, weights):
    """
    Calculate the energy of the current network state.

    Args:
        state (np.ndarray): State vector of length N.
        weights (np.ndarray): Weight matrix W of shape (N, N).

    Returns:
        float: Energy value E.
    """
    # TODO: Implement energy calculation
    E = -0.5 * np.dot(state, np.dot(weights, state))
    return E

## 6. Adding Noise to a Pattern (`add_noise`)

- Flip a fraction `noise_level` of bits in the input pattern.

In [None]:
def add_noise(pattern, noise_level):
    """
    Add noise to a binary pattern by flipping a fraction of its bits.

    Args:
        pattern (np.ndarray): Original pattern vector (length N).
        noise_level (float): Fraction of bits to flip (0 to 1).

    Returns:
        np.ndarray: Noisy pattern vector.
    """
    noisy = pattern.copy()
    N = pattern.size
    num_flip = int(noise_level * N)
    # TODO: flip random bits in the pattern.
    flip_indices = np.random.choice(N, size=num_flip, replace=False)
    noisy[flip_indices] *= -1
    return noisy

## 7. Train and evaluate Hopfield network

In [6]:
import matplotlib.pyplot as plt
import matplotlib.animation as animation
from IPython.display import HTML

In [7]:
def animate_convergence(initial_state, weights, interval=25, max_steps=400, patience=80):
    """
    Animate the asynchronous convergence of a Hopfield network with convergence patience.

    Args:
        initial_state (np.ndarray): Noisy initial state vector.
        weights (np.ndarray): Learned weight matrix.
        interval (int): Time in milliseconds between frames.
        max_steps (int): Maximum number of update steps to attempt.
        patience (int): Number of consecutive steps with no change to confirm convergence.

    Returns:
        Animation object (for use in notebooks).
    """
    state = initial_state.copy()
    states = [state.copy()]
    unchanged_counter = 0

    for _ in range(max_steps):
        new_state = update_state(state, weights)
        states.append(new_state.copy())
        if np.array_equal(new_state, state):
            unchanged_counter += 1
            if unchanged_counter >= patience:
                break
        else:
            unchanged_counter = 0
        state = new_state

    fig, ax = plt.subplots(figsize=(3, 3))
    im = ax.imshow(states[0].reshape(5, 5), cmap='gray', vmin=-1, vmax=1, interpolation='none')

    # Remove ticks and add frame
    ax.set_xticks([])
    ax.set_yticks([])
    ax.set_xticks(np.arange(-0.5, 5, 1), minor=True)
    ax.set_yticks(np.arange(-0.5, 5, 1), minor=True)
    ax.grid(which='minor', color='black', linestyle='-', linewidth=1)
    ax.tick_params(which='minor', bottom=False, left=False)
    ax.set_title("Hopfield Convergence", fontsize=10)
    fig.tight_layout()

    def update(frame):
        im.set_data(states[frame].reshape(5, 5))
        return [im]

    ani = animation.FuncAnimation(
        fig, update, frames=len(states), interval=interval, blit=True, repeat=False
    )
    plt.close(fig)
    return ani

In [8]:
p1 = np.array([
    [ 1,  1, 1,  1,  1],
    [-1, -1, 1, -1, -1],
    [-1, -1, 1, -1, -1],
    [-1, -1, 1, -1, -1],
    [-1, -1, 1, -1, -1]
]).flatten()
p2 = np.array([
    [1, -1, -1, -1, -1],
    [1, -1, -1, -1, -1],
    [1, -1, -1, -1, -1],
    [1, -1, -1, -1, -1],
    [1,  1,  1,  1,  1]
]).flatten()
p3 = np.array([
    [-1, -1, 1, -1, -1],
    [-1, -1, 1, -1, -1],
    [ 1,  1, 1,  1,  1],
    [-1, -1, 1, -1, -1],
    [-1, -1, 1, -1, -1]
]).flatten()
patterns = [p1, p2, p3]
W = train_hopfield(patterns)

In [9]:
HTML(animate_convergence(add_noise(p1, 0.15).copy(), W).to_jshtml())

In [10]:
HTML(animate_convergence(add_noise(p2, 0.15).copy(), W).to_jshtml())

In [11]:
HTML(animate_convergence(add_noise(p3, 0.15).copy(), W).to_jshtml())