# Compare execution time between CPU and GPU version

This notebook compares the execution time of Hopfield network between CPU and GPU.

For reference, on my environment:

- CPU (AMD Ryzen 7 7700): 8.04(7) s
- GPU (NVIDIA GeForce RTX 4060): 0.042(1) s

## Common process

In [1]:
import sys
import os
import numpy as np
import cupy as cp
import cv2
import matplotlib.pyplot as plt

current_dir = os.getcwd()

if 'google.colab' in sys.modules:
    !git clone https://github.com/skrbcr/Hopfield_network_gpu.git
    %cd Hopfield_network_gpu
    sys.path.append('/content/Hopfield_network_gpu/')
    current_dir += '/Hopfield_network_gpu'
else:
    sys.path.append(os.path.dirname(current_dir))

In [2]:
image_name = f"{current_dir}/github.png"  # Pattern image
P = 100  # Number of pattern
M0 = 0.7  # Initial m^0
DELTA_M = 0.001  # Convergence threshold for M0 (m^0)
MAX_STEPS = 100  # Max step for recalling

In [3]:
image = cv2.imread(image_name, cv2.IMREAD_GRAYSCALE)
if image is None:
    print(f"[Error] Cannot open \"{image_name}\"")
    sys.exit(-1)

## GPU (NumPy and CuPy) version

In [4]:
%%timeit

# Memorize
xi0 = cp.where(cp.asarray(image.reshape(-1)) == 255, 1, -1)
N = xi0.size
width = image[0].size
height = N // width

cp.random.seed(np.uint64(1234))

xi = cp.random.choice([-1, 1], size=(P, N))
J = (cp.outer(xi0, xi0) + xi.T @ xi) / N

# Recall
s = xi0.copy()
indices = cp.random.choice(N, size=int(N * (1 - M0) / 2), replace=False)
s[indices] = -s[indices]

m = [float(cp.dot(xi0, s) / N)]

for _ in range(MAX_STEPS):
    s = cp.where((J @ s) >= 0, 1, -1)
    m.append(float(cp.dot(xi0, s) / N))
    if cp.abs(m[-1] - m[-2]) <= DELTA_M:
        break

43.7 ms ± 1.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


## CPU (NumPy only) version

**NOTICE**

CPU version takes loooong time to finish the calculation. Please be patient...

In [5]:
%%timeit

# Memorize
xi0 = np.where(image.reshape(-1) == 255, 1, -1)
N = xi0.size
width = image[0].size
height = N // width

np.random.seed(np.uint64(1234))

xi = np.random.choice([-1, 1], size=(P, N))
J = (np.outer(xi0, xi0) + xi.T @ xi) / N

# Recall
s = xi0.copy()
indices = np.random.choice(N, size=int(N * (1 - M0) / 2), replace=False)
s[indices] = -s[indices]

m = [float(np.dot(xi0, s) / N)]

for _ in range(MAX_STEPS):
    s = np.where((J @ s) >= 0, 1, -1)
    m.append(float(np.dot(xi0, s) / N))
    if np.abs(m[-1] - m[-2]) <= DELTA_M:
        break

8.17 s ± 74.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
