# Evaluating CNN Performance on RRAM-based In-Memory Computing Accelerators

In this notebook, you will use the [cross-sim](https://github.com/sandialabs/cross-sim) simulator to analyze how the performance of a Convolutional Neural Network (CNN) for MNIST digit recognition is affected when deployed on RRAM-based In-Memory-Computing (IMC) accelerators.

You will:
- Train and evaluate a CNN in standard PyTorch (software-only baseline).
- Evaluate the same trained network using cross-sim to simulate RRAM hardware effects.
- Retrain the network using Hardware-Aware Training (HAT) with cross-sim, then evaluate its performance on simulated hardware.

Finally, you can draw your own digit and see how each network performs on your input!

## 1. Setup and Imports

Let's start by importing the necessary libraries and setting up the environment.

In [None]:
import os

REPO_URL = "https://github.com/TommyR06/cross-sim-BTU-course.git"
REPO_NAME = "cross-sim-BTU-course/tutorial/BTU-course"

if not os.path.exists(REPO_NAME):
    !git clone {REPO_URL}
%cd {REPO_NAME}

!pip install -r requirements.txt

In [None]:
%matplotlib inline
import sys
sys.path.append("../../")
import numpy as np
from applications.mvm_params import set_params
import matplotlib.pyplot as plt
from simulator.algorithms.dnn.torch.convert import from_torch, reinitialize, synchronize
import torch
from torchvision import datasets, transforms
from tqdm import tqdm

np.random.seed(498)

## 2. Data Preparation

We will use the MNIST dataset. Let's load and preprocess it.

In [None]:
# Inference batch size
batch_size = 64

# Load the MNIST training set
mnist_data = datasets.MNIST("./", download=True, train=True,
                              transform=transforms.ToTensor(),
                              target_transform=transforms.Compose([
                                lambda x:torch.tensor([x]), 
                                lambda x:torch.nn.functional.one_hot(x,10).float(),
                                lambda x:x.squeeze(),
                                ]))

# Load the MNIST test set
mnist_test = datasets.MNIST("./", download=True, train=False,
                              transform=transforms.ToTensor(),
                              target_transform=transforms.Compose([
                                lambda x:torch.tensor([x]), 
                                lambda x:torch.nn.functional.one_hot(x,10).float(),
                                lambda x:x.squeeze(),
                                ]))

# Split dataset into training and validation and create data loaders
ds_train, ds_val = torch.utils.data.random_split(mnist_data, [0.8, 0.2])
mnist_loader_train = torch.utils.data.DataLoader(ds_train, batch_size=batch_size, shuffle=True)
mnist_loader_val = torch.utils.data.DataLoader(ds_val, batch_size=batch_size, shuffle=False)

# Create test set loader
mnist_loader_test = torch.utils.data.DataLoader(mnist_test, batch_size=batch_size, shuffle=False)
N_test = len(mnist_loader_test.dataset)

## 3. Define the CNN Model

We will use a simple CNN suitable for MNIST. This is a small network with only 7018 trainable weights.

In [16]:
# Define the CNN topology
def mnist_cnn():
    return torch.nn.Sequential(
        torch.nn.Conv2d(1, 8, 3, padding='valid', stride=2),
        torch.nn.ReLU(),
        torch.nn.Conv2d(8, 16, 3, padding='valid', stride=2),
        torch.nn.ReLU(),
        torch.nn.Flatten(),
        torch.nn.Linear(576, 10)
        )

## 4. Training and Evaluation Functions

Let's define helper functions for training and evaluating the model.


We will use the standard PyTorch wrapper below to train the CNN on MNIST. We will train using the Adam optimizer with a learning rate of $10^{-3}$.

In [17]:
# Wrapper for training the CNN
class SequentialWrapper():
    def __init__(self, net, loss, learning_rate=1e-3):
        self.net = net
        self.loss = loss
        self.learning_rate = learning_rate
        self.optimizer = torch.optim.Adam(self.net.parameters(), lr=self.learning_rate)

    def forward(self, x):
        return self.net(x)
    
    def training_step(self, batch):
        self.optimizer.zero_grad()
        pred = self.forward(batch[0])
        loss = self.loss(pred, batch[1])
        loss.backward()
        self.optimizer.step()
        return loss

    def validation_step(self, batch):
        pred = self.forward(batch[0])
        loss = self.loss(pred, batch[1])
        return loss
    
    def train_epoch(self, train_loader, val_loader):
        loss_train, loss_val = 0, 0
        for minibatch in iter(train_loader):
            loss_train += self.training_step(minibatch).detach()
        for minibatch in iter(val_loader):
            loss_val += self.validation_step(minibatch).detach()
        return loss_train/len(train_loader), loss_val/len(val_loader)
    
    def train(self, train_loader, val_loader, epochs):
        loss_train, loss_val = np.zeros(epochs), np.zeros(epochs)
        for e in tqdm(range(0, epochs)):
            lt, lv = self.train_epoch(train_loader, val_loader)
            loss_train[e] = lt
            loss_val[e] = lv
        return loss_train, loss_val

# Create the wrapped PyTorch model
mnist_cnn_pt = SequentialWrapper(mnist_cnn(), torch.nn.CrossEntropyLoss())

## 5. Case I: PyTorch Training and Inference

Train and evaluate the CNN using only PyTorch (software baseline).

We will first train this CNN as we would normally do in PyTorch, without any analog error injection during training. After training, we'll evaluate the test accuracy, again without any analog errors.

In [18]:
# Number of training epochs
N_epochs = 20

# Train the standard PyTorch CNN
loss_train_pt, loss_val_pt = mnist_cnn_pt.train(mnist_loader_train, mnist_loader_val, N_epochs)

# Perform inference on the test set, with no analog errors
y_pred, y, k = np.zeros(N_test), np.zeros(N_test), 0
for inputs, labels in mnist_loader_test:
    output = mnist_cnn_pt.net(inputs)
    y_pred_k = output.data.detach().numpy()
    y_pred = np.append(y_pred,y_pred_k.argmax(axis=-1))
    y = np.append(y,labels.detach().numpy().argmax(axis=1))

# Evaluate accuracy
accuracy_digitalTrain_digitalTest = np.sum(y == y_pred)/len(y)
print('===========')
print('No analog errors during training, no analog errors during test')
print('Test accuracy: {:.2f}%\n'.format(accuracy_digitalTrain_digitalTest*100))

100%|██████████| 20/20 [11:31<00:00, 34.59s/it]


No analog errors during training, no analog errors during test
Test accuracy: 99.07%



## 6. Case II: PyTorch Training, CrossSim Inference

Evaluate the PyTorch-trained model using cross-sim to simulate RRAM hardware effects.


**MODIFY BELOW BASED ON DEVICE AT IHP**

How well does this CNN do when analog errors are injected at inference time? Since this is MNIST, we will simulate inference assuming a memory device that has very large errors. This device will have state-independent conductance errors with $\alpha = 0.3$. We will disable all other error models to keep this demo simple.

We will run inference by first passing our trained CNN through our PyTorch layer converter as we did in Part 2. Since the device error is large, we will simulate inference ten times with re-sampled random device errors each time. This will give us a good statistical picture of the network's accuracy.

In [None]:
# Create a parameters object that models a memory device with very large errors
params_analog = set_params(
    weight_bits = 8, 
    wtmodel = "BALANCED", 
    error_model = "generic",
    proportional_error = "False",
    alpha_error = 0.3)

# Convert the layers in the trained CNN
analog_mnist_cnn_pt = from_torch(mnist_cnn_pt.net, params_analog)

# Number of inference simulations with re-sampled random analog errors
N_runs = 10

# Perform analog inference on the test set
accuracies = np.zeros(N_runs)
for i in range(N_runs):
    print("Inference simulation {:d} of {:d}".format(i+1,N_runs), end="\r")
    y_pred, y, k = np.zeros(N_test), np.zeros(N_test), 0
    for inputs, labels in mnist_loader_test:
        output = analog_mnist_cnn_pt.forward(inputs)
        y_pred_k = output.data.detach().numpy()
        y_pred = np.append(y_pred,y_pred_k.argmax(axis=-1))
        y = np.append(y,labels.detach().numpy().argmax(axis=1))
    accuracies[i] = np.sum(y == y_pred)/len(y)
    reinitialize(analog_mnist_cnn_pt)

# Evaluate average test accuracy
print('\n===========')
print('No analog errors during training, CrossSim analog errors during test')
accuracy_digitalTrain_analogTest = np.mean(accuracies)
std_digitalTrain_analogTest = np.std(accuracies)
print('Test accuracy: {:.2f}% +/- {:.3f}%'.format(100*accuracy_digitalTrain_analogTest,100*std_digitalTrain_analogTest))

Inference simulation 10 of 10
No analog errors during training, CrossSim analog errors during test
Test accuracy: 92.00% +/- 2.092%


## 7. Case III: CrossSim Training and Inference (Hardware-Aware Training)

Retrain the network using cross-sim to include hardware effects during training (HAT), then evaluate on simulated hardware.

**MODIFY BELOW BASED ON YOUR CASE**

With the inclusion of these large conductance errors, our model loses quite a bit of accuracy on MNIST.

Now let's try to see if we can make up this accuracy loss by simulating the conductance errors at inference time during the training process. As before, we will disable all other error models to keep things simple. For a practical hardware-aware training scenario, we can specify our parameters to represent the exact analog hardware configuration that would be used during inference and enable as many different error models in CrossSim as we would like.

To do this, we will use a modified training wrapper below that includes only a single new line. The "synchronize" method is called after the backward pass to update the conductance values in the AnalogCores with the new updated weight values found using the optimizer. These updated AnalogCores will then be used for the forward pass of the next training epoch.

We create another PyTorch CNN, convert its layers to be CrossSim-compatible, then wrap it with the modified training wrapper. Then we will train this model with the same large conductance errors injected during training.

In [20]:
# Modified training warpper for CrossSim-in-the-loop training
class SequentialWrapper_CrossSim(SequentialWrapper):
    def __init__(self, net, loss, learning_rate=1e-3):
        super().__init__(net, loss, learning_rate)
        
    def training_step(self, batch):
        self.optimizer.zero_grad()
        pred = self.forward(batch[0])
        loss = self.loss(pred, batch[1])
        loss.backward()
        self.optimizer.step()
        synchronize(self.net)  # <--- The only changed line in all of training!
        return loss

# Create a PyTorch model with CrossSim-compatible layers
analog_mnist_cnn = from_torch(mnist_cnn(), params_analog)

# Create the wrapped analog PyTorch model
analog_mnist_cnn_CS = SequentialWrapper_CrossSim(analog_mnist_cnn, torch.nn.CrossEntropyLoss())

# Train the analog PyTorch model
loss_train_CS, loss_val_CS = analog_mnist_cnn_CS.train(mnist_loader_train, mnist_loader_val, N_epochs)

  0%|          | 0/20 [00:00<?, ?it/s]

100%|██████████| 20/20 [20:05<00:00, 60.26s/it]


Finally, let's perform inference simulation with conductance errors to see if our model that had device-aware training (with the same conductance errors as inference) achieves higher accuracy than the model with standard training.

In [None]:
# Perform analog inference on the test set
accuracies = np.zeros(N_runs)
for i in range(N_runs):
    print("Inference simulation {:d} of {:d}".format(i+1,N_runs), end="\r")
    y_pred, y, k = np.zeros(N_test), np.zeros(N_test), 0
    for inputs, labels in mnist_loader_test:
        new_input = inputs[0:1]
        output = analog_mnist_cnn_CS.net(inputs)
        # output = analog_mnist_cnn_CS.net(new_input)
        y_pred_k = output.data.detach().numpy()
        y_pred = np.append(y_pred,y_pred_k.argmax(axis=-1))
        y = np.append(y,labels.detach().numpy().argmax(axis=1))
    accuracies[i] = np.sum(y == y_pred)/len(y)
    reinitialize(analog_mnist_cnn_CS.net)

# Evaluate average test accuracy
print('\n===========')
print('CrossSim analog errors during training, CrossSim analog errors during test')
accuracy_analogTrain_analogTest = np.mean(accuracies)
std_analogTrain_analogTest = np.std(accuracies)
print('Test accuracy: {:.2f}% +/- {:.3f}%'.format(accuracy_analogTrain_analogTest*100,std_analogTrain_analogTest*100))

Inference simulation 1 of 1
CrossSim analog errors during training, CrossSim analog errors during test
Test accuracy: 98.09% +/- 0.000%


## 8. Summary Table

In [22]:
print("Accuracy on MNIST test set")
print("================")
print("Standard training, standard inference: {:.2f}%".format(100*accuracy_digitalTrain_digitalTest))
print("Standard training, CrossSim inference: {:.2f}% +/- {:.3f}%".format(100*accuracy_digitalTrain_analogTest, 100*std_digitalTrain_analogTest))
print("CrossSim training, CrossSim inference: {:.2f}% +/- {:.3f}%".format(100*accuracy_analogTrain_analogTest, 100*std_analogTrain_analogTest))

Accuracy on MNIST test set
Standard training, standard inference: 99.07%
Standard training, CrossSim inference: 92.00% +/- 2.092%
CrossSim training, CrossSim inference: 98.17% +/- 0.308%


Device-aware training using CrossSim yielded a substantial recovery of the test accuracy in the presence of very large conductance errors!

## 9. Draw Your Own Digit!

Use the canvas below to draw a digit (0-9). The image will be preprocessed and fed to all three models. See how each model predicts your digit!

In [None]:
from ipycanvas import Canvas, hold_canvas
from IPython.display import display
import numpy as np
import matplotlib.pyplot as plt
from google.colab import output
output.enable_custom_widget_manager()

### 9.1. Canvas Generation

Re-run the following cell to clear the canvas and re-drawn.

In [None]:
canvas = Canvas(width=512, height=512, sync_image_data=True)
canvas.fill_style = 'black'
canvas.stroke_style = 'white'
canvas.line_width = 20

display(canvas)

last_pos = [None]
def handle_mouse_down(x, y): last_pos[0] = (x, y)
def handle_mouse_up(x, y): last_pos[0] = None
def handle_mouse_move(x, y):
    if last_pos[0] is not None:
        with hold_canvas(canvas):
            canvas.begin_path()
            canvas.move_to(*last_pos[0])
            canvas.line_to(x, y)
            canvas.stroke()
        last_pos[0] = (x, y)


canvas.on_mouse_down(handle_mouse_down)
canvas.on_mouse_up(handle_mouse_up)
canvas.on_mouse_move(handle_mouse_move)

### 9.2. Print the Drawn Symbol

Print out the digit from the canvas of the previous cell.

In [None]:
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt

def get_canvas_image(canvas, out_size=(28, 28)):
    data = np.array(canvas.get_image_data())
    gray = np.mean(data[:, :, :3], axis=2).astype(np.uint8)
    pil_img = Image.fromarray(gray)
    pil_img = pil_img.resize(out_size, Image.LANCZOS)
    return np.array(pil_img)

output_img = get_canvas_image(canvas)
plt.imshow(output_img, cmap='gray')
plt.axis('off')
plt.title("Output from Canvas")
plt.show()

### 9.3. Assess Networks Prediction

Check the prediction for the drawn symbol of the three networks.

In [None]:
writing_tranform = transforms.Compose([
    transforms.ToPILImage(),               # Convert to PIL image
    transforms.ToTensor()                 # Converts to [C, H, W] and scales to [0.0, 1.0]
])

img_tensor = writing_tranform(output_img)    # [1, 28, 28]    
img_tensor = img_tensor.unsqueeze(0)   # Add batch dimension -> [1, 1, 28, 28]

In [None]:
output_pt = mnist_cnn_pt.forward(img_tensor)
output_analog_pt = analog_mnist_cnn_pt.forward(img_tensor)
# reinitialize(analog_mnist_cnn_pt) # uncommen to see cycle-by-cycle variability (programming error )
output_analog_CS = analog_mnist_cnn_CS.net(img_tensor)
# reinitialize(analog_mnist_cnn_CS.net) # uncommen to see cycle-by-cycle variability (programming error )


y_pred_output_pt = output_pt.data.detach().numpy()
y_pred_output_analog_pt = output_analog_pt.data.detach().numpy()
y_pred_output_analog_CS = output_analog_CS.data.detach().numpy()

print("Predicted Writing Digit")
print("================")
print(f"Standard training, standard inference: {y_pred_output_pt.argmax(axis=-1)}")
print(f"Standard training, CrossSim inference: {y_pred_output_analog_pt.argmax(axis=-1)}")
print(f"CrossSim training, CrossSim inference: {y_pred_output_analog_CS.argmax(axis=-1)}\n\n")


Predicted Writing Digit
Standard training, standard inference: [3]
Standard training, CrossSim inference: [5]
CrossSim training, CrossSim inference: [3]




[W618 16:53:19.889719448 NNPACK.cpp:57] Could not initialize NNPACK! Reason: Unsupported hardware.
[W618 16:53:19.900309659 NNPACK.cpp:57] Could not initialize NNPACK! Reason: Unsupported hardware.
