<a href="https://www.kaggle.com/code/koenbotermans/pytorch-vs-tensorflow-ann?scriptVersionId=234897170" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

# Pytorch vs Tensorflow - ANN

In this notebook series I will give a code comparison between PyTorch and TensorFlow is vital for understanding deep learning because these two libraries are the most widely used frameworks in the field, each with its own syntax, features, and paradigms. Having a "Rosetta Stone" to translate between them is super convenient as it allows practitioners to leverage the strengths of both libraries, facilitates learning and collaboration across different teams, and ensures that models and research can be easily adapted and shared within the deep learning community, enhancing flexibility and innovation.

## In this notebook
In the first notebook of this series, I will start with the basics by creating a simple Artificial Neural Network (ANN) in both PyTorch and TensorFlow. We will train the ANN on the MNIST dataset, which consists of handwritten digit images, to illustrate fundamental concepts and workflows in each framework. This foundational exercise will set the stage for more complex models and comparisons in subsequent notebooks. 

For both PyTorch and TensorFlow, the steps for training a model are essentially the same:

1. Importing the necessary libraries.
2. Defining the model parameters and hyperparameters.
3. Loading the dataset.
4. Preprocessing the data to make it suitable for training.
5. Initializing the model architecture.
6. Training the model using the defined parameters and dataset.

These steps form the core workflow in building and training deep learning models, regardless of the framework used.

# Tensorflow

In [1]:
from plotly.subplots import make_subplots
import plotly.graph_objects as go

# Plotting the training metrics with Plotly
def plot_training_history(history):
    fig = make_subplots(rows=1, cols=2, subplot_titles=("Loss Over Epochs", "Accuracy Over Epochs"))

    # Loss plot
    fig.add_trace(go.Scatter(x=list(range(len(history.history['loss']))),
                             y=history.history['loss'],
                             mode='lines',
                             name='Training Loss'),
                  row=1, col=1)
    fig.add_trace(go.Scatter(x=list(range(len(history.history['val_loss']))),
                             y=history.history['val_loss'],
                             mode='lines',
                             name='Validation Loss'),
                  row=1, col=1)

    # Accuracy plot
    fig.add_trace(go.Scatter(x=list(range(len(history.history['accuracy']))),
                             y=history.history['accuracy'],
                             mode='lines',
                             name='Training Accuracy'),
                  row=1, col=2)
    fig.add_trace(go.Scatter(x=list(range(len(history.history['val_accuracy']))),
                             y=history.history['val_accuracy'],
                             mode='lines',
                             name='Validation Accuracy'),
                  row=1, col=2)

    # Updating layout
    fig.update_layout(title='Training Metrics',
                      xaxis_title='Epoch',
                      yaxis_title='Value',
                      showlegend=True)

    # Update xaxis labels
    fig.update_xaxes(title_text='Epoch', row=1, col=1)
    fig.update_xaxes(title_text='Epoch', row=1, col=2)

    # Update yaxis labels
    fig.update_yaxes(title_text='Loss', row=1, col=1)
    fig.update_yaxes(title_text='Accuracy', row=1, col=2)

    fig.show()


In [2]:
#1. Importing the necessary libraries.
import tensorflow as tf
import numpy as np

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.losses import CategoricalCrossentropy

from tensorflow.keras.datasets import mnist

# 2. Defining the model parameters and hyperparameters.
hidden_1 = 64
hidden_2 = 64
n_classes = 10
n_epochs = 10
batch_size = 32
learning_rate = 0.001

#3. Loading the dataset.
(x_train, y_train), (x_test, y_test) = mnist.load_data()

#4. Preprocessing the data to make it suitable for training.
x_train, x_test = x_train.reshape(-1, 784).astype(np.float32), x_test.reshape(-1, 784).astype(np.float32)
x_train, x_test = x_train/255, x_test/255

y_train_encoded = np.zeros(shape=(y_train.shape[0], 10), dtype=int)
y_test_encoded = np.zeros(shape=(y_test.shape[0], 10), dtype=int)

y_train_encoded[np.linspace(0, len(y_train)-1, len(y_train)).astype(int), y_train]=1
y_test_encoded[np.linspace(0, len(y_test)-1, len(y_test)).astype(int), y_test]=1

# 5. Initializing the model architecture.
model = Sequential([
    Dense(units=hidden_1, activation="relu"),
    Dense(units=hidden_2, activation="relu"),
    Dense(units=n_classes, activation="softmax")
])
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=learning_rate), loss=CategoricalCrossentropy(), metrics=["accuracy"])

# 6. Training the model using the defined parameters and dataset.
callback = model.fit(x=x_train, y=y_train_encoded, validation_data=(x_test, y_test_encoded), batch_size=batch_size, epochs=n_epochs)

2025-04-19 19:13:05.551266: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2025-04-19 19:13:05.551423: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2025-04-19 19:13:05.692080: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered


Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
[1m11490434/11490434[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 0us/step
Epoch 1/10
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 2ms/step - accuracy: 0.8646 - loss: 0.4784 - val_accuracy: 0.9555 - val_loss: 0.1479
Epoch 2/10
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 2ms/step - accuracy: 0.9610 - loss: 0.1306 - val_accuracy: 0.9636 - val_loss: 0.1194
Epoch 3/10
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 2ms/step - accuracy: 0.9736 - loss: 0.0871 - val_accuracy: 0.9696 - val_loss: 0.0959
Epoch 4/10
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 2ms/step - accuracy: 0.9784 - loss: 0.0695 - val_accuracy: 0.9729 - val_loss: 0.0887
Epoch 5/10
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 2ms/step - accuracy: 0.9820 - loss: 0.0565 - val_accuracy: 0.9743 - val_loss: 0.0873
Epo

In [3]:
plot_training_history(history=callback)

# Pytorch

In [4]:
import plotly.graph_objects as go
from plotly.subplots import make_subplots

def plot_training_metrics(n_epochs, train_losses, test_losses, train_accuracies, test_accuracies):
    # Plotting the training metrics with Plotly
    fig = make_subplots(rows=1, cols=2, subplot_titles=("Loss Over Epochs", "Accuracy Over Epochs"))

    # Loss plot
    fig.add_trace(go.Scatter(x=list(range(n_epochs)), y=train_losses, mode='lines', name='Training Loss'), row=1, col=1)
    fig.add_trace(go.Scatter(x=list(range(n_epochs)), y=test_losses, mode='lines', name='Validation Loss'), row=1, col=1)

    # Accuracy plot
    fig.add_trace(go.Scatter(x=list(range(n_epochs)), y=train_accuracies, mode='lines', name='Training Accuracy'), row=1, col=2)
    fig.add_trace(go.Scatter(x=list(range(n_epochs)), y=test_accuracies, mode='lines', name='Validation Accuracy'), row=1, col=2)

    # Updating layout
    fig.update_layout(title='Training Metrics', xaxis_title='Epoch', yaxis_title='Value', showlegend=True)

    # Update xaxis labels
    fig.update_xaxes(title_text='Epoch', row=1, col=1)
    fig.update_xaxes(title_text='Epoch', row=1, col=2)

    # Update yaxis labels
    fig.update_yaxes(title_text='Loss', row=1, col=1)
    fig.update_yaxes(title_text='Accuracy', row=1, col=2)

    fig.show()


In [5]:
# 1. Importing the necessary libraries.
import os
import torch
import torchvision
from tqdm import tqdm
import torch.nn as nn
import matplotlib.pyplot as plt
import plotly.graph_objects as go
from plotly.subplots import make_subplots

# 2. Defining the model parameters and hyperparameters.
hidden_1 = 64
hidden_2 = 64
n_classes = 10
n_epochs = 10
batch_size = 32
learning_rate = 0.001

# 3. Loading the dataset.
trans = torchvision.transforms.Compose([torchvision.transforms.ToTensor(), 
                            torchvision.transforms.Normalize((0.5,), (1.0,))])
mnist_dataset_train = torchvision.datasets.MNIST(root=".", transform=trans, train=True, download=True)
mnist_dataset_test = torchvision.datasets.MNIST(root=".", transform=trans, download=False)

# 4. Preprocessing the data to make it suitable for training.
train_loader = torch.utils.data.DataLoader(
    dataset=mnist_dataset_train, 
    batch_size=batch_size, 
    shuffle=True)
test_loader = torch.utils.data.DataLoader(
    dataset=mnist_dataset_test, 
    batch_size=batch_size, 
    shuffle=False)

# 5. Initializing the model architecture.
class ANN(nn.Module):
    def __init__(self):
        super(ANN, self).__init__()
        self.fc1 = nn.Linear(28*28, hidden_1)
        self.fc2 = nn.Linear(hidden_1, hidden_2)
        self.out = nn.Linear(hidden_2, n_classes)
    
    def forward(self, x):
        x = x.reshape(-1, 28*28)
        x = nn.functional.relu(self.fc1(x))
        x = nn.functional.relu(self.fc2(x))
        x = self.out(x)
        return x

model = ANN()
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
loss_function = nn.CrossEntropyLoss()

# Tracking metrics
train_losses = []
test_losses = []
train_accuracies = []
test_accuracies = []

# 6. Training the model using the defined parameters and dataset.
for epoch in range(n_epochs):
    model.train()
    running_loss = 0
    correct_train = 0
    total_train = 0

    for batch_idx, (x_train, y_train) in tqdm(enumerate(train_loader)):
        optimizer.zero_grad()
        y_pred = model(x_train)
        loss = loss_function(y_pred, y_train)
        loss.backward()
        optimizer.step()

        running_loss += loss.item()
        _, predicted = torch.max(y_pred, 1)
        correct_train += (predicted == y_train).sum().item()
        total_train += y_train.size(0)

    train_losses.append(running_loss / len(train_loader))
    train_accuracies.append(correct_train / total_train)

    model.eval()
    test_loss = 0
    correct_test = 0
    total_test = 0

    with torch.no_grad():
        for x_test, y_test in test_loader:
            y_pred = model(x_test)
            loss = loss_function(y_pred, y_test)
            test_loss += loss.item()
            _, predicted = torch.max(y_pred, 1)
            correct_test += (predicted == y_test).sum().item()
            total_test += y_test.size(0)

    test_losses.append(test_loss / len(test_loader))
    test_accuracies.append(correct_test / total_test)

    print(f'Epoch {epoch+1}/{n_epochs}, Train Loss: {train_losses[-1]:.4f}, Train Accuracy: {train_accuracies[-1]:.4f}, Test Loss: {test_losses[-1]:.4f}, Test Accuracy: {test_accuracies[-1]:.4f}')


plot_training_metrics(n_epochs, train_losses, test_losses, train_accuracies, test_accuracies)

Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Failed to download (trying next):
HTTP Error 404: Not Found

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz to ./MNIST/raw/train-images-idx3-ubyte.gz


100%|██████████| 9912422/9912422 [00:00<00:00, 11511667.56it/s]


Extracting ./MNIST/raw/train-images-idx3-ubyte.gz to ./MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Failed to download (trying next):
HTTP Error 404: Not Found

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-labels-idx1-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-labels-idx1-ubyte.gz to ./MNIST/raw/train-labels-idx1-ubyte.gz


100%|██████████| 28881/28881 [00:00<00:00, 340578.77it/s]


Extracting ./MNIST/raw/train-labels-idx1-ubyte.gz to ./MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Failed to download (trying next):
HTTP Error 404: Not Found

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz to ./MNIST/raw/t10k-images-idx3-ubyte.gz


100%|██████████| 1648877/1648877 [00:00<00:00, 3185229.53it/s]


Extracting ./MNIST/raw/t10k-images-idx3-ubyte.gz to ./MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Failed to download (trying next):
HTTP Error 404: Not Found

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-labels-idx1-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-labels-idx1-ubyte.gz to ./MNIST/raw/t10k-labels-idx1-ubyte.gz


100%|██████████| 4542/4542 [00:00<00:00, 2820629.07it/s]


Extracting ./MNIST/raw/t10k-labels-idx1-ubyte.gz to ./MNIST/raw



1875it [00:17, 104.64it/s]


Epoch 1/10, Train Loss: 0.4151, Train Accuracy: 0.8761, Test Loss: 0.2615, Test Accuracy: 0.9170


1875it [00:18, 102.68it/s]


Epoch 2/10, Train Loss: 0.2191, Train Accuracy: 0.9331, Test Loss: 0.1740, Test Accuracy: 0.9464


1875it [00:17, 104.93it/s]


Epoch 3/10, Train Loss: 0.1675, Train Accuracy: 0.9477, Test Loss: 0.1525, Test Accuracy: 0.9535


1875it [00:18, 101.88it/s]


Epoch 4/10, Train Loss: 0.1381, Train Accuracy: 0.9572, Test Loss: 0.1068, Test Accuracy: 0.9668


1875it [00:17, 104.53it/s]


Epoch 5/10, Train Loss: 0.1189, Train Accuracy: 0.9623, Test Loss: 0.0931, Test Accuracy: 0.9708


1875it [00:17, 105.02it/s]


Epoch 6/10, Train Loss: 0.1044, Train Accuracy: 0.9671, Test Loss: 0.0902, Test Accuracy: 0.9717


1875it [00:17, 105.58it/s]


Epoch 7/10, Train Loss: 0.0943, Train Accuracy: 0.9699, Test Loss: 0.0824, Test Accuracy: 0.9738


1875it [00:17, 104.96it/s]


Epoch 8/10, Train Loss: 0.0844, Train Accuracy: 0.9738, Test Loss: 0.0705, Test Accuracy: 0.9770


1875it [00:17, 105.23it/s]


Epoch 9/10, Train Loss: 0.0788, Train Accuracy: 0.9750, Test Loss: 0.0846, Test Accuracy: 0.9723


1875it [00:18, 103.33it/s]


Epoch 10/10, Train Loss: 0.0741, Train Accuracy: 0.9760, Test Loss: 0.1061, Test Accuracy: 0.9632
