Building a Simple Neural Network with nn.Module

In [None]:
import torch
import torch.nn as nn

In [None]:
import torch
import torch.nn as nn
import torch.optim as optim

# Custom Neural Network Class
class MySimpleNN(nn.Module):
    def __init__(self, num_features):
        super().__init__()
        self.network = nn.Sequential(
            nn.Linear(num_features, 16),  # First hidden layer
            nn.ReLU(),                    # Activation function
            nn.Linear(16, 2)              # Output layer (for binary classification)
        )

    def forward(self, x):
        return self.network(x)


I first created a custom neural network class by inheriting from nn.Module.

Inside the constructor __init__, I added two fully connected layers (nn.Linear). The first one transforms the input features to a hidden layer, and the second one maps from that hidden layer to the final output layer.

I also added a ReLU activation function in between the layers to introduce non-linearity into the model

Then, I implemented the forward() method, which defines how the input data flows through the network layers during the forward pass.

In [None]:
# Generate random input data
X = torch.rand(100, 5)                # 100 samples, 5 features
y = torch.randint(0, 2, (100,))       # 100 binary labels (0 or 1)


In [None]:
# Create model instance
model = MySimpleNN(num_features=5)

# Define loss function and optimizer
criterion = nn.CrossEntropyLoss()               # suitable for classification
optimizer = optim.SGD(model.parameters(), lr=0.1)


In [None]:
# Training the model
epochs = 20
for epoch in range(epochs):
    # Forward pass
    outputs = model(X)
    loss = criterion(outputs, y)

    # Backward pass and optimization
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    # Print loss
    if (epoch+1) % 5 == 0 or epoch == 0:
        print(f'Loss: {loss.item():.4f}')


Loss: 0.6810
Loss: 0.6804
Loss: 0.6798
Loss: 0.6794
Loss: 0.6790


In [None]:
# Evaluation: Get predictions
with torch.no_grad():
    test_outputs = model(X)
    _, predicted = torch.max(test_outputs, 1)
    accuracy = (predicted == y).sum().item() / y.size(0)

print(f'Accuracy: {accuracy*100:.2f}%')


Accuracy: 58.00%


Instead of using a real-world dataset like MNIST I created a random dataset

I converted this dataset into PyTorch tensors so it could be used with the model.

For the loss function, I used CrossEntropyLoss, which is suitable for classification tasks.

For optimization, I used Stochastic Gradient Descent (SGD) through torch.optim.SGD, and trained the model using a loop that runs for several epochs.

During each epoch, I passed data through the model, calculated the loss, performed backpropagation, and updated the weights.

Custom Dataset and DataLoader

In [None]:
from torch.utils.data import Dataset, DataLoader
from sklearn.datasets import make_classification

In [None]:
X,y=make_classification(
    n_samples=10,
    n_features=2,
    n_classes=2,
    random_state=42,
    n_informative=2,    # Number of informative features
    n_redundant=0,      # Number of redundant features

)

In [None]:
X

tensor([[ 1.0683, -0.9701],
        [-1.1402, -0.8388],
        [-2.8954,  1.9769],
        [-0.7206, -0.9606],
        [-1.9629, -0.9923],
        [-0.9382, -0.5430],
        [ 1.7273, -1.1858],
        [ 1.7774,  1.5116],
        [ 1.8997,  0.8344],
        [-0.5872, -1.9717]])

In [None]:
y.shape

torch.Size([10])

In [None]:

# Convert the data to PyTorch tensors
X = torch.tensor(X, dtype=torch.float32)
y = torch.tensor(y, dtype=torch.long)


In [None]:
class CustomDataset(Dataset):

  def __init__(self, features, labels):

    self.features = features
    self.labels = labels

  def __len__(self):

    return self.features.shape[0]

  def __getitem__(self, index):

    return self.features[index], self.labels[index]

 I defined a class called CustomDataset that inherited from torch.utils.data.Dataset.

Inside this class, I implemented the __init__ method to accept the features and labels, and the __len__ method to return the size of the dataset.

I implemented the __getitem__() method to return a single sample and its label when given an index.

In [None]:
dataset=customDataset(X,y)

In [None]:
len(dataset)

10

In [None]:
dataloader=DataLoader(dataset, batch_size=32, shuffle=True)

In [None]:
for batch_features, batch_labels in dataloader:
  print(batch_features.shape)
  print(batch_labels.shape)

torch.Size([10, 2])
torch.Size([10])


After creating the custom dataset object, I used DataLoader to wrap it. I set batch_size=32 and shuffle=True so that data is randomly batched during training. Then, I tested it by looping through the DataLoader and printing the shape of a few batches. This confirmed that batching and shuffling were working correctly.

In [None]:
# 80% train, 20% test
train_size = int(0.8 * len(dataset))
test_size = len(dataset) - train_size

train_dataset, test_dataset = random_split(dataset, [train_size, test_size])


In [None]:
train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=32, shuffle=False)


In [None]:
class MyClassifier(nn.Module):
    def __init__(self):
        super().__init__()
        self.net = nn.Sequential(
            nn.Linear(2, 16),
            nn.ReLU(),
            nn.Linear(16, 2)
        )

    def forward(self, x):
        return self.net(x)


In [None]:
model = MyClassifier()
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.01)

epochs = 20
for epoch in range(epochs):
    model.train()
    total_loss = 0
    for features, labels in train_loader:
        outputs = model(features)
        loss = criterion(outputs, labels)

        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        total_loss += loss.item()

    print(f" Loss: {total_loss:.4f}")



 Loss: 0.7230
 Loss: 0.7147
 Loss: 0.7066
 Loss: 0.6989
 Loss: 0.6915
 Loss: 0.6842
 Loss: 0.6772
 Loss: 0.6704
 Loss: 0.6638
 Loss: 0.6575
 Loss: 0.6513
 Loss: 0.6453
 Loss: 0.6394
 Loss: 0.6338
 Loss: 0.6283
 Loss: 0.6229
 Loss: 0.6177
 Loss: 0.6126
 Loss: 0.6077
 Loss: 0.6029


In [None]:
from sklearn.metrics import classification_report

print(classification_report(
    all_labels,
    all_preds,
    labels=[0, 1],                         # Explicit class labels
    target_names=["Class 0", "Class 1"],  # Same order
    zero_division=0                       # Avoid divide-by-zero warning
))


              precision    recall  f1-score   support

     Class 0       1.00      1.00      1.00         2
     Class 1       0.00      0.00      0.00         0

    accuracy                           1.00         2
   macro avg       0.50      0.50      0.50         2
weighted avg       1.00      1.00      1.00         2



After training the model and setting up the data pipeline, I moved on to evaluating the model a

First, I split the dataset into training and testing sets using random_split.

I used the trained model to predict labels on the test set, and compared these predictions with the true labels to calculate accuracy.

To evaluate the model in more depth, I used classification_report from sklearn, which gives additional metrics like precision, recall, and F1 score.

However, since the dataset was small, I faced an issue where sometimes the test set had only one class. I solved this by explicitly setting labels=[0, 1] inside classification_report() so it wouldn’t crash.

Hyperparameter Tuning

I tried different learning rates (like 0.01 and 0.001) by changing the optim
izer settings. I observed how a smaller learning rate made the model learn slowly but more precisely, while a bigger learning rate made it faster but risked overshooting.

I also experimented with different batch sizes like 32 and 64 to see how it affected convergence.


Finally, I tried training for different numbers of epochs (e.g., 10, 20, 30) to observe the loss curve and accuracy trends over time.