# Quick Start: Your First Neural Network

**Course:** AI Skills Hub  
**Lesson:** Quick Start Tutorial  
**Platform:** AI Skills Hub  
**License:** MIT  
**GPU Required:** Optional (will run on CPU)  
**Estimated Runtime:** 5-10 minutes

---

## Overview

In this quick start tutorial, you'll:
- Build a simple neural network in PyTorch
- Understand the basic components of a neural network
- Train a model on the MNIST dataset
- Make predictions on handwritten digits

No prior deep learning experience required!

## Step 1: Install Dependencies

First, let's make sure we have all the required libraries.

In [None]:
# Install required packages (uncomment if needed)
# !pip install torch torchvision matplotlib

import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms
from torch.utils.data import DataLoader
import matplotlib.pyplot as plt
import numpy as np

print(f"PyTorch Version: {torch.__version__}")
print(f"GPU Available: {torch.cuda.is_available()}")

## Step 2: Load the MNIST Dataset

MNIST is a dataset of 70,000 handwritten digits (0-9). It's perfect for learning!

In [None]:
# Define data transformations
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.1307,), (0.3081,))
])

# Download and load training data
train_dataset = datasets.MNIST(root='./data', train=True, download=True, transform=transform)
test_dataset = datasets.MNIST(root='./data', train=False, download=True, transform=transform)

# Create data loaders
train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=1000, shuffle=False)

print(f"Training samples: {len(train_dataset)}")
print(f"Test samples: {len(test_dataset)}")

## Step 3: Visualize Some Examples

Let's look at what our data looks like.

In [None]:
# Get a batch of training data
examples = iter(train_loader)
example_data, example_targets = next(examples)

# Plot the first 6 examples
fig, axes = plt.subplots(2, 3, figsize=(10, 6))
for i, ax in enumerate(axes.flat):
    ax.imshow(example_data[i].squeeze(), cmap='gray')
    ax.set_title(f"Label: {example_targets[i]}")
    ax.axis('off')
plt.tight_layout()
plt.show()

## Step 4: Define the Neural Network

Here's where the magic happens! We'll create a simple 3-layer neural network.

In [None]:
class SimpleNN(nn.Module):
    def __init__(self):
        super(SimpleNN, self).__init__()
        # Input layer: 28x28 = 784 pixels
        self.fc1 = nn.Linear(784, 128)
        self.relu = nn.ReLU()
        # Output layer: 10 classes (digits 0-9)
        self.fc2 = nn.Linear(128, 10)
    
    def forward(self, x):
        # Flatten the image
        x = x.view(-1, 784)
        # Pass through layers
        x = self.fc1(x)
        x = self.relu(x)
        x = self.fc2(x)
        return x

# Create the model
model = SimpleNN()
print(model)

# Move to GPU if available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = model.to(device)
print(f"\nUsing device: {device}")

## Step 5: Define Loss Function and Optimizer

These help the model learn from its mistakes.

In [None]:
# Loss function: CrossEntropyLoss for classification
criterion = nn.CrossEntropyLoss()

# Optimizer: Adam (a popular choice)
optimizer = optim.Adam(model.parameters(), lr=0.001)

print("Loss function and optimizer ready!")

## Step 6: Train the Model

Now let's train our neural network! We'll train for just 3 epochs to keep it quick.

In [None]:
# Training function
def train(model, device, train_loader, optimizer, criterion, epoch):
    model.train()
    total_loss = 0
    correct = 0
    
    for batch_idx, (data, target) in enumerate(train_loader):
        data, target = data.to(device), target.to(device)
        
        # Zero the gradients
        optimizer.zero_grad()
        
        # Forward pass
        output = model(data)
        loss = criterion(output, target)
        
        # Backward pass and optimize
        loss.backward()
        optimizer.step()
        
        # Track metrics
        total_loss += loss.item()
        pred = output.argmax(dim=1, keepdim=True)
        correct += pred.eq(target.view_as(pred)).sum().item()
        
        # Print progress every 100 batches
        if batch_idx % 100 == 0:
            print(f'Epoch {epoch}: [{batch_idx * len(data)}/{len(train_loader.dataset)} '
                  f'({100. * batch_idx / len(train_loader):.0f}%)]\tLoss: {loss.item():.6f}')
    
    avg_loss = total_loss / len(train_loader)
    accuracy = 100. * correct / len(train_loader.dataset)
    print(f'\nEpoch {epoch} Summary: Avg Loss: {avg_loss:.4f}, Accuracy: {accuracy:.2f}%\n')
    return avg_loss, accuracy

# Train for 3 epochs
num_epochs = 3
train_losses = []
train_accuracies = []

for epoch in range(1, num_epochs + 1):
    loss, acc = train(model, device, train_loader, optimizer, criterion, epoch)
    train_losses.append(loss)
    train_accuracies.append(acc)

## Step 7: Evaluate on Test Set

Let's see how well our model performs on unseen data!

In [None]:
def test(model, device, test_loader, criterion):
    model.eval()
    test_loss = 0
    correct = 0
    
    with torch.no_grad():
        for data, target in test_loader:
            data, target = data.to(device), target.to(device)
            output = model(data)
            test_loss += criterion(output, target).item()
            pred = output.argmax(dim=1, keepdim=True)
            correct += pred.eq(target.view_as(pred)).sum().item()
    
    test_loss /= len(test_loader)
    accuracy = 100. * correct / len(test_loader.dataset)
    
    print(f'Test Set: Average loss: {test_loss:.4f}, '
          f'Accuracy: {correct}/{len(test_loader.dataset)} ({accuracy:.2f}%)\n')
    return test_loss, accuracy

# Test the model
test_loss, test_accuracy = test(model, device, test_loader, criterion)

## Step 8: Make Predictions

Let's see our model in action on some test examples!

In [None]:
# Get some test examples
examples = iter(test_loader)
example_data, example_targets = next(examples)

# Make predictions
model.eval()
with torch.no_grad():
    output = model(example_data.to(device))
    predictions = output.argmax(dim=1, keepdim=True)

# Plot results
fig, axes = plt.subplots(2, 3, figsize=(12, 8))
for i, ax in enumerate(axes.flat):
    ax.imshow(example_data[i].squeeze(), cmap='gray')
    pred = predictions[i].item()
    true = example_targets[i].item()
    color = 'green' if pred == true else 'red'
    ax.set_title(f"Pred: {pred}, True: {true}", color=color, fontsize=14, fontweight='bold')
    ax.axis('off')
plt.tight_layout()
plt.show()

print("Green = Correct Prediction, Red = Wrong Prediction")

## Congratulations!

You've just built, trained, and tested your first neural network! Here's what you learned:

1. **Loading data** with PyTorch datasets and DataLoaders
2. **Building a neural network** with Linear layers and ReLU activation
3. **Training** using forward pass, loss calculation, and backpropagation
4. **Evaluation** on a test set
5. **Making predictions** on new data

## Next Steps

Ready to learn more? Check out:

- **[Foundation Track](https://rajgupt.github.io/ai-for-builders/courses/foundation/)** - Deep dive into Python, Math, and ML fundamentals
- **[Core Track](https://rajgupt.github.io/ai-for-builders/courses/core/)** - Learn CNNs, RNNs, and advanced architectures
- **[Projects](https://rajgupt.github.io/ai-for-builders/projects/)** - Build portfolio-worthy AI projects

## Try These Challenges

1. **Add more layers:** Modify the network to have 3 or 4 layers
2. **Change activation:** Try different activation functions (Tanh, Sigmoid)
3. **Tune hyperparameters:** Experiment with learning rate, batch size, epochs
4. **Add dropout:** Prevent overfitting with dropout layers
5. **Use CNNs:** Replace linear layers with convolutional layers

---

**Questions?** Join our [community discussions](https://github.com/rajgupt/ai-for-builders/discussions)

**Found this helpful?** Star our [GitHub repository](https://github.com/rajgupt/ai-for-builders)