Exercise Case Study Notebook: Deep Learning Fundamentals

1. Problem and Objective:
   - Introduce a multi-class image classification dataset
   - Goal: Implement and compare various neural network architectures


2. Data Loading:

In [None]:
import requests

# URLs of the files
train_data_url = 'https://www.raphaelcousin.com/modules/module4/course/module5_course_handling_duplicate_train.csv'
test_data_url = 'https://www.raphaelcousin.com/modules/module4/course/module5_course_handling_duplicate_test.csv'

# Function to download a file
def download_file(url, file_name):
    response = requests.get(url)
    response.raise_for_status()  # Ensure we notice bad responses
    with open(file_name, 'wb') as file:
        file.write(response.content)
    print(f'Downloaded {file_name} from {url}')

# Downloading the files
download_file(train_data_url, 'module5_course_handling_duplicate_train.csv')
download_file(test_data_url, 'module5_course_handling_duplicate_test.csv')

In [None]:
import torch
import torchvision
import torchvision.transforms as transforms

# Load and preprocess the dataset
transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5,), (0.5,))])
trainset = torchvision.datasets.CIFAR10(root='./data', train=True, download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=32, shuffle=True)

print(f"Dataset size: {len(trainset)}")
print(f"Number of classes: {len(trainset.classes)}")


3. Deep Learning Tasks:

a. Feedforward Neural Network:
   - Task: Implement a simple feedforward network for image classification
   - Question: How does the network's depth affect its performance?

b. Backpropagation and Optimization:
   - Task: Implement backpropagation from scratch for a simple network
   - Question: Compare the performance of SGD, Adam, and RMSprop optimizers

c. Automatic Differentiation:
   - Task: Use PyTorch's autograd to compute gradients
   - Question: How does autograd simplify the implementation of custom layers?

d. Activation Functions:
   - Task: Experiment with different activation functions (ReLU, Leaky ReLU, ELU)
   - Question: Analyze the impact of activation functions on training dynamics

e. Regularization Techniques:
   - Task: Apply dropout and batch normalization to your model
   - Question: How do these techniques affect training time and final performance?

f. Convolutional Neural Networks:
   - Task: Implement a CNN for the image classification task
   - Question: Compare the CNN's performance with the feedforward network

g. Recurrent Neural Networks:
   - Task: Implement an LSTM for a sequence prediction task
   - Question: How does the LSTM handle long-term dependencies compared to a simple RNN?

h. Advanced RNN Architectures:
   - Task: Implement a bidirectional LSTM
   - Question: In what scenarios might a bidirectional architecture be particularly useful?

i. Training Deep Neural Networks:
   - Task: Implement learning rate scheduling and gradient clipping
   - Question: How do these techniques impact training stability?

j. Model Interpretation:
   - Task: Generate saliency maps for your CNN predictions
   - Question: What insights can you gain from visualizing activations?

4. Model Comparison:
   - Task: Compare the performance of different architectures (MLP, CNN, RNN)
   - Question: Analyze the trade-offs between model complexity and performance





5. Submission:

In [None]:

testset = torchvision.datasets.CIFAR10(root='./data', train=False, download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=32, shuffle=False)

# Use your best model to make predictions
best_model.eval()
predictions = []
with torch.no_grad():
    for images, _ in testloader:
        outputs = best_model(images)
        _, predicted = torch.max(outputs, 1)
        predictions.extend(predicted.tolist())

submission = pd.DataFrame({
    'id': range(len(testset)),
    'predicted_class': predictions
})

submission.to_csv('submission.csv', index=False)


6. Final Questions:
   - Summarize the key findings from your experiments with different neural network architectures.
   - How might you further improve the model's performance?
   - Discuss the computational requirements of training deep neural networks and strategies for efficient training.
   - What ethical considerations should be taken into account when deploying deep learning models?
