# PyTorch Operations for Artificial Neural Networks (ANNs)

This notebook covers essential PyTorch operations and functions specifically useful for building Artificial Neural Networks (ANNs). The focus is on operations that are frequently used in ANN development, including layer definitions, weight initialization, dropout, batch normalization, and various utility functions. The examples and explanations aim to help students understand and effectively apply these operations in their neural network models.

## 1. Defining Layers for ANN

In PyTorch, neural network layers are defined using the `torch.nn` module. This module provides a variety of pre-built layers such as fully connected layers (`Linear`), convolutional layers, pooling layers, etc. Here we focus on the fully connected layers used in ANNs.


In [None]:
# Example: Defining a Fully Connected Layer
import torch.nn as nn

# Define a single fully connected layer
fc_layer = nn.Linear(in_features=10, out_features=5)  # Input size 10, output size 5
print('Fully Connected Layer:', fc_layer)
print('Layer Weight Shape:', fc_layer.weight.shape)
print('Layer Bias Shape:', fc_layer.bias.shape)

## 2. Weight Initialization for ANN

Weight initialization is critical for training deep learning models. Proper initialization can help avoid problems like vanishing or exploding gradients. PyTorch provides several initialization methods, including Xavier (Glorot) and Kaiming (He) initializations.


In [None]:
# Example: Weight Initialization
nn.init.xavier_uniform_(fc_layer.weight)
print('Xavier Initialized Weights:', fc_layer.weight)

nn.init.kaiming_uniform_(fc_layer.weight, nonlinearity='relu')
print('Kaiming Initialized Weights:', fc_layer.weight)

## 3. Dropout for Regularization

Dropout is a regularization technique used to prevent overfitting in neural networks by randomly setting a fraction of input units to zero during training. PyTorch provides the `nn.Dropout` layer to implement dropout easily.


In [None]:
# Example: Applying Dropout
dropout_layer = nn.Dropout(p=0.5)  # Dropout probability of 50%
input_tensor = torch.randn(5, 10)  # A batch of 5 samples, each of dimension 10
output_tensor = dropout_layer(input_tensor)
print('Input Tensor:', input_tensor)
print('Output Tensor with Dropout Applied:', output_tensor)

## 4. Batch Normalization

Batch normalization is a technique to stabilize and accelerate the training process by normalizing the input of each layer. This reduces the internal covariate shift and allows for higher learning rates.


In [None]:
# Example: Applying Batch Normalization
batch_norm_layer = nn.BatchNorm1d(num_features=10)  # For a fully connected layer with 10 features
normalized_output = batch_norm_layer(input_tensor)
print('Batch Normalized Output:', normalized_output)

## 5. Activation Functions in ANN

Activation functions introduce non-linearity into the model, allowing it to learn complex patterns. Commonly used activation functions in ANNs include ReLU, Sigmoid, and Tanh, all available in PyTorch.


In [None]:
# Example: Using Activation Functions
relu = nn.ReLU()
sigmoid = nn.Sigmoid()
tanh = nn.Tanh()

relu_output = relu(input_tensor)
sigmoid_output = sigmoid(input_tensor)
tanh_output = tanh(input_tensor)

print('ReLU Output:', relu_output)
print('Sigmoid Output:', sigmoid_output)
print('Tanh Output:', tanh_output)

## 6. Utility Functions for ANNs

Several utility functions are essential for building and training ANNs. These include functions for saving and loading models, shuffling datasets, and adjusting learning rates.


In [None]:
# Example: Saving and Loading Models
model = nn.Sequential(
    nn.Linear(10, 5),
    nn.ReLU(),
    nn.Linear(5, 2)
)
torch.save(model.state_dict(), 'ann_model.pth')  # Save model state

# Load the saved model state
model_loaded = nn.Sequential(
    nn.Linear(10, 5),
    nn.ReLU(),
    nn.Linear(5, 2)
)
model_loaded.load_state_dict(torch.load('ann_model.pth'))
print('Model Loaded Successfully')

## 7. Learning Rate Scheduling

Learning rate scheduling involves dynamically adjusting the learning rate during training to improve convergence. PyTorch provides several schedulers like `StepLR`, `ExponentialLR`, and `ReduceLROnPlateau`.


In [None]:
# Example: Using Learning Rate Scheduler
optimizer = torch.optim.SGD(model.parameters(), lr=0.1)
scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=10, gamma=0.1)

# Simulate training loop
for epoch in range(20):
    # ... training logic ...
    print(f'Epoch {epoch+1}, Learning Rate: {scheduler.get_last_lr()}')
    scheduler.step()

## Exercises

1. Define different types of layers (e.g., fully connected, dropout, batch normalization) and combine them into a small ANN.
2. Initialize weights using different methods and observe the effects on training.
3. Apply different activation functions and see their impact on model outputs.
4. Use a learning rate scheduler to adjust learning rates dynamically during training.
5. Experiment with saving and loading models to understand the workflow in PyTorch.