**Plan**

**1. AutoML with PyTorch**

**2. PyTorch Lightning**

**3. Hyperparameter tuning with PyTorch**

# **AutoML with PyTorch**

AutoML (Automated Machine Learning) aims to automate the process of building and tuning machine learning models. While traditional AutoML frameworks offer end-to-end solutions, there are several ways to integrate AutoML principles with PyTorch for tasks like hyperparameter optimization, architecture search, and automated model selection. Here's an overview and examples of using AutoML concepts with PyTorch.

**1. Hyperparameter Optimization**

Hyperparameter optimization is a crucial part of the AutoML process, as it involves finding the best set of hyperparameters for a model. Several libraries can assist with hyperparameter tuning in PyTorch.

**Example with Optuna**

**Optuna** is a popular hyperparameter optimization framework that works well with PyTorch.

In [None]:
import optuna
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, Dataset
from torchvision import datasets, transforms

# Define a simple model
class SimpleNN(nn.Module):
    def __init__(self, hidden_units):
        super(SimpleNN, self).__init__()
        self.fc1 = nn.Linear(28 * 28, hidden_units)
        self.fc2 = nn.Linear(hidden_units, 10)

    def forward(self, x):
        x = x.view(-1, 28 * 28)
        x = torch.relu(self.fc1(x))
        x = self.fc2(x)
        return x

# Define objective function for Optuna
def objective(trial):
    # Hyperparameters to optimize
    hidden_units = trial.suggest_int('hidden_units', 64, 128)
    lr = trial.suggest_loguniform('lr', 1e-5, 1e-1)

    # Load dataset
    transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5,), (0.5,))])
    train_dataset = datasets.MNIST(root='data', train=True, download=True, transform=transform)
    train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)

    # Initialize model, optimizer, and loss function
    model = SimpleNN(hidden_units).to('cuda')
    optimizer = optim.Adam(model.parameters(), lr=lr)
    criterion = nn.CrossEntropyLoss()

    # Training loop
    model.train()
    for epoch in range(1):
        for batch in train_loader:
            data, target = batch
            data, target = data.to('cuda'), target.to('cuda')

            optimizer.zero_grad()
            output = model(data)
            loss = criterion(output, target)
            loss.backward()
            optimizer.step()

    # Validation (using the same dataset here for simplicity)
    model.eval()
    correct = 0
    total = 0
    with torch.no_grad():
        for data, target in train_loader:
            data, target = data.to('cuda'), target.to('cuda')
            output = model(data)
            _, predicted = torch.max(output.data, 1)
            total += target.size(0)
            correct += (predicted == target).sum().item()

    accuracy = correct / total
    return accuracy

# Optimize hyperparameters using Optuna
study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=10)

print("Best hyperparameters: ", study.best_params)
print("Best accuracy: ", study.best_value)


**2. Neural Architecture Search (NAS)**

Neural Architecture Search involves finding the best neural network architecture for a given task. While building a NAS system from scratch is complex, there are libraries and frameworks that simplify the process.
Example with NAT (Neural Architecture Transformer)

NAT is a PyTorch-based library for architecture search.

In [None]:
# Example code snippet for using a NAS library like NAT
from nat import NAT

# Define a search space and a model
search_space = {
    'layers': [2, 3, 4],
    'units': [64, 128, 256],
}

# Initialize NAT
nas = NAT(search_space=search_space, model_class=SimpleNN)

# Search for the best architecture
best_architecture = nas.search()
print("Best architecture: ", best_architecture)

Note: This is a conceptual example. For actual implementation, refer to specific NAS libraries' documentation.

**3. AutoML Frameworks Integrating with PyTorch**

Several AutoML frameworks integrate with PyTorch to automate model building and hyperparameter tuning.
Example with AutoKeras

AutoKeras is an AutoML library that works well with PyTorch and TensorFlow. Here’s how you can use it with PyTorch:

In [None]:
import autokeras as ak
import tensorflow as tf
from tensorflow.keras.datasets import mnist

# Load dataset
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train / 255.0
x_test = x_test / 255.0

# Initialize AutoKeras ImageClassifier
clf = ak.ImageClassifier(max_trials=3, overwrite=True)

# Train the model
clf.fit(x_train, y_train, epochs=10)

# Evaluate the model
accuracy = clf.evaluate(x_test, y_test)
print(f'Accuracy: {accuracy}')


**4. Automated Model Selection**

Automated model selection involves choosing the best model architecture and configuration from a set of candidates. This can be done using libraries like Optuna in conjunction with model evaluation functions.
Example: Automated Model Selection with Optuna

In [None]:
import optuna
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, Dataset
from torchvision import datasets, transforms

# Define various models
class ModelA(nn.Module):
    def __init__(self):
        super(ModelA, self).__init__()
        self.fc1 = nn.Linear(28 * 28, 128)
        self.fc2 = nn.Linear(128, 10)

    def forward(self, x):
        x = x.view(-1, 28 * 28)
        x = torch.relu(self.fc1(x))
        x = self.fc2(x)
        return x

class ModelB(nn.Module):
    def __init__(self):
        super(ModelB, self).__init__()
        self.conv1 = nn.Conv2d(1, 32, 3)
        self.fc1 = nn.Linear(32 * 26 * 26, 10)

    def forward(self, x):
        x = torch.relu(self.conv1(x))
        x = x.view(-1, 32 * 26 * 26)
        x = self.fc1(x)
        return x

# Define the objective function
def objective(trial):
    model_type = trial.suggest_categorical('model_type', [ModelA, ModelB])
    model = model_type().to('cuda')

    # Hyperparameters
    lr = trial.suggest_loguniform('lr', 1e-5, 1e-1)
    batch_size = trial.suggest_int('batch_size', 16, 64)

    # Load dataset
    transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5,), (0.5,))])
    train_dataset = datasets.MNIST(root='data', train=True, download=True, transform=transform)
    train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)

    # Training setup
    optimizer = optim.Adam(model.parameters(), lr=lr)
    criterion = nn.CrossEntropyLoss()

    # Training loop
    model.train()
    for epoch in range(1):
        for inputs, targets in train_loader:
            inputs, targets = inputs.to('cuda'), targets.to('cuda')
            optimizer.zero_grad()
            outputs = model(inputs)
            loss = criterion(outputs, targets)
            loss.backward()
            optimizer.step()

    # Evaluate accuracy
    model.eval()
    correct = 0
    total = 0
    with torch.no_grad():
        for inputs, targets in train_loader:
            inputs, targets = inputs.to('cuda'), targets.to('cuda')
            outputs = model(inputs)
            _, predicted = torch.max(outputs.data, 1)
            total += targets.size(0)
            correct += (predicted == targets).sum().item()

    accuracy = correct / total
    return accuracy

# Perform optimization
study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=10)

print("Best hyperparameters: ", study.best_params)
print("Best accuracy: ", study.best_value)

# **PyTorch Lightning**

PyTorch Lightning is a high-level library built on top of PyTorch that aims to simplify the training process by abstracting boilerplate code, making your PyTorch code more readable and maintainable. It provides a standardized way to organize PyTorch code and helps with tasks like training loops, validation, and checkpointing, allowing you to focus more on the research and model development.
Key Features of PyTorch Lightning

- Simplified Code Structure: Encapsulates common training practices into reusable components.
- Automatic Handling of Training Loops: Manages training, validation, and testing loops.
- Seamless Multi-GPU and Distributed Training: Easy integration with multi-GPU and distributed setups.
- Integrated Logging and Checkpointing: Built-in support for logging and saving model checkpoints.
- Flexibility: Allows for easy customization and extension to fit specific needs.

Basic Concepts

- LightningModule: The core class where you define your model, training, validation, and testing steps.
- Trainer: Manages the training and validation process.
- DataModule: Optional but recommended class to encapsulate data loading and preprocessing.

Example Code

Here’s a simple example to illustrate how to use PyTorch Lightning.

**1. Define the LightningModule**

The LightningModule encapsulates your model definition and the training/validation steps.

In [None]:
import pytorch_lightning as pl
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms
from torch.utils.data import DataLoader

class LitModel(pl.LightningModule):
    def __init__(self):
        super(LitModel, self).__init__()
        self.model = nn.Sequential(
            nn.Linear(28 * 28, 128),
            nn.ReLU(),
            nn.Linear(128, 10)
        )
        self.criterion = nn.CrossEntropyLoss()

    def forward(self, x):
        return self.model(x.view(x.size(0), -1))

    def training_step(self, batch, batch_idx):
        x, y = batch
        y_hat = self(x)
        loss = self.criterion(y_hat, y)
        return loss

    def configure_optimizers(self):
        return optim.Adam(self.parameters(), lr=0.001)

    def train_dataloader(self):
        transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5,), (0.5,))])
        dataset = datasets.MNIST(root='data', train=True, download=True, transform=transform)
        return DataLoader(dataset, batch_size=32, shuffle=True)


**2. Training with Trainer**

Create a Trainer object to manage the training process.

In [None]:
from pytorch_lightning import Trainer

# Initialize the model
model = LitModel()

# Initialize the trainer
trainer = Trainer(max_epochs=5, gpus=1)  # Use gpus=1 for a single GPU or set to 0 for CPU

# Train the model
trainer.fit(model)


**3. Validation and Testing**

Add validation and testing steps to the LightningModule if needed.

In [None]:
class LitModel(pl.LightningModule):
    # ... previous code ...

    def validation_step(self, batch, batch_idx):
        x, y = batch
        y_hat = self(x)
        loss = self.criterion(y_hat, y)
        acc = torch.sum(torch.argmax(y_hat, dim=1) == y) / y.size(0)
        return {'val_loss': loss, 'val_acc': acc}

    def validation_epoch_end(self, outputs):
        avg_loss = torch.stack([x['val_loss'] for x in outputs]).mean()
        avg_acc = torch.stack([x['val_acc'] for x in outputs]).mean()
        return {'val_loss': avg_loss, 'val_acc': avg_acc}

    def test_step(self, batch, batch_idx):
        x, y = batch
        y_hat = self(x)
        loss = self.criterion(y_hat, y)
        acc = torch.sum(torch.argmax(y_hat, dim=1) == y) / y.size(0)
        return {'test_loss': loss, 'test_acc': acc}


In [None]:
# Test the model
trainer.test(model)

**Advanced Features**

**1. DataModule**

Encapsulates data loading and preprocessing logic.

In [None]:
class MNISTDataModule(pl.LightningDataModule):
    def __init__(self, batch_size=32):
        super().__init__()
        self.batch_size = batch_size

    def setup(self, stage=None):
        transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5,), (0.5,))])
        self.train_dataset = datasets.MNIST(root='data', train=True, download=True, transform=transform)
        self.val_dataset = datasets.MNIST(root='data', train=False, download=True, transform=transform)

    def train_dataloader(self):
        return DataLoader(self.train_dataset, batch_size=self.batch_size, shuffle=True)

    def val_dataloader(self):
        return DataLoader(self.val_dataset, batch_size=self.batch_size, shuffle=False)


**2. Callbacks**

Custom actions during training, such as saving checkpoints or early stopping

In [None]:
from pytorch_lightning.callbacks import ModelCheckpoint, EarlyStopping

# Define callbacks
checkpoint_callback = ModelCheckpoint(monitor='val_loss', save_top_k=1)
early_stopping_callback = EarlyStopping(monitor='val_loss', patience=3)

# Initialize the trainer with callbacks
trainer = Trainer(
    max_epochs=5,
    gpus=1,
    callbacks=[checkpoint_callback, early_stopping_callback]
)

# **Hyperparameter tuning with PyTorch**

Hyperparameter tuning is a crucial step in developing machine learning models, as it involves finding the best combination of hyperparameters to improve model performance. With PyTorch, there are several approaches and libraries you can use for hyperparameter optimization. Here’s an overview of popular methods and a practical example using some of these approaches.

**1. Manual Tuning**

Manual tuning involves selecting hyperparameters based on intuition, experience, and trial-and-error. While this method is straightforward, it can be time-consuming and less efficient compared to automated approaches.

**2. Grid Search**

Grid Search is an exhaustive search over a specified hyperparameter grid. Although it is simple to implement, it can be computationally expensive and inefficient.

**Example with scikit-learn's GridSearchCV (for models compatible with scikit-learn):**

In [None]:
from sklearn.model_selection import GridSearchCV
import torch
import torch.nn as nn
import torch.optim as optim

# Define a PyTorch model compatible with scikit-learn
class PyTorchModel(nn.Module):
    def __init__(self, hidden_units):
        super(PyTorchModel, self).__init__()
        self.fc1 = nn.Linear(28 * 28, hidden_units)
        self.fc2 = nn.Linear(hidden_units, 10)

    def forward(self, x):
        x = x.view(-1, 28 * 28)
        x = torch.relu(self.fc1(x))
        x = self.fc2(x)
        return x

# Create the grid of hyperparameters
param_grid = {
    'hidden_units': [64, 128],
    'lr': [0.001, 0.01]
}

# Use GridSearchCV to find the best hyperparameters
# Note: scikit-learn GridSearchCV is not directly compatible with PyTorch models


For PyTorch models, you will need to use other methods like Optuna or Hyperopt.

**3. Random Search**

Random Search samples hyperparameters randomly from a specified range. It’s often more efficient than Grid Search, especially with large hyperparameter spaces.

**4. Bayesian Optimization**

Bayesian Optimization builds a probabilistic model to predict the performance of hyperparameters. It’s more efficient than random search and grid search for larger spaces.
Example with Optuna:

Optuna is a popular library for hyperparameter optimization that integrates well with PyTorch.

In [None]:
import optuna
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, Dataset
from torchvision import datasets, transforms

class LitModel(nn.Module):
    def __init__(self, hidden_units):
        super(LitModel, self).__init__()
        self.fc1 = nn.Linear(28 * 28, hidden_units)
        self.fc2 = nn.Linear(hidden_units, 10)

    def forward(self, x):
        x = x.view(-1, 28 * 28)
        x = torch.relu(self.fc1(x))
        x = self.fc2(x)
        return x

def objective(trial):
    hidden_units = trial.suggest_int('hidden_units', 64, 128)
    lr = trial.suggest_loguniform('lr', 1e-5, 1e-1)

    # Load dataset
    transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5,), (0.5,))])
    train_dataset = datasets.MNIST(root='data', train=True, download=True, transform=transform)
    train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)

    model = LitModel(hidden_units)
    optimizer = optim.Adam(model.parameters(), lr=lr)
    criterion = nn.CrossEntropyLoss()

    # Training loop
    model.train()
    for epoch in range(1):
        for data, target in train_loader:
            data, target = data.cuda(), target.cuda()
            optimizer.zero_grad()
            output = model(data)
            loss = criterion(output, target)
            loss.backward()
            optimizer.step()

    # Validation (using the same dataset here for simplicity)
    model.eval()
    correct = 0
    total = 0
    with torch.no_grad():
        for data, target in train_loader:
            data, target = data.cuda(), target.cuda()
            output = model(data)
            _, predicted = torch.max(output.data, 1)
            total += target.size(0)
            correct += (predicted == target).sum().item()

    accuracy = correct / total
    return accuracy

# Optimize hyperparameters
study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=10)

print("Best hyperparameters: ", study.best_params)
print("Best accuracy: ", study.best_value)


**5. Hyperparameter Optimization Libraries**

- Optuna: Provides a flexible and efficient way to perform hyperparameter optimization.
- Hyperopt: Another library for optimization that supports Bayesian optimization.
- Ray Tune: A scalable hyperparameter tuning library for distributed training.

**6. Example with Hyperopt**

Hyperopt is another popular library for hyperparameter tuning that uses Bayesian optimization.

In [None]:
from hyperopt import fmin, tpe, hp, Trials
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms
from torch.utils.data import DataLoader

class LitModel(nn.Module):
    def __init__(self, hidden_units):
        super(LitModel, self).__init__()
        self.fc1 = nn.Linear(28 * 28, hidden_units)
        self.fc2 = nn.Linear(hidden_units, 10)

    def forward(self, x):
        x = x.view(-1, 28 * 28)
        x = torch.relu(self.fc1(x))
        x = self.fc2(x)
        return x

def objective(params):
    hidden_units = int(params['hidden_units'])
    lr = params['lr']

    # Load dataset
    transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5,), (0.5,))])
    train_dataset = datasets.MNIST(root='data', train=True, download=True, transform=transform)
    train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)

    model = LitModel(hidden_units)
    optimizer = optim.Adam(model.parameters(), lr=lr)
    criterion = nn.CrossEntropyLoss()

    # Training loop
    model.train()
    for epoch in range(1):
        for data, target in train_loader:
            data, target = data.cuda(), target.cuda()
            optimizer.zero_grad()
            output = model(data)
            loss = criterion(output, target)
            loss.backward()
            optimizer.step()

    # Validation (using the same dataset here for simplicity)
    model.eval()
    correct = 0
    total = 0
    with torch.no_grad():
        for data, target in train_loader:
            data, target = data.cuda(), target.cuda()
            output = model(data)
            _, predicted = torch.max(output.data, 1)
            total += target.size(0)
            correct += (predicted == target).sum().item()

    accuracy = correct / total
    return -accuracy  # Hyperopt minimizes the objective function, so return negative accuracy

# Define the search space
space = {
    'hidden_units': hp.choice('hidden_units', [64, 128]),
    'lr': hp.loguniform('lr', -5, -1)  # Log-uniform distribution
}

# Optimize hyperparameters
trials = Trials()
best = fmin(fn=objective, space=space, algo=tpe.suggest, max_evals=10, trials=trials)

print("Best hyperparameters: ", best)


**Summary**

- Manual Tuning: Simple but not efficient for large hyperparameter spaces.
- Grid Search: Exhaustive but computationally expensive.
- Random Search: More efficient than Grid Search.
- Bayesian Optimization: More efficient and intelligent search using libraries like Optuna and Hyperopt.
- Libraries: Optuna, Hyperopt, and Ray Tune provide advanced and efficient ways to perform hyperparameter tuning.