# Negative Log Likelihood (NLL) Loss in PyTorch

## Explanation of Key Concepts

Negative Log Likelihood (NLL) Loss is a commonly used loss function for classification problems with multiple classes. It's particularly useful when working with probability distributions as outputs from your model, such as when using the softmax activation function in the final layer.

## Contextualize the Topic

In this tutorial, we'll use the Iris dataset to demonstrate how to apply NLL loss. The Iris dataset is a popular dataset containing 150 samples of iris flowers with four features (sepal length, sepal width, petal length, petal width) and three classes (Setosa, Versicolor, Virginica).

## Explain Parameters and Settings

In PyTorch, you can use the `nn.NLLLoss` class to compute the NLL loss. It has two main parameters:

- `reduction`: Specifies the reduction method applied to the output. It can be 'mean' (default), 'sum', or 'none'. The 'mean' calculates the mean of the loss values, 'sum' calculates their sum, and 'none' returns the individual losses without any reduction.

- `weight`: An optional tensor of weights to be used for each class, useful when dealing with imbalanced datasets. By default, it's set to `None`. If provided, the loss will be multiplied by the weights for each class.

## Training Process

First, let's import the necessary libraries, load the Iris dataset, and split it into training and testing sets.

In [None]:
import torch
from torch import nn
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

# Load the dataset
iris = load_iris()
X, y = iris.data, iris.target

# Standardize the data
scaler = StandardScaler()
X = scaler.fit_transform(X)

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Convert to tensors
X_train = torch.tensor(X_train, dtype=torch.float32)
y_train = torch.tensor(y_train, dtype=torch.long)

Now, let's create a simple feed-forward neural network with a softmax activation function in the final layer.

In [None]:
class IrisClassifier(nn.Module):
    def __init__(self):
        super(IrisClassifier, self).__init__()
        self.fc1 = nn.Linear(4, 16)
    self.fc2 = nn.Linear(16, 8)
        self.fc3 = nn.Linear(8, 3)

    def forward(self, x):
        x = torch.relu(self.fc1(x))
        x = torch.relu(self.fc2(x))
        x = torch.log_softmax(self.fc3(x), dim=1)
        return x

model = IrisClassifier()

Now, let's define the NLL loss and the optimizer.

In [None]:
criterion = nn.NLLLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)

Let's train the model for 100 epochs and print the training loss at each epoch.

In [None]:
num_epochs = 100

for epoch in range(num_epochs):
    optimizer.zero_grad()
    outputs = model(X_train)
    loss = criterion(outputs, y_train)
    loss.backward()
    optimizer.step()
    print(f'Epoch [{epoch + 1}/{num_epochs}], Loss: {loss.item():.4f}')

## Evaluation and Interpretation

To evaluate the model, we'll calculate the accuracy on the test set.

In [None]:
with torch.no_grad():
    X_test = torch.tensor(X_test, dtype=torch.float32)
    y_test = torch.tensor(y_test, dtype=torch.long)
    outputs = model(X_test)
    _, predicted = torch.max(outputs.data, 1)
    correct = (predicted == y_test).sum().item()
    accuracy = correct / y_test.size(0)
    print(f'Accuracy: {accuracy * 100:.2f}%')

## Practical Application

NLL loss is suitable for classification problems with multiple classes, such as text classification, image classification, and more. When using NLL loss, remember to apply the softmax activation function to the output layer of your model to produce probability distributions.

## Next Steps

Now that you have learned about Negative Log Likelihood (NLL) Loss in PyTorch, you can explore other loss functions, such as Cross-Entropy Loss, Binary Cross-Entropy Loss, and Mean Squared Error Loss. Understanding various loss functions will help you choose the right one for your specific problem and improve your model's performance.