# Simple Long Short-Term Memory (LSTM) Example with PyTorch
This notebook introduces the fundamental concepts of Long Short-Term Memory (LSTM) networks and demonstrates a minimal LSTM model using PyTorch. The example is suitable for beginners and includes detailed explanations.

## What is an LSTM?
LSTM is a type of Recurrent Neural Network (RNN) designed to better capture long-term dependencies in sequential data. It uses special gating mechanisms to control the flow of information and mitigate the vanishing gradient problem.

Key concepts:
- **Cell State**: Memory that runs through the entire sequence.
- **Gates**: Structures that regulate the addition or removal of information (input, forget, output gates).
- **Hidden State**: Output at each time step.
- **Input/Output**: Data at each time step and the model's prediction.

## Minimal LSTM Example: Sequence Classification
We will use a simple LSTM to classify sequences. For demonstration, we use random data.

In [1]:
import torch
import torch.nn as nn

# Define a simple LSTM model
class SimpleLSTM(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super().__init__()
        self.lstm = nn.LSTM(input_size, hidden_size, batch_first=True)
        self.fc = nn.Linear(hidden_size, output_size)
    def forward(self, x):
        # x: (batch_size, seq_len, input_size)
        out, (h_n, c_n) = self.lstm(x)  # out: (batch_size, seq_len, hidden_size)
        out = out[:, -1, :]             # Take the output at the last time step
        out = self.fc(out)              # (batch_size, output_size)
        return out

## Build the Model and Generate Random Input
We will create a model instance and use a randomly generated tensor to simulate a batch of sequences.

In [2]:
# Model parameters
input_size = 5   # Number of features per time step
hidden_size = 8  # Size of the LSTM hidden state
output_size = 2  # Number of classes
seq_len = 7      # Length of each sequence
batch_size = 3   # Number of sequences in a batch

# Create model instance
model = SimpleLSTM(input_size, hidden_size, output_size)

# Randomly generate a batch of input data
x = torch.randn(batch_size, seq_len, input_size)

## Forward Pass
Feed the input data into the model to get the output.

In [3]:
# Forward pass
output = model(x)
print("Output shape:", output.shape)
print("Output content:\n", output)

Output shape: torch.Size([3, 2])
Output content:
 tensor([[-0.1592,  0.2306],
        [-0.1746,  0.1796],
        [-0.1921,  0.2100]], grad_fn=<AddmmBackward0>)


## Summary
This notebook introduced the basics of LSTM networks and demonstrated a simple LSTM for sequence classification. For more advanced sequence models, consider exploring GRU or Transformer architectures.