## Collaborative Filtering with Deep Neural Network for Movie Recommendations using PyTorch
# Dependencies

To run this notebook, you need the following dependencies:

- pandas
- numpy
- torch
- scikit-learn
- matplotlib
- contextlib
```bash 
pip install 'Dependency'
```
# Installation

First, create and activate a Python virtual environment:

```bash
python -m venv .venv
source .venv/bin/activate  # On Windows use: .venv\Scripts\activate
```

# Dataset
Ensure the dataset is in the directory where this notebook is located. The dataset files should include in the directory:
```bash 
/databases/ml-latest-small/
```
- ratings.csv
- movies.csv

In [13]:
import pandas as pd
import numpy as np
import torch
import torch.nn as nn
from torch.utils.data import Dataset, DataLoader
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
from sklearn.metrics import mean_squared_error, root_mean_squared_error
from collections import defaultdict
import sys
import matplotlib.pyplot as plt
from contextlib import contextmanager

## MovieLensDataset Class

The `MovieLensDataset` class is a custom PyTorch dataset for handling the MovieLens dataset. It inherits from `torch.utils.data.Dataset` and provides the necessary methods to work with the dataset in a PyTorch DataLoader.

### Initialization

```python
def __init__(self, users, movies, ratings):
    self.users = users # A list of array of user IDs.
    self.movies = movies # movies: A list or array of movie IDs.
    self.ratings = ratings # ratings: A list or array of ratings corresponding to the user-movie pairs.
```

In [None]:
class MovieLensDataset(Dataset):
    def __init__(self, users, movies, ratings):
        self.users = users
        self.movies = movies
        self.ratings = ratings
    # Returns the total number of user-movie-rating triplets in the dataset.
    def __len__(self):
        return len(self.users)
    '''
    Takes an index item and returns a dictionary containing:
        users: The user ID at the given index as a PyTorch tensor.
        movies: The movie ID at the given index as a PyTorch tensor.
        ratings: The rating at the given index as a PyTorch tensor.
    '''
    def __getitem__(self, item):
        return {
            "users": torch.tensor(self.users[item], dtype=torch.long),
            "movies": torch.tensor(self.movies[item], dtype=torch.long),
            "ratings": torch.tensor(self.ratings[item], dtype=torch.float),
        }

## DeepRecommenderSystem Class

The `DeepRecommenderSystem` class is a deep learning-based recommender system implemented using PyTorch. This class is designed to predict user ratings for movies based on user and movie embeddings, and a series of fully connected layers.

### Arguments

- `num_users` (int): The number of unique users in the dataset.
- `num_movies` (int): The number of unique movies in the dataset.
- `embedding_size` (int, optional): The size of the embedding vectors for users and movies. Default is 128.

### Attributes

- `user_embedding` (nn.Embedding): Embedding layer for users.
- `movie_embedding` (nn.Embedding): Embedding layer for movies.
- `layers` (nn.Sequential): A sequential container of deep neural network layers, including fully connected layers, ReLU activations, batch normalization, and dropout.

### Methods

- `__init__(self, num_users, num_movies, embedding_size=128)`: Initializes the embedding layers and the deep neural network layers. Also applies weight initialization.
- `_init_weights(self, module)`: Initializes the weights of the network layers using Xavier uniform initialization.
- `forward(self, users, movies)`: Defines the forward pass through the network. It concatenates the user and movie embeddings and passes them through the deep neural network layers to produce the final rating prediction.

This class encapsulates the entire model architecture, from embedding layers to the final prediction layer, and includes methods for weight initialization and the forward pass.

In [None]:
class DeepRecommenderSystem(nn.Module):
    """
    A deep learning-based recommender system using PyTorch.
    Args:
        num_users (int): The number of unique users in the dataset.
        num_movies (int): The number of unique movies in the dataset.
        embedding_size (int, optional): The size of the embedding vectors for users and movies. Default is 128.
    Attributes:
        user_embedding (nn.Embedding): Embedding layer for users.
        movie_embedding (nn.Embedding): Embedding layer for movies.
        layers (nn.Sequential): A sequential container of deep neural network layers.
    Methods:
        _init_weights(module): Initializes the weights of the network layers.
        forward(users, movies): Forward pass through the network.
    """
    def __init__(self, num_users, num_movies, embedding_size=128):
        super(DeepRecommenderSystem, self).__init__()
        
        # Embedding layers
        self.user_embedding = nn.Embedding(num_users, embedding_size)
        self.movie_embedding = nn.Embedding(num_movies, embedding_size)
        
        # Deep Neural Network layers
        self.layers = nn.Sequential(
            nn.Linear(2 * embedding_size, 512),
            nn.ReLU(),
            nn.BatchNorm1d(512),
            nn.Dropout(0.3),
            
            nn.Linear(512, 256),
            nn.ReLU(),
            nn.BatchNorm1d(256),
            nn.Dropout(0.2),
            
            nn.Linear(256, 128),
            nn.ReLU(),
            nn.BatchNorm1d(128),
            nn.Dropout(0.2),
            
            nn.Linear(128, 64),
            nn.ReLU(),
            nn.BatchNorm1d(64),
            nn.Dropout(0.1),
            
            nn.Linear(64, 1),
            nn.Sigmoid()
        )
        
        # Initialize weights
        self.apply(self._init_weights)
        
    def _init_weights(self, module):
        if isinstance(module, nn.Linear):
            torch.nn.init.xavier_uniform_(module.weight)
            if module.bias is not None:
                module.bias.data.zero_()
    
    def forward(self, users, movies):
        user_embedded = self.user_embedding(users)
        movie_embedded = self.movie_embedding(movies)
        concatenated = torch.cat([user_embedded, movie_embedded], dim=1)
        return self.layers(concatenated).squeeze()

## Train Model Function

The `train_model` function is responsible for training a given PyTorch model using the provided training and validation data loaders. It performs the following tasks:

### Arguments

- `model` (torch.nn.Module): The model to be trained.
- `train_loader` (torch.utils.data.DataLoader): DataLoader for the training dataset.
- `val_loader` (torch.utils.data.DataLoader): DataLoader for the validation dataset.
- `device` (torch.device): The device (CPU or GPU) to perform training on.
- `epochs` (int, optional): Number of epochs to train the model. Default is 10.
- `lr` (float, optional): Learning rate for the optimizer. Default is 0.001.

### Returns

- `tuple`: A tuple containing two lists:
  - `train_losses` (list of float): List of average training losses for each epoch.
  - `val_losses` (list of float): List of average validation losses for each epoch.

### Description

1. **Initialization**:
   - Defines the loss function (`criterion`) as Mean Squared Error Loss.
   - Initializes the optimizer (`optimizer`) as Adam with the specified learning rate.
   - Sets up a learning rate scheduler (`scheduler`) to reduce the learning rate on plateau.

2. **Training Loop**:
   - Iterates over the specified number of epochs.
   - For each epoch, the model is set to training mode.
   - Iterates over the training data loader, performing the following steps for each batch:
     - Moves the batch data (users, movies, ratings) to the specified device.
     - Performs a forward pass to get predictions.
     - Computes the loss and scales predictions to the 0-5 range.
     - Performs backpropagation and updates the model parameters.
     - Accumulates the total loss and counts the number of batches.
     - Prints the average loss for every 100 batches.

3. **Validation Loop**:
   - After each epoch, the model is set to evaluation mode.
   - Iterates over the validation data loader, performing the following steps for each batch:
     - Moves the batch data (users, movies, ratings) to the specified device.
     - Performs a forward pass to get predictions.
     - Computes the loss and accumulates the total validation loss.

4. **Learning Rate Scheduling**:
   - Adjusts the learning rate based on the validation loss.

5. **Model Saving**:
   - Saves the model's state if the validation loss improves.

6. **Return**:
   - Returns the lists of average training and validation losses for each epoch.

In [None]:
def train_model(model, train_loader, val_loader, device, epochs=10, lr=0.001):
    """
    Trains a given model using the provided training and validation data loaders.
    Args:
        model (torch.nn.Module): The model to be trained.
        train_loader (torch.utils.data.DataLoader): DataLoader for the training dataset.
        val_loader (torch.utils.data.DataLoader): DataLoader for the validation dataset.
        device (torch.device): The device (CPU or GPU) to perform training on.
        epochs (int, optional): Number of epochs to train the model. Default is 10.
        lr (float, optional): Learning rate for the optimizer. Default is 0.001.
    Returns:
        tuple: A tuple containing two lists:
            - train_losses (list of float): List of average training losses for each epoch.
            - val_losses (list of float): List of average validation losses for each epoch.
    """
    criterion = nn.MSELoss()
    optimizer = torch.optim.Adam(model.parameters(), lr=lr)
    scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau(optimizer, mode='min', factor=0.5, patience=2)
    
    train_losses = []
    val_losses = []
    best_val_loss = float('inf')
    
    for epoch in range(epochs):
        # Training phase
        model.train()
        total_loss = 0
        batch_count = 0
        
        for batch_idx, batch in enumerate(train_loader):
            users = batch["users"].to(device)
            movies = batch["movies"].to(device)
            ratings = batch["ratings"].to(device)
            
            optimizer.zero_grad()
            predictions = model(users, movies)
            loss = criterion(predictions * 5.0, ratings)  # Scale predictions to 0-5 range
            
            loss.backward()
            optimizer.step()
            
            total_loss += loss.item()
            batch_count += 1
            
            if batch_idx % 100 == 0:
                avg_loss = total_loss / batch_count
                sys.stderr.write(f"\rEpoch {epoch+1}/{epochs} | Batch {batch_idx}/{len(train_loader)} | Avg Loss: {avg_loss:.6f}")
                sys.stderr.flush()
        
        avg_train_loss = total_loss / batch_count
        train_losses.append(avg_train_loss)
        
        # Validation phase
        model.eval()
        val_loss = 0
        with torch.no_grad():
            for batch in val_loader:
                users = batch["users"].to(device)
                movies = batch["movies"].to(device)
                ratings = batch["ratings"].to(device)
                
                predictions = model(users, movies)
                loss = criterion(predictions * 5.0, ratings)
                val_loss += loss.item()
        
        avg_val_loss = val_loss / len(val_loader)
        val_losses.append(avg_val_loss)
        
        # Learning rate scheduling
        scheduler.step(avg_val_loss)
        
        print(f"\nEpoch {epoch+1}/{epochs} - Train Loss: {avg_train_loss:.6f} - Val Loss: {avg_val_loss:.6f}")
        
        # Save best model
        if avg_val_loss < best_val_loss:
            best_val_loss = avg_val_loss
            torch.save(model.state_dict(), 'best_model.pth')
    
    return train_losses, val_losses

## Calculate NDCG Function

The `calculate_ndcg` function computes the Normalized Discounted Cumulative Gain (NDCG) at a specified rank `k` for a list of true and predicted ratings. NDCG is a measure of ranking quality that accounts for the position of relevant items in the predicted ranking.

### Arguments

- `true_ratings` (list of float): List of true ratings.
- `predicted_ratings` (list of float): List of predicted ratings.
- `k` (int, optional): Number of items to consider for the calculation. Default is 10.

### Returns

- `ndcg` (float): The NDCG score at rank `k`.

### Description

1. **Sort Predictions**:
   - Sorts the predicted ratings in descending order and retrieves the indices of the top `k` items.

2. **Calculate DCG (Discounted Cumulative Gain)**:
   - Initializes `dcg` to 0.
   - Iterates over the top `k` indices of the sorted predictions.
   - For each index, retrieves the corresponding true rating (`rel`).
   - Updates `dcg` using the formula: \((2^{\text{rel}} - 1) / \log_2(i + 2)\), where `i` is the position in the sorted list (starting from 0).

3. **Calculate IDCG (Ideal Discounted Cumulative Gain)**:
   - Sorts the true ratings in descending order and retrieves the indices of the top `k` items.
   - Initializes `idcg` to 0.
   - Iterates over the top `k` indices of the sorted true ratings.
   - For each index, retrieves the corresponding true rating (`rel`).
   - Updates `idcg` using the same formula as for `dcg`.

4. **Calculate NDCG**:
   - Computes the NDCG score as the ratio of `dcg` to `idcg`.
   - If `idcg` is greater than 0, returns the computed NDCG score; otherwise, returns 0.

This function provides a way to evaluate the quality of the predicted rankings by comparing them to the ideal rankings based on true ratings.

In [17]:
def calculate_ndcg(true_ratings, predicted_ratings, k=10):
    # Calculate NDCG@k for a list of predictions
    
    # Args:
    #     true_ratings: List of true ratings
    #     predicted_ratings: List of predicted ratings
    #     k: Number of items to consider
    # Sort predictions and get top k indices
    top_k_indices = np.argsort(predicted_ratings)[-k:][::-1]
    
    # Get DCG
    dcg = 0
    for i, idx in enumerate(top_k_indices):
        rel = true_ratings[idx]
        dcg += (2**rel - 1) / np.log2(i + 2)  # i+2 because i starts from 0
    
    # Get IDCG (sort true ratings in descending order)
    ideal_order = np.argsort(true_ratings)[-k:][::-1]
    idcg = 0
    for i, idx in enumerate(ideal_order):
        rel = true_ratings[idx]
        idcg += (2**rel - 1) / np.log2(i + 2)
    
    # Calculate NDCG
    ndcg = dcg / idcg if idcg > 0 else 0
    return ndcg

## Calculate Metrics Function

The `calculate_metrics` function evaluates a trained model on a validation dataset and computes various evaluation metrics. These metrics help in understanding the performance of the model in terms of prediction accuracy and ranking quality.

### Arguments

- `model` (torch.nn.Module): The trained model to evaluate.
- `val_loader` (torch.utils.data.DataLoader): DataLoader for the validation dataset.
- `device` (torch.device): The device (CPU or GPU) to perform computations on.

### Returns

- `tuple`: A tuple containing the following metrics:
  - `rmse` (float): Root Mean Squared Error of the predictions.
  - `avg_precision` (float): Average precision across all users.
  - `avg_recall` (float): Average recall across all users.
  - `f_measure` (float): F-measure (harmonic mean of precision and recall).
  - `avg_ndcg` (float): Average Normalized Discounted Cumulative Gain across all users.

### Description

1. **Initialization**:
   - Sets the model to evaluation mode.
   - Initializes lists to store predictions and actual ratings.
   - Initializes dictionaries to store user-specific predictions and actual items.

2. **Prediction Loop**:
   - Iterates over the validation data loader.
   - For each batch, moves the data (users, movies, ratings) to the specified device.
   - Performs a forward pass to get predictions and scales them back to the 0-5 range.
   - Extends the predictions and actuals lists with the batch results.
   - Stores both predicted and true ratings for each user-movie pair.

3. **Calculate RMSE**:
   - Computes the Root Mean Squared Error (RMSE) between the actual and predicted ratings.

4. **Calculate Precision and Recall**:
   - Initializes lists to store precision and recall for each user.
   - For each user, sorts the recommendations by predicted rating.
   - Extracts the recommended movie IDs.
   - Calculates precision and recall for the top `k` recommendations.
   - Computes the average precision and recall across all users.

5. **Calculate F-measure**:
   - Computes the F-measure as the harmonic mean of average precision and recall.

6. **Calculate NDCG**:
   - Initializes a list to store NDCG scores for each user.
   - For each user, sorts the recommendations by predicted rating.
   - Aligns the predicted and true ratings for the top `k` recommendations.
   - Computes the NDCG score for the user.
   - Computes the average NDCG score across all users.

7. **Return**:
   - Returns the computed metrics: RMSE, average precision, average recall, F-measure, and average NDCG.

In [None]:
def calculate_metrics(model, val_loader, device):
    """
    Calculate various evaluation metrics for a given model on a validation dataset.
    Args:
        model (torch.nn.Module): The trained model to evaluate.
        val_loader (torch.utils.data.DataLoader): DataLoader for the validation dataset.
        device (torch.device): The device (CPU or GPU) to perform computations on.
    Returns:
        tuple: A tuple containing the following metrics:
            - rmse (float): Root Mean Squared Error of the predictions.
            - avg_precision (float): Average precision across all users.
            - avg_recall (float): Average recall across all users.
            - f_measure (float): F-measure (harmonic mean of precision and recall).
            - avg_ndcg (float): Average Normalized Discounted Cumulative Gain across all users.
    """
    model.eval()
    predictions = []
    actuals = []
    
    # Store user-specific predictions and actual items
    user_test_items = defaultdict(set)
    user_recommendations = defaultdict(list)
    
    with torch.no_grad():
        for batch in val_loader:
            users = batch["users"].to(device)
            movies = batch["movies"].to(device)
            ratings = batch["ratings"].to(device)
            
            output = model(users, movies)
            scaled_output = output * 5.0  # Scale back to 0-5 range
            
            predictions.extend(scaled_output.cpu().numpy())
            actuals.extend(ratings.cpu().numpy())
            
            # Store both predicted and true ratings
            for user, movie, pred, true in zip(users.cpu().numpy(), 
                                             movies.cpu().numpy(),
                                             scaled_output.cpu().numpy(),
                                             ratings.cpu().numpy()):
                user_test_items[user].add(movie)
                user_recommendations[user].append((movie, pred, true))
    
    # Calculate RMSE
    rmse = root_mean_squared_error(actuals, predictions)
    
    # Calculate Precision and Recall for each user
    precisions = []
    recalls = []
    
    for user_id in user_test_items.keys():
        # Sort recommendations by predicted rating
        user_recs = user_recommendations[user_id]
        user_recs.sort(key=lambda x: x[1], reverse=True)
        recommended_items = [item[0] for item in user_recs]  # Get just the movie IDs
        
        # Calculate precision and recall
        p, r = calculate_precision_recall(
            user_id=user_id,
            test_items=user_test_items[user_id],
            recommended_items=recommended_items,
            k=10
        )
        precisions.append(p)
        recalls.append(r)
    
    avg_precision = np.mean(precisions)
    avg_recall = np.mean(recalls)
    
    # Calculate F-measure
    f_measure = 2 * avg_precision * avg_recall / (avg_precision + avg_recall) if (avg_precision + avg_recall) > 0 else 0
    
    ndcg_scores = []
    for user_id in user_test_items.keys():
        user_recs = user_recommendations[user_id]
        # Sort by predicted ratings
        user_recs.sort(key=lambda x: x[1], reverse=True)
        
        # Get aligned predicted and true ratings
        pred_ratings = np.array([pred for _, pred, _ in user_recs[:10]])
        true_ratings = np.array([true for _, _, true in user_recs[:10]])
        
        ndcg = calculate_ndcg(true_ratings, pred_ratings, k=10)
        ndcg_scores.append(ndcg)
    
    avg_ndcg = np.mean(ndcg_scores)
    
    return rmse, avg_precision, avg_recall, f_measure, avg_ndcg

## Calculate Precision and Recall Function

The `calculate_precision_recall` function computes the precision and recall for a given user based on the recommended items and the actual items in the test set. These metrics help in evaluating the effectiveness of the recommendation system.

### Arguments

- `user_id` (int): The user ID.
- `test_items` (set): Set of items in the test set for this user.
- `recommended_items` (list): List of recommended items (top-10).
- `k` (int, optional): Number of recommendations to consider. Default is 10.

### Returns

- `precision` (float): The precision of the recommendations.
- `recall` (float): The recall of the recommendations.

### Description

1. **Select Top-k Recommendations**:
   - Takes only the first `k` items from the list of recommended items.

2. **Count Hits**:
   - Counts how many of the recommended items are present in the test set by computing the intersection of the recommended items and the test items.

3. **Calculate Precision**:
   - Computes precision as the ratio of hits to `k`. Precision measures the proportion of recommended items that are relevant.

4. **Calculate Recall**:
   - Computes recall as the ratio of hits to the total number of test items. Recall measures the proportion of relevant items that are recommended.

5. **Return**:
   - Returns the computed precision and recall values.

This function provides a way to evaluate the recommendation system's performance by measuring how many of the recommended items are relevant (precision) and how many of the relevant items are recommended (recall).

In [19]:
def calculate_precision_recall(user_id, test_items, recommended_items, k=10):
    # Args:
    #     user_id: The user ID
    #     test_items: Set of items in test set for this user
    #     recommended_items: List of recommended items (top-10)
    #     k: Number of recommendations to consider (10)

    # Take only first k recommendations
    recommended_k = recommended_items[:k]
    
    # Count how many recommended items are in test set
    hits = len(set(recommended_k) & set(test_items))
    
    # Precision = hits / k
    precision = hits / k if k > 0 else 0
    
    # Recall = hits / total test items
    recall = hits / len(test_items) if test_items else 0
    
    return precision, recall

## Recommend Movies Function

The `recommend_movies` function generates movie recommendations for a given user using a trained model. It predicts ratings for all movies and selects the top `k` recommendations.

### Arguments

- `model` (torch.nn.Module): Trained model.
- `user_id` (int): User ID (encoded).
- `movie_ids` (list of int): List of encoded movie IDs.
- `df_movies` (pandas.DataFrame): Original movies dataframe.
- `device` (torch.device): The device (CPU or GPU) to perform computations on.
- `movie_encoder` (LabelEncoder): LabelEncoder used for movie IDs.
- `top_k` (int, optional): Number of recommendations to return. Default is 10.

### Returns

- `recommended_movies` (list of dict): List of top `k` recommended movies with their titles and predicted ratings.

### Description

1. **Set Model to Evaluation Mode**:
   - Sets the model to evaluation mode to disable dropout and batch normalization layers.

2. **Create Tensors for Prediction**:
   - Creates tensors for the user and movie IDs. The user tensor contains the user ID repeated for each movie ID.

3. **Predict Ratings**:
   - Performs a forward pass through the model to get predicted ratings for all movies.
   - Scales the predicted ratings to the 0-5 range.

4. **Create Movie Recommendations**:
   - Zips the movie IDs with their predicted ratings.
   - Sorts the movies by predicted rating in descending order.
   - Selects the top `k` movies.

5. **Get Movie Details**:
   - Converts the encoded movie IDs back to their original IDs using the `movie_encoder`.
   - Retrieves the movie details (title) from the `df_movies` dataframe.
   - Appends the movie title and predicted rating to the list of recommended movies.

6. **Return**:
   - Returns the list of top `k` recommended movies with their titles and predicted ratings.

This function provides a way to generate personalized movie recommendations for a user based on the trained model's predictions.

In [20]:
def recommend_movies(model, user_id, movie_ids, df_movies, device, movie_encoder, top_k=10):
    # Recommend movies for a user
    # Args:
    #     model: Trained model
    #     user_id: User ID (encoded)
    #     movie_ids: List of encoded movie IDs
    #     df_movies: Original movies dataframe
    #     device: torch device
    #     movie_encoder: LabelEncoder used for movie IDs
    #     top_k: Number of recommendations to return

    model.eval()
    
    # Create tensors for prediction
    user_tensor = torch.tensor([user_id] * len(movie_ids), dtype=torch.long).to(device)
    movie_tensor = torch.tensor(movie_ids, dtype=torch.long).to(device)
    
    with torch.no_grad():
        predictions = model(user_tensor, movie_tensor)
        predictions = predictions.cpu().numpy() * 5.0  # Scale to 0-5 range
    
    # Create movie recommendations
    movie_preds = list(zip(movie_ids, predictions))
    movie_preds.sort(key=lambda x: x[1], reverse=True)
    top_movies = movie_preds[:top_k]
    
    # Get movie details
    recommended_movies = []
    for encoded_movie_id, pred_rating in top_movies:
        # Convert encoded ID back to original movie ID
        original_movie_id = movie_encoder.inverse_transform([encoded_movie_id])[0]
        movie_info = df_movies[df_movies['movieId'] == original_movie_id]
        if not movie_info.empty:
            movie_info = movie_info.iloc[0]
            recommended_movies.append({
                'title': movie_info['title'],
                'predicted_rating': pred_rating
            })
    
    return recommended_movies

## stdout_to_file Context Manager

The `stdout_to_file` function is a context manager that redirects the standard output (stdout) and standard error (stderr) to both a file and the console. This is useful for logging output to a file while still displaying it in the console.

### Arguments

- `filename` (str): The name of the file to which stdout and stderr will be redirected.

### Description

1. **MultiOutputStream Class**:
   - A helper class that takes multiple output streams and writes text to all of them.
   - `__init__(self, *streams)`: Initializes the class with a list of streams.
   - `write(self, text)`: Writes the given text to all streams and flushes them.
   - `flush(self)`: Flushes all streams.

2. **Context Manager**:
   - Opens the specified file in write mode.
   - Backs up the current stdout and stderr.
   - Creates an instance of `MultiOutputStream` with the file and the original stdout.
   - Redirects stdout and stderr to the `MultiOutputStream` instance.
   - Yields control back to the caller.
   - Restores the original stdout and stderr after the context block is exited.

### Usage

This context manager can be used to capture and log output from a block of code:

```python
with stdout_to_file('output.log'):
    print("This will be logged to both the console and the file.")

In [None]:
@contextmanager
def stdout_to_file(filename):
    # Context manager to redirect stdout and stderr to both file and console
    class MultiOutputStream:
        def __init__(self, *streams):
            self.streams = streams

        def write(self, text):
            for stream in self.streams:
                stream.write(text)
                stream.flush()

        def flush(self):
            # Fix: Iterate through streams to flush each one
            for stream in self.streams:
                stream.flush()

    with open(filename, 'w') as file:
        stdout_backup = sys.stdout
        stderr_backup = sys.stderr
        multi_stream = MultiOutputStream(file, sys.stdout)
        sys.stdout = multi_stream
        sys.stderr = multi_stream
        try:
            yield
        finally:
            sys.stdout = stdout_backup
            sys.stderr = stderr_backup

## Main Function

The `main` function orchestrates the entire process of loading data, training the model, evaluating its performance, and generating movie recommendations. It also redirects the output to both a file and the console using the `stdout_to_file` context manager.

### Description

1. **Redirect Output**:
   - Uses the `stdout_to_file` context manager to redirect stdout and stderr to `small_data_recommendation.txt` and the console.

2. **Set Device**:
   - Determines whether to use a GPU (if available) or CPU for computations and prints the selected device.

3. **Load and Preprocess Data**:
   - Loads the ratings and movies data from CSV files.
   - Encodes user and movie IDs using `LabelEncoder`.

4. **Split Data**:
   - Splits the ratings data into training and validation sets using an 80-20 split.

5. **Create Datasets**:
   - Creates `MovieLensDataset` instances for the training and validation sets.

6. **Create DataLoaders**:
   - Creates `DataLoader` instances for the training and validation datasets with a batch size of 64.

7. **Initialize Model**:
   - Initializes the `DeepRecommenderSystem` model with the number of unique users and movies.
   - Moves the model to the selected device.

8. **Train Model**:
   - Trains the model using the `train_model` function and prints the training progress.
   - Plots and saves the training and validation loss history.

9. **Calculate Metrics**:
   - Evaluates the model on the validation set using the `calculate_metrics` function.
   - Prints the calculated metrics: RMSE, Precision@10, Recall@10, F-measure, and NDCG@10.

10. **Generate Sample Recommendations**:
    - Generates movie recommendations for a sample user (user ID 1) using the `recommend_movies` function.
    - Prints the top 10 recommended movies with their predicted ratings.

This function provides a comprehensive workflow for training and evaluating a deep learning-based recommender system, as well as generating personalized movie recommendations.

In [22]:
def main():
    with stdout_to_file('small_data_recommendation.txt'):
        # Set device
        device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
        print(f"Using device: {device}")
        
        # Load and preprocess data
        print("Loading data...")
        ratings_df = pd.read_csv("databases/ml-latest-small/ratings.csv")
        movies_df = pd.read_csv("databases/ml-latest-small/movies.csv")
        
        # Encode user and movie IDs
        user_encoder = LabelEncoder()
        movie_encoder = LabelEncoder()
        
        ratings_df['userId'] = user_encoder.fit_transform(ratings_df['userId'])
        ratings_df['movieId'] = movie_encoder.fit_transform(ratings_df['movieId'])
        
        # Split data
        train_df, val_df = train_test_split(ratings_df, test_size=0.2, random_state=42)
        
        # Create datasets
        train_dataset = MovieLensDataset(
            users=train_df.userId.values,
            movies=train_df.movieId.values,
            ratings=train_df.rating.values
        )
        
        val_dataset = MovieLensDataset(
            users=val_df.userId.values,
            movies=val_df.movieId.values,
            ratings=val_df.rating.values
        )
        
        # Create dataloaders
        train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)
        val_loader = DataLoader(val_dataset, batch_size=64, shuffle=False)
        
        # Initialize model
        model = DeepRecommenderSystem(
            num_users=len(user_encoder.classes_),
            num_movies=len(movie_encoder.classes_)
        ).to(device)
        
        # Train model
        print("\nStarting training...")
        train_losses, val_losses = train_model(model, train_loader, val_loader, device)
        
        # Plot training history
        plt.figure(figsize=(10, 6))
        plt.plot(train_losses, label='Training Loss')
        plt.plot(val_losses, label='Validation Loss')
        plt.xlabel('Epoch')
        plt.ylabel('Loss')
        plt.title('Training History')
        plt.legend()
        plt.savefig('training_history.png')
        plt.close()
        
        # Calculate metrics
        print("\nCalculating metrics...")
        rmse, precision, recall, f_measure, ndcg = calculate_metrics(model, val_loader, device)
        print(f"RMSE: {rmse:.4f}")
        print(f"Precision@10: {precision:.4f}")
        print(f"Recall@10: {recall:.4f}")
        print(f"F-measure: {f_measure:.4f}")
        print(f"NDCG@10: {ndcg:.4f}")
        
        print("\nGenerating sample recommendations...")
        sample_user_id = 1
        movie_ids = ratings_df['movieId'].unique()
        
        # Get recommendations once
        recommendations = recommend_movies(model, sample_user_id, movie_ids, movies_df, device, movie_encoder)
        
        # Print recommendations once in a clear format
        print(f"\nTop 10 recommended movies for user {sample_user_id}:")
        for i, movie in enumerate(recommendations, 1):
            print(f"{i}. {movie['title']} - Predicted rating: {movie['predicted_rating']:.2f}")

In [23]:
if __name__ == "__main__":
    main()

Using device: cpu
Loading data...

Starting training...
Epoch 1/10 | Batch 1200/1261 | Avg Loss: 1.183118
Epoch 1/10 - Train Loss: 1.172236 - Val Loss: 0.939145
Epoch 2/10 | Batch 1200/1261 | Avg Loss: 0.911810
Epoch 2/10 - Train Loss: 0.908878 - Val Loss: 0.887655
Epoch 3/10 | Batch 1200/1261 | Avg Loss: 0.840216
Epoch 3/10 - Train Loss: 0.839532 - Val Loss: 0.846069
Epoch 4/10 | Batch 1200/1261 | Avg Loss: 0.780852
Epoch 4/10 - Train Loss: 0.779829 - Val Loss: 0.823406
Epoch 5/10 | Batch 1200/1261 | Avg Loss: 0.730021
Epoch 5/10 - Train Loss: 0.731862 - Val Loss: 0.797402
Epoch 6/10 | Batch 1200/1261 | Avg Loss: 0.686446
Epoch 6/10 - Train Loss: 0.687753 - Val Loss: 0.795351
Epoch 7/10 | Batch 1200/1261 | Avg Loss: 0.654796
Epoch 7/10 - Train Loss: 0.657188 - Val Loss: 0.797863
Epoch 8/10 | Batch 1200/1261 | Avg Loss: 0.625147
Epoch 8/10 - Train Loss: 0.625411 - Val Loss: 0.795235
Epoch 9/10 | Batch 1200/1261 | Avg Loss: 0.593791
Epoch 9/10 - Train Loss: 0.595570 - Val Loss: 0.792833