
# Neural Collaborative Filtering (NCF) for Binary Interaction Model Training



This notebook implements a Neural Collaborative Filtering (NCF) model for recommending products based on binary user interactions.

## Overview:
 1. Data loading and preprocessing
 2. Model definition (NCF architecture)
 3. Training with early stopping
 4. Evaluation and metrics
 5. Model saving for inference

## Setup and Imports

In [18]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [20]:
import pandas as pd
import numpy as np
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import Dataset, DataLoader
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
from sklearn.metrics import roc_auc_score, accuracy_score, precision_score, recall_score, f1_score
import matplotlib.pyplot as plt
from tqdm.notebook import tqdm
import os
import pickle

# Set random seed for reproducibility
seed = 42
torch.manual_seed(seed)
torch.cuda.manual_seed(seed)
np.random.seed(seed)

# Check if GPU is available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")

# Define model save directory - adjust this path according to your needs
MODEL_DIR = "./models"
os.makedirs(MODEL_DIR, exist_ok=True)

Using device: cuda


## 1. Data Loading and Preprocessing

 In this section, we:
 1. Load the reviews dataset
 2. Convert ratings to binary interactions
 3. Encode user and item IDs
 4. Split data into training and test sets

### Load and Preprocess Data

In [21]:
def load_and_preprocess_data(file_path, test_size=0.2, threshold=3):
    """
    Load and preprocess the reviews data for the binary NCF model.
    Ratings >= threshold are considered positive interactions (1),
    ratings < threshold are considered negative interactions (0).
    """
    print("Loading data...")
    df = pd.read_csv(file_path, keep_default_na=False)

    # Keep only necessary columns
    df = df[['reviewerID', 'asin', 'overall']]

    # Convert ratings to binary interaction (0 or 1)
    df['interaction'] = (df['overall'] >= threshold).astype(int)

    # Display basic statistics
    print(f"Total records: {len(df)}")
    print(f"Unique users: {df['reviewerID'].nunique()}")
    print(f"Unique items: {df['asin'].nunique()}")
    print(f"Rating distribution (original):\n{df['overall'].value_counts().sort_index()}")
    print(f"Interaction distribution (binary):\n{df['interaction'].value_counts()}")

    # Encode user and item IDs
    user_encoder = LabelEncoder()
    item_encoder = LabelEncoder()

    df['user_idx'] = user_encoder.fit_transform(df['reviewerID'])
    df['item_idx'] = item_encoder.fit_transform(df['asin'])

    # Split the data using stratified sampling based on interactions
    train_df, test_df = train_test_split(
        df, test_size=test_size,
        stratify=df['interaction'],
        random_state=seed
    )

    print(f"Training set size: {len(train_df)}")
    print(f"Test set size: {len(test_df)}")

    # Get the number of users and items for embedding layers
    n_users = df['user_idx'].nunique()
    n_items = df['item_idx'].nunique()

    return train_df, test_df, n_users, n_items, user_encoder, item_encoder

### Custom Dataset for binary interaction

In [22]:
class InteractionDataset(Dataset):
    """
    Custom PyTorch Dataset for binary interaction data.
    """
    def __init__(self, df):
        self.user_ids = torch.tensor(df['user_idx'].values, dtype=torch.long)
        self.item_ids = torch.tensor(df['item_idx'].values, dtype=torch.long)
        self.interactions = torch.tensor(df['interaction'].values, dtype=torch.float)

    def __len__(self):
        return len(self.interactions)

    def __getitem__(self, idx):
        return {
            'user_id': self.user_ids[idx],
            'item_id': self.item_ids[idx],
            'interaction': self.interactions[idx]
        }

## 1. Model Definition

The Neural Collaborative Filtering (NCF) model combines:
 1. Generalized Matrix Factorization (GMF)
 2. Multi-Layer Perceptron (MLP)

This architecture captures both linear and non-linear interactions between users and items.

In [23]:
class NCF(nn.Module):
    """
    Neural Collaborative Filtering (NCF) model for binary interaction prediction.
    """
    def __init__(self, n_users, n_items, factors=32, mlp_layers=[64, 32, 16], dropout=0.2):
        super(NCF, self).__init__()

        # GMF part
        self.user_gmf_embedding = nn.Embedding(n_users, factors)
        self.item_gmf_embedding = nn.Embedding(n_items, factors)

        # MLP part
        self.user_mlp_embedding = nn.Embedding(n_users, factors)
        self.item_mlp_embedding = nn.Embedding(n_items, factors)

        # MLP layers
        self.mlp_layers = nn.ModuleList()
        input_size = 2 * factors

        for next_size in mlp_layers:
            self.mlp_layers.append(nn.Linear(input_size, next_size))
            self.mlp_layers.append(nn.ReLU())
            self.mlp_layers.append(nn.Dropout(dropout))
            input_size = next_size

        # Output layer
        self.output_layer = nn.Linear(factors + mlp_layers[-1], 1)
        self.sigmoid = nn.Sigmoid()

        # Initialize weights
        self._init_weights()

    def _init_weights(self):
        """Initialize weights using Xavier initialization."""
        for m in self.modules():
            if isinstance(m, nn.Linear):
                nn.init.xavier_uniform_(m.weight)
                if m.bias is not None:
                    nn.init.zeros_(m.bias)
            elif isinstance(m, nn.Embedding):
                nn.init.normal_(m.weight, mean=0, std=0.01)

    def forward(self, user_id, item_id):
        # GMF part
        user_gmf = self.user_gmf_embedding(user_id)
        item_gmf = self.item_gmf_embedding(item_id)
        gmf_vector = user_gmf * item_gmf

        # MLP part
        user_mlp = self.user_mlp_embedding(user_id)
        item_mlp = self.item_mlp_embedding(item_id)
        mlp_vector = torch.cat([user_mlp, item_mlp], dim=1)

        for layer in self.mlp_layers:
            mlp_vector = layer(mlp_vector)

        # Concatenate GMF and MLP parts
        concat_vector = torch.cat([gmf_vector, mlp_vector], dim=1)

        # Output layer
        output = self.output_layer(concat_vector)
        output = self.sigmoid(output)

        return output.squeeze()

## 3. Training Function with Early Stopping

We implement a training function with early stopping based on validation AUC.

This prevents overfitting and ensures the model generalizes well.

In [24]:
def train_model(model, train_loader, val_loader, criterion, optimizer, epochs=20, patience=5, scheduler=None, model_save_path=None):
    """
    Train the NCF model with early stopping based on AUC.
    """
    # Move model to device
    model = model.to(device)

    # Initialize variables for training
    train_losses = []
    val_losses = []
    val_aucs = []
    best_val_auc = 0
    best_model_state = None
    patience_counter = 0

    for epoch in range(epochs):
        # Training phase
        model.train()
        running_loss = 0.0

        train_bar = tqdm(train_loader, desc=f"Epoch {epoch+1}/{epochs} [Train]")
        for batch in train_bar:
            user_ids = batch['user_id'].to(device)
            item_ids = batch['item_id'].to(device)
            interactions = batch['interaction'].to(device)

            # Zero gradients
            optimizer.zero_grad()

            # Forward pass
            predictions = model(user_ids, item_ids)

            # Calculate loss
            loss = criterion(predictions, interactions)

            # Backward pass and optimize
            loss.backward()
            optimizer.step()

            # Update statistics
            running_loss += loss.item() * len(interactions)
            train_bar.set_postfix({'loss': loss.item()})

        epoch_train_loss = running_loss / len(train_loader.dataset)
        train_losses.append(epoch_train_loss)

        # Validation phase
        model.eval()
        val_loss = 0.0
        all_preds = []
        all_labels = []

        with torch.no_grad():
            val_bar = tqdm(val_loader, desc=f"Epoch {epoch+1}/{epochs} [Val]")
            for batch in val_bar:
                user_ids = batch['user_id'].to(device)
                item_ids = batch['item_id'].to(device)
                interactions = batch['interaction'].to(device)

                # Forward pass
                predictions = model(user_ids, item_ids)

                # Calculate loss
                loss = criterion(predictions, interactions)
                val_loss += loss.item() * len(interactions)

                # Store predictions and labels for metrics
                all_preds.extend(predictions.cpu().numpy())
                all_labels.extend(interactions.cpu().numpy())

                val_bar.set_postfix({'val_loss': loss.item()})

        # Calculate epoch validation metrics
        epoch_val_loss = val_loss / len(val_loader.dataset)
        epoch_val_auc = roc_auc_score(all_labels, all_preds)

        val_losses.append(epoch_val_loss)
        val_aucs.append(epoch_val_auc)

        # Update learning rate if scheduler is provided
        if scheduler:
            scheduler.step(epoch_val_auc)

        # Print epoch results
        print(f"Epoch {epoch+1}/{epochs}")
        print(f"Train Loss: {epoch_train_loss:.4f}")
        print(f"Val Loss: {epoch_val_loss:.4f}, Val AUC: {epoch_val_auc:.4f}")

        # Early stopping based on AUC (higher is better)
        if epoch_val_auc > best_val_auc:
            best_val_auc = epoch_val_auc
            best_model_state = model.state_dict().copy()
            patience_counter = 0
        else:
            patience_counter += 1

        if patience_counter >= patience:
            print(f"Early stopping triggered after epoch {epoch+1}")
            break

    # Load the best model
    if best_model_state:
        model.load_state_dict(best_model_state)

        # Save the best model if a path is provided
        if model_save_path:
            os.makedirs(os.path.dirname(model_save_path), exist_ok=True)
            torch.save({
                'model_state_dict': best_model_state,
                'n_users': model.user_gmf_embedding.num_embeddings,
                'n_items': model.item_gmf_embedding.num_embeddings,
                'factors': model.user_gmf_embedding.embedding_dim,
                'mlp_layers': [64, 32, 16],  # Make sure this matches your model architecture
                'dropout': 0.2,              # Make sure this matches your model architecture
                'best_val_auc': best_val_auc
            }, model_save_path)
            print(f"Best model saved to {model_save_path}")

    return model, train_losses, val_losses, val_aucs

## 4. Evaluation Functions

We evaluate our model using:
 1. AUC (Area Under the ROC Curve)
 2. Accuracy, Precision, Recall, and F1 Score
 3. Cold-start item performance

### Model Evaluation

In [25]:
def evaluate_model(model, test_loader, threshold=0.5):
    """
    Evaluate the model on the test set using multiple metrics.
    """
    model.eval()
    all_preds = []
    all_probs = []  # Raw probabilities for AUC
    all_labels = []

    with torch.no_grad():
        for batch in tqdm(test_loader, desc="Evaluating"):
            user_ids = batch['user_id'].to(device)
            item_ids = batch['item_id'].to(device)
            interactions = batch['interaction'].to(device)

            # Forward pass
            predictions = model(user_ids, item_ids)

            # Store raw probabilities for AUC
            all_probs.extend(predictions.cpu().numpy())

            # Convert to binary predictions using threshold
            binary_preds = (predictions >= threshold).float()

            all_preds.extend(binary_preds.cpu().numpy())
            all_labels.extend(interactions.cpu().numpy())

    # Calculate metrics
    auc = roc_auc_score(all_labels, all_probs)
    accuracy = accuracy_score(all_labels, all_preds)
    precision = precision_score(all_labels, all_preds)
    recall = recall_score(all_labels, all_preds)
    f1 = f1_score(all_labels, all_preds)

    print("\nTest Set Metrics:")
    print(f"AUC: {auc:.4f}")
    print(f"Accuracy: {accuracy:.4f}")
    print(f"Precision: {precision:.4f}")
    print(f"Recall: {recall:.4f}")
    print(f"F1 Score: {f1:.4f}")

    return {
        'auc': auc,
        'accuracy': accuracy,
        'precision': precision,
        'recall': recall,
        'f1': f1,
        'predictions': all_probs,
        'labels': all_labels
    }

### Cold Start Evaluation

In [26]:
def evaluate_cold_start_items(model, df, test_df, user_encoder, item_encoder, min_interactions=5):
    """
    Evaluate the model specifically on cold start items from the test set.
    Cold start items are defined as items with minimal user interactions.
    """
    print("\nEvaluating model on cold start items...")

    # Count interactions per item in the full dataset
    item_counts = df['asin'].value_counts()

    # Find cold start items that appear in the test set
    cold_start_items = set(item_counts[item_counts <= min_interactions].index) & set(test_df['asin'].unique())

    if not cold_start_items:
        print(f"No cold start items found with <= {min_interactions} interactions in the test set.")
        return None

    print(f"Found {len(cold_start_items)} cold start items in the test set.")

    # Filter test data to only include cold start items
    cold_start_test_df = test_df[test_df['asin'].isin(cold_start_items)]

    if len(cold_start_test_df) == 0:
        print("No data available for cold start evaluation.")
        return None

    print(f"Number of interactions with cold start items: {len(cold_start_test_df)}")

    # Create dataset and dataloader
    cold_start_dataset = InteractionDataset(cold_start_test_df)
    cold_start_loader = DataLoader(cold_start_dataset, batch_size=1024, shuffle=False)

    # Evaluate
    model.eval()
    all_probs = []
    all_labels = []

    with torch.no_grad():
        for batch in tqdm(cold_start_loader, desc="Evaluating cold start items"):
            user_ids = batch['user_id'].to(device)
            item_ids = batch['item_id'].to(device)
            interactions = batch['interaction'].to(device)

            predictions = model(user_ids, item_ids)

            all_probs.extend(predictions.cpu().numpy())
            all_labels.extend(interactions.cpu().numpy())

    # Calculate AUC for cold start items
    cold_start_auc = roc_auc_score(all_labels, all_probs)
    print(f"Cold Start Items AUC: {cold_start_auc:.4f}")

    return cold_start_auc


## 5. Recommendation Generation

This function generates recommendations for users by finding items they haven't interacted with yet and predicting their likelihood of enjoying them.

In [27]:
def generate_recommendations(model, df, test_df, user_encoder, item_encoder, n_recommendations=5, threshold=0.5):
    """
    Generate n recommendations for users in the test set.
    Returns a dictionary of user IDs to recommended item lists.
    """
    model.eval()

    # Get unique users from test set
    test_users = test_df['reviewerID'].unique()

    # Get all items
    all_items = df['asin'].unique()

    recommendations = {}

    for user in tqdm(test_users[:10], desc="Generating recommendations"):  # Just do 10 users for demo
        user_idx = user_encoder.transform([user])[0]

        # Get items the user hasn't interacted with yet
        user_items = df[df['reviewerID'] == user]['asin'].values
        unseen_items = np.setdiff1d(all_items, user_items)

        if len(unseen_items) == 0:
            print(f"User {user} has interacted with all items!")
            continue

        # Calculate scores for unseen items
        unseen_item_idxs = item_encoder.transform(unseen_items)

        # Process in batches to avoid memory issues
        batch_size = 1024
        all_scores = []

        for i in range(0, len(unseen_item_idxs), batch_size):
            batch_idxs = unseen_item_idxs[i:i+batch_size]
            user_tensor = torch.tensor([user_idx] * len(batch_idxs), dtype=torch.long).to(device)
            item_tensor = torch.tensor(batch_idxs, dtype=torch.long).to(device)

            with torch.no_grad():
                scores = model(user_tensor, item_tensor)
                all_scores.append(scores.cpu().numpy())

        all_scores = np.concatenate(all_scores)

        # Get items that are predicted to be interacted with (binary recommendation)
        positive_idxs = np.where(all_scores >= threshold)[0]

        # If we have more positive predictions than needed, randomly sample
        if len(positive_idxs) > n_recommendations:
            np.random.shuffle(positive_idxs)
            selected_idxs = positive_idxs[:n_recommendations]
        # If we have fewer positive predictions, include some highest scoring items below threshold
        elif len(positive_idxs) < n_recommendations:
            # Find indices of items below threshold
            negative_idxs = np.where(all_scores < threshold)[0]

            # Sort by score (highest first)
            sorted_neg_idxs = negative_idxs[np.argsort(-all_scores[negative_idxs])]

            # Take as many as needed to reach n_recommendations
            needed = n_recommendations - len(positive_idxs)
            additional_idxs = sorted_neg_idxs[:needed]

            # Combine positive and additional items
            selected_idxs = np.concatenate([positive_idxs, additional_idxs])
        else:
            selected_idxs = positive_idxs

        # Get the corresponding items
        recommended_items = unseen_items[selected_idxs]

        # Store recommendations
        recommendations[user] = recommended_items.tolist()

    return recommendations

## 6. Visualization Functions

We plot the training and validation curves to monitor model performance.


In [28]:
def plot_training_curves(train_losses, val_losses, val_aucs):
    """
    Plot training and validation curves.
    """
    plt.figure(figsize=(15, 5))

    # Loss curves
    plt.subplot(1, 2, 1)
    plt.plot(train_losses, label='Train Loss')
    plt.plot(val_losses, label='Validation Loss')
    plt.xlabel('Epoch')
    plt.ylabel('Loss')
    plt.title('Training and Validation Loss')
    plt.legend()

    # AUC curve
    plt.subplot(1, 2, 2)
    plt.plot(val_aucs, label='Validation AUC')
    plt.xlabel('Epoch')
    plt.ylabel('AUC')
    plt.title('Validation AUC')
    plt.legend()

    plt.tight_layout()
    plt.savefig(os.path.join(MODEL_DIR, "training_curves.png"))
    plt.close()
    print("Training curves saved to models/training_curves.png")


## 7. Save Encoders

This function saves the user and item encoders for later use during inference.

In [29]:
def save_encoders(user_encoder, item_encoder, save_dir):
    """
    Save the user and item encoders to files for later use in inference.
    """
    encoder_path = os.path.join(save_dir, 'encoders.pkl')
    with open(encoder_path, 'wb') as f:
        pickle.dump({
            'user_encoder': user_encoder,
            'item_encoder': item_encoder
        }, f)
    print(f"Encoders saved to {encoder_path}")

## 8. Main Training Function

This is the main function that ties everything together:
 1. Loads and preprocesses the data
 2. Creates and trains the model
 3. Evaluates model performance
 4. Saves all necessary components for inference

In [40]:
def main():
    print("\n" + "="*80)
    print("STEP 1: LOADING DATA")
    print("="*80)

    MODEL_DIR = "/content/drive/MyDrive/bt4222data/final_models"

    os.makedirs(MODEL_DIR, exist_ok=True)

    # File path - update this to your file path
    file_path = "/content/drive/My Drive/bt4222data/Reviews Data Cleaned/cleaned_reviews.csv"
    print(f"Loading data from: {file_path}")

    # Model and encoders save paths
    model_save_path = os.path.join(MODEL_DIR, "ncf_binary_model.pt")
    print(f"Model will be saved to: {model_save_path}")

    # Load and preprocess data with binary interactions
    train_df, test_df, n_users, n_items, user_encoder, item_encoder = load_and_preprocess_data(
        file_path,
        test_size=0.2,
        threshold=3  # Ratings >= 3 are positive interactions
    )

    print("\n" + "="*80)
    print("STEP 2: SAVING ENCODERS FOR INFERENCE")
    print("="*80)

    # Save encoders for inference
    save_encoders(user_encoder, item_encoder, MODEL_DIR)

    print("\n" + "="*80)
    print("STEP 3: CREATING DATASETS AND DATALOADERS")
    print("="*80)

    # Create datasets and data loaders
    train_dataset = InteractionDataset(train_df)
    test_dataset = InteractionDataset(test_df)

    print(f"Training dataset size: {len(train_dataset)}")
    print(f"Test dataset size: {len(test_dataset)}")

    train_loader = DataLoader(train_dataset, batch_size=1024, shuffle=True, num_workers=4)
    test_loader = DataLoader(test_dataset, batch_size=1024, shuffle=False, num_workers=4)

    print("\n" + "="*80)
    print("STEP 4: CREATING NCF MODEL")
    print("="*80)

    # Create model
    model = NCF(
        n_users=n_users,
        n_items=n_items,
        factors=64,                # Embedding dimension
        mlp_layers=[128, 64, 32],  # MLP layer dimensions
        dropout=0.3                # Dropout rate
    )

    print(f"Model created with {n_users} users and {n_items} items")
    print(f"Embedding dimension: 64")
    print(f"MLP layers: [128, 64, 32]")
    print(f"Dropout rate: 0.3")

    print("\n" + "="*80)
    print("STEP 5: TRAINING THE MODEL")
    print("="*80)

    # Define loss function and optimizer
    criterion = nn.BCELoss()  # Binary Cross Entropy Loss for binary classification
    optimizer = optim.Adam(
        model.parameters(),
        lr=0.001,
        weight_decay=1e-5  # L2 regularization
    )

    # Learning rate scheduler based on AUC (higher is better, so mode is 'max')
    scheduler = optim.lr_scheduler.ReduceLROnPlateau(
        optimizer,
        mode='max',
        factor=0.5,
        patience=2,
        verbose=True
    )

    print("Training with early stopping based on validation AUC...")
    print("Early stopping patience: 5 epochs")
    print("Learning rate: 0.001 with ReduceLROnPlateau scheduler")
    print("Weight decay (L2 regularization): 1e-5")

    # Train the model with early stopping based on AUC
    model, train_losses, val_losses, val_aucs = train_model(
        model=model,
        train_loader=train_loader,
        val_loader=test_loader,  # Using test set as validation set
        criterion=criterion,
        optimizer=optimizer,
        epochs=30,
        patience=5,
        scheduler=scheduler,
        model_save_path=model_save_path
    )

    print("\n" + "="*80)
    print("STEP 6: PLOTTING TRAINING CURVES")
    print("="*80)

    # Plot training curves
    plot_training_curves(train_losses, val_losses, val_aucs)

    print("\n" + "="*80)
    print("STEP 7: EVALUATING MODEL ON TEST SET")
    print("="*80)

    # Evaluate the model on the test set
    test_metrics = evaluate_model(model, test_loader)

    print("\n" + "="*80)
    print("STEP 8: EVALUATING ON COLD START ITEMS")
    print("="*80)
    print("Cold start items are defined as items with <= 5 interactions in the dataset")

    # Evaluate on cold start items
    cold_start_auc = evaluate_cold_start_items(
        model=model,
        df=pd.concat([train_df, test_df]),
        test_df=test_df,
        user_encoder=user_encoder,
        item_encoder=item_encoder,
        min_interactions=5  # Define cold start items as those with <= 5 interactions
    )

    print("\n" + "="*80)
    print("STEP 9: GENERATING SAMPLE RECOMMENDATIONS")
    print("="*80)

    # Generate sample recommendations
    recommendations = generate_recommendations(
        model=model,
        df=pd.concat([train_df, test_df]),
        test_df=test_df,
        user_encoder=user_encoder,
        item_encoder=item_encoder,
        n_recommendations=5
    )

    # Display sample recommendations
    print("\nSAMPLE RECOMMENDATIONS (ASINs only):")
    print("-" * 50)
    for i, (user, items) in enumerate(recommendations.items()):
        if i >= 3:  # Just show 3 users for brevity
            break
        print(f"\nUser ID: {user}")
        print(f"Top 5 recommended items (ASINs):")
        for item in items:
            print(f"{item}")

    print("\n" + "="*80)
    print("TRAINING AND EVALUATION COMPLETE")
    print("="*80)
    print(f"Model saved to: {model_save_path}")
    print(f"Encoders saved to: {os.path.join(MODEL_DIR, 'encoders.pkl')}")
    print(f"Training curves saved to: {os.path.join(MODEL_DIR, 'training_curves.png')}")
    print("\nYou can now use the inference notebook to generate recommendations for specific users.")


## Run the Training Process

In [41]:
if __name__ == "__main__":
    main()


STEP 1: LOADING DATA
Loading data from: /content/drive/My Drive/bt4222data/Reviews Data Cleaned/cleaned_reviews.csv
Model will be saved to: /content/drive/MyDrive/bt4222data/final_models/ncf_binary_model.pt
Loading data...
Total records: 1689188
Unique users: 192403
Unique items: 63001
Rating distribution (original):
overall
1.0     108725
2.0      82139
3.0     142257
4.0     347041
5.0    1009026
Name: count, dtype: int64
Interaction distribution (binary):
interaction
1    1498324
0     190864
Name: count, dtype: int64
Training set size: 1351350
Test set size: 337838

STEP 2: SAVING ENCODERS FOR INFERENCE
Encoders saved to /content/drive/MyDrive/bt4222data/final_models/encoders.pkl

STEP 3: CREATING DATASETS AND DATALOADERS
Training dataset size: 1351350
Test dataset size: 337838

STEP 4: CREATING NCF MODEL
Model created with 192403 users and 63001 items
Embedding dimension: 64
MLP layers: [128, 64, 32]
Dropout rate: 0.3

STEP 5: TRAINING THE MODEL
Training with early stopping based



Epoch 1/30 [Train]:   0%|          | 0/1320 [00:00<?, ?it/s]

Epoch 1/30 [Val]:   0%|          | 0/330 [00:00<?, ?it/s]

Epoch 1/30
Train Loss: 0.3390
Val Loss: 0.3155, Val AUC: 0.7318


Epoch 2/30 [Train]:   0%|          | 0/1320 [00:00<?, ?it/s]

Epoch 2/30 [Val]:   0%|          | 0/330 [00:00<?, ?it/s]

Epoch 2/30
Train Loss: 0.2830
Val Loss: 0.3199, Val AUC: 0.7273


Epoch 3/30 [Train]:   0%|          | 0/1320 [00:00<?, ?it/s]

Epoch 3/30 [Val]:   0%|          | 0/330 [00:00<?, ?it/s]

Epoch 3/30
Train Loss: 0.2459
Val Loss: 0.3432, Val AUC: 0.7167


Epoch 4/30 [Train]:   0%|          | 0/1320 [00:00<?, ?it/s]

Epoch 4/30 [Val]:   0%|          | 0/330 [00:00<?, ?it/s]

Epoch 4/30
Train Loss: 0.2247
Val Loss: 0.3666, Val AUC: 0.7131


Epoch 5/30 [Train]:   0%|          | 0/1320 [00:00<?, ?it/s]

Epoch 5/30 [Val]:   0%|          | 0/330 [00:00<?, ?it/s]

Epoch 5/30
Train Loss: 0.1793
Val Loss: 0.4795, Val AUC: 0.7071


Epoch 6/30 [Train]:   0%|          | 0/1320 [00:00<?, ?it/s]

Epoch 6/30 [Val]:   0%|          | 0/330 [00:00<?, ?it/s]

Epoch 6/30
Train Loss: 0.1398
Val Loss: 0.5750, Val AUC: 0.7036
Early stopping triggered after epoch 6
Best model saved to /content/drive/MyDrive/bt4222data/final_models/ncf_binary_model.pt

STEP 6: PLOTTING TRAINING CURVES
Training curves saved to models/training_curves.png

STEP 7: EVALUATING MODEL ON TEST SET


Evaluating:   0%|          | 0/330 [00:00<?, ?it/s]


Test Set Metrics:
AUC: 0.7036
Accuracy: 0.8607
Precision: 0.9028
Recall: 0.9447
F1 Score: 0.9233

STEP 8: EVALUATING ON COLD START ITEMS
Cold start items are defined as items with <= 5 interactions in the dataset

Evaluating model on cold start items...
Found 5849 cold start items in the test set.
Number of interactions with cold start items: 8685


Evaluating cold start items:   0%|          | 0/9 [00:00<?, ?it/s]

Cold Start Items AUC: 0.6394

STEP 9: GENERATING SAMPLE RECOMMENDATIONS


Generating recommendations:   0%|          | 0/10 [00:00<?, ?it/s]


SAMPLE RECOMMENDATIONS (ASINs only):
--------------------------------------------------

User ID: A3U6GQG9C4PFXI
Top 5 recommended items (ASINs):
B0001Q2DKS
B00COSHR20
B00CTUOSYS
B001O2S5QE
B001V7R5ZE

User ID: A3NHUQ33CFH3VM
Top 5 recommended items (ASINs):
B003FXD9FC
B0007WLI0M
B0009ORXE8
B00EL0EIGC
B007XQREZ8

User ID: A2KZ9WSADF4ZYY
Top 5 recommended items (ASINs):
B00D8JA2S0
B002UKZ9IG
B000I4UQL6
B007PE3FC4
B000HCUTU2

TRAINING AND EVALUATION COMPLETE
Model saved to: /content/drive/MyDrive/bt4222data/final_models/ncf_binary_model.pt
Encoders saved to: /content/drive/MyDrive/bt4222data/final_models/encoders.pkl
Training curves saved to: /content/drive/MyDrive/bt4222data/final_models/training_curves.png

You can now use the inference notebook to generate recommendations for specific users.
