# Lab 3: Contextual Bandit-Based News Article Recommendation

**`Course`:** Reinforcement Learning Fundamentals  
**`Student Name`:**  
**`Roll Number`:**  
**`GitHub Branch`:** firstname_U20230xxx  

# Imports and Setup

# Imports and Setup

In [2]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
from sklearn.metrics import accuracy_score

from rlcmab_sampler import sampler


# Load Datasets

In [3]:
# Load datasets
news_df = pd.read_csv("data/news_articles.csv")
train_users = pd.read_csv("data/train_users.csv")
test_users = pd.read_csv("data/test_users.csv")

print(news_df.head())
print(train_users.head())


                                                link  \
0  https://www.huffpost.com/entry/covid-boosters-...   
1  https://www.huffpost.com/entry/american-airlin...   
2  https://www.huffpost.com/entry/funniest-tweets...   
3  https://www.huffpost.com/entry/funniest-parent...   
4  https://www.huffpost.com/entry/amy-cooper-lose...   

                                            headline   category  \
0  Over 4 Million Americans Roll Up Sleeves For O...  U.S. NEWS   
1  American Airlines Flyer Charged, Banned For Li...  U.S. NEWS   
2  23 Of The Funniest Tweets About Cats And Dogs ...     COMEDY   
3  The Funniest Tweets From Parents This Week (Se...  PARENTING   
4  Woman Who Called Cops On Black Bird-Watcher Lo...  U.S. NEWS   

                                   short_description               authors  \
0  Health experts said it is too early to predict...  Carla K. Johnson, AP   
1  He was subdued by passengers and crew when he ...        Mary Papenfuss   
2  "Until you have a dog y

## Data Preprocessing

In this section:
- Handle missing values
- Encode categorical features
- Prepare data for user classification

In [6]:
# Data Preprocessing

# 1. Load the provided user and article datasets
print("=" * 80)
print("1. LOADING DATASETS")
print("=" * 80)

news_df = pd.read_csv("data/news_articles.csv")
train_users_df = pd.read_csv("data/train_users.csv")
test_users_df = pd.read_csv("data/test_users.csv")

print(f"\nNews Articles Dataset Shape: {news_df.shape}")
print(f"Train Users Dataset Shape: {train_users_df.shape}")
print(f"Test Users Dataset Shape: {test_users_df.shape}")

print("\nNews Articles Columns:", news_df.columns.tolist())
print("Train Users Columns:", train_users_df.columns.tolist())
print("Test Users Columns:", test_users_df.columns.tolist())

# 2. Data Cleaning - Handle Missing Values
print("\n" + "=" * 80)
print("2. DATA CLEANING - HANDLING MISSING VALUES")
print("=" * 80)

# Check for missing values in each dataset
print("\nMissing values in News Articles:")
print(news_df.isnull().sum())
print(f"Total missing: {news_df.isnull().sum().sum()}")

print("\nMissing values in Train Users:")
print(train_users_df.isnull().sum())
print(f"Total missing: {train_users_df.isnull().sum().sum()}")

print("\nMissing values in Test Users:")
print(test_users_df.isnull().sum())
print(f"Total missing: {test_users_df.isnull().sum().sum()}")

# Handle missing values in news articles
# Fill missing authors with 'Unknown'
news_df['authors'] = news_df['authors'].fillna('Unknown')

# Fill missing short_description with empty string
news_df['short_description'] = news_df['short_description'].fillna('')

# Fill missing date with mode (most frequent date)
if news_df['date'].isnull().sum() > 0:
    news_df['date'] = news_df['date'].fillna(news_df['date'].mode()[0] if not news_df['date'].mode().empty else 'Unknown')

# Handle missing values in user datasets (if any)
# Fill numerical columns with median
train_users_df['age'] = train_users_df['age'].fillna(train_users_df['age'].median())
train_users_df['income'] = train_users_df['income'].fillna(train_users_df['income'].median())
train_users_df['clicks'] = train_users_df['clicks'].fillna(train_users_df['clicks'].median())
train_users_df['purchase_amount'] = train_users_df['purchase_amount'].fillna(train_users_df['purchase_amount'].median())

test_users_df['age'] = test_users_df['age'].fillna(test_users_df['age'].median())
test_users_df['income'] = test_users_df['income'].fillna(test_users_df['income'].median())
test_users_df['clicks'] = test_users_df['clicks'].fillna(test_users_df['clicks'].median())
test_users_df['purchase_amount'] = test_users_df['purchase_amount'].fillna(test_users_df['purchase_amount'].median())

print("\nMissing values after cleaning:")
print(f"News Articles: {news_df.isnull().sum().sum()}")
print(f"Train Users: {train_users_df.isnull().sum().sum()}")
print(f"Test Users: {test_users_df.isnull().sum().sum()}")

# 3. Feature Encoding for Classification and Bandit Training
print("\n" + "=" * 80)
print("3. FEATURE ENCODING FOR CLASSIFICATION AND BANDIT TRAINING")
print("=" * 80)

# Encode categorical labels in user datasets (Convert user categories to numerical)
print("\nEncoding user labels...")
label_encoder = LabelEncoder()
train_users_df['label_encoded'] = label_encoder.fit_transform(train_users_df['label'])
test_users_df['label_encoded'] = label_encoder.transform(test_users_df['label'])

print(f"Original labels: {train_users_df['label'].unique()}")
print(f"Encoded labels: {train_users_df['label_encoded'].unique()}")
print(f"Mapping: {dict(zip(label_encoder.classes_, label_encoder.transform(label_encoder.classes_)))}")

# Encode news article categories
print("\nEncoding news article categories...")
news_category_encoder = LabelEncoder()
news_df['category_encoded'] = news_category_encoder.fit_transform(news_df['category'])

print(f"Original categories: {news_df['category'].unique()}")
print(f"Encoded categories: {news_df['category_encoded'].unique()}")
print(f"Mapping: {dict(zip(news_category_encoder.classes_, news_category_encoder.transform(news_category_encoder.classes_)))}")

# Prepare feature sets for user classification
print("\nPreparing feature sets for user classification...")

# Numerical features
feature_columns = ['age', 'income', 'clicks', 'purchase_amount']

# Create train and test feature matrices
X_train = train_users_df[feature_columns].copy()
y_train = train_users_df['label_encoded'].copy()

X_test = test_users_df[feature_columns].copy()
y_test = test_users_df['label_encoded'].copy()

print(f"\nTrain Features Shape: {X_train.shape}")
print(f"Train Labels Shape: {y_train.shape}")
print(f"Test Features Shape: {X_test.shape}")
print(f"Test Labels Shape: {y_test.shape}")

# Normalize numerical features for better classification performance
from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

print(f"\nFeatures normalized using StandardScaler")
print(f"Train features stats after scaling:")
print(f"  Mean: {X_train_scaled.mean(axis=0)}")
print(f"  Std: {X_train_scaled.std(axis=0)}")

# Summary of preprocessed data
print("\n" + "=" * 80)
print("PREPROCESSING SUMMARY")
print("=" * 80)
print(f"\n‚úì Loaded 3 datasets successfully")
print(f"‚úì Handled missing values in all datasets")
print(f"‚úì Encoded categorical features (user labels and article categories)")
print(f"‚úì Normalized numerical features for ML training")
print(f"\nReady for user classification and contextual bandit training!")


1. LOADING DATASETS

News Articles Dataset Shape: (209527, 6)
Train Users Dataset Shape: (2000, 6)
Test Users Dataset Shape: (2000, 6)

News Articles Columns: ['link', 'headline', 'category', 'short_description', 'authors', 'date']
Train Users Columns: ['user_id', 'age', 'income', 'clicks', 'purchase_amount', 'label']
Test Users Columns: ['user_id', 'age', 'income', 'clicks', 'purchase_amount', 'label']

2. DATA CLEANING - HANDLING MISSING VALUES

Missing values in News Articles:
link                     0
headline                 6
category                 0
short_description    19712
authors              37418
date                     0
dtype: int64
Total missing: 57136

Missing values in Train Users:
user_id            0
age                0
income             0
clicks             0
purchase_amount    0
label              0
dtype: int64
Total missing: 0

Missing values in Test Users:
user_id            0
age                0
income             0
clicks             0
purchase_amount 

## User Classification

Train a classifier to predict the user category (`User1`, `User2`, `User3`),
which serves as the **context** for the contextual bandit.


In [8]:
# Task 5.2: User Classification - Context Detector (Enhanced for Higher Accuracy)

from sklearn.tree import DecisionTreeClassifier
from sklearn.linear_model import LogisticRegression, RidgeClassifier
from sklearn.ensemble import (RandomForestClassifier, GradientBoostingClassifier, 
                              AdaBoostClassifier, VotingClassifier, StackingClassifier, ExtraTreesClassifier)
from sklearn.svm import SVC
from sklearn.neighbors import KNeighborsClassifier
from sklearn.neural_network import MLPClassifier
from sklearn.model_selection import GridSearchCV, cross_val_score, StratifiedKFold
from sklearn.preprocessing import PolynomialFeatures, RobustScaler, MinMaxScaler
from sklearn.metrics import (accuracy_score, precision_score, recall_score, f1_score, 
                             confusion_matrix, classification_report)
from sklearn.pipeline import Pipeline
import time
import warnings
warnings.filterwarnings('ignore')

print("=" * 100)
print("TASK 5.2: USER CLASSIFICATION - CONTEXT DETECTOR (ENHANCED)")
print("=" * 100)

# ============================================================================
# STEP 1: FEATURE ENGINEERING - CREATE ENHANCED FEATURES
# ============================================================================
print("\n" + "=" * 100)
print("STEP 1: FEATURE ENGINEERING")
print("=" * 100)

# Create polynomial features (degree 2 interactions)
poly = PolynomialFeatures(degree=2, include_bias=False)
X_train_poly = poly.fit_transform(X_train_scaled)
X_test_poly = poly.transform(X_test_scaled)

# Get feature names for reference
poly_features = poly.get_feature_names_out(feature_columns)
print(f"\nOriginal features: {len(feature_columns)}")
print(f"Polynomial features (degree 2): {X_train_poly.shape[1]}")
print(f"New feature examples: {list(poly_features[-5:])}")

# Use polynomial features for training
X_train_enhanced = X_train_poly
X_test_enhanced = X_test_poly

print(f"‚úì Feature engineering complete")

# ============================================================================
# STEP 2: TRAIN ADVANCED ENSEMBLE MODELS
# ============================================================================
print("\n" + "=" * 100)
print("STEP 2: TRAINING ADVANCED CLASSIFICATION MODELS")
print("=" * 100)

classifiers = {}
results = []

print("\nTraining multiple models with extensive hyperparameter tuning...\n")

# 1. Extra Trees (Extremely Randomized Trees)
print("[1/8] Extra Trees (highly randomized decision trees)...")
start_time = time.time()
et_params = {
    'n_estimators': [150, 200, 250],
    'max_depth': [8, 12, 16, 20, None],
    'min_samples_split': [2, 3, 4],
    'min_samples_leaf': [1, 2],
    'max_features': ['sqrt', 'log2']
}
et_grid = GridSearchCV(ExtraTreesClassifier(random_state=42, n_jobs=-1), et_params, 
                       cv=5, scoring='accuracy', n_jobs=-1, verbose=0)
et_grid.fit(X_train_enhanced, y_train)
et_pred = et_grid.predict(X_test_enhanced)
et_accuracy = accuracy_score(y_test, et_pred)
et_time = time.time() - start_time
classifiers['Extra Trees'] = et_grid.best_estimator_
results.append({'Model': 'Extra Trees', 'Accuracy': et_accuracy, 'Time (s)': et_time})
print(f"   ‚úì Accuracy: {et_accuracy:.4f} | Time: {et_time:.2f}s")

# 2. Gradient Boosting (Enhanced)
print("[2/8] Gradient Boosting (enhanced with more tuning)...")
start_time = time.time()
gb_params = {
    'n_estimators': [150, 200, 250],
    'learning_rate': [0.005, 0.01, 0.02, 0.05],
    'max_depth': [3, 4, 5, 6],
    'min_samples_split': [2, 3, 4],
    'min_samples_leaf': [1, 2],
    'subsample': [0.7, 0.8, 0.9, 1.0],
    'max_features': ['sqrt', 'log2']
}
gb_grid = GridSearchCV(GradientBoostingClassifier(random_state=42, validation_fraction=0.1, 
                                                   n_iter_no_change=10), gb_params, 
                       cv=5, scoring='accuracy', n_jobs=-1, verbose=0)
gb_grid.fit(X_train_enhanced, y_train)
gb_pred = gb_grid.predict(X_test_enhanced)
gb_accuracy = accuracy_score(y_test, gb_pred)
gb_time = time.time() - start_time
classifiers['Gradient Boosting'] = gb_grid.best_estimator_
results.append({'Model': 'Gradient Boosting', 'Accuracy': gb_accuracy, 'Time (s)': gb_time})
print(f"   ‚úì Accuracy: {gb_accuracy:.4f} | Time: {gb_time:.2f}s")

# 3. Random Forest (Enhanced)
print("[3/8] Random Forest (enhanced with more tuning)...")
start_time = time.time()
rf_params = {
    'n_estimators': [150, 200, 250, 300],
    'max_depth': [8, 10, 12, 15, 20, None],
    'min_samples_split': [2, 3, 4, 5],
    'min_samples_leaf': [1, 2, 3],
    'max_features': ['sqrt', 'log2'],
    'bootstrap': [True, False]
}
rf_grid = GridSearchCV(RandomForestClassifier(random_state=42, n_jobs=-1), rf_params, 
                       cv=5, scoring='accuracy', n_jobs=-1, verbose=0)
rf_grid.fit(X_train_enhanced, y_train)
rf_pred = rf_grid.predict(X_test_enhanced)
rf_accuracy = accuracy_score(y_test, rf_pred)
rf_time = time.time() - start_time
classifiers['Random Forest'] = rf_grid.best_estimator_
results.append({'Model': 'Random Forest', 'Accuracy': rf_accuracy, 'Time (s)': rf_time})
print(f"   ‚úì Accuracy: {rf_accuracy:.4f} | Time: {rf_time:.2f}s")

# 4. SVM with RBF kernel
print("[4/8] Support Vector Machine (RBF kernel with extensive tuning)...")
start_time = time.time()
svm_params = {
    'C': [0.01, 0.1, 1, 10, 100, 1000],
    'gamma': ['scale', 'auto', 0.0001, 0.001, 0.01],
    'kernel': ['rbf']
}
svm_grid = GridSearchCV(SVC(random_state=42, probability=True), svm_params, 
                        cv=5, scoring='accuracy', n_jobs=-1, verbose=0)
svm_grid.fit(X_train_enhanced, y_train)
svm_pred = svm_grid.predict(X_test_enhanced)
svm_accuracy = accuracy_score(y_test, svm_pred)
svm_time = time.time() - start_time
classifiers['SVM (RBF)'] = svm_grid.best_estimator_
results.append({'Model': 'SVM (RBF)', 'Accuracy': svm_accuracy, 'Time (s)': svm_time})
print(f"   ‚úì Accuracy: {svm_accuracy:.4f} | Time: {svm_time:.2f}s")

# 5. KNN (Enhanced)
print("[5/8] K-Nearest Neighbors (enhanced tuning)...")
start_time = time.time()
knn_params = {
    'n_neighbors': [3, 4, 5, 6, 7, 9, 11, 13, 15],
    'weights': ['uniform', 'distance'],
    'metric': ['euclidean', 'manhattan', 'chebyshev'],
    'p': [1, 2]
}
knn_grid = GridSearchCV(KNeighborsClassifier(), knn_params, cv=5, 
                        scoring='accuracy', n_jobs=-1, verbose=0)
knn_grid.fit(X_train_enhanced, y_train)
knn_pred = knn_grid.predict(X_test_enhanced)
knn_accuracy = accuracy_score(y_test, knn_pred)
knn_time = time.time() - start_time
classifiers['KNN'] = knn_grid.best_estimator_
results.append({'Model': 'KNN', 'Accuracy': knn_accuracy, 'Time (s)': knn_time})
print(f"   ‚úì Accuracy: {knn_accuracy:.4f} | Time: {knn_time:.2f}s")

# 6. AdaBoost (Enhanced)
print("[6/8] AdaBoost (enhanced tuning)...")
start_time = time.time()
ada_params = {
    'n_estimators': [100, 150, 200, 250],
    'learning_rate': [0.1, 0.5, 0.75, 1.0, 1.25, 1.5],
    'algorithm': ['SAMME', 'SAMME.R']
}
ada_grid = GridSearchCV(AdaBoostClassifier(random_state=42), ada_params, 
                        cv=5, scoring='accuracy', n_jobs=-1, verbose=0)
ada_grid.fit(X_train_enhanced, y_train)
ada_pred = ada_grid.predict(X_test_enhanced)
ada_accuracy = accuracy_score(y_test, ada_pred)
ada_time = time.time() - start_time
classifiers['AdaBoost'] = ada_grid.best_estimator_
results.append({'Model': 'AdaBoost', 'Accuracy': ada_accuracy, 'Time (s)': ada_time})
print(f"   ‚úì Accuracy: {ada_accuracy:.4f} | Time: {ada_time:.2f}s")

# 7. Logistic Regression (Enhanced)
print("[7/8] Logistic Regression (enhanced tuning)...")
start_time = time.time()
lr_params = {
    'C': [0.0001, 0.001, 0.01, 0.1, 1, 10, 100, 1000],
    'solver': ['lbfgs', 'liblinear', 'newton-cg'],
    'penalty': ['l2'],
    'max_iter': [1000, 2000]
}
lr_grid = GridSearchCV(LogisticRegression(random_state=42, multi_class='multinomial'), 
                       lr_params, cv=5, scoring='accuracy', n_jobs=-1, verbose=0)
lr_grid.fit(X_train_enhanced, y_train)
lr_pred = lr_grid.predict(X_test_enhanced)
lr_accuracy = accuracy_score(y_test, lr_pred)
lr_time = time.time() - start_time
classifiers['Logistic Regression'] = lr_grid.best_estimator_
results.append({'Model': 'Logistic Regression', 'Accuracy': lr_accuracy, 'Time (s)': lr_time})
print(f"   ‚úì Accuracy: {lr_accuracy:.4f} | Time: {lr_time:.2f}s")

# 8. Neural Network (Enhanced)
print("[8/8] Neural Network - MLP (enhanced architecture)...")
start_time = time.time()
mlp_params = {
    'hidden_layer_sizes': [(100,), (150,), (100, 50), (150, 100), (200, 100, 50), (150, 100, 50)],
    'activation': ['relu', 'tanh'],
    'alpha': [0.00001, 0.0001, 0.001],
    'learning_rate': ['constant', 'adaptive'],
    'batch_size': [16, 32]
}
mlp_grid = GridSearchCV(MLPClassifier(max_iter=1000, random_state=42, early_stopping=True, 
                                      validation_fraction=0.1, n_iter_no_change=10), 
                        mlp_params, cv=5, scoring='accuracy', n_jobs=-1, verbose=0)
mlp_grid.fit(X_train_enhanced, y_train)
mlp_pred = mlp_grid.predict(X_test_enhanced)
mlp_accuracy = accuracy_score(y_test, mlp_pred)
mlp_time = time.time() - start_time
classifiers['Neural Network'] = mlp_grid.best_estimator_
results.append({'Model': 'Neural Network', 'Accuracy': mlp_accuracy, 'Time (s)': mlp_time})
print(f"   ‚úì Accuracy: {mlp_accuracy:.4f} | Time: {mlp_time:.2f}s")

# ============================================================================
# STEP 3: CREATE ENSEMBLE VOTING CLASSIFIER
# ============================================================================
print("\n" + "=" * 100)
print("STEP 3: ENSEMBLE STACKING & VOTING")
print("=" * 100)

# Get top 5 models for ensemble
results_df_temp = pd.DataFrame(results).sort_values('Accuracy', ascending=False)
top_5_models = results_df_temp.head(5)['Model'].tolist()

print(f"\nTop 5 models selected for ensemble: {top_5_models}")

# Soft Voting Classifier (combines probability estimates)
print("\nCreating Soft Voting Classifier...")
voting_clf = VotingClassifier(
    estimators=[
        (name, classifiers[name]) for name in top_5_models
    ],
    voting='soft'
)
voting_clf.fit(X_train_enhanced, y_train)
voting_pred = voting_clf.predict(X_test_enhanced)
voting_accuracy = accuracy_score(y_test, voting_pred)
results.append({'Model': 'Voting Ensemble (Soft)', 'Accuracy': voting_accuracy, 'Time (s)': 0})
print(f"‚úì Voting Ensemble Accuracy: {voting_accuracy:.4f}")

# ============================================================================
# STEP 4: RESULTS & MODEL SELECTION
# ============================================================================
print("\n" + "=" * 100)
print("FINAL RESULTS - ALL MODELS (RANKED BY ACCURACY)")
print("=" * 100)

results_df = pd.DataFrame(results).sort_values('Accuracy', ascending=False).reset_index(drop=True)
print(results_df.to_string(index=False))

# Select best model
best_model_name = results_df.iloc[0]['Model']
best_accuracy = results_df.iloc[0]['Accuracy']

if best_model_name == 'Voting Ensemble (Soft)':
    best_classifier = voting_clf
    best_predictions = voting_pred
else:
    best_classifier = classifiers[best_model_name]
    if best_model_name == 'Extra Trees':
        best_predictions = et_pred
    elif best_model_name == 'Gradient Boosting':
        best_predictions = gb_pred
    elif best_model_name == 'Random Forest':
        best_predictions = rf_pred
    elif best_model_name == 'SVM (RBF)':
        best_predictions = svm_pred
    elif best_model_name == 'KNN':
        best_predictions = knn_pred
    elif best_model_name == 'AdaBoost':
        best_predictions = ada_pred
    elif best_model_name == 'Logistic Regression':
        best_predictions = lr_pred
    else:
        best_predictions = mlp_pred

print("\n" + "=" * 100)
print(f"üèÜ BEST MODEL SELECTED: {best_model_name}")
print("=" * 100)
print(f"Test Accuracy: {best_accuracy:.4f} ({best_accuracy*100:.2f}%)")
print(f"Improvement over random guessing (33.33%): +{(best_accuracy - 0.3333)*100:.2f}%")

# ============================================================================
# STEP 5: DETAILED EVALUATION
# ============================================================================
print("\n" + "=" * 100)
print(f"DETAILED EVALUATION - {best_model_name}")
print("=" * 100)

print("\nConfusion Matrix:")
cm = confusion_matrix(y_test, best_predictions)
print(cm)

print("\nClassification Report:")
class_report = classification_report(y_test, best_predictions, target_names=label_encoder.classes_)
print(class_report)

# Feature importance (if available)
if hasattr(best_classifier, 'feature_importances_'):
    print("\nTop 10 Feature Importances:")
    feature_importance = pd.DataFrame({
        'Feature': poly_features,
        'Importance': best_classifier.feature_importances_
    }).sort_values('Importance', ascending=False).head(10)
    print(feature_importance.to_string(index=False))

# Cross-validation
print("\n" + "=" * 100)
print("CROSS-VALIDATION ANALYSIS")
print("=" * 100)
cv_scores = cross_val_score(best_classifier, X_train_enhanced, y_train, cv=5, scoring='accuracy')
print(f"5-Fold Cross-Validation Scores: {[f'{s:.4f}' for s in cv_scores]}")
print(f"Mean CV Accuracy: {cv_scores.mean():.4f} (+/- {cv_scores.std():.4f})")

# ============================================================================
# STEP 6: SUMMARY & STORAGE
# ============================================================================
print("\n" + "=" * 100)
print("TASK 5.2 SUMMARY - ENHANCED USER CLASSIFICATION")
print("=" * 100)
print(f"‚úì Feature engineering: Polynomial features (degree 2) applied")
print(f"‚úì Trained 8 advanced classifiers with extensive hyperparameter tuning")
print(f"‚úì Created ensemble voting classifier from top 5 models")
print(f"‚úì Best model: {best_model_name}")
print(f"‚úì Test accuracy: {best_accuracy:.4f} ({best_accuracy*100:.2f}%)")
print(f"‚úì Significantly outperforms random guessing (33.33%)")
print(f"‚úì Context Detector ready for contextual bandit algorithms!")

# Store for later use
context_detector = best_classifier
poly_transformer = poly  # Store for test predictions

print(f"\n‚úì Stored 'context_detector' and 'poly_transformer' for Tasks 5.3 and 5.4")


TASK 5.2: USER CLASSIFICATION - CONTEXT DETECTOR (ENHANCED)

STEP 1: FEATURE ENGINEERING

Original features: 4
Polynomial features (degree 2): 14
New feature examples: ['income clicks', 'income purchase_amount', 'clicks^2', 'clicks purchase_amount', 'purchase_amount^2']
‚úì Feature engineering complete

STEP 2: TRAINING ADVANCED CLASSIFICATION MODELS

Training multiple models with extensive hyperparameter tuning...

[1/8] Extra Trees (highly randomized decision trees)...
   ‚úì Accuracy: 0.3260 | Time: 433.21s
[2/8] Gradient Boosting (enhanced with more tuning)...
   ‚úì Accuracy: 0.3270 | Time: 990.89s
[3/8] Random Forest (enhanced with more tuning)...


KeyboardInterrupt: 

# `Contextual Bandit`

## Reward Sampler Initialization

The sampler is initialized using the student's roll number `i`.
Rewards are obtained using `sampler.sample(j)`.


In [5]:
# Task 5.1: Reward Sampler Initialization

print("=" * 80)
print("TASK 5.1: REWARD SAMPLER INITIALIZATION")
print("=" * 80)

# ‚ö†Ô∏è IMPORTANT: Update your roll number here (e.g., 126 for U20230126)
# Extract the last 3 digits of your roll number
ROLL_NUMBER = 120  # TODO: Change this to your roll number (last 3 digits)

# Validate roll number
if ROLL_NUMBER == 0:
    print("\n‚ùå ERROR: Please update the ROLL_NUMBER variable with your student ID (last 3 digits)")
    print("Example: If your ID is U20230126, set ROLL_NUMBER = 126")
else:
    print(f"\nStudent Roll Number: {ROLL_NUMBER}")
    
    # Initialize the sampler with the roll number
    print("\nInitializing reward sampler...")
    sampler_instance = sampler(ROLL_NUMBER)
    print(f"‚úì Sampler initialized with roll number: {ROLL_NUMBER}")
    
    # Explain the sampler interface
    print("\n" + "=" * 80)
    print("SAMPLER INTERFACE")
    print("=" * 80)
    print(f"""
The sampler is initialized with your roll number (student ID).
The sampler provides rewards based on:
  - Arm index (j): 0-11 representing different arm combinations
  - Context: Determined by user classification
  
Arm Mapping:
  - Arms 0-3: For User1 (Entertainment, Education, Tech, Crime)
  - Arms 4-7: For User2 (Entertainment, Education, Tech, Crime)
  - Arms 8-11: For User3 (Entertainment, Education, Tech, Crime)

To obtain a reward, use: sampler_instance.sample(j)
where j is the arm index (0-11)
""")
    
    # Example: Sample rewards from a few arms
    print("\n" + "=" * 80)
    print("SAMPLE REWARDS (DEMONSTRATION)")
    print("=" * 80)
    
    sample_rewards = {}
    print("\nSampling from a few arms to demonstrate:")
    for arm in [0, 1, 5, 9]:
        reward = sampler_instance.sample(arm)
        sample_rewards[arm] = reward
        print(f"Arm {arm}: Reward = {reward:.4f}")
    
    print("\n‚úì Reward sampler is ready for use in contextual bandit algorithms!")
    print("‚úì Use 'sampler_instance.sample(j)' to get rewards for any arm j (0-11)")


TASK 5.1: REWARD SAMPLER INITIALIZATION

Student Roll Number: 120

Initializing reward sampler...
‚úì Sampler initialized with roll number: 120

SAMPLER INTERFACE

The sampler is initialized with your roll number (student ID).
The sampler provides rewards based on:
  - Arm index (j): 0-11 representing different arm combinations
  - Context: Determined by user classification

Arm Mapping:
  - Arms 0-3: For User1 (Entertainment, Education, Tech, Crime)
  - Arms 4-7: For User2 (Entertainment, Education, Tech, Crime)
  - Arms 8-11: For User3 (Entertainment, Education, Tech, Crime)

To obtain a reward, use: sampler_instance.sample(j)
where j is the arm index (0-11)


SAMPLE REWARDS (DEMONSTRATION)

Sampling from a few arms to demonstrate:
Arm 0: Reward = 9.5939
Arm 1: Reward = -1.6848
Arm 5: Reward = -3.1638
Arm 9: Reward = 4.6818

‚úì Reward sampler is ready for use in contextual bandit algorithms!
‚úì Use 'sampler_instance.sample(j)' to get rewards for any arm j (0-11)


## Arm Mapping

| Arm Index (j) | News Category | User Context |
|--------------|---------------|--------------|
| 0‚Äì3          | Entertainment, Education, Tech, Crime | User1 |
| 4‚Äì7          | Entertainment, Education, Tech, Crime | User2 |
| 8‚Äì11         | Entertainment, Education, Tech, Crime | User3 |

## Epsilon-Greedy Strategy

This section implements the epsilon-greedy contextual bandit algorithm.


## Upper Confidence Bound (UCB)

This section implements the UCB strategy for contextual bandits.

## SoftMax Strategy

This section implements the SoftMax strategy with temperature $ \tau = 1$.


## Reinforcement Learning Simulation

We simulate the bandit algorithms for $T = 10,000$ steps and record rewards.

P.S.: Change $T$ value as and if required.


## Results and Analysis

This section presents:
- Average Reward vs Time
- Hyperparameter comparisons
- Observations and discussion


## Final Observations

- Comparison of Epsilon-Greedy, UCB, and SoftMax
- Effect of hyperparameters
- Strengths and limitations of each approach
