# Twitch Recommender Baselines: ALS vs. REP

This notebook benchmarks two baseline models on the Twitch dataset:
1. **ALS (Alternating Least Squares):** A matrix factorization model suited for implicit feedback.
2. **REP (Repeat/Popularity):** A heuristic baseline that predicts re-watching (high on Twitch) and global popularity.

**Evaluation Strategy:**
We evaluate on two subsets of the test data to highlight the difference between "Retention" and "Discovery":
- **Full Test Set:** Includes re-watches. (REP usually wins here).
- **New Discovery Only:** Filters out any streamer the user has watched in training. (ALS should win here).


## 1. Loading Dependencies

In [None]:
import pandas as pd
import numpy as np
import scipy.sparse as sparse
import random
import os
import time
from collections import defaultdict

# Optional: Try to import implicit for ALS. If not installed, we will skip ALS or warn.
try:
    import implicit
    HAS_IMPLICIT = True
except ImportError:
    print("WARNING: 'implicit' library not found. Please run `pip install implicit`.")
    print("ALS steps will be skipped in this run.")
    HAS_IMPLICIT = False



## 2. Data Loading & Preprocessing
We load the CSV, calculate implicit weights (Duration), and Map IDs.

In [None]:
class TwitchDataLoader:
    def __init__(self, filepath):
        self.filepath = filepath
        self.df = None
        self.user_map = {}
        self.item_map = {}
        self.item_map_inv = {}
        
    def load_and_process(self):
        # Assuming no header based on description, but let's safely load
        # Columns: User ID, Stream ID, Streamer username, Time start, Time stop
        if not os.path.exists(self.filepath):
            raise FileNotFoundError(f"File {self.filepath} not found. Please ensure the CSV is in the directory.")

        try:
            self.df = pd.read_csv(self.filepath, names=['user_id', 'stream_id', 'streamer', 'start', 'stop'])
        except:
            # Fallback if header exists
            self.df = pd.read_csv(self.filepath)
        
        # 1. Calculate Duration (Implicit Feedback Strength)
        self.df['duration'] = self.df['stop'] - self.df['start']
        
        # 2. Map Streamer Usernames to IDs
        unique_streamers = self.df['streamer'].unique()
        self.item_map = {name: i for i, name in enumerate(unique_streamers)}
        self.item_map_inv = {i: name for i, name in enumerate(unique_streamers)}
        self.df['item_idx'] = self.df['streamer'].map(self.item_map)
        
        # 3. Map User IDs to 0...N range
        unique_users = self.df['user_id'].unique()
        self.user_map = {uid: i for i, uid in enumerate(unique_users)}
        self.df['user_idx'] = self.df['user_id'].map(self.user_map)
        
        print(f"Loaded {len(self.df)} interactions.")
        print(f"Users: {len(self.user_map)}, Streamers: {len(self.item_map)}")
        return self.df

loader = TwitchDataLoader('100k_a.csv')
full_df = loader.load_and_process()



## 3. Temporal Split (Train / Test)
We split based on sorting by time and taking the first N% of rows.
- **Train:** First 90% of interactions (sorted by time).
- **Test:** Last 10% of interactions.

In [None]:
def temporal_split(df, split_ratio=0.9):
    # Sort by start time to ensure temporal order
    df = df.sort_values(by='start')
    
    # Split by row count index
    split_index = int(len(df) * split_ratio)
    
    print(f"Splitting data at row {split_index} out of {len(df)}")
    
    train_df = df.iloc[:split_index].copy()
    test_df = df.iloc[split_index:].copy()
    
    # Filter Test: Only keep users who exist in Train (Cold start users are a different problem)
    # train_users = set(train_df['user_idx'].unique())
    # test_df = test_df[test_df['user_idx'].isin(train_users)]
    
    print(f"Train samples: {len(train_df)}")
    print(f"Test samples:  {len(test_df)}")
    
    return train_df, test_df

train_df, test_df = temporal_split(full_df)

# Create Sparse Matrices for ALS
# Row = User, Col = Item
def create_sparse_matrix(df, num_users, num_items):
    # Sum duration if multiple interactions exist for same user-item
    grouped = df.groupby(['user_idx', 'item_idx'])['duration'].sum().reset_index()
    
    sparse_mat = sparse.csr_matrix(
        (grouped['duration'], (grouped['user_idx'], grouped['item_idx'])),
        shape=(num_users, num_items)
    )
    return sparse_mat

num_users = len(loader.user_map)
num_items = len(loader.item_map)

train_matrix = create_sparse_matrix(train_df, num_users, num_items)
print("Sparse matrices created.")



In [None]:
def calculate_user_concentration(matrix, user_idx):
    """
    Calculates a concentration score (0.0 to 1.0) for a user's watch history.
    
    Logic: Uses the Herfindahl-Hirschman Index (HHI) on watch duration ratios.
    Score = sum((duration_i / total_duration)^2)
    """
    # Get the row for the user from the sparse matrix
    # user_row is a sparse vector (1 x n_items)
    user_row = matrix[user_idx]
    
    # Extract the non-zero values (durations)
    durations = user_row.data
    
    if len(durations) == 0:
        return 0.0 # No history, can be interpreted as 0 diversity or undefined. 
        
    total_duration = np.sum(durations)
    
    if total_duration == 0:
        return 0.0
        
    # Calculate ratios
    ratios = durations / total_duration
    
    # Sum of 'squared' ratios - we've chosen to decrese this power to bring the mean closer to 0.5
    score = np.sum(ratios ** 1.5)
    
    k=5
    # k is the "steepness". k > 1 spreads values out.
    # Center input at 0 by subtracting 0.5, multiply by gain, then sigmoid
    return 1 / (1 + np.exp(-k * (score - 0.5)))
    #return float(score)

import matplotlib.pyplot as plt

# Calculate concentration scores for all users
concentration_scores = [calculate_user_concentration(train_matrix, user_idx) for user_idx in range(num_users)]

# Plot histogram
plt.figure(figsize=(8, 5))
plt.hist(concentration_scores, bins=50, color='skyblue', edgecolor='black')
plt.title('User Concentration Scores (HHI) Histogram')
plt.xlabel('Concentration Score')
plt.ylabel('Number of Users')
plt.show()


# 4. Exploratory Analysis
Some intuition-building tests run on the dataset of 100k users to help realize how popular the most popular streamers are, and how common it is for a user to watch a new streamer. Hopefullly this can be a guide for how often our reccomender system should try to reccomend a streamer the user does not have a history of viewing.

In [None]:
train_df.head()

In [None]:
#Top ten most watched streamers in train data

unique_users_per_streamer = (
    train_df.groupby("streamer")["user_id"]
      .nunique()
      .sort_values(ascending=False)
)

import matplotlib.pyplot as plt

unique_users_per_streamer.head(10).plot(kind='bar', figsize=(12, 6))

plt.title("Top 10 Streamers by Unique Users")
plt.xlabel("Streamer Username")
plt.ylabel("Unique Users")
plt.tight_layout()
plt.show()


In [None]:
# Count novel user-streamer pairs in random sample of test data
test_sample_size = 10_000
tst_sample = test_df.sample(test_sample_size)
#trn_sample = train_df.sample(frac=1/10000)

# Build lookup set
train_pairs = set(zip(train_df['user_id'], train_df['streamer']))

novel_pairs = 0
for row in tst_sample.itertuples(index=False):
    if (row.user_id, row.streamer) not in train_pairs:
        novel_pairs += 1

print(f'Number of sampled test rows with novel user-streamer pairs: {novel_pairs} of {test_sample_size}')
print(f'Percentage of novel pairs in test data: {novel_pairs / test_sample_size * 100:.2f}%')

# Count the number of unique streamers in the dataset - gain intuition
print('Number of unique streamers in entire dataset',full_df['streamer'].nunique())

sorted = full_df.sort_values('start').reset_index(drop=True)
user_watch_hist = {-1: {'example streamer'}}
novel_instances = 0

for row in sorted.itertuples(index=False):
    if row.user_id not in user_watch_hist:
        novel_instances += 1
        user_watch_hist[row.user_id] = {row.streamer}
    elif row.streamer not in user_watch_hist[row.user_id]:
        novel_instances += 1
        user_watch_hist[row.user_id].add(row.streamer)

print('novel interactions (in entire dataset): ', novel_instances, '/', len(sorted), ' = ', novel_instances / len(sorted))

## 4. Evaluation Engine
We calculate Precision@K. We define two modes:
1. **All Items:** Checks if recommendation exists in test set (Reward re-watching).
2. **New Items:** Checks if recommendation exists in test set AND was NOT in train set (Reward discovery).

In [None]:
def evaluate_model(model_name, recommend_func, test_df, train_df, k=10, visualize=True):
    print(f"\n--- Evaluating {model_name} @ K={k} ---")
    
    # Pre-computation for visuals
    test_user_items = test_df.groupby('user_idx')['item_idx'].apply(set).to_dict()
    train_user_items = train_df.groupby('user_idx')['item_idx'].apply(set).to_dict()
    
    # Metrics
    hits_all = 0
    total_users = 0
    hits_new = 0
    total_users_with_new = 0
    
    # Data collectors for Visuals
    all_recommendations = [] # For Popularity Bias
    user_hit_status = [] # For Concentration Plot (conc_score, hit_binary)
    
    # Main Evaluation Loop
    for u_idx, ground_truth_items in test_user_items.items():
        recs = recommend_func(u_idx, k)
        all_recommendations.extend(recs)
        
        # 1. Standard Metric
        is_hit = 0
        if len(set(recs) & ground_truth_items) > 0:
            hits_all += 1
            is_hit = 1
        total_users += 1
        
        # Collect for Concentration Visual
        if train_matrix is not None:
            conc = calculate_user_concentration(train_matrix, u_idx)
            user_hit_status.append((conc, is_hit))
        
        # 2. New Discovery Metric
        past_items = train_user_items.get(u_idx, set())
        true_new_items = ground_truth_items - past_items
        
        if len(true_new_items) > 0:
            if len(set(recs) & true_new_items) > 0:
                hits_new += 1
            total_users_with_new += 1

    precision_all = hits_all / total_users if total_users > 0 else 0
    precision_new = hits_new / total_users_with_new if total_users_with_new > 0 else 0
    
    print(f"Hit Rate (All Items): {precision_all:.4f}")
    print(f"Hit Rate (New Only):  {precision_new:.4f}")
    
    if visualize:
        fig, axes = plt.subplots(1, 3, figsize=(18, 5))
        fig.suptitle(f"Visual Analysis: {model_name}", fontsize=16)
        
        # --- Visual 1: Hit Rate @ K (Sensitivity) ---
        k_values = [1, 5, 10, 20]
        sens_scores = []
        for kv in k_values:
            # Lightweight calc for sensitivity
            h = 0
            t = 0
            for u_idx, gt in list(test_user_items.items())[:500]: # Sample 500 for speed
                r = recommend_func(u_idx, kv)
                if len(set(r) & gt) > 0: h += 1
                t += 1
            sens_scores.append(h/t if t>0 else 0)
            
        axes[0].plot(k_values, sens_scores, marker='o', color='b')
        axes[0].set_title("Sensitivity: Hit Rate @ K")
        axes[0].set_xlabel("K")
        axes[0].set_ylabel("Hit Rate")
        axes[0].grid(True, alpha=0.3)
        
        # --- Visual 2: Popularity Bias ---
        # Calculate training popularity ranks
        pop_counts = train_df['item_idx'].value_counts().reset_index()
        pop_counts.columns = ['item_idx', 'count']
        pop_counts['rank'] = pop_counts['count'].rank(ascending=False)
        rank_map = pop_counts.set_index('item_idx')['rank'].to_dict()
        
        rec_ranks = [rank_map.get(i, len(rank_map)+1) for i in all_recommendations]
        
        axes[1].hist(rec_ranks, bins=30, color='purple', alpha=0.7, log=True)
        axes[1].set_title("Popularity Bias (Log Scale)")
        axes[1].set_xlabel("Streamer Popularity Rank (1=Top)")
        axes[1].set_ylabel("Freq of Recommendation")
        
        # --- Visual 3: Performance vs Concentration ---
        if user_hit_status:
            x_vals = [x[0] for x in user_hit_status]
            y_vals = [x[1] for x in user_hit_status]
            # Add jitter to y_vals so points don't overlap completely on 0 and 1 lines
            y_jitter = [y + np.random.normal(0, 0.05) for y in y_vals]
            
            axes[2].scatter(x_vals, y_jitter, alpha=0.3, c='teal', s=10)
            axes[2].set_title("Performance vs User Focus")
            axes[2].set_xlabel("Concentration (0=Diverse, 1=Focus)")
            axes[2].set_yticks([0, 1])
            axes[2].set_yticklabels(['Miss', 'Hit'])
        
        plt.tight_layout()
        plt.show()

    return precision_all, precision_new


## 5. Baseline Model: REP (Repeat / Popularity) and POP (Simple Popularity)
These model represents the "Naive" Twitch strategy:
1. Recommend what the user watched most in the past.
2. If we need more items, fill with globally most popular streamers.

In [None]:
class REPModel:
    def __init__(self, train_df, num_items):
        self.num_items = num_items
        
        # Precompute user favorites (sorted by total duration)
        self.user_history = train_df.groupby('user_idx')['item_idx'].apply(
            lambda x: x.value_counts().index.tolist()
        ).to_dict()
        
        # Precompute global popularity
        self.global_popular = train_df['item_idx'].value_counts().index.tolist()
        
    def recommend(self, user_idx, k=10):
        recs = []
        
        # 1. Add History (Repeat)
        if user_idx in self.user_history:
            recs.extend(self.user_history[user_idx][:k])
            
        # 2. Fill with Popular (if needed)
        if len(recs) < k:
            for item in self.global_popular:
                if item not in recs:
                    recs.append(item)
                    if len(recs) >= k:
                        break
        return recs[:k]

print("Training REP Baseline...")
rep_model = REPModel(train_df, num_items)
evaluate_model("REP (Repeat/Popularity) (k=1)", rep_model.recommend, test_df, train_df, k=1)
evaluate_model("REP (Repeat/Popularity) (k=10)", rep_model.recommend, test_df, train_df, k=10)


In [None]:
class POPModel:
    def __init__(self, train_df):
        # Precompute global popularity
        # value_counts() returns items sorted by frequency descending
        self.global_popular = train_df['item_idx'].value_counts().index.tolist()
        
    def recommend(self, user_idx, k=10):
        # Always return top k popular items
        return self.global_popular[:k]

print("Training POP Baseline...")
pop_model = POPModel(train_df)
evaluate_model("POP (Global Popularity) (K=1)", pop_model.recommend, test_df, train_df, k=1)
evaluate_model("POP (Global Popularity) (K=10)", pop_model.recommend, test_df, train_df, k=10)

## 6. Machine Learning Model: ALS (Implicit Matrix Factorization)
We use the `implicit` library. This learns embeddings based on co-occurrence.

In [None]:
if HAS_IMPLICIT:
    print("\nTraining ALS Model...")
    
    # Implicit expects (items x users) usually, but AlternatingLeastSquares varies by version.
    # Modern 'implicit' (0.5+) takes (users x items) in fit() usually, check version.
    # We will use the standard setup: fit(user_item)
    
    # Initialize Model
    # factors=64, regularization=0.05, iterations=20 are standard starting points
    als_model = implicit.als.AlternatingLeastSquares(
        factors=64, 
        regularization=0.05, 
        iterations=20,
        random_state=42
    )
    
    # Train
    # Note: implicit expects (users, items) sparse matrix for training in recent versions
    als_model.fit(train_matrix)
    
    def recommend_als(user_idx, k=10):
        # implicit's recommend function
        # filter_already_liked_items=False allows us to compare fairly with REP (which recommends history)
        # However, for pure discovery, we might want True. 
        # We set False here to allow the model to decide if re-watching is relevant.
        ids, scores = als_model.recommend(
            user_idx, 
            train_matrix[user_idx], 
            N=k, 
            filter_already_liked_items=False 
        )
        return ids
    
    evaluate_model("ALS (Matrix Factorization) (K=1)", recommend_als, test_df, train_df, k=1)
    evaluate_model("ALS (Matrix Factorization) (K = 10)", recommend_als, test_df, train_df, k=10)

    
    # --- ALS (Pure Discovery Mode) ---
    # Let's test ALS forced to explore (filter_already_liked_items=True)
    def recommend_als_discovery(user_idx, k=10):
        ids, scores = als_model.recommend(
            user_idx, 
            train_matrix[user_idx], 
            N=k, 
            filter_already_liked_items=True 
        )
        return ids
        
    evaluate_model("ALS (Discovery Mode) (K=1)", recommend_als_discovery, test_df, train_df, k=1)
    evaluate_model("ALS (Discovery Mode) (K=10)", recommend_als_discovery, test_df, train_df, k=10)


else:
    print("Skipping ALS evaluation (library missing).")



## 7. MF-BPR With Uniform Negative Sampling
Uses a preliminary model of matrix factorization with ranking criteria for from BPR. 

In [None]:
import tensorflow as tf
import numpy as np

train_data = train_df
test_data = test_df

# Pairs of (user_id, streamer_id) in the training data
trainInteractions = list(zip(train_data['user_idx'], train_data['item_idx']))
# For each user id, this gets the set of consumed item ids (streamers they watched)
user_consumed_items = train_data.groupby('user_idx')['item_idx'].apply(set).to_dict()
len(trainInteractions)

# Assuming 'data', 'trainInteractions', and 'user_consumed_items' exist from previous cells
interactionsArr = np.array(trainInteractions)

# --- 2. Negative Sampling Logic (Helper) ---
def sampleNegativeBatch(sampleU, sampleI, num_items, Prepeat=0.5):
    """
    Takes a batch of Users and Items, generates Negatives, and returns (U, I, J).
    """
    batch_size = len(sampleU)
    sampleJ = np.zeros(batch_size, dtype=np.int64)

    for k in range(batch_size):
        u = sampleU[k]
        i = sampleI[k]

        # Logic: Try to sample from history (Repeat)
        if np.random.rand() < Prepeat:
            consumed = user_consumed_arrays[u]
            if len(consumed) > 1 or (len(consumed) == 1 and consumed[0] != i):
                j = np.random.choice(consumed)
                while j == i:
                    j = np.random.choice(consumed)
                sampleJ[k] = j
                continue

        # Logic: Novel Sample
        j = np.random.randint(0, num_items)
        while j == i:
            j = np.random.randint(0, num_items)
        sampleJ[k] = j

    return sampleU, sampleI, sampleJ

def negativeSampling(u_batch, i_batch):
    # This wrapper allows the dataset pipeline to call our Python function
    # It takes (U, I) and returns (U, I, J)
    u_out, i_out, j_out = tf.numpy_function(
        func=lambda u, i: sampleNegativeBatch(u, i, num_items, Prepeat=0.5),
        inp=[u_batch, i_batch],
        Tout=[tf.int64, tf.int64, tf.int64]
    )
    return u_out, i_out, j_out

class MFModel(tf.keras.Model):
    def __init__(self, K, lamb):
        super(MFModel, self).__init__()
        # Initialize with stddev=0.1 for better learning signal
        self.betaI = tf.Variable(tf.random.normal([num_items],stddev=0.001))
        self.gammaU = tf.Variable(tf.random.normal([num_users, K], stddev=0.1))
        self.gammaI = tf.Variable(tf.random.normal([num_items, K], stddev=0.1))
        self.lamb = lamb

    def predict(self, u, i):
        p = self.betaI[i] + tf.tensordot(self.gammaU[u], self.gammaI[i], 1)
        return p

    def recommend(self, u, N=10):
        u = tf.convert_to_tensor(u, dtype=tf.int64)
        gamma_u = tf.nn.embedding_lookup(self.gammaU, u)
        interaction_scores = tf.matmul(self.gammaI, tf.expand_dims(gamma_u, axis=-1))
        scores = self.betaI + tf.squeeze(interaction_scores, axis=-1)
        top_N = tf.math.top_k(scores, k=N)
        return top_N.indices.numpy(), top_N.values.numpy()

    def reg(self):
        return self.lamb * (tf.nn.l2_loss(self.betaI) +\
                    tf.nn.l2_loss(self.gammaU) +\
                    tf.nn.l2_loss(self.gammaI))

    def score(self, sampleU, sampleI):
        u = tf.convert_to_tensor(sampleU, dtype=tf.int64)
        i = tf.convert_to_tensor(sampleI, dtype=tf.int64)
        beta_i = tf.nn.embedding_lookup(self.betaI, i)
        gamma_u = tf.nn.embedding_lookup(self.gammaU, u)
        gamma_i = tf.nn.embedding_lookup(self.gammaI, i)
        x_ui = beta_i + tf.reduce_sum(tf.multiply(gamma_u, gamma_i), 1)
        return x_ui

    def call(self, inputs):
        sampleU, sampleI, sampleJ = inputs
        x_ui = self.score(sampleU, sampleI)
        x_uj = self.score(sampleU, sampleJ)
        return -tf.reduce_mean(tf.math.log(tf.math.sigmoid(x_ui - x_uj)))

    # --- NEW: Custom Training Step for model.fit() ---
    def train_step(self, data):
        # Unpack the data tuple (u, i, j) provided by the dataset
        u, i, j = data

        with tf.GradientTape() as tape:
            loss = self.call((u, i, j))
            loss += self.reg()

        # Compute gradients
        trainable_vars = self.trainable_variables
        gradients = tape.gradient(loss, [self.betaI, self.gammaU, self.gammaI])

        # Update weights
        self.optimizer.apply_gradients(zip(gradients, [self.betaI, self.gammaU, self.gammaI]))

        # Return metrics
        return {"loss": loss}
    
# --- 1. Data Preparation ---
print("Converting user history to arrays...")
user_consumed_arrays = {}
for u in user_consumed_items:
    user_consumed_arrays[u] = np.array(list(user_consumed_items[u]), dtype=np.int64)

# Prepare positive training data
train_u = interactionsArr[:, 0]
train_i = interactionsArr[:, 1]

# --- 2. Create High-Performance Dataset ---
BATCH_SIZE = 2**13  # 16,384

# 1. Create dataset of (User, Item)
dataset = tf.data.Dataset.from_tensor_slices((train_u, train_i))
dataset = dataset.shuffle(buffer_size=100000, reshuffle_each_iteration=True)

# 2. Batch FIRST (so we process 16k items in Python at once, not 1 at a time)
dataset = dataset.batch(BATCH_SIZE)

# 3. Map (User, Item) -> (User, Item, NegativeItem)
dataset = dataset.map(tf_negative_sampling, num_parallel_calls=tf.data.AUTOTUNE)
dataset = dataset.prefetch(tf.data.AUTOTUNE)

# --- 3. Model Initialization ---
LAMB = 0.0001
LR = 0.0005
EPOCHS = 10

# Re-init model class (Use the corrected one from previous turn!)
model = MFModel(K=20, lamb=LAMB)
model.compile(optimizer=tf.keras.optimizers.Adam(LR))

# --- 4. Training Loop ---
print("Starting training...")

history = model.fit(dataset, epochs=EPOCHS)

def MFRecommend(user_idx, k=10):
    # implicit's recommend function
    ids, scores = model.recommend(
        user_idx, 
        k, 
    )
    return ids
evaluate_model("MF (BPR-NegSample) Model (K=1)", MFRecommend, test_df, train_df, k=1)
evaluate_model("MF (BPR-NegSample) Model (K=10)", MFRecommend, test_df, train_df, k=10)

## 8. Hybrid Model (Stochastic Mix)
Mixes REP (Retention) and ALS Discovery (Exploration).
For each slot, flips a coin: >0.5 takes from REP, else from ALS.

In [None]:
class HybridRandomModel:
    def __init__(self, rep_model, als_func):
        self.rep_model = rep_model
        self.als_func = als_func
        
    def recommend(self, user_idx, k=10):
        # Fetch buffers from both sources (get k from each to ensure enough candidates)
        rep_candidates = self.rep_model.recommend(user_idx, k=k)
        als_candidates = self.als_func(user_idx, k=k)
        
        recs = []
        seen = set()
        
        rep_ptr = 0
        als_ptr = 0
        
        # Fill k slots
        while len(recs) < k:
            # If both exhausted, stop
            if rep_ptr >= len(rep_candidates) and als_ptr >= len(als_candidates):
                break
                
            choice = random.random()
            use_rep = False
            
            # Decision Logic
            if choice > 0.4:
                if rep_ptr < len(rep_candidates):
                    use_rep = True
                else:
                    use_rep = False # Fallback to ALS
            else:
                if als_ptr < len(als_candidates):
                    use_rep = False
                else:
                    use_rep = True # Fallback to REP
            
            # Selection
            item = None
            if use_rep:
                item = rep_candidates[rep_ptr]
                rep_ptr += 1
            else:
                item = als_candidates[als_ptr]
                als_ptr += 1
            
            # Deduplicate
            if item not in seen:
                recs.append(item)
                seen.add(item)
                
        return recs

hybrid_model = HybridRandomModel(rep_model, recommend_als_discovery)

print("\n--- Evaluating Hybrid Model (REP + ALS Discovery) ---")
evaluate_model("Hybrid Random (K=1)", hybrid_model.recommend, test_df, train_df, k=1)
evaluate_model("Hybrid Random (K=10)", hybrid_model.recommend, test_df, train_df, k=10)


## 9. Hybrid Model with Diversity/Concentration Analysis Function
This diversity function calculates a "Concentration Score" based on the Herfindahl-Hirschman Index (HHI).
1.0 -> Only watches one streamer (High concentration)
0.0 -> Watches many streamers equally (Low concentration)

We then use this in the Hybrid Model to determine how often REP or ASL-discovery is used

In [None]:
class WeightedHybridModel:
    def __init__(self, rep_model, als_func, train_matrix):
        self.rep_model = rep_model
        self.als_func = als_func
        self.train_matrix = train_matrix
        
    def recommend(self, user_idx, k=10):
        # Fetch buffers from both sources (get k from each to ensure enough candidates)
        rep_candidates = self.rep_model.recommend(user_idx, k=k)
        als_candidates = self.als_func(user_idx, k=k)
        diversity_score = calculate_user_concentration(train_matrix, user_idx)
        
        recs = []
        seen = set()
        
        rep_ptr = 0
        als_ptr = 0
        
        # Fill k slots
        while len(recs) < k:
            # If both exhausted, stop
            if rep_ptr >= len(rep_candidates) and als_ptr >= len(als_candidates):
                break
                
            choice = random.random()
            use_rep = False
            
            # Decision Logic
            if choice < diversity_score:
                if rep_ptr < len(rep_candidates):
                    use_rep = True
                else:
                    use_rep = False # Fallback to ALS
            else:
                if als_ptr < len(als_candidates):
                    use_rep = False
                else:
                    use_rep = True # Fallback to REP
            
            # Selection
            item = None
            if use_rep:
                item = rep_candidates[rep_ptr]
                rep_ptr += 1
            else:
                item = als_candidates[als_ptr]
                als_ptr += 1
            
            # Deduplicate
            if item not in seen:
                recs.append(item)
                seen.add(item)
                
        return recs

w_hybrid_model = WeightedHybridModel(rep_model, recommend_als_discovery, train_matrix)

print("\n--- Evaluating WeightedHybridModel (REP + ALS Discovery) ---")
evaluate_model("Weighted Hybrid (K=1)", w_hybrid_model.recommend, test_df, train_df, k=1)
evaluate_model("Weighted Hybrid (K=10)", w_hybrid_model.recommend, test_df, train_df, k=10)


## 10. Summary & Interpretation

**Expected Results:**
1. **Hit Rate (All Items):** **REP** should effectively tie or beat ALS. In our dataset, about 50% of consumption is re-watching. A model that simply says "watch what you watched yesterday" is incredibly hard to beat for general engagement.
2. **Hit Rate (New Only):** **ALS (Discovery Mode)** should crush REP. REP relies on history; it fails to find *new* items (except via crude global popularity). ALS uses collaborative filtering ("Users like you watched X") to find specific, niche new streamers for the user.
3. **Hybrid Model:** Should sit between REP and ALS, offering a balanced trade-off between retention (All Items Hit Rate) and exploration (New Items Hit Rate).