<a href="https://colab.research.google.com/github/Saurabh1222/HybridRecommendationSystem/blob/main/HybridRecommendationSystem.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# The Core Model: Hybrid Filtering
The primary goal is to combine the strengths of two fundamental recommendation approaches:

### Content-Based Filtering (CBF): Recommends items similar to what the user has liked in the past.

- Data Used: Restaurant/Dish features (cuisine, cost, ingredients, veg/non-veg, restaurant tags, menu descriptions).

- Goal: Recommending a new North Indian restaurant to a user who frequently orders from different North Indian places.

### Collaborative Filtering (CF): Recommends items that similar users have liked.

- Data Used: User-Item interaction data (ratings, past orders, clicks, searches).

- Goal: Recommending a Biryani to a user whose peer group (users with similar ordering history) frequently orders that specific Biryani.

## Data Collection & Preprocessing:

- Acquire or simulate a dataset containing User IDs, Restaurant IDs, Dish IDs, Ratings/Reviews, Order History, Cuisine Type, Cost, and Timestamps.

- Clean and transform raw data into a suitable format (e.g., a sparse User-Item interaction matrix).

In [9]:
#%%writefile data_generator.py
# Data Generator
import pandas as pd
import numpy as np

def generate_synthetic_data(num_users=100, num_restaurants=50, max_orders=500):
    """
    Generates synthetic data for User Orders and Restaurant Features.
    """
    np.random.seed(42)

    # 1. User-Item Interaction Data (Orders/Ratings)
    user_ids = np.random.randint(1, num_users + 1, max_orders)
    restaurant_ids = np.random.randint(1, num_restaurants + 1, max_orders)
    ratings = np.random.randint(1, 6, max_orders) # 1 to 5 star rating
    timestamps = pd.to_datetime('2024-01-01') + pd.to_timedelta(np.random.randint(0, 365, max_orders), unit='D')

    orders_df = pd.DataFrame({
        'user_id': user_ids,
        'item_id': restaurant_ids,
        'rating': ratings,
        'timestamp': timestamps
    })

    # 2. Restaurant Feature Data (for Content-Based Filtering)
    cuisines = ['Indian', 'Chinese', 'Italian', 'Mexican', 'Fast Food']
    features = ['spicy', 'vegetarian', 'delivery-focused', 'budget', 'premium']

    features_data = {
        'item_id': np.arange(1, num_restaurants + 1),
        'cuisine': np.random.choice(cuisines, num_restaurants),
        'tags': [', '.join(np.random.choice(features, np.random.randint(1, 4), replace=False)) for _ in range(num_restaurants)],
        'avg_price': np.round(np.random.uniform(200, 800, num_restaurants), 0)
    }
    features_df = pd.DataFrame(features_data)

    return orders_df, features_df


orders, features = generate_synthetic_data()
print("--- Orders Data Head ---")
print(orders.head())
print("\n--- Features Data Head ---")
print(features.head())
# Save the data for use in other modules
orders.to_csv('orders.csv', index=False)
features.to_csv('features.csv', index=False)

--- Orders Data Head ---
   user_id  item_id  rating  timestamp
0       52       12       2 2024-01-29
1       93       26       1 2024-11-13
2       15       16       4 2024-09-01
3       72       37       3 2024-08-23
4       61       22       2 2024-07-03

--- Features Data Head ---
   item_id    cuisine                                   tags  avg_price
0        1    Chinese  premium, vegetarian, delivery-focused      699.0
1        2     Indian                             vegetarian      602.0
2        3  Fast Food    vegetarian, delivery-focused, spicy      242.0
3        4    Mexican                spicy, delivery-focused      584.0
4        5    Chinese           vegetarian, delivery-focused      363.0


## Content Based Filtering
### Generating recommendations based on restaurant features using TF-IDF and Cosine Similarity.

In [17]:
#%%writefile content_filter.py
import pandas as pd
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity

class ContentBasedRecommender:
    def __init__(self, features_df):
        """
        Initializes the Content-Based Recommender.
        """
        self.features_df = features_df.set_index('item_id')
        self.vectorizer = TfidfVectorizer(stop_words='english')
        self.cosine_sim = None
        self._build_model()

    def _build_model(self):
        """
        Processes features (cuisine + tags) and computes the Cosine Similarity matrix.
        """
        # Combine relevant text features into one string
        self.features_df['combined_features'] = (
            self.features_df['cuisine'].str.strip() + ' ' + self.features_df['tags'].str.strip()
        )

        # Create the TF-IDF matrix
        tfidf_matrix = self.vectorizer.fit_transform(self.features_df['combined_features'])

        # Compute the cosine similarity matrix
        self.cosine_sim = cosine_similarity(tfidf_matrix, tfidf_matrix)

    def get_recommendations(self, item_id, top_n=10):
        """
        Recommends items similar to the given item_id.
        Returns a list of (item_id, similarity_score) tuples.
        """
        if item_id not in self.features_df.index:
            return []

        # Get the index of the item that matches the item_id
        idx = self.features_df.index.get_loc(item_id)

        # Get the pairwise similarity scores
        sim_scores = list(enumerate(self.cosine_sim[idx]))

        # Sort the restaurants based on the similarity scores
        sim_scores = sorted(sim_scores, key=lambda x: x[1], reverse=True)

        # Get the scores of the top_n most similar items (excluding itself)
        sim_scores = sim_scores[1:top_n+1]

        # Get the item indices and scores
        item_indices = [i[0] for i in sim_scores]
        scores = [i[1] for i in sim_scores]

        # Map back to item_id
        recommended_items = self.features_df.iloc[item_indices].index.tolist()

        return list(zip(recommended_items, scores))


# Load data generated in the previous step
features_df = pd.read_csv('features.csv')

cbf_recommender = ContentBasedRecommender(features_df)

# Get recommendations for item_id 1
item_id_to_test = 1
recommendations = cbf_recommender.get_recommendations(item_id_to_test)

print(f"\n--- Content-Based Recommendations for Item {item_id_to_test} ---")
print(f"Top 5 Recommendations: {recommendations[:5]}")


--- Content-Based Recommendations for Item 1 ---
Top 5 Recommendations: [(5, 0.895913640502298), (29, 0.8042668552056544), (37, 0.7986381544578232), (17, 0.7840524295710705), (34, 0.7840524295710705)]


## Collaborative based filtering
Use the Surprise library to implement a collaborative filtering model (SVD).

In [4]:
#!pip install numpy==1.26.4

In [None]:
!pip install scikit-surprise

In [14]:
#%%writefile collaborative_filter.py
import pandas as pd
from surprise import Reader, Dataset, SVD
from surprise.model_selection import train_test_split
from surprise import accuracy

class CollaborativeRecommender:
    def __init__(self, orders_df):
        """
        Initializes and trains the Collaborative Filtering model (SVD).
        """
        self.orders_df = orders_df
        # Surprise requires data in a specific format (User, Item, Rating)
        self.reader = Reader(rating_scale=(1, 5))
        self.data = Dataset.load_from_df(
            self.orders_df[['user_id', 'item_id', 'rating']],
            self.reader
        )
        self.trainset = self.data.build_full_trainset()
        self.model = SVD(n_epochs=20, n_factors=50, random_state=42)
        self._train_model()

    def _train_model(self):
        """
        Trains the SVD model on the full dataset.
        """
        self.model.fit(self.trainset)

        # Optional: Evaluate the model on a test set (not strictly needed for production prediction)
        # trainset, testset = train_test_split(self.data, test_size=0.2, random_state=42)
        # self.model.fit(trainset)
        # predictions = self.model.test(testset)
        # print(f"CF Model RMSE: {accuracy.rmse(predictions, verbose=False)}")


    def predict_score(self, user_id, item_id):
        """
        Predicts the rating a user would give to an item.
        """
        # The estimate (est) is the predicted rating.
        prediction = self.model.predict(user_id, item_id)
        return prediction.est

    def get_top_n_recommendations(self, user_id, all_item_ids, top_n=10):
        """
        Generates top N item recommendations for a given user.
        """
        # Get items the user hasn't rated/ordered yet
        user_ordered_items = self.orders_df[self.orders_df['user_id'] == user_id]['item_id'].unique()
        items_to_predict = [item for item in all_item_ids if item not in user_ordered_items]

        # Predict ratings for all potential items
        predictions = []
        for item_id in items_to_predict:
            score = self.predict_score(user_id, item_id)
            predictions.append((item_id, score))

        # Sort by predicted score
        predictions.sort(key=lambda x: x[1], reverse=True)

        return predictions[:top_n]


# Load data
orders_df = pd.read_csv('orders.csv')
all_item_ids = orders_df['item_id'].unique().tolist()

cf_recommender = CollaborativeRecommender(orders_df)

# Get recommendations for user 10
user_id_to_test = 10
recommendations = cf_recommender.get_top_n_recommendations(user_id_to_test, all_item_ids)

print(f"\n--- Collaborative Filtering Recommendations for User {user_id_to_test} ---")
print(f"Top 5 Recommendations: {recommendations[:5]}")


--- Collaborative Filtering Recommendations for User 10 ---
Top 5 Recommendations: [(35, 3.684305626538732), (42, 3.367812693564362), (39, 3.361375030569888), (4, 3.315204626329262), (30, 3.2882940994001255)]


## Hybrid Recommendation
This is the main module that integrates the two models and applies contextual filtering.

### Develop boosting/penalizing logic to adjust the final recommendation score based on real-time factors.

In [20]:
import pandas as pd
import numpy as np
from content_filter import ContentBasedRecommender
from collaborative_filter import CollaborativeRecommender
from datetime import datetime

class HybridRecommender:
    def __init__(self, orders_df, features_df):
        """
        Initializes the Hybrid Recommender with both models.
        """
        self.cbf = ContentBasedRecommender(features_df)
        self.cf = CollaborativeRecommender(orders_df)
        self.all_item_ids = orders_df['item_id'].unique().tolist()

        # Hyperparameter for blending the two scores
        self.cf_weight = 0.6  # 60% Collaborative, 40% Content-Based
        self.cbf_weight = 1.0 - self.cf_weight

    def _get_time_factor(self, current_time):
        """
        Simple contextual logic: Boost certain cuisines based on time of day.
        Returns a dictionary of {item_id: score_multiplier}.
        """
        hour = current_time.hour

        # Example logic: Boost 'Indian' cuisine for Lunch (12-2 PM)
        time_boost_factor = 1.15 if 12 <= hour < 14 else 1.0


        time_boosts = {}
        for item_id in self.all_item_ids:
            # Placeholder for actual cuisine lookup
            is_indian = item_id <= 10 # Example: Items 1-10 are 'Indian'
            time_boosts[item_id] = time_boost_factor if is_indian else 1.0

        return time_boosts

    def recommend(self, user_id, last_ordered_item, top_n=10, current_time=None):
        """
        Generates hybrid and contextually filtered recommendations.
        """
        if current_time is None:
            current_time = datetime.now()

        # 1. Get CF scores (predicted ratings)
        cf_predictions = self.cf.get_top_n_recommendations(user_id, self.all_item_ids, top_n=len(self.all_item_ids))
        cf_scores = {item: score for item, score in cf_predictions}

        # 2. Get CBF scores (similarity to last ordered item)
        cbf_scores_list = self.cbf.get_recommendations(last_ordered_item, top_n=len(self.all_item_ids))
        cbf_scores = {item: score for item, score in cbf_scores_list}

        # 3. Apply Time-based Contextual Filtering
        time_factors = self._get_time_factor(current_time)

        final_recommendations = {}
        for item_id in self.all_item_ids:
            # Check if item has a CF score (it always should, but good practice)
            if item_id in cf_scores:
                # Get the scores, using 0 if CBF score is missing (i.e., item is not similar to last order)
                cf_score = cf_scores.get(item_id, 0) / 5.0 # Normalize CF rating (1-5) to (0-1)
                cbf_score = cbf_scores.get(item_id, 0)     # CBF score is already (0-1)

                # **Hybrid Blending Formula**
                hybrid_score = (self.cf_weight * cf_score) + (self.cbf_weight * cbf_score)

                # **Contextual Adjustment**
                contextual_factor = time_factors.get(item_id, 1.0)
                final_score = hybrid_score * contextual_factor

                final_recommendations[item_id] = final_score

        # Sort and return top N
        sorted_recommendations = sorted(final_recommendations.items(), key=lambda item: item[1], reverse=True)
        return sorted_recommendations[:top_n]


# Load data
orders_df = pd.read_csv('orders.csv')
features_df = pd.read_csv('features.csv')

# Initialize the system
hybrid_system = HybridRecommender(orders_df, features_df)

# Test case: Recommend for User 10, whose last order was Item 5
user_id_test = 10
last_item_test = 5

print(f"\n--- Hybrid & Contextual Recommendations for User {user_id_test} ---")

# Simulate a Lunchtime request (1 PM)
lunch_time = datetime(2025, 1, 1, 13, 0, 0)
recs_lunch = hybrid_system.recommend(user_id_test, last_item_test, top_n=5, current_time=lunch_time)
print(f"\n[Lunchtime (1 PM) - Boosts Indian (ID <= 10)]")
for item_id, score in recs_lunch:
    print(f"Item ID: {item_id}, Score: {score:.4f}")

# Simulate an Off-peak request (4 PM)
off_peak_time = datetime(2025, 1, 1, 16, 0, 0)
recs_offpeak = hybrid_system.recommend(user_id_test, last_item_test, top_n=5, current_time=off_peak_time)
print(f"\n[Off-Peak (4 PM) - No Special Boost]")
for item_id, score in recs_offpeak:
    print(f"Item ID: {item_id}, Score: {score:.4f}")


--- Hybrid & Contextual Recommendations for User 10 ---
4
[(0, 0.895913640502298), (1, 0.27295905888710303), (2, 0.5060907449541091), (3, 0.3921942934369847), (4, 1.0), (5, 0.22292831253440754), (6, 0.0), (7, 0.0), (8, 0.3572280245027837), (9, 0.17958194072706923), (10, 0.21712434685588702), (11, 0.19501933924673487), (12, 0.27295905888710303), (13, 0.0), (14, 0.3921942934369847), (15, 0.34922795063564316), (16, 0.8751428643630069), (17, 0.0), (18, 0.18467307799353183), (19, 0.35205983720306755), (20, 0.2578491522534077), (21, 0.5502524723326633), (22, 0.5162165119877451), (23, 0.0), (24, 0.42497077280470674), (25, 0.18467307799353183), (26, 0.3935503809390608), (27, 0.0), (28, 0.6795588029593385), (29, 0.35058100095065503), (30, 0.6036175177585501), (31, 0.18025041208378145), (32, 0.3806369112215798), (33, 0.8751428643630069), (34, 0.8751428643630069), (35, 0.5502524723326633), (36, 0.8914231443223288), (37, 0.0), (38, 0.36059254731655405), (39, 0.33088629770720684), (40, 0.0), (41, 

Evaluation:

Evaluated the model using metrics like Precision@k, Recall@k, and Root Mean Square Error (RMSE).

A critical real-world metric would be Click-Through Rate (CTR) or Conversion Rate (Orders placed) for the recommended items.

In [21]:
import pandas as pd
import numpy as np
from surprise import Reader, Dataset, SVD, accuracy
from surprise.model_selection import train_test_split, KFold
from collections import defaultdict # Used for Top-N metrics

# Helper function for calculating Precision and Recall (Standard Surprise pattern)
def get_top_n(predictions, n=10):
    """Return the top N recommendation for each user from a set of predictions."""

    # First map the predictions to each user.
    top_n = defaultdict(list)
    for uid, iid, true_r, est, _ in predictions:
        top_n[uid].append((iid, est))

    # Then sort the predictions for each user and retrieve the k highest ones.
    for uid, user_ratings in top_n.items():
        user_ratings.sort(key=lambda x: x[1], reverse=True)
        top_n[uid] = user_ratings[:n]

    return top_n

def precision_recall_at_k(predictions, k=10, threshold=3.5):
    """
    Return precision and recall at k, calculated using a relevance threshold.
    A rating is considered 'relevant' if it is above the threshold.
    """
    user_est_true = defaultdict(list)
    for uid, _, true_r, est, _ in predictions:
        user_est_true[uid].append((est, true_r))

    precisions = dict()
    recalls = dict()
    for uid, user_ratings in user_est_true.items():
        # Sort user ratings by estimated value
        user_ratings.sort(key=lambda x: x[0], reverse=True)

        # Number of relevant items
        n_rel = sum((true_r >= threshold) for (_, true_r) in user_ratings)

        # Number of recommended items in top k
        n_rec_k = sum((est >= 0) for (est, _) in user_ratings[:k])

        # Number of relevant recommended items in top k
        n_rel_and_rec_k = sum(((true_r >= threshold) and (est >= 0))
                              for (est, true_r) in user_ratings[:k])

        # Precision@k: Proportion of recommended items that are relevant
        precisions[uid] = n_rel_and_rec_k / n_rec_k if n_rec_k != 0 else 1

        # Recall@k: Proportion of relevant items that are recommended
        recalls[uid] = n_rel_and_rec_k / n_rel if n_rel != 0 else 1

    return precisions, recalls


class RecommenderEvaluator:
    def __init__(self, orders_df):
        self.orders_df = orders_df
        self.reader = Reader(rating_scale=(1, 5))
        self.data = Dataset.load_from_df(
            self.orders_df[['user_id', 'item_id', 'rating']],
            self.reader
        )
        self.all_item_ids = self.orders_df['item_id'].unique().tolist()

    def evaluate_collaborative_filter(self, model_class=SVD, test_size=0.2):
        """
        Trains and evaluates a Collaborative Filtering model using RMSE/MAE.
        """
        print(f"--- Evaluating Collaborative Filtering Model ({model_class.__name__}) ---")

        # Split data into training and testing sets
        trainset, testset = train_test_split(self.data, test_size=test_size, random_state=42)

        # Initialize and train the model
        model = model_class(random_state=42)
        model.fit(trainset)

        # Predict ratings on the test set
        predictions = model.test(testset)

        # Compute metrics
        rmse = accuracy.rmse(predictions, verbose=False)
        mae = accuracy.mae(predictions, verbose=False)

        print(f"  Root Mean Square Error (RMSE): {rmse:.4f}")
        print(f"  Mean Absolute Error (MAE): {mae:.4f}")

        return {'RMSE': rmse, 'MAE': mae}

    def evaluate_top_n(self, model_class=SVD, k=5, threshold=3.5):
        """
        Performs robust Top-N evaluation using K-Fold cross-validation
        and the helper function pattern for Precision and Recall.
        """
        print(f"\n--- Evaluating Top-{k} Recommendation Quality (Precision/Recall) ---")
        print(f"Using relevance threshold: {threshold} (e.g., ratings >= 4 are 'relevant')")

        kf = KFold(n_splits=3, random_state=42, shuffle=True)
        model = model_class(random_state=42)

        all_precisions = []
        all_recalls = []

        for trainset, testset in kf.split(self.data):
            # Train model
            model.fit(trainset)

            # Get predictions on the test set
            predictions = model.test(testset)

            # Calculate Precision and Recall for this fold
            precisions, recalls = precision_recall_at_k(predictions, k=k, threshold=threshold)

            # Store average precision and recall for this fold
            all_precisions.append(sum(prec for prec in precisions.values()) / len(precisions))
            all_recalls.append(sum(rec for rec in recalls.values()) / len(recalls))

        # Aggregate results
        avg_precision = np.mean(all_precisions)
        avg_recall = np.mean(all_recalls)

        # F1 Score is the harmonic mean
        f1_score = 2 * (avg_precision * avg_recall) / (avg_precision + avg_recall) if (avg_precision + avg_recall) > 0 else 0

        print(f"  Average Precision@{k}: {avg_precision:.4f}")
        print(f"  Average Recall@{k}: {avg_recall:.4f}")
        print(f"  F1 Score@{k}: {f1_score:.4f}")

        return {'Precision': avg_precision, 'Recall': avg_recall, 'F1': f1_score}



evaluator = RecommenderEvaluator(orders_df)

# 1. Evaluate prediction accuracy
rating_metrics = evaluator.evaluate_collaborative_filter(model_class=SVD)

# 2. Evaluate recommendation ranking quality (Top-5 items)
ranking_metrics = evaluator.evaluate_top_n(model_class=SVD, k=5, threshold=4)

--- Evaluating Collaborative Filtering Model (SVD) ---
  Root Mean Square Error (RMSE): 1.4851
  Mean Absolute Error (MAE): 1.2670

--- Evaluating Top-5 Recommendation Quality (Precision/Recall) ---
Using relevance threshold: 4 (e.g., ratings >= 4 are 'relevant')
  Average Precision@5: 0.3715
  Average Recall@5: 0.9917
  F1 Score@5: 0.5406


## A/B Test Statistical Analysis Code

In [22]:
import numpy as np
import pandas as pd
from statsmodels.stats.proportion import proportions_ztest

def analyze_ab_test_results(group_a_users, group_a_conversions, group_b_users, group_b_conversions, alpha=0.05):
    """
    Performs a two-proportion Z-test to determine if the difference
    in conversion rates between Group A and Group B is statistically significant.

    Args:
        group_a_users (int): Total number of users in the control group (A).
        group_a_conversions (int): Total number of conversions (orders) in group A.
        group_b_users (int): Total number of users in the treatment group (B).
        group_b_conversions (int): Total number of conversions (orders) in group B.
        alpha (float): The significance level (default is 0.05).
    """

    # 1. Define the input data for the Z-test
    count = np.array([group_a_conversions, group_b_conversions])
    nobs = np.array([group_a_users, group_b_users])

    # 2. Perform the two-proportion Z-test
    # The proportions_ztest function returns the z-statistic and the two-sided p-value
    z_stat, p_value = proportions_ztest(count, nobs, alternative='two-sided')

    # 3. Calculate metrics for reporting
    cr_a = group_a_conversions / group_a_users
    cr_b = group_b_conversions / group_b_users

    # 4. Determine significance
    is_significant = p_value < alpha

    # 5. Output Results
    print("--- A/B Test Analysis: Hybrid Recommender vs. Legacy ---")
    print(f"Significance Level (alpha): {alpha}")
    print("\n## ðŸ“Š Observed Metrics")
    print("-" * 35)
    print(f"Group A (Legacy) CR: {cr_a:.4f} ({group_a_conversions} / {group_a_users})")
    print(f"Group B (Hybrid) CR: {cr_b:.4f} ({group_b_conversions} / {group_b_users})")
    print(f"Observed Lift (B vs A): {((cr_b - cr_a) / cr_a) * 100:.2f}%")
    print("-" * 35)

    print("\n## ðŸ”¬ Statistical Test Results")
    print(f"Z-statistic: {z_stat:.4f}")
    print(f"P-value: {p_value:.6f}")

    print("\n## âœ… Conclusion")
    if is_significant and cr_b > cr_a:
        print(f"Result: SUCCESS! (P-value < {alpha})")
        print("Action: **Reject the Null Hypothesis.** The Hybrid System (B) drives a statistically significant higher conversion rate. Proceed with full rollout.")
    elif is_significant and cr_a > cr_b:
        print(f"Result: FAILURE! (P-value < {alpha})")
        print("Action: **Reject the Null Hypothesis.** The Hybrid System (B) is significantly worse. Stop the test and investigate the cause.")
    else:
        print(f"Result: INCONCLUSIVE (P-value >= {alpha})")
        print("Action: **Fail to Reject the Null Hypothesis.** The observed difference is likely due to chance. Iterate on the model or run the test longer.")


# --- Hypothetical Data from Previous Example ---
# Assume 100,000 users were shown the recommendations in each group
N_users = 100000

# Group A: Legacy System (Control)
A_conversions = 18000 # 18.0% CR

# Group B: Hybrid System (Treatment)
B_conversions = 21500 # 21.5% CR

# Run the analysis
analyze_ab_test_results(
    group_a_users=N_users,
    group_a_conversions=A_conversions,
    group_b_users=N_users,
    group_b_conversions=B_conversions
)

--- A/B Test Analysis: Hybrid Recommender vs. Legacy ---
Significance Level (alpha): 0.05

## ðŸ“Š Observed Metrics
-----------------------------------
Group A (Legacy) CR: 0.1800 (18000 / 100000)
Group B (Hybrid) CR: 0.2150 (21500 / 100000)
Observed Lift (B vs A): 19.44%
-----------------------------------

## ðŸ”¬ Statistical Test Results
Z-statistic: -19.6583
P-value: 0.000000

## âœ… Conclusion
Result: SUCCESS! (P-value < 0.05)
Action: **Reject the Null Hypothesis.** The Hybrid System (B) drives a statistically significant higher conversion rate. Proceed with full rollout.
