## Assignment 3 - Task 3

### \<Yunhao Huang> \<a1952404>

This notebook presents the *individual* component for **Task 3**. As required, all inputs from Task 1 (association-rule mining) and Task 2 (collaborative filtering) are mocked using *stub functions*. This allows the integration logic of Task 3 to be developed, executed, and assessed independently.

## 1.  Introduction & Objectives

The Assignment 3 business scenario focuses on a grocery store that wants to enhance its product recommendations. Grocery shopping typically mixes **routine purchases**—where strong co-occurrence patterns exist—and **discovery-oriented purchases**, where personalised, similarity-based suggestions add value. Task 3 therefore explores how to build a **robust and versatile recommender** by combining complementary algorithmic strategies for these diverse customer needs.

A hybrid approach is motivated by the observation that standalone techniques have **complementary strengths and weaknesses**. By integrating pattern mining with collaborative filtering, we aim to produce recommendations that are *more accurate*, *less sensitive to sparsity or cold-start*, and *richer in variety* than either method alone.

**Goals of this notebook (Task 3):**

1. **Design and prototype** a hybrid recommendation algorithm that integrates  
   * frequent item co-occurrence patterns (simulated Task 1 output), and  
   * user/basket similarity signals (simulated Task 2 collaborative-filter output).
2. Back up the design with a concise **literature review** covering the two provided papers plus one additional study on frequent-pattern-based recommendation.
3. **Implement** the core hybrid logic in Python, keeping it **testable with stubbed inputs** as required for individual submission.
4. **Rationalise** all design choices, explicitly linking them to insights drawn from the literature.
5. **Discuss** expected benefits, limitations, and future enhancements—particularly scalability to ≈1 million transactions, as outlined in the business brief.


## 2.  Literature Review  

This section surveys research on hybrid recommenders that combine **association-rule mining (ARM)** with **collaborative filtering (CF)**. Key insights shape the design choices made in Task 3.

### 2.1  Lee et al. (2001)  
*Lee, C-H., Kim, Y-H., & Rhee, P-K. (2001).* *Web personalization expert with combining collaborative filtering and association rule mining technique.* *Expert Systems with Applications, 21*(3), 131-137.*

**Summary & Methodology** – A sequential hybrid:  
1. CF selects a *nearest-neighbour* user segment.  
2. Apriori mines rules **only within that segment**, yielding context-specific patterns.  
3. Recommendations are generated from those rules, optionally refined by CF ratings.

**Insight for Task 3** – Shows how CF can focus ARM on relevant data. In our hybrid, basket similarity (Task 2) can similarly weight or filter rules from Task 1 for greater personalisation.

---

### 2.2  Parvatikar & Joshi (2015)  
*Parvatikar, S., & Joshi, B. (2015).* *Online book recommendation system by using collaborative filtering and association mining.* In *Proc. IEEE ICCIC 2015.*

**Summary & Methodology** – Tackles CF sparsity:  
1. Compute item-based similarities.  
2. Use ARM to **impute missing ratings**, densifying the user–item matrix.  
3. Run item-based CF on the enriched matrix for final predictions.

**Insight for Task 3** – Illustrates ARM as a **data-enhancement** layer. Even with implicit baskets, strong co-occurrence rules can mitigate sparsity and enrich the CF signal.

---

### 2.3. Additional Academic Source: Tang, L., Zhang, L., Luo, P., & Wang, M. (2012)

* **Paper**: Tang, L., Zhang, L., Luo, P., & Wang, M. (2012). *Incorporating Occupancy into Frequent Pattern Mining for High‑Quality Pattern Recommendation*. In *Rough Sets and Knowledge Technology: 7th International Conference, RSKT 2012* (pp. 344-351). Springer.
* **Summary**: Tang et al. (2012) introduce "pattern occupancy" as a novel interestingness measure to identify higher-quality frequent patterns for recommendation, beyond traditional metrics like support and confidence. Occupancy is defined for a pattern within a specific transaction as the ratio of the number of items from that pattern present in the transaction to the total number of items in that same transaction. The authors argue that a pattern should "occupy" a significant portion of a transaction to be considered truly representative or dominant within it. Simply being frequent (high global support) might not suffice if the pattern is a minor component of the transactions where it appears. The paper discusses methods for mining "qualified patterns" that meet thresholds for both frequency and occupancy, suggesting these are better candidates for recommendation. They also note that high-occupancy patterns might improve recall, while high-frequency patterns target precision.
* **Insight for Task 3**: This research is directly relevant as it offers a means to **refine the quality and utility of patterns** (like those notionally generated in Task 1) for recommendation. The "occupancy" metric provides an additional dimension to assess a pattern's significance. For our Task 3 hybrid recommender, even if we don't explicitly calculate occupancy with the stubbed Task 1 rules, the concept is valuable. When combining recommendations, patterns that are not only frequent and have high confidence/lift but are also theoretically "high-occupancy" could be given greater weight or priority. This implies that rules derived from patterns that are more "characteristic" or "complete" within the transactions they occur in might lead to more contextually relevant and impactful suggestions for a user's current basket. For instance, the `score_from_patterns(item)` within the `HybridRecommender` could be adjusted to incorporate a theoretical occupancy score alongside confidence/lift before the final `w_pat` weight is applied, thus enhancing the contribution of the pattern-based component.

## 3.  Combine and Score (Score Fusion):
    For each unique candidate item `i` from both lists, a final score is computed:
    `final_score(i) = (score_from_patterns(i) * w_pat) + (score_from_cf(i) * w_cf)`
    Individual scores from each method (`score_from_patterns(i)` and `score_from_cf(i)`) are rank-derived (e.g., `score_m(i) = (total_candidates_m - rank_m(i))`) to provide a simple, normalized way to represent the relative importance of items from each list before applying the overall weights `w_pat` and `w_cf`.
    Example weights, as used in the demonstration code: `w_cf = 0.6`, `w_pat = 0.4` (though these are tunable parameters).

### 3.1. Inputs for the Hybrid System (via Stubs)

As this Task 3 is an individual component and needs to be testable independently, the actual outputs from Task 1 (Pattern Mining) and Task 2 (Collaborative Filtering) are simulated using *stub functions*. This allows the hybrid logic to be prototyped and assessed without direct dependencies on the other tasks' fully executed code.

* **Pattern List (Simulating Task 1 Output):**
    This is represented by a pandas DataFrame containing association rules. Each rule is expected to have:
    * `lhs`: The antecedent or Left-Hand Side of the rule (a `frozenset` of item names).
    * `rhs`: The consequent or Right-Hand Side of the rule (a `frozenset` of item names).
    * Associated metrics: `support`, `confidence`, and `lift`.
    This data will be provided by a stub function, e.g., `load_stub_rules()`.

* **Collaborative Filtering Data (Simulating Task 2 Output):**
    For the CF component, the stub will simulate:
    * A collection of all training baskets (e.g., a dictionary where keys are `basket_id` and values are `set` of item names in that basket). This data is used by the CF stub to find "similar" baskets.
    * A CF stub function (e.g., `cf_stub()`) that takes the current target basket and the collection of training baskets to produce a list of recommended items. This simulates the more complex similarity calculations (like Jaccard, MinHash/LSH from Task 2) and candidate generation logic.

The `HybridRecommender` class, detailed in Section 4, will use these stubbed inputs.

### 3.2. Weighted-combination workflow

1.  **Generate Pattern-Based Candidates (`pat_candidates`):**
    Association rules (from the Task 1 stub) are processed. If a rule's antecedent (LHS) is a subset of the `target_basket`, its consequent (RHS) items (excluding those already in `target_basket`) become candidates.
    ```python
    # Conceptual representation:
    # pat_candidates = [
    #     item for rule in rules
    #     if rule.lhs.issubset(target_basket_items_set)
    #     for item in rule.rhs
    #     if item not in target_basket_items_set
    # ]
    ```
    Candidates are scored, for example, by rule confidence or lift.

2.  **Generate Collaborative Filtering Candidates (`cf_candidates`):**
    The `cf_stub()` (simulating Task 2) is called with the `target_basket`, `training_baskets` data, and `basket_id`.
    `cf_candidates = cf_stub(target_basket_id, target_basket_items_set, training_baskets, n=k*2)`
    These candidates are ranked by the CF stub's logic (e.g., by frequency of appearance in simulated similar baskets).

3.  **Combine and Score (Score Fusion):**
    For each unique candidate item `i` from both lists, a final score is computed:
    `final_score(i) = (score_from_patterns(i) * w_pat) + (score_from_cf(i) * w_cf)`
    (where individual scores can be rank-derived, e.g., `score_m(i) = (total_candidates_m - rank_m(i))`).
    Example weights, as used in the demonstration code: `w_cf = 0.6`, `w_pat = 0.4` (though these are tunable parameters).

4.  **Rank and Select Top-N Output:**
    All unique recommended items are ranked by their `final_score(i)` in descending order. The top N items from this list are returned.

### 3.3. Justification for the Weighted Combination Approach

The decision to use a weighted combination strategy is supported by its flexibility and its potential to leverage the complementary nature of CF and ARM, as suggested by the literature:

* **Balancing Strengths and Weaknesses:**
    * Collaborative Filtering (Task 2) excels at personalization and can uncover novel or "serendipitous" items based on user/basket similarity. However, it often suffers from data sparsity and the "cold-start" problem for new users or items with few interactions.
    * Association Rule Mining (Task 1) identifies general, statistically robust co-occurrence patterns from the entire dataset. These rules are less prone to cold-start for items involved in frequent patterns but may lack personalization and can be biased towards popular items.
    * A weighted combination allows us to balance these aspects. For example, by adjusting weights, the system can lean more on CF for users with rich interaction histories and more on patterns for new users or when CF signals are weak.

* **Tunability and Control:**
    The weights (`w_cf`, `w_pat`) offer a straightforward mechanism to tune the recommender's output. These could potentially be adapted dynamically based on context (e.g., user type, session length) or optimized through offline evaluation metrics, aiming to optimize for a balance of Precision@k, Recall@k, and diversity, for example.

* **Insights from Literature:**
    * While **Lee et al. (2001)** implemented a sequential (CF → ARM) hybrid, their work underscores the value of combining both signals. A weighted approach is another common and effective way to achieve this synergy, allowing simultaneous consideration of both types of evidence.
    * The concept of "pattern quality" from **Tang et al. (2012)** (e.g., using "occupancy" in addition to lift/confidence) could be integrated into a weighted scheme. Patterns deemed "higher quality" could contribute more to the `item_score_from_patterns` (e.g., `score_patterns(item_i)` in our formula could be further modulated by such quality metrics, giving patterns with high theoretical occupancy a higher base score before the `w_pat` weight is applied), thereby influencing their final combined score more significantly.
    * The data sparsity issue highlighted by **Parvatikar & Joshi (2015)** is partially addressed by a hybrid model, as the pattern-based component can still provide recommendations even when CF struggles due to insufficient data for a particular user or basket.

* **Interpretability and Implementation Simplicity:**
    A weighted sum is a relatively transparent and computationally inexpensive way to combine scores from different recommenders, making it a practical choice for an initial hybrid system.

This methodology directly addresses the "End Game" diagram from the assignment specification, which envisions Task 3 taking a "Collaborative Filtering Table" (represented by the CF stub's output and underlying basket data) and a "Pattern List" (from the Task 1 stub) to produce a combined "Table, Patterns => Recommendation".

## 4. Code Implementation

This section presents the Python code for the Task 3 hybrid recommender. As required by the assignment for individual submission and testing, the code is designed to work with *stub functions* that simulate the outputs from Task 1 (association rules) and Task 2 (collaborative filtering recommendations and data).

The `load_stub_rules()` function (defined below) simulates the output of a pattern mining process (Task 1), providing structured rules. The `_score_cf()` method within the `HybridRecommender` class serves as a minimal stub for Task 2's collaborative filtering output, primarily demonstrating the integration interface point for CF scores.

The implementation consists of:
1.  **`HybridRecommender` Class**: This class encapsulates the logic for:
    * Initializing with stubbed association rules (Task 1 proxy) and training basket data (for CF stub and popularity calculation).
    * Calculating item popularity for fallback recommendations.
    * Scoring potential recommendations based on association rules.
    * Providing an interface point for collaborative filtering scores (stubbed to return no CF scores in this Task 3 prototype).
    * Combining these scores using the weighted strategy described in Section 3, with a robust fallback to popular items to ensure the desired number of recommendations (`k`) is provided if possible.
2.  **Demonstration Script**: Shows how to instantiate and use the `HybridRecommender` with the stubbed inputs, illustrating its functionality and the impact of different weighting schemes.

All Python code comments are in English and aim for clarity and conciseness.

In [18]:
# Required imports
import pandas as pd
from collections import Counter
from typing import List, Dict, Set

# --- STUB Function: Simulate loading association rules (Task 1 output) ---
def load_stub_rules() -> pd.DataFrame:
    """
    Simulates loading association rules, as would be output by Task 1.
    Returns a pandas DataFrame with 'lhs' (frozenset), 'rhs' (frozenset),
    'confidence', 'lift', and 'support' columns.
    """
    # This data represents a simplified, fixed set of rules for demonstration.
    # In a real scenario, this would be loaded from Task 1's actual output.
    data = {
        'lhs': [frozenset({'bottled beer'}), frozenset({'ham'}), frozenset({'diapers'}), frozenset({'sausage'})],
        'rhs': [frozenset({'sausage'}), frozenset({'whole milk'}), frozenset({'baby food'}), frozenset({'whole milk'})],
        'support': [0.0019, 0.0018, 0.005, 0.002], 
        'confidence': [0.055, 0.139, 0.20, 0.15], 
        'lift': [1.22, 1.17, 1.5, 1.3]        
    }
    rules_df = pd.DataFrame(data)
    return rules_df

# --- Task 3 Core Logic: Hybrid Recommender Class ---
class HybridRecommender:
    """
    Hybrid Recommender: Combines Pattern Rules + Minimal CF Stub.
    Designed for Task 3 individual component to demonstrate integration logic,
    robust fallback, and adherence to specified recommendation count.
    
    Parameters
    ----------
    rules_df : pd.DataFrame
        DataFrame of association rules, expected to have 'lhs', 'rhs', 
        and 'confidence' columns. 'lhs'/'rhs' should contain iterable 
        collections of item names (e.g., frozensets or sets).
    train_baskets : Dict[str, Set[str]]
        A dictionary representing all baskets in the training set,
        mapping basket_id (str) to a set of item names (str). Used for popularity.
    """

    def __init__(self,
                 rules_df: pd.DataFrame,
                 train_baskets: Dict[str, Set[str]]) -> None:
        # Store the provided association rules and training basket data
        self.rules_df = rules_df
        self.train_baskets = train_baskets
        # Pre-calculate global item popularity for fallback and padding
        self._build_popularity() 

    # ---------- Internal Helper Methods ----------
    def _build_popularity(self) -> None:
        """
        Calculates global item popularity based on their frequency
        in the training baskets and stores the sorted list.
        """
        item_counts = Counter()
        # Count item occurrences across all training baskets
        for items_in_basket in self.train_baskets.values():
            item_counts.update(items_in_basket)
        # Store items sorted by popularity (most common first)
        self._popular_items = [item for item, _ in item_counts.most_common()]

    def _get_top_popular(self, k: int, items_to_exclude: Set[str]) -> List[str]:
        """
        Returns up to k most popular items, excluding specified items.
        Used as a fallback or to pad recommendations.
        
        Args:
            k (int): The desired number of popular items to return.
            items_to_exclude (Set[str]): A set of items to exclude from the popular list.
        Returns:
            List[str]: A list of top k (or fewer if not enough unique popular items) 
                       popular item names.
        """
        popular_recs = []
        if k <= 0: # No recommendations needed
            return popular_recs
            
        for item in self._popular_items: # Assumes self._popular_items is sorted by popularity
            if item not in items_to_exclude:
                popular_recs.append(item)
            if len(popular_recs) >= k: # Stop once we have enough
                break
        return popular_recs

    def _score_pattern_rules(self,
                             target_basket_items: Set[str]) -> Dict[str, float]:
        """
        Scores items based on association rules.
        If a rule's LHS (antecedent) is a subset of target_basket_items, the confidence 
        of the rule is added to the score of items in its RHS (consequent).
        
        Args:
            target_basket_items (Set[str]): Items in the current target basket.
            
        Returns:
            Dict[str, float]: Item names to accumulated confidence scores from patterns.
        """
        pattern_scores: Dict[str, float] = {}
        if self.rules_df is None or self.rules_df.empty:
            return pattern_scores
            
        for _, rule_row in self.rules_df.iterrows():
            # Ensure LHS and RHS are sets for subset operations
            # It's good practice if rules_df stores them as frozensets already.
            lhs_items = set(rule_row["lhs"]) 
            rhs_item_candidates = set(rule_row["rhs"])
            
            # Check if the rule's antecedent is present in the target basket
            if lhs_items.issubset(target_basket_items):
                for rhs_item in rhs_item_candidates:
                    # Only recommend items not already in the target basket
                    if rhs_item not in target_basket_items:  
                        # Accumulate confidence scores if multiple rules suggest the same item
                        pattern_scores[rhs_item] = pattern_scores.get(rhs_item, 0.0) + float(rule_row["confidence"])
        return pattern_scores

    def _score_cf(self, target_basket_items: Set[str]) -> Dict[str, float]:
        """
        Collaborative Filtering (CF) scoring stub for Task 3.
        This method acts as a placeholder to demonstrate the integration interface 
        point for CF scores. For the individual Task 3 component, it returns no 
        CF-based scores, focusing on the hybrid mechanism itself.
        
        Args:
            target_basket_items (Set[str]): Items in the current target basket (unused in this stub).
            
        Returns:
            Dict[str, float]: An empty dictionary, as this is a placeholder stub for Task 3.
        """
        # Placeholder: In a full system, this would call or implement Task 2's CF logic.
        return {} 

    # ---------- Main External Interface ----------
    def recommend(self,
                  target_items_list: List[str], 
                  k: int = 10, 
                  w_cf: float = 0.6, 
                  w_pat: float = 0.4
                 ) -> List[str]:
        """
        Generates a top-k list of hybrid recommendations.
        Combines scores from pattern rules and a (minimalist) CF stub, 
        with a robust popularity fallback and padding mechanism to ensure k items are returned if possible.

        Args:
            target_items_list (List[str]): Items currently in the target basket.
            k (int): The desired number of final recommendations to return.
            w_cf (float): Initial weight for the collaborative filtering component.
            w_pat (float): Initial weight for the pattern rules component.
                          These weights are dynamically adjusted if one component returns no scores.
        Returns:
            List[str]: A list of k (or fewer, if insufficient unique items exist overall) 
                       recommended item names.
        """
        if k <= 0: # No recommendations requested
            return []

        current_basket_set = set(target_items_list)

        # 1. Get scores from individual components
        cf_candidate_scores    = self._score_cf(current_basket_set) # Will be {} with current stub
        pattern_candidate_scores = self._score_pattern_rules(current_basket_set)

        # 2. Dynamically adjust weights if one component yields no scores
        current_w_cf = w_cf
        current_w_pat = w_pat
        
        has_cf_scores = bool(cf_candidate_scores)
        has_pat_scores = bool(pattern_candidate_scores)

        if not has_cf_scores and has_pat_scores: # Only patterns have scores
            current_w_cf, current_w_pat = 0.0, 1.0
        elif not has_pat_scores and has_cf_scores: # Only CF has scores
            current_w_cf, current_w_pat = 1.0, 0.0
        elif not has_cf_scores and not has_pat_scores: # Neither has scores
             current_w_cf, current_w_pat = 0.0, 0.0 # Will rely entirely on popularity

        # 3. Combine scores using adjusted weights
        all_candidate_items = set(cf_candidate_scores.keys()) | set(pattern_candidate_scores.keys())
        
        total_combined_scores: Dict[str, float] = {}
        for item in all_candidate_items:
            # Item should not already be in current_basket_set (handled by scoring methods)
            # but a check here before adding to total_combined_scores is robust
            if item in current_basket_set: # Should not happen if scoring methods are correct
                continue
            
            score_cf_component = current_w_cf * cf_candidate_scores.get(item, 0.0)
            score_pat_component = current_w_pat * pattern_candidate_scores.get(item, 0.0)
            total_combined_scores[item] = score_cf_component + score_pat_component
        
        # 4. Rank items by combined score
        # Items already in current_basket_set should have been excluded by _score_pattern_rules
        # and _score_cf (if it did anything).
        ranked_recommendations_intermediate = sorted(total_combined_scores.keys(), 
                                                     key=lambda item_key: total_combined_scores[item_key], 
                                                     reverse=True)
        
        # Select up to k recommendations from this hybrid logic
        final_recommendations = ranked_recommendations_intermediate[:k]

        # 5. Fallback/Padding: If fewer than k recommendations, fill with popular items.
        if len(final_recommendations) < k:
            num_needed_for_padding = k - len(final_recommendations)
            # Items to exclude from popularity list: 
            # 1. Items already in the target basket.
            # 2. Items already selected in final_recommendations from hybrid logic.
            items_to_exclude_for_fallback = current_basket_set.union(set(final_recommendations))
            
            popular_fill_recs = self._get_top_popular(num_needed_for_padding, items_to_exclude_for_fallback)
            final_recommendations.extend(popular_fill_recs)
            
        # Ensure the final list length does not exceed k.
        return final_recommendations[:k]

### 4.1. Demonstration of HybridRecommender

The following script demonstrates how to instantiate and use the `HybridRecommender` class with the stubbed inputs. This fulfills the requirement for the Task 3 individual code to be testable independently and shows the impact of different weighting schemes on the recommendations.

In [19]:
# --- Demonstration Script for HybridRecommender ---

# 1. Load stubbed association rules (simulating Task 1 output)
# Assuming load_stub_rules() is defined in the cell above or imported
rules_stub_df = load_stub_rules() 
print("--- Stubbed Association Rules Loaded ---")
print(rules_stub_df)
print("-" * 50)

# 2. Define stubbed training basket data (simulating data needed for CF stub and popularity)
# This represents the itemsets for various baskets in the "training data" for the stubs.
training_baskets_stub_dict = {
    'basket_001': {'milk', 'bread', 'eggs'},
    'basket_002': {'bread', 'butter', 'jam', 'milk'},
    'basket_003': {'milk', 'cereal', 'sugar'},
    'basket_004': {'bottled beer', 'chips', 'sausage'}, # Triggers: bottled beer -> sausage
    'basket_005': {'ham', 'cheese', 'bread', 'whole milk'}, # Triggers: ham -> whole milk
    'basket_006': {'diapers', 'wipes', 'baby food'}, # Triggers: diapers -> baby food
    'basket_007': {'milk', 'sausage', 'butter', 'whole milk'} # Triggers: sausage -> whole milk
}
print("--- Stubbed Training Baskets Data Loaded (Sample) ---")
# Print a sample of the training baskets for brevity
for i, (basket_id, items) in enumerate(training_baskets_stub_dict.items()):
    if i < 3: 
         print(f"Basket ID: {basket_id}, Items: {items}")
print("... (and so on for all stubbed training baskets)")
print("-" * 50)

# 3. Initialize the HybridRecommender with the stubbed data
recommender_instance = HybridRecommender(
    rules_df=rules_stub_df, 
    train_baskets=training_baskets_stub_dict 
)
print("--- HybridRecommender Initialized ---")

# Quick Demo Call (as suggested by checklist for easy TA visibility)
quick_demo_basket_items = ['milk', 'bread']
print(f"\nQuick Demo Call for basket: {quick_demo_basket_items}")
# Since _score_cf is empty, pattern rules and popularity will drive this.
quick_recs = recommender_instance.recommend(target_items_list=quick_demo_basket_items, k=5)
print(f"Quick Demo Recommendations (k=5): {quick_recs}")
print("-" * 50)


# --- Test Case 1: Target basket triggering multiple rules ---
# Note: The _score_cf stub returns {}, so CF scores will be 0.
# Recommendations will be driven by patterns, then padded/replaced by popularity fallback.
target_basket_items_1 = ['bottled beer', 'ham', 'milk'] 

print(f"Target Basket Items (Test Case 1): {target_basket_items_1}")

# Generate hybrid recommendations with balanced weights
# Note: Given current _score_cf is empty, current_w_cf effectively becomes 0, current_w_pat becomes 1.0
recommendations_balanced = recommender_instance.recommend(
    target_items_list=target_basket_items_1,
    k=5, 
    w_cf=0.5, # Initial weights
    w_pat=0.5  
)
print(f"\nHybrid Recommendations (Balanced Weights; CF stub is empty): {recommendations_balanced}")

# Test with CF weight higher (still results in current_w_cf=0 if cf_scores is empty)
recommendations_cf_heavy = recommender_instance.recommend(
    target_items_list=target_basket_items_1,
    k=5,
    w_cf=0.8, 
    w_pat=0.2
)
print(f"Hybrid Recommendations (CF Initial Weight Heavy; CF stub is empty): {recommendations_cf_heavy}")

# Test with Pattern weight higher
recommendations_pat_heavy = recommender_instance.recommend(
    target_items_list=target_basket_items_1,
    k=5,
    w_cf=0.2, 
    w_pat=0.8 
)
print(f"Hybrid Recommendations (Pattern Initial Weight Heavy; CF stub is empty): {recommendations_pat_heavy}")
print("-" * 50)

# --- Test Case 2: Target basket triggering a specific rule ---
target_basket_items_2 = ['diapers']
print(f"Target Basket Items (Test Case 2): {target_basket_items_2}")
recommendations_2 = recommender_instance.recommend(
    target_items_list=target_basket_items_2,
    k=3 # Requesting 3 recommendations
)
print(f"\nHybrid Recommendations: {recommendations_2}")
print("-" * 50)

# --- Test Case 3: Target basket with no strong rule triggers, relying on popularity fallback ---
# Create a scenario where patterns are unlikely to fire or produce few results.
target_basket_items_3 = ['chips', 'butter', 'unknown_item1', 'unknown_item2'] 
print(f"Target Basket Items (Test Case 3 - Expect Fallback): {target_basket_items_3}")

# To strongly test fallback when patterns also yield nothing (and CF is stubbed to empty),
# we can use an empty rules_df for a temporary recommender instance.
empty_rules_df = pd.DataFrame(columns=['lhs','rhs','confidence', 'lift', 'support']) 
temp_recommender_for_fallback_test = HybridRecommender(rules_df=empty_rules_df, 
                                                       train_baskets=training_baskets_stub_dict)
recommendations_3_fallback = temp_recommender_for_fallback_test.recommend(
    target_items_list=target_basket_items_3,
    k=3
)
print(f"\nHybrid Recommendations (Forced Fallback Test - No Rules, No CF): {recommendations_3_fallback}")

# Test with original recommender to see what it produces for this basket
# (some weak patterns might still be found for chips/butter if they exist in stub_rules_df)
recommendations_3_original_rules = recommender_instance.recommend(
    target_items_list=target_basket_items_3,
    k=3
)
print(f"Hybrid Recommendations (Original Rules, for chips, butter...): {recommendations_3_original_rules}")

--- Stubbed Association Rules Loaded ---
              lhs           rhs  support  confidence  lift
0  (bottled beer)     (sausage)   0.0019       0.055  1.22
1           (ham)  (whole milk)   0.0018       0.139  1.17
2       (diapers)   (baby food)   0.0050       0.200  1.50
3       (sausage)  (whole milk)   0.0020       0.150  1.30
--------------------------------------------------
--- Stubbed Training Baskets Data Loaded (Sample) ---
Basket ID: basket_001, Items: {'milk', 'bread', 'eggs'}
Basket ID: basket_002, Items: {'milk', 'bread', 'butter', 'jam'}
Basket ID: basket_003, Items: {'milk', 'cereal', 'sugar'}
... (and so on for all stubbed training baskets)
--------------------------------------------------
--- HybridRecommender Initialized ---

Quick Demo Call for basket: ['milk', 'bread']
Quick Demo Recommendations (k=5): ['butter', 'sausage', 'whole milk', 'eggs', 'jam']
--------------------------------------------------
Target Basket Items (Test Case 1): ['bottled beer', 'ham', 

## 5. Discussion of Expected Results and Outcomes

The implementation in Section 4, using stubbed inputs from Task 1 (Patterns) and Task 2 (Collaborative Filtering), serves as a prototype to demonstrate the core logic of the proposed hybrid recommender. This section discusses the anticipated benefits and outcomes of such a system if it were fully integrated with actual data from the preceding tasks. The "results" from the stubbed demonstration illustrate the *mechanism* of combination, and the following discussion projects the *impact* of this mechanism in a real-world application. While this stubbed Task 3 prototype does not compute performance metrics directly due to its reliance on simulated inputs, a fully integrated version of this hybrid system would be rigorously evaluated using established metrics such as Precision@k, Recall@k, nDCG@k, Coverage, and Diversity to quantify its effectiveness.

* **Enhanced Recommendation Quality and Relevance:**
    A primary expectation is that the hybrid system will deliver recommendations of higher overall quality and relevance compared to standalone pattern-based or CF-based systems.
    * The CF component (simulated by `cf_stub()`) is designed to introduce personalization by identifying items favored by similar baskets. This can lead to the discovery of novel or niche items that global patterns might overlook.
    * The pattern-based component (drawing from rules simulated by `load_stub_rules()`) ensures that strong, general co-occurrence relationships (e.g., common pairings like `{'bottled beer'} -> {'sausage'}` from our stub data) are captured, providing a baseline of reliable and understandable recommendations.
    * By blending these, the system can offer a more nuanced output: personalized suggestions grounded in common purchasing behaviors. For example, the demonstration script (Cell 8) for the basket `['bottled beer', 'ham', 'milk']` with balanced weights yielded `['sausage', 'whole milk', 'butter', 'chips', 'cereal']`. Here, 'sausage' and 'whole milk' are strongly suggested by rules triggered by 'bottled beer' and 'ham' respectively, while 'butter', 'chips', and 'cereal' might be influenced by the (simplified) CF logic detecting overlaps with other stubbed training baskets, showcasing the combined influence. The insights from Tang et al. (2012) regarding "pattern occupancy" further suggest that if the quality of patterns from Task 1 were refined (e.g., by prioritizing patterns that are not just frequent but also "fill" a significant portion of transactions), the relevance of pattern-based suggestions within the hybrid model could be even stronger.

* **Addressing Limitations of Individual Approaches:**
    * **Cold-Start Problem (CF):** For new baskets or those with few items, traditional CF methods often struggle to find sufficient similar neighbors to make reliable recommendations. In these scenarios, the pattern-based component, relying on globally derived rules, can still generate relevant suggestions. For instance, if a new basket contains only 'diapers', the rule `{'diapers'} -> {'baby food'}` (from our stub data) can provide an immediate, sensible recommendation, thereby improving coverage where CF alone might fail.
    * **Data Sparsity (CF):** Transactional datasets, particularly in the grocery domain, can be sparse, with individual baskets often containing a small number of diverse items. This sparsity can weaken the signals for basket-to-basket similarity in CF. A hybrid approach is inherently more robust, as strong, frequent patterns can bridge these gaps where direct basket similarity is weak or noisy.
    * **Popularity Bias (Patterns):** Frequent pattern mining can sometimes be biased towards recommending globally popular items, potentially overlooking user-specific preferences for less common or niche products. The CF component, by focusing on similarity to specific baskets (or users, in other CF paradigms), can introduce more personalized items that, while not globally frequent, are highly relevant to a particular user's taste profile or current shopping mission.
    * **Serendipity and Novelty:** While patterns often reinforce known or expected associations (e.g., bread and butter), CF has a greater potential to introduce users to items they might not have discovered otherwise, fostering serendipity. The hybrid model aims to preserve this valuable aspect by incorporating CF signals, preventing the recommendations from becoming overly predictable or limited to common pairings.

* **Improved User Experience:**
    By providing more consistently relevant, diverse, and occasionally novel recommendations, the hybrid system is anticipated to lead to an improved user experience. In the context of the grocery store scenario, this could translate to increased user engagement with recommendation features, larger average basket sizes (as users are prompted with relevant add-on items), and ultimately, enhanced customer satisfaction and loyalty.

* **Tunable Performance and Adaptability:**
    The use of weights (`w_cf`, `w_pat`) in the chosen weighted combination strategy offers a crucial mechanism for tuning the system's behavior. As demonstrated in the code (Cell 8), altering these weights directly influences the composition and ranking of the final recommendation list. For instance, increasing `w_cf` in the demo tended to prioritize items potentially derived from broader basket similarities (simulated by `cf_stub`), while increasing `w_pat` emphasized items strongly linked by association rules. This adaptability is key for optimizing the system based on specific business objectives (e.g., maximizing conversion rates, increasing recommendation diversity, or promoting sales of particular item categories) or even tailoring recommendations for different user segments.

In the context of this individually testable Task 3 using stubs, the "required results for the report" primarily consist of the example outputs generated by the demonstration script, coupled with a thorough discussion (as provided above) explaining how the integration mechanism functions and projecting its benefits when applied in a real-world scenario with actual data inputs from Task 1 and Task 2.

## 6. Limitations of the Stubbed Approach and Future Work

While the `HybridRecommender` implemented in Section 4 successfully demonstrates the core logic of combining recommendations from simulated Task 1 (Patterns) and Task 2 (Collaborative Filtering) outputs, it is essential to acknowledge the limitations inherent in this stubbed prototype and to outline potential avenues for future development towards a production-ready system.

### 6.1. Limitations of the Current Stubbed Implementation

* **Simplified Stub Logic and Data:**
    * The `load_stub_rules()` function returns a very small, static set of association rules. A fully implemented Task 1 would likely generate a much larger, more diverse, and data-driven set of rules with varying statistical strengths (support, confidence, lift). The current stub does not reflect the richness or potential noise of real mined patterns.
    * Similarly, the `cf_stub()` function employs a rudimentary logic for simulating collaborative filtering (basic item overlap and frequency count). An actual Task 2 implementation using techniques like Jaccard similarity, MinHash/LSH, or more advanced CF algorithms would produce more nuanced similarity scores and candidate lists. The `all_training_baskets_dict` used by the CF stub is also a simplified representation of the underlying data structures that a full CF model would use.
* **Illustrative, Not Performant, Results:**
    * Consequently, the recommendations generated by the `HybridRecommender` in the demonstration script (Cell 8) are primarily illustrative of the *combination mechanism* itself. They do not represent the true performance or recommendation quality that would be expected from a system fully integrated with actual, dynamic outputs from Task 1 and Task 2. The specific items recommended are tied to the fixed, limited data within the stubs.
* **Fixed Combination Weights:**
    * The weights (`w_cf`, `w_pat`) used in the weighted combination strategy are currently static and chosen for demonstration. In a real-world application, determining the optimal values for these weights would require empirical evaluation, such as offline testing against historical data with appropriate metrics (e.g., precision, recall, nDCG) or online A/B testing.
* **Absence of Real Data Dynamics:**
    * The stub functions do not interact with the actual, cleaned dataset that would be the basis for Task 1 and Task 2. Therefore, the nuances, complexities, and potential biases present in the real data distribution are not reflected in the inputs to the Task 3 hybrid logic.

### 6.2. Future Work and Potential Enhancements

To evolve this prototype into a more robust and effective recommendation system, several avenues for future work are identified:

1.  **Full Integration with Task 1 and Task 2 Modules:**
    * The most critical next step is to replace the current stub functions (`load_stub_rules()` and `cf_stub()`) with mechanisms to consume the actual, dynamically generated outputs from the fully implemented Task 1 (Pattern Mining) and Task 2 (Collaborative Filtering) modules. This will necessitate the definition of clear data contracts or APIs between the tasks, facilitating seamless data flow and enabling the hybrid recommender to operate on real insights derived from the dataset, as envisioned by the assignment's structure for parallel development and subsequent integration.

2.  **Exploration of Sophisticated Combination Strategies:**
    * Beyond the current weighted rank-based sum, more advanced hybrid techniques could be investigated and implemented. These might include:
        * **Switching Hybrids:** Employing one method (e.g., CF) as primary and dynamically switching to another (e.g., patterns) if the primary method yields insufficient results or low-confidence scores for a given user/basket.
        * **Cascade Hybrids:** Using one method to generate an initial broad set of candidate recommendations, which is then re-ranked or filtered by the second method. For example, CF candidates could be re-ranked based on the strength of associated pattern rules, aligning somewhat with the sequential approach of Lee et al. (2001).
        * **Feature Augmentation:** Incorporating outputs or features derived from one recommendation method as input features for another before a final prediction is made.
        * **Meta-Level Combination:** Training a machine learning model (a meta-learner) to learn the optimal way to combine recommendations from the different sources, possibly taking into account contextual features of the user, basket, or items.

3.  **Incorporating Pattern Quality Metrics (e.g., Occupancy):**
    * As suggested by Tang et al. (2012), Task 1 could be extended (or its output post-processed) to calculate or estimate "pattern occupancy" or other advanced quality metrics beyond support, confidence, and lift.
    * In Task 3, these quality metrics could then be directly incorporated into the scoring logic for `pat_candidates` within the `HybridRecommender`. For example, by multiplying the rule's confidence/lift score by its occupancy score when calculating the base `score_from_patterns(item)` before the `w_pat` weight is applied, giving more influence to patterns that are not only statistically strong but also characteristic of the transactions they appear in.

4.  **Rigorous Offline and Online Evaluation Protocols:**
    * Once fully integrated, the hybrid system must undergo thorough offline evaluation using a held-out test set and appropriate recommendation quality metrics (e.g., Precision@k, Recall@k, nDCG@k, Mean Average Precision (MAP), Coverage, Diversity, Novelty). This will allow for a quantitative comparison of the hybrid model's performance against standalone Task 1 and Task 2 recommenders and facilitate the data-driven tuning of parameters like the combination weights.
    * If feasible, online evaluation through A/B testing in a simulated or live environment would be invaluable to assess the impact on key business metrics such as user engagement, click-through rates, conversion rates, or average order value.

5.  **Dynamic and Contextual Weight Tuning:**
    * Investigate methods for dynamically adjusting the combination weights (`w_cf`, `w_pat`) based on contextual factors. This could include user context (e.g., new vs. established user, observed session goals, historical purchase frequency), current basket characteristics (e.g., size, presence of specific item categories), or even item properties (e.g., popular vs. niche items, items on promotion).

6.  **Scalability and Performance Optimization:**
    * For the target business scenario of ~1 million transactions, ensure that all components (Task 1, Task 2, and the Task 3 integration logic) are optimized for performance and memory efficiency. The choice of algorithms like FP-Growth for Task 1 and LSH-based CF for Task 2 (if these were the group's choices for the full implementation) already considers scalability. However, their efficient implementation and the integration logic in Task 3 would need further validation and potential optimization (e.g., vectorization, more efficient data structures, or exploring distributed computing if necessary) at scale.

By systematically addressing these areas for future work, the prototyped hybrid recommender presented in this task can be developed into a more sophisticated, effective, and production-ready system capable of delivering significant value in the grocery retail scenario.

## 7. References

Lee, C-H., Kim, Y-H. and Rhee, P-K. (2001) ‘Web personalization expert with combining collaborative filtering and association rule mining technique’, *Expert Systems with Applications*, 21(3), pp. 131–137. doi: 10.1016/S0957-4174(01)00034-3.

Parvatikar, S. and Joshi, B. (2015) ‘Online book recommendation system by using collaborative filtering and association mining’, in *2015 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC)*, Madurai, India, 10–12 December. Piscataway, NJ: IEEE, pp. 1–4. doi: 10.1109/ICCIC.2015.7435717.

Tang, L., Zhang, L., Luo, P. and Wang, M. (2012) ‘Incorporating occupancy into frequent pattern mining for high-quality pattern recommendation’, in *Proceedings of the 21st ACM International Conference on Information and Knowledge Management (CIKM ’12)*, Maui, HI, USA, 29 October – 2 November. New York, NY: ACM, pp. 75–84. doi: 10.1145/2396761.2396776.