## Step 5: Improvements to the Recommendation System

We can make the recommendation system smarter and more useful by implementing several enhancements. Below are possible improvements with example code for each.


### 1. Use a Larger / More Detailed Dataset
Using more transactions improves the quality of association rules. You can merge multiple CSV files into one dataset.


In [2]:
import pandas as pd

# Example: load multiple CSV files into a single DataFrame
# files = ["dataset1.csv", "dataset2.csv", "dataset3.csv"]
# df = pd.concat([pd.read_csv(f) for f in files], ignore_index=True)

# Preview
# df.head()


### 2. Include Quantities and Frequency
Currently, items are treated as just present/absent. By including the quantity of items purchased, recommendations can be weighted more accurately.


In [3]:
# Example: if the dataset has a 'Quantity' column
# df['Weighted'] = df['Quantity']  # could multiply by price or other factor

# You could then use this weight when generating association rules
# e.g., frequent_itemsets = apriori(df_encoded, min_support=0.01, use_colnames=True, weight_col='Weighted')


### 3. Filter Recommendations by Basket Context
Not all recommended items make sense together. Filtering by category or type can make suggestions more relevant.


In [4]:
# Example: define categories
# categories = {'pasta': 'pantry', 'olive oil': 'pantry', 'ice cream': 'frozen', 'tomato': 'produce'}
# basket = ['pasta', 'olive oil']
# recommended = ['ice cream', 'tomato', 'rolls/buns']

# Filter recommendations by category of first basket item
# filtered = [item for item in recommended if categories.get(item) == categories.get(basket[0])]
# print(filtered)


### 5. Collaborative Filtering
We can suggest items based on what similar customers bought, not just co-occurrence in baskets.


In [5]:
# from surprise import Dataset, Reader, KNNBasic

# Prepare dataset
# reader = Reader(rating_scale=(1, 5))
# data = Dataset.load_from_df(df[['UserID','Item','Rating']], reader)

# Item-based collaborative filtering
# sim_options = {'name': 'cosine', 'user_based': False}
# algo = KNNBasic(sim_options=sim_options)
# algo.fit(data.build_full_trainset())
