# Product Bundling Recommendation System
## Objective
The aim of this notebook is to build a sophisticated and efficient product bundling recommendation system. We'll use various data science techniques to analyze buying habits and suggest bundles of products that are often bought together.

## Steps
1. Data Preprocessing
2. Exploratory Data Analysis
3. Model Building
4. Evaluation
5. Conclusion

Let's get started!

In [None]:
# Importing necessary libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.cluster import KMeans
from sklearn.metrics import silhouette_score
from sklearn.preprocessing import StandardScaler
from mlxtend.frequent_patterns import apriori, association_rules

# Setting display options
pd.set_option('display.max_columns', None)
sns.set(style='whitegrid')

## Data Preprocessing
In this step, we'll load the dataset and perform some basic preprocessing tasks such as handling missing values, data type conversions, and scaling. Let's start by loading the dataset.

In [None]:
# Load the dataset
df = pd.read_csv('https://archive.ics.uci.edu/ml/machine-learning-databases/00396/Sales_Transactions_Dataset_Weekly.csv')

# Display the first few rows of the dataset
df.head()

## Exploratory Data Analysis (EDA)
After loading the data, it's crucial to understand its structure, distribution, and patterns. EDA will help us make informed decisions during the model-building phase. We'll look at summary statistics, distributions, and correlations among features.

In [None]:
# Summary statistics
df.describe()

## Model Building
For product bundling, we'll use two main techniques:
1. **K-Means Clustering**: To segment customers into different groups based on their buying habits.
2. **Association Rule Mining**: To find associations between different products.

Let's start with K-Means Clustering.

In [None]:
# Data preparation for K-Means
# We'll only use the columns that represent weekly sales data
X = df.filter(like='W')

# Standardize the data
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Find the optimal number of clusters using the Elbow method
inertia = []
for i in range(1, 11):
    kmeans = KMeans(n_clusters=i, random_state=0)
    kmeans.fit(X_scaled)
    inertia.append(kmeans.inertia_)

# Plot the Elbow graph
plt.figure(figsize=(10, 6))
plt.plot(range(1, 11), inertia, marker='o')
plt.title('Elbow Method For Optimal Number of Clusters')
plt.xlabel('Number of Clusters')
plt.ylabel('Inertia')
plt.show()

## Association Rule Mining
After segmenting customers, we'll use Association Rule Mining to find patterns in product purchases. We'll use the Apriori algorithm for this purpose. The Apriori algorithm identifies frequent itemsets and generates association rules. It's commonly used in market basket analysis.

In [None]:
# Data preparation for Apriori Algorithm
# Convert the data into a format suitable for the Apriori algorithm
basket_data = df.filter(like='W').applymap(lambda x: 1 if x > 0 else 0)

# Applying Apriori algorithm to find frequent itemsets
frequent_itemsets = apriori(basket_data, min_support=0.05, use_colnames=True)

# Generate association rules
rules = association_rules(frequent_itemsets, metric='lift', min_threshold=1)
rules.sort_values('lift', ascending=False).head(10)

## Conclusion
In this notebook, we built a product bundling recommendation system using K-Means Clustering and Association Rule Mining. The K-Means algorithm helped us segment customers into different groups based on their buying habits, while the Apriori algorithm allowed us to find associations between different products.

This system can be very useful for businesses looking to increase their sales by bundling products that are frequently bought together.