# Market Basket Analysis

> First, let's load and inspect the dataset

In [6]:
import pandas as pd

data = pd.read_csv('D:\LearnFlow_Internship\BA task 5\data.csv', encoding='ISO-8859-1')
data.head()

Unnamed: 0,InvoiceNo,StockCode,Description,Quantity,InvoiceDate,UnitPrice,CustomerID,Country
0,536365,85123A,WHITE HANGING HEART T-LIGHT HOLDER,6,12-01-2010 08:26,2.55,17850.0,United Kingdom
1,536365,71053,WHITE METAL LANTERN,6,12-01-2010 08:26,3.39,17850.0,United Kingdom
2,536365,84406B,CREAM CUPID HEARTS COAT HANGER,8,12-01-2010 08:26,2.75,17850.0,United Kingdom
3,536365,84029G,KNITTED UNION FLAG HOT WATER BOTTLE,6,12-01-2010 08:26,3.39,17850.0,United Kingdom
4,536365,84029E,RED WOOLLY HOTTIE WHITE HEART.,6,12-01-2010 08:26,3.39,17850.0,United Kingdom


The dataset has been successfully loaded and the first few rows are displayed


>Next, let's proceed with the market basket analysis to discover associations between products frequently purchased together.

In [8]:
pip install mlxtend


Collecting mlxtend
  Downloading mlxtend-0.23.1-py3-none-any.whl.metadata (7.3 kB)
Downloading mlxtend-0.23.1-py3-none-any.whl (1.4 MB)
   ---------------------------------------- 1.4/1.4 MB 2.1 MB/s eta 0:00:00
Installing collected packages: mlxtend
Successfully installed mlxtend-0.23.1
Note: you may need to restart the kernel to use updated packages.


In [10]:
from mlxtend.frequent_patterns import apriori, association_rules
from mlxtend.preprocessing import TransactionEncoder
import pandas as pd

# Load the dataset again
data = pd.read_csv('D:\LearnFlow_Internship\BA task 5\data.csv', encoding='ISO-8859-1')

# Preprocessing the data for market basket analysis
basket = (data.groupby(['InvoiceNo', 'Description'])['Quantity']
          .sum().unstack().reset_index().fillna(0)
          .set_index('InvoiceNo'))

# Convert quantities to 1 (purchased) and 0 (not purchased)
def encode_units(x):
    return 1 if x >= 1 else 0

basket_sets = basket.applymap(encode_units).astype(bool)  # Convert to bool type

# Apply the Apriori algorithm to find frequent itemsets
frequent_itemsets = apriori(basket_sets, min_support=0.01, use_colnames=True)

# Generate the association rules
rules = association_rules(frequent_itemsets, metric="lift", min_threshold=1)
rules = rules.sort_values(['confidence', 'lift'], ascending=[False, False])
rules.head()

  data = pd.read_csv('D:\LearnFlow_Internship\BA task 5\data.csv', encoding='ISO-8859-1')


Unnamed: 0,antecedents,consequents,antecedent support,consequent support,support,confidence,lift,leverage,conviction,zhangs_metric
1368,"(REGENCY TEA PLATE PINK, REGENCY TEA PLATE ROS...",(REGENCY TEA PLATE GREEN ),0.011004,0.015585,0.010431,0.947955,60.823405,0.01026,18.914824,0.994502
1369,"(REGENCY TEA PLATE PINK, REGENCY TEA PLATE GRE...",(REGENCY TEA PLATE ROSES ),0.011372,0.018203,0.010431,0.917266,50.389863,0.010224,11.866933,0.991429
794,(REGENCY TEA PLATE PINK),(REGENCY TEA PLATE GREEN ),0.012476,0.015585,0.011372,0.911475,58.48275,0.011178,11.120239,0.995319
1424,"(PINK REGENCY TEACUP AND SAUCER, ROSES REGENCY...",(GREEN REGENCY TEACUP AND SAUCER),0.01354,0.04152,0.012313,0.909366,21.901823,0.011751,10.575228,0.967441
985,"(PINK REGENCY TEACUP AND SAUCER, ROSES REGENCY...",(GREEN REGENCY TEACUP AND SAUCER),0.024503,0.04152,0.022171,0.904841,21.79286,0.021154,10.072447,0.978079


The market basket analysis has been successfully performed and the top association rules with highest confidence and lift scores discovered are shown

In [25]:
rules.head(20           )

Unnamed: 0,antecedents,consequents,antecedent support,consequent support,support,confidence,lift,leverage,conviction,zhangs_metric
1368,"(REGENCY TEA PLATE PINK, REGENCY TEA PLATE ROS...",(REGENCY TEA PLATE GREEN ),0.011004,0.015585,0.010431,0.947955,60.823405,0.01026,18.914824,0.994502
1369,"(REGENCY TEA PLATE PINK, REGENCY TEA PLATE GRE...",(REGENCY TEA PLATE ROSES ),0.011372,0.018203,0.010431,0.917266,50.389863,0.010224,11.866933,0.991429
794,(REGENCY TEA PLATE PINK),(REGENCY TEA PLATE GREEN ),0.012476,0.015585,0.011372,0.911475,58.48275,0.011178,11.120239,0.995319
1424,"(PINK REGENCY TEACUP AND SAUCER, ROSES REGENCY...",(GREEN REGENCY TEACUP AND SAUCER),0.01354,0.04152,0.012313,0.909366,21.901823,0.011751,10.575228,0.967441
985,"(PINK REGENCY TEACUP AND SAUCER, ROSES REGENCY...",(GREEN REGENCY TEACUP AND SAUCER),0.024503,0.04152,0.022171,0.904841,21.79286,0.021154,10.072447,0.978079
1381,"(STRAWBERRY CHARLOTTE BAG, CHARLOTTE BAG PINK ...",(RED RETROSPOT CHARLOTTE BAG),0.011086,0.042297,0.010022,0.904059,21.373914,0.009553,9.982209,0.963899
1376,"(SET/20 RED RETROSPOT PAPER NAPKINS , SET/6 RE...",(SET/6 RED SPOTTY PAPER PLATES),0.012108,0.021558,0.01084,0.89527,41.528989,0.010579,9.342546,0.987882
962,"(JUMBO BAG RED RETROSPOT, SUKI SHOULDER BAG)",(DOTCOM POSTAGE),0.011536,0.028962,0.010186,0.882979,30.487709,0.009852,8.297963,0.978487
798,(REGENCY TEA PLATE PINK),(REGENCY TEA PLATE ROSES ),0.012476,0.018203,0.011004,0.881967,48.45072,0.010777,8.317999,0.991734
1423,"(PINK REGENCY TEACUP AND SAUCER, GREEN REGENCY...",(ROSES REGENCY TEACUP AND SAUCER ),0.014031,0.043606,0.012313,0.877551,20.124402,0.011701,7.810548,0.963833


These rules indicate strong associations between various products. For example, customers who bought 'REGENCY TEA PLATE PINK, REGENCY TEA PLATE ROSE' also bought 'REGENCY TEA PLATE GREEN', 'PINK REGENCY TEACUP AND SAUCER', 'GREEN REGENCY TEACUP AND SAUCER', and 'ROSES REGENCY TEACUP AND SAUCER'.

>Based on the association rules discovered, here are some recommendations for marketing campaigns:


<b>Bundle Promotions:</b>
 
 Create product bundles that include items frequently bought together. For example, bundle 'REGENCY TEA PLATE PINK' with 'REGENCY TEA PLATE ROSE' and offer a discount for purchasing the bundle.
 
<b>Cross-Selling:</b>
 
 Encourage customers to purchase complementary products by displaying recommendations on product pages. For instance, when a customer views 'CHARLOTTE BAG PINK POLKADOT', suggest 'RED RETROSPOT CHARLOTTE BAG' as a complementary item.
 
<b>Email Marketing:</b>
 
 Send personalized email campaigns to customers based on their purchase history. If a customer bought 'SET/20 RED RETROSPOT PAPER NAPKINS', recommend 'SET/6 RED SPOTTY PAPER PLATES' in the follow-up email.
 
<b>In-Store Promotions:</b> 

If applicable, place complementary products near each other in the store. For example, place 'JUMBO BAG RED RETROSPOT' and 'SUKI SHOULDER BAG' in close proximity to encourage customers to purchase both items.

<b>Loyalty Programs:</b> 

Offer loyalty points or rewards for purchasing items that are frequently bought together. This can incentivize customers to buy more and increase their overall basket size.

<b>Social Media Campaigns:</b> 

Highlight popular product combinations on social media platforms. Share posts or stories showcasing how customers can use 'PINK REGENCY TEACUP AND SAUCER' and 'GREEN REGENCY TEACUP AND SAUCER' together for crocery.

These strategies can help increase sales and improve customer satisfaction by offering relevant and appealing product combinations.