# Upselling and Cross Selling

Upselling is the sales technique of encouraging customers to purchase a higher-end version or add premium features to their initial product selection, while cross-selling aims to sell complementary or related products to customers alongside their primary purchase. Association rule mining provides a powerful data-driven approach to identify these opportunities by uncovering hidden patterns and relationships between products in transaction data. By analyzing which items are frequently purchased together, businesses can generate personalized product recommendations that naturally align with customer preferences and purchase history. The insights derived from association rule mining enable more targeted marketing strategies, improved product bundling decisions, and strategic placement of complementary products, ultimately driving increased average order value, enhanced customer satisfaction, and higher revenue per customer interaction.

We'll explore 2 methods of association rule mining:
1. Apriori
2. Eclat

## Importing Libraries and Data

In [13]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from apyori import apriori # pip install apyori
from pyECLAT import ECLAT # pip install pyECLAT

In our dataset, we're working with a grocery store owner who wants to increase her sales by offering BOGO deals. She doesn't know which items she should create BOGO deals with, and asked us to help. The store owner tracked every transaction at her point of sale for 7 days, and gave us a CSV where each row is a transaction, each row containing n items that were purchased in that transaction. We'll import this data and start exploring:

In [2]:
# The header=None argument tells read_csv that our input data doesn't have column labels. There are no features nor a second dimension to this dataset--just row upon row of transaction basket data
df = pd.read_csv('transaction_data.csv', header=None)
df.head()

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19
0,shrimp,almonds,avocado,vegetables mix,green grapes,whole weat flour,yams,cottage cheese,energy drink,tomato juice,low fat yogurt,green tea,honey,salad,mineral water,salmon,antioxydant juice,frozen smoothie,spinach,olive oil
1,burgers,meatballs,eggs,,,,,,,,,,,,,,,,,
2,chutney,,,,,,,,,,,,,,,,,,,
3,turkey,avocado,,,,,,,,,,,,,,,,,,
4,mineral water,milk,energy bar,whole wheat rice,green tea,,,,,,,,,,,,,,,


In [3]:
df.shape

(7501, 20)

## Apriori

In its simplest form, Apriori is 'If this, then that.' Apriori algorithms look for sets of items that exist together, and create heuristics based on the frequency and consistency of those itemsets. Apriori relies on the idea that `If a set of items (itemset) is frequent, then all of its subsets must also be frequent. Conversely, if a set of items is infrequent, none of its supersets can be frequent.` A common use case is analyzing point of sale transactions to make deal recommendations, like in our scenario. We'll be using the `apriori` class from the `apyori` package. The `apriori` class expects a list of lists, where each sub-list is a transaction. Each item in the transaction list needs to be str, so we do some reformatting to pull the data out of the df and into an all-str list of lists.

### Preprocessing

In [6]:
transactions = []
# For each row in the dataset, append the items to the transactions list
for i in range(len(df)):
  transactions.append([str(df.values[i,j]) for j in range(len(df.columns))])

print(transactions[0])

['shrimp', 'almonds', 'avocado', 'vegetables mix', 'green grapes', 'whole weat flour', 'yams', 'cottage cheese', 'energy drink', 'tomato juice', 'low fat yogurt', 'green tea', 'honey', 'salad', 'mineral water', 'salmon', 'antioxydant juice', 'frozen smoothie', 'spinach', 'olive oil']


### Training/Mining

The `apriori` class expects 4 main arguments:
1. transactions: a list of transactions
2. min_support: support is the number of times a subset of items appears in the transaction list. The `apriori` class expects a minimum value to bound the list of output upsell rules to only include associations that are considered significant, expressed as a fraction of total transactions.
3. min_confidence: confidence refers to the strength of a given association rule, which describes the required % of applicable transactions where the rule applies. IE if min_confidence is 0.8, then the rule must apply to 80% of the transactions where the antecedent (requisite itemset) is present. This is a hyperparameter we can tune to end up with a higher or lower count of rules.
4. min_lift: lift measures the importance of the rule; how much more likely the second item is to be purchased when the first item is purchased. A lift value of 1 indicates that the items are unrelated (neither positive nor negative correlation). A lift value of more than 1 indicates that the second item is `lift_value` more times likely to be purchased when the first item is purchased. A lift value of less than 1 indicates that second item is (1/`lift_value`) times less likely to be purchased when the first item is purchased. 

In [None]:
# Building a heuristic for min_support: let's consider itemsets that were bought at least 3 times a day. Our dataset is for 7 days total, so valid itemsets should occur at least 21 times. min_support is expressed as a percentage of transactions, so we divide 21 by the total transaction count and round.
min_support = round(3*7/len(transactions), 4)
min_confidence = 0.2
# min_length and max_length are the minimum and maximum number of items in the itemset. These values are heavily dependent on the business problem that underlies the apriori analysis. In our case, we want to recommend rules that correlate with BOGO deals, so we want to create rules that are only 2 items long: buy item A, get item B.
rules = apriori(transactions=transactions, min_support=min_support, min_confidence=min_confidence, min_lift=3, min_length=2, max_length=2)

### Visualization

In [8]:
# Displaying the first results coming directly from the output of the apriori function
results = list(rules)
for rule in results[:10]:
    print(rule)

RelationRecord(items=frozenset({'light cream', 'chicken'}), support=0.004532728969470737, ordered_statistics=[OrderedStatistic(items_base=frozenset({'light cream'}), items_add=frozenset({'chicken'}), confidence=0.29059829059829057, lift=4.84395061728395)])
RelationRecord(items=frozenset({'escalope', 'mushroom cream sauce'}), support=0.005732568990801226, ordered_statistics=[OrderedStatistic(items_base=frozenset({'mushroom cream sauce'}), items_add=frozenset({'escalope'}), confidence=0.3006993006993007, lift=3.790832696715049)])
RelationRecord(items=frozenset({'escalope', 'pasta'}), support=0.005865884548726837, ordered_statistics=[OrderedStatistic(items_base=frozenset({'pasta'}), items_add=frozenset({'escalope'}), confidence=0.3728813559322034, lift=4.700811850163794)])
RelationRecord(items=frozenset({'honey', 'fromage blanc'}), support=0.003332888948140248, ordered_statistics=[OrderedStatistic(items_base=frozenset({'fromage blanc'}), items_add=frozenset({'honey'}), confidence=0.245098

In [9]:
print("Number of rules: ",len(results))

Number of rules:  9


With our current parameters, we ended up with 9 association rules. Let's organize them into a df for easier analysis:

In [None]:
# This is a cool bit of code from stack overflow that helps visualize the rules. It extracts the left and right hand side products from the rule, as well as the support, confidence, and lift for each rule, then zips those together into a list of tuples. The final line (outside the inspect function) casts that list of tuples to a df.
def inspect(results):
    lhs         = [tuple(result[2][0][0])[0] for result in results]
    rhs         = [tuple(result[2][0][1])[0] for result in results]
    supports    = [result[1] for result in results]
    confidences = [result[2][0][2] for result in results]
    lifts       = [result[2][0][3] for result in results]
    return list(zip(lhs, rhs, supports, confidences, lifts))
resultsinDataFrame = pd.DataFrame(inspect(results), columns = ['Left Hand Side', 'Right Hand Side', 'Support', 'Confidence', 'Lift'])
resultsinDataFrame

Unnamed: 0,Left Hand Side,Right Hand Side,Support,Confidence,Lift
0,light cream,chicken,0.004533,0.290598,4.843951
1,mushroom cream sauce,escalope,0.005733,0.300699,3.790833
2,pasta,escalope,0.005866,0.372881,4.700812
3,fromage blanc,honey,0.003333,0.245098,5.164271
4,herb & pepper,ground beef,0.015998,0.32345,3.291994
5,tomato sauce,ground beef,0.005333,0.377358,3.840659
6,light cream,olive oil,0.0032,0.205128,3.11471
7,whole wheat pasta,olive oil,0.007999,0.271493,4.12241
8,pasta,shrimp,0.005066,0.322034,4.506672


Let's sort by lift to get a better sense of which rules are most important:

In [11]:
resultsinDataFrame.sort_values(by='Lift', ascending=False).head()

Unnamed: 0,Left Hand Side,Right Hand Side,Support,Confidence,Lift
3,fromage blanc,honey,0.003333,0.245098,5.164271
0,light cream,chicken,0.004533,0.290598,4.843951
2,pasta,escalope,0.005866,0.372881,4.700812
8,pasta,shrimp,0.005066,0.322034,4.506672
7,whole wheat pasta,olive oil,0.007999,0.271493,4.12241


The way we interpret these rules is that each rule constitutes a potential BOGO deal that our shop owner could offer. Buy `Left Hand Side` product, get `Right Hand Side` product for free. That's not to say that each rule is a great idea--giving a customer free chicken (expensive) when they buy light cream (cheap) may not be smart. These rules only indicate what products customers often purchase together, thus indicating items that could be paired together in deals, placed in close proximity in the store, or upsold the customer at the point of sale.

Apriori is an established algorithm, but it has one critical flaw--it doesn't scale well. Apriori is memory intensive due to the fact that it needs multiple dataset scans to arrive at the set of rules. Apriori scans through the whole transaction list, then builds itemsets from that entire list. When working with larger or dense datasets, apriori is compute and memory heavy. Enter Eclat:

## Eclat

Eclat is essentially a simplified form of Apriori. It leads to the same output of association rules, but does so in a more efficient way. While there is some nuance to that efficiency (intermediary Eclat transaction itemsets can consume a LOT of memory), Eclat is generally faster than Apriori.

Apriori:
1. Scan transaction list data for 1-item sets
2. Build all possible itemsets by combining frequent sets from step 1
3. Scan data again to calculate support for each possible itemset
4. Remove itemsets that are infrequent
5. Repeat steps 2-4 until no more frequent itemsets are found

Eclat:
1. Scan transaction list data to build a dictionary of `{item : [list of transaction IDs where item appears]}`
2. Build all possible itemsets by checking the length/size of the item's appearance list
3. Use set intersections to find larger frequent itemsets
4. Recursively extend itemsets in a depth-first manner

In essence, where Apriori is a breadth-first search, Eclat is a depth-first search. 

### Training/Mining

In [24]:
eclat_model = ECLAT(data=df, verbose=True)
eclat_model.uniq_ # prints the unique items in the dataset

100%|██████████| 120/120 [00:00<00:00, 180.06it/s]
100%|██████████| 120/120 [00:00<00:00, 5864.59it/s]
100%|██████████| 120/120 [00:00<00:00, 7501.33it/s]


['soup',
 'sparkling water',
 'gluten free bar',
 'fromage blanc',
 'red wine',
 'dessert wine',
 'shrimp',
 'body spray',
 'salmon',
 'yogurt cake',
 'cream',
 'strong cheese',
 'eggplant',
 'mushroom cream sauce',
 'burgers',
 'rice',
 'corn',
 'hand protein bar',
 'chili',
 'tomatoes',
 'chocolate',
 'clothes accessories',
 'champagne',
 'asparagus',
 'sandwich',
 'muffins',
 'shampoo',
 'oil',
 'cooking oil',
 'chocolate bread',
 'water spray',
 'flax seed',
 'mayonnaise',
 'shallot',
 'cookies',
 'energy bar',
 'tea',
 'herb & pepper',
 'brownies',
 'chicken',
 'mineral water',
 'pasta',
 'fresh tuna',
 'hot dogs',
 'white wine',
 'oatmeal',
 'pickles',
 'zucchini',
 'ham',
 'whole weat flour',
 'whole wheat pasta',
 'mashed potato',
 'fresh bread',
 'green grapes',
 'grated cheese',
 ' asparagus',
 'light mayo',
 'barbecue sauce',
 nan,
 'soda',
 'spinach',
 'salad',
 'pancakes',
 'tomato sauce',
 'eggs',
 'cereals',
 'french fries',
 'pepper',
 'bug spray',
 'melons',
 'cauliflo

In [28]:
# fit the ECLAT model to the data
eclat_indexes, eclat_supports = eclat_model.fit(min_support=min_support, min_combination=2, max_combination=2, separator=' & ', verbose=True)

Combination 2 by 2


6555it [00:42, 154.26it/s]


In [29]:
eclat_supports

{'soup & shrimp': 0.005599253432875617,
 'soup & salmon': 0.003199573390214638,
 'soup & burgers': 0.006265831222503666,
 'soup & tomatoes': 0.006932409012131715,
 'soup & chocolate': 0.010131982402346354,
 'soup & cooking oil': 0.005332622317024397,
 'soup & herb & pepper': 0.0035995200639914677,
 'soup & chicken': 0.005999200106652446,
 'soup & mineral water': 0.023063591521130515,
 'soup & grated cheese': 0.0029329422743634183,
 'soup & pancakes': 0.006799093454206106,
 'soup & eggs': 0.009065457938941474,
 'soup & french fries': 0.007598986801759766,
 'soup & olive oil': 0.008932142381015865,
 'soup & frozen vegetables': 0.007998933475536596,
 'soup & whole wheat rice': 0.0035995200639914677,
 'soup & low fat yogurt': 0.005732568990801226,
 'soup & escalope': 0.0034662045060658577,
 'soup & milk': 0.015197973603519531,
 'soup & spaghetti': 0.014264764698040262,
 'soup & turkey': 0.004932675643247567,
 'soup & ground beef': 0.009732035728569524,
 'soup & honey': 0.003199573390214638

### Visualization

In [32]:
eclat_itemsets_df = pd.DataFrame(list(eclat_supports.items()), columns=['Itemset', 'Support'])

# Sort by support in descending order
df_sorted = eclat_itemsets_df.sort_values(by='Support', ascending=False)
df_sorted

Unnamed: 0,Itemset,Support
431,mineral water & spaghetti,0.059725
213,chocolate & mineral water,0.052660
405,mineral water & eggs,0.050927
429,mineral water & milk,0.047994
436,mineral water & ground beef,0.040928
...,...,...
710,carrots & milk,0.002933
739,protein bar & green tea,0.002933
646,olive oil & almonds,0.002933
287,cooking oil & fresh bread,0.002933


## Conclusion

And there we have it! Association rule algorithms like Apriori and Eclat help uncover buying patterns in transaction data. The discovered high-confidence, high-lift rules provide actionable insights for strategic product placement, bundle creation, and personalized recommendation systems for our shop owner. By implementing these rules into her sales strategy, the owner can enhance the customer shopping experience while simultaneously increasing average order value. Future work could explore combining these insights with customer segmentation data to create even more targeted upselling approaches, potentially incorporating temporal patterns to anticipate seasonal buying behaviors.