# Apriori

## Importing the libraries

In [1]:
!pip install apyori

Collecting apyori
  Downloading apyori-1.1.2.tar.gz (8.6 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Building wheels for collected packages: apyori
  Building wheel for apyori (setup.py) ... [?25l[?25hdone
  Created wheel for apyori: filename=apyori-1.1.2-py3-none-any.whl size=5954 sha256=0c13a577e3ed3554ef889e8c919536d732e54579c78a9b873108410e133cd8a4
  Stored in directory: /root/.cache/pip/wheels/c4/1a/79/20f55c470a50bb3702a8cb7c94d8ada15573538c7f4baebe2d
Successfully built apyori
Installing collected packages: apyori
Successfully installed apyori-1.1.2


In [2]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

## Data Preprocessing

In [5]:
#apriori model requires a list of all transactions (list of lists) and not a df
dataset = pd.read_csv('Market_Basket_Optimisation.csv', header = None)
transactions = []
for i in range(7501) :
  transactions.append([str(dataset.values[i, j])for j in range(20)])

## Training the Apriori model on the dataset



1. **Lift = 1:**
   - **Interpretation:** The items in the rule are independent of each other.
   - **Implication:** The occurrence of one item does not impact the occurrence of the other.

2. **Lift < 1:**
   - **Interpretation:** The items in the rule appear together less often than expected by chance.
   - **Implication:** There is a negative or weak association between the items.

3. **Lift > 1:**
   - **Interpretation:** The items in the rule appear together more often than if they were chosen randomly.
   - **Implication:** The degree of lift indicates the strength of the association between the items.

   - **1 < Lift < 2:** Suggests a positive association, but a relatively weak one.
   - **2 < Lift < 5:** Indicates a moderate to strong positive association.
   - **Lift > 5:** Suggests a strong positive association.

In summary, lift values greater than 1 indicate a positive association, with higher values suggesting stronger associations. Lift values less than 1 indicate a negative or weak association. It's crucial to consider the context of your data and the specific problem you are addressing to appropriately interpret lift values and make informed decisions based on the association rules.

In [6]:
# min_support
(3*7) / 7501

0.0027996267164378083

In [7]:
# support (A) = no of transactions in which A was bought / total no. transactions
# assumption: i need apriori only on those transactions that occured atleast 3 times a day
# the used dataset is taken from 7 days.
# min_support = (3*7) / 7501 = 0.003
# looking to make Buy A get B free. so min length and max length both = 2
# confidence : no. of transaction in which A and B were bought / no. of transactions containing A
# lift : confidence (A->B) / support (B)
from apyori import apriori
rules = apriori(transactions, min_support = 0.003, min_confidence = 0.2, min_lift = 3, min_length = 2 , max_length = 2)

## Visualising the results

### Displaying the first results coming directly from the output of the apriori function

In [8]:
results = list(rules)
print(results)

[RelationRecord(items=frozenset({'chicken', 'light cream'}), support=0.004532728969470737, ordered_statistics=[OrderedStatistic(items_base=frozenset({'light cream'}), items_add=frozenset({'chicken'}), confidence=0.29059829059829057, lift=4.84395061728395)]), RelationRecord(items=frozenset({'mushroom cream sauce', 'escalope'}), support=0.005732568990801226, ordered_statistics=[OrderedStatistic(items_base=frozenset({'mushroom cream sauce'}), items_add=frozenset({'escalope'}), confidence=0.3006993006993007, lift=3.790832696715049)]), RelationRecord(items=frozenset({'escalope', 'pasta'}), support=0.005865884548726837, ordered_statistics=[OrderedStatistic(items_base=frozenset({'pasta'}), items_add=frozenset({'escalope'}), confidence=0.3728813559322034, lift=4.700811850163794)]), RelationRecord(items=frozenset({'honey', 'fromage blanc'}), support=0.003332888948140248, ordered_statistics=[OrderedStatistic(items_base=frozenset({'fromage blanc'}), items_add=frozenset({'honey'}), confidence=0.24

### Putting the results well organised into a Pandas DataFrame

In [17]:
print(tuple(results[0][2][0][0])[0], type(results[0]))

light cream <class 'apyori.RelationRecord'>


In [19]:
def inspect(results):
  lhs = [tuple(result[2][0][0])[0] for result in results]
  rhs = [tuple(result[2][0][1])[0] for result in results]
  supports = [result[1] for result in results]
  confidences = [result[2][0][2] for result in results]
  lifts = [result[2][0][3] for result in results]
  return list(zip(lhs, rhs, supports, confidences, lifts))
resultsinDataFrame = pd.DataFrame(inspect(results), columns = ['Left', 'Right', 'Support', 'Confidence', 'Lift'])

### Displaying the results non sorted

In [20]:
resultsinDataFrame

Unnamed: 0,Left,Right,Support,Confidence,Lift
0,light cream,chicken,0.004533,0.290598,4.843951
1,mushroom cream sauce,escalope,0.005733,0.300699,3.790833
2,pasta,escalope,0.005866,0.372881,4.700812
3,fromage blanc,honey,0.003333,0.245098,5.164271
4,herb & pepper,ground beef,0.015998,0.32345,3.291994
5,tomato sauce,ground beef,0.005333,0.377358,3.840659
6,light cream,olive oil,0.0032,0.205128,3.11471
7,whole wheat pasta,olive oil,0.007999,0.271493,4.12241
8,pasta,shrimp,0.005066,0.322034,4.506672


### Displaying the results sorted by descending lifts

In [21]:
resultsinDataFrame.nlargest(n = 10, columns = 'Lift')

Unnamed: 0,Left,Right,Support,Confidence,Lift
3,fromage blanc,honey,0.003333,0.245098,5.164271
0,light cream,chicken,0.004533,0.290598,4.843951
2,pasta,escalope,0.005866,0.372881,4.700812
8,pasta,shrimp,0.005066,0.322034,4.506672
7,whole wheat pasta,olive oil,0.007999,0.271493,4.12241
5,tomato sauce,ground beef,0.005333,0.377358,3.840659
1,mushroom cream sauce,escalope,0.005733,0.300699,3.790833
4,herb & pepper,ground beef,0.015998,0.32345,3.291994
6,light cream,olive oil,0.0032,0.205128,3.11471
