# Apriori

Apriori bekerja selayaknya naive bayes. Apriori menentukan apakah ada hubungan suatu kejadian akan menyebabkan kejadian lain. Contohnya customer membeli barang A memiliki kecenderungan membeli barang B.


##Support

Support adalah ukuran seberapa mungkin kejadian B terjadi dari ruang sampel

$support(B)={n(B) \over n(S)}$

##Confidence

Confidence adalah ukuran seberapa mungkin kejadian B terjadi jika kejadian A terjadi alias ini adalah probabilitas bersyarat.

$confidence(A->B)={n(A \cup B) \over n(A)} = {P(A \cup B) \over P(A)}$

##Lift


Lift adalah ukuran seberapa peningkatan kemungkinan yang terjadi jika kejadian B terjadi jika A terjadi dibandingkan kejadian B terjadi pada ruang sampel
$lift(A->B)={confidence(A->B) \over support(B)}$

Lift adalah ukuran terpenting dalam apriori

## Importing the libraries

In [1]:
!pip install apyori

Collecting apyori
  Downloading apyori-1.1.2.tar.gz (8.6 kB)
Building wheels for collected packages: apyori
  Building wheel for apyori (setup.py) ... [?25l[?25hdone
  Created wheel for apyori: filename=apyori-1.1.2-py3-none-any.whl size=5974 sha256=0ed4202ef11d9ab6dd48eb76f2495b44f992fa50126c8c510a4c2046d15e346c
  Stored in directory: /root/.cache/pip/wheels/cb/f6/e1/57973c631d27efd1a2f375bd6a83b2a616c4021f24aab84080
Successfully built apyori
Installing collected packages: apyori
Successfully installed apyori-1.1.2


In [2]:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import apyori

## Data Preprocessing

In [3]:
dataset = pd.read_csv("Market_Basket_Optimisation.csv",header=None)
transactions=[]
for i in range(dataset.shape[0]):
  transaction=[]
  for j in range(dataset.shape[1]):
    if dataset.iloc[i,j] is not np.NaN:
      transaction.append(dataset.iloc[i,j])
  transactions.append(transaction)


## Training the Apriori model on the dataset

In [30]:
rules = apyori.apriori(transactions,min_support=0.003,min_confidence=0.2,min_lift=3,min_length=2,max_length=2)

## Visualising the results

### Displaying the first results coming directly from the output of the apriori function

In [31]:
results = list(rules)

In [32]:
results

[RelationRecord(items=frozenset({'chicken', 'light cream'}), support=0.004532728969470737, ordered_statistics=[OrderedStatistic(items_base=frozenset({'light cream'}), items_add=frozenset({'chicken'}), confidence=0.29059829059829057, lift=4.84395061728395)]),
 RelationRecord(items=frozenset({'mushroom cream sauce', 'escalope'}), support=0.005732568990801226, ordered_statistics=[OrderedStatistic(items_base=frozenset({'mushroom cream sauce'}), items_add=frozenset({'escalope'}), confidence=0.3006993006993007, lift=3.790832696715049)]),
 RelationRecord(items=frozenset({'pasta', 'escalope'}), support=0.005865884548726837, ordered_statistics=[OrderedStatistic(items_base=frozenset({'pasta'}), items_add=frozenset({'escalope'}), confidence=0.3728813559322034, lift=4.700811850163794)]),
 RelationRecord(items=frozenset({'honey', 'fromage blanc'}), support=0.003332888948140248, ordered_statistics=[OrderedStatistic(items_base=frozenset({'fromage blanc'}), items_add=frozenset({'honey'}), confidence=0

### Putting the results well organised into a Pandas DataFrame

In [38]:
result_pd=pd.DataFrame(results)
result_pd

Unnamed: 0,items,support,ordered_statistics
0,"(chicken, light cream)",0.004533,"[((light cream), (chicken), 0.2905982905982905..."
1,"(mushroom cream sauce, escalope)",0.005733,"[((mushroom cream sauce), (escalope), 0.300699..."
2,"(pasta, escalope)",0.005866,"[((pasta), (escalope), 0.3728813559322034, 4.7..."
3,"(honey, fromage blanc)",0.003333,"[((fromage blanc), (honey), 0.2450980392156863..."
4,"(ground beef, herb & pepper)",0.015998,"[((herb & pepper), (ground beef), 0.3234501347..."
5,"(ground beef, tomato sauce)",0.005333,"[((tomato sauce), (ground beef), 0.37735849056..."
6,"(light cream, olive oil)",0.0032,"[((light cream), (olive oil), 0.20512820512820..."
7,"(whole wheat pasta, olive oil)",0.007999,"[((whole wheat pasta), (olive oil), 0.27149321..."
8,"(pasta, shrimp)",0.005066,"[((pasta), (shrimp), 0.3220338983050847, 4.506..."


### Displaying the results non sorted

In [57]:
result_pd=pd.DataFrame(columns=["Item Base","Item Add","Support","Confidence","Lift"])
for i in results:
  record = []
  record.append([item_base for item_base in i.ordered_statistics[0].items_base])
  record.append([item_add for item_add in i.ordered_statistics[0].items_add])
  record.append(i.support)
  record.append(i.ordered_statistics[0].confidence)
  record.append(i.ordered_statistics[0].lift)
  result_pd.loc[len(result_pd)]=record

result_pd
# result_pd.columns =["Item Base","Item Add","Support","Confidence","Lift"]
# result_pd.head()
# print(results[0].ordered_statistics[0].items_base)

Unnamed: 0,Item Base,Item Add,Support,Confidence,Lift
0,[light cream],[chicken],0.004533,0.290598,4.843951
1,[mushroom cream sauce],[escalope],0.005733,0.300699,3.790833
2,[pasta],[escalope],0.005866,0.372881,4.700812
3,[fromage blanc],[honey],0.003333,0.245098,5.164271
4,[herb & pepper],[ground beef],0.015998,0.32345,3.291994
5,[tomato sauce],[ground beef],0.005333,0.377358,3.840659
6,[light cream],[olive oil],0.0032,0.205128,3.11471
7,[whole wheat pasta],[olive oil],0.007999,0.271493,4.12241
8,[pasta],[shrimp],0.005066,0.322034,4.506672


### Displaying the results sorted by descending lifts

In [61]:
sorted_result_pd = result_pd.sort_values(by='Lift',axis=0,ascending=False)
sorted_result_pd

Unnamed: 0,Item Base,Item Add,Support,Confidence,Lift
3,[fromage blanc],[honey],0.003333,0.245098,5.164271
0,[light cream],[chicken],0.004533,0.290598,4.843951
2,[pasta],[escalope],0.005866,0.372881,4.700812
8,[pasta],[shrimp],0.005066,0.322034,4.506672
7,[whole wheat pasta],[olive oil],0.007999,0.271493,4.12241
5,[tomato sauce],[ground beef],0.005333,0.377358,3.840659
1,[mushroom cream sauce],[escalope],0.005733,0.300699,3.790833
4,[herb & pepper],[ground beef],0.015998,0.32345,3.291994
6,[light cream],[olive oil],0.0032,0.205128,3.11471
