***

## Apriori with mlxtend library

***

### Data Comprehension

#### Main-aim

In the upcoming work we'll use Association Rules Learning methods on business problem. We aim to analyze different relationship between items from and Online Retail Data and try to give advices on how one could make profit from the underlying connexions between different items. 

#### Library Discussion

 For this work we used a github based library called MLXTEND. This one allow us to use a well-known ARL method which is Apriori algorithm for our items recommendation model.

### Data Preprocesssing

#### Importing Libraries and Dataset

In [35]:
import pandas as pd 
import numpy as np
from mlxtend.frequent_patterns import apriori, association_rules
from mlxtend.preprocessing import TransactionEncoder

In [64]:
df = pd.read_csv("C:/Users/HP Elitebook/OneDrive/Bureau/retail_dataset.csv")
df.head()

Unnamed: 0,0,1,2,3,4,5,6
0,Bread,Wine,Eggs,Meat,Cheese,Pencil,Diaper
1,Bread,Cheese,Meat,Diaper,Wine,Milk,Pencil
2,Cheese,Meat,Eggs,Milk,Wine,,
3,Cheese,Meat,Eggs,Milk,Wine,,
4,Meat,Pencil,Wine,,,,


#### Data Extraction

In [37]:
transaction = []
for i in range(0,len(df)):
    transaction.append([str(df.values[i,j]) for j in range(0,7)])
transaction

[['Bread', 'Wine', 'Eggs', 'Meat', 'Cheese', 'Pencil', 'Diaper'],
 ['Bread', 'Cheese', 'Meat', 'Diaper', 'Wine', 'Milk', 'Pencil'],
 ['Cheese', 'Meat', 'Eggs', 'Milk', 'Wine', 'nan', 'nan'],
 ['Cheese', 'Meat', 'Eggs', 'Milk', 'Wine', 'nan', 'nan'],
 ['Meat', 'Pencil', 'Wine', 'nan', 'nan', 'nan', 'nan'],
 ['Eggs', 'Bread', 'Wine', 'Pencil', 'Milk', 'Diaper', 'Bagel'],
 ['Wine', 'Pencil', 'Eggs', 'Cheese', 'nan', 'nan', 'nan'],
 ['Bagel', 'Bread', 'Milk', 'Pencil', 'Diaper', 'nan', 'nan'],
 ['Bread', 'Diaper', 'Cheese', 'Milk', 'Wine', 'Eggs', 'nan'],
 ['Bagel', 'Wine', 'Diaper', 'Meat', 'Pencil', 'Eggs', 'Cheese'],
 ['Cheese', 'Meat', 'Eggs', 'Milk', 'Wine', 'nan', 'nan'],
 ['Bagel', 'Eggs', 'Meat', 'Bread', 'Diaper', 'Wine', 'Milk'],
 ['Bread', 'Diaper', 'Pencil', 'Bagel', 'Meat', 'nan', 'nan'],
 ['Bagel', 'Cheese', 'Milk', 'Meat', 'nan', 'nan', 'nan'],
 ['Bread', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan'],
 ['Pencil', 'Diaper', 'Bagel', 'nan', 'nan', 'nan', 'nan'],
 ['Meat', 'Bagel',

#### Data Encoding

In [65]:
te                   = TransactionEncoder()
Encoded_transactions = te.fit(transaction).transform(transaction)
transactions         = pd.DataFrame(Encoded_transactions, columns = te.columns_)
transactions.head()

Unnamed: 0,Bagel,Bread,Cheese,Diaper,Eggs,Meat,Milk,Pencil,Wine,nan
0,False,True,True,True,True,True,False,True,True,False
1,False,True,True,True,False,True,True,True,True,False
2,False,False,True,False,True,True,True,False,True,True
3,False,False,True,False,True,True,True,False,True,True
4,False,False,False,False,False,True,False,True,True,True


### Apriori Algorithm Conception

#### Model fitting

In [66]:
items_tab = apriori(transactions, min_support = 0.2, use_colnames = True)

#### Outcomes Edit

In [68]:
items_tab['Num. of items']  = items_tab['itemsets'].apply(lambda x: len(x))
rules                       = association_rules(items_tab, metric = 'confidence', min_threshold = 0.3)
rules['antecedents_length'] = rules['antecedents'].apply(lambda x: len(x))
rules.head()

Unnamed: 0,antecedents,consequents,antecedent support,consequent support,support,confidence,lift,leverage,conviction,antecedents_length
0,(Bagel),(Bread),0.425397,0.504762,0.279365,0.656716,1.301042,0.064641,1.44265,1
1,(Bread),(Bagel),0.504762,0.425397,0.279365,0.553459,1.301042,0.064641,1.286787,1
2,(Bagel),(Milk),0.425397,0.501587,0.225397,0.529851,1.056348,0.012023,1.060116,1
3,(Milk),(Bagel),0.501587,0.425397,0.225397,0.449367,1.056348,0.012023,1.043532,1
4,(Bagel),(nan),0.425397,0.869841,0.336508,0.791045,0.909413,-0.03352,0.622902,1


### Targeting relevant items

In [63]:
filter_      = (rules['antecedents_length'] >= 2) & (rules['lift'] > 1) & (rules['confidence'] > 0.5)
target_items = rules[filter_]
target_items

Unnamed: 0,antecedents,consequents,antecedent support,consequent support,support,confidence,lift,leverage,conviction,antecedents_length
62,"(Bagel, nan)",(Bread),0.336508,0.504762,0.212698,0.632075,1.252225,0.042842,1.346032,2
64,"(nan, Bread)",(Bagel),0.396825,0.425397,0.212698,0.536,1.26,0.04389,1.238369,2
67,"(Eggs, Meat)",(Cheese),0.266667,0.501587,0.215873,0.809524,1.613924,0.082116,2.616667,2
68,"(Eggs, Cheese)",(Meat),0.298413,0.47619,0.215873,0.723404,1.519149,0.073772,1.893773,2
69,"(Cheese, Meat)",(Eggs),0.32381,0.438095,0.215873,0.666667,1.521739,0.074014,1.685714,2
74,"(Eggs, nan)",(Cheese),0.336508,0.501587,0.219048,0.650943,1.297767,0.05026,1.427885,2
75,"(nan, Cheese)",(Eggs),0.393651,0.438095,0.219048,0.556452,1.270161,0.046591,1.26684,2
78,"(Cheese, Meat)",(Milk),0.32381,0.501587,0.203175,0.627451,1.250931,0.040756,1.337845,2
79,"(Meat, Milk)",(Cheese),0.244444,0.501587,0.203175,0.831169,1.657077,0.080564,2.952137,2
80,"(Cheese, Milk)",(Meat),0.304762,0.47619,0.203175,0.666667,1.4,0.05805,1.571429,2
