## Association Rule Learning
- This rule shows how frequently a itemset occurs in a transaction. A typical example is a Market Based Analysis. 

- This rule shows how frequently a itemset occurs in a transaction. A typical examples can be Market Basket analysis, Web usage mining, continuous production, etc.

- For example, if a customer buys bread, he most likely can also buy butter, eggs, or milk, so these products are stored within a shelf or mostly nearby.

**Types of Association Rule Learning**
 1. Apriori
 1. Eclat
 1. F-P Growth Algorithm

# Apriori
- also called frequent pattern mining.

- is an association rule leaning that analyzes that people who bought product A also bought product B.

### Components of Apriori algorithm
1.  support
    - refers to the default popularity of any product.
    - $$ support(A) = \frac {\textrm{transaction  containing A}}{\textrm{Total transaction}}$$

1.  Confidence
    - refers to the possibility that the customers bought both product A & B together.
    - $$ Confidence(A→B) = \frac {\textrm{transaction  containing A and B}}{\textrm{Transaction  containing A}}$$
    
1.  Lift
    - lift refers to the increase in the ratio of the sale of **'Product B'** when you sell **'product A'**.
    - $$ Lift(A→B) = \frac {\textrm{Confidence(A→B)}}{\textrm{Support(B)}}$$

### Example, 
Let's say we picked a dataset, of a super market, and we are trying to figure out Coustomer buying behaviour, here are few transactions.

<img src='./arl_photos/shop_dataset.png' width='600px'>

- Now total number of People who bought French fries is 10 (Circeled), and people who bought Burgers is 40 (gree colored).

<img src='./arl_photos/total_buy.png' width ='400px'>

Therefor,<br>
- **Support = 10/100 = 10%**
- **confidence = 7/40 = 17.5%** 
- **Lift = 17.5% / 10% = 1.75**

**Rules for Apriori :** are conditions, or Customer behaviour, eg. people by product-B  after buying product-A. 

### Steps for Apriori - Algorithem

1. Set a minimum support and confidence

1. Take all the subsets in transactions having higher support than minnimum support.

1. Take all the rules of these subsets having higher confidence than minimum confidence.

1. Sort the rules by decreasing lift.

## Implementing Apriori algoritem

In [18]:
# import libraries
import pandas as pd
import numpy as np
from matplotlib import pyplot as plt

# import dataset
dataset = pd.read_csv('Market_Basket_Optimisation.csv', header = None) 
# header = None, specifies that there no header, so don't exclude the first row

**Note:** the Apriori function, unlike pandas(2D matrix) accepts a single list(1D array).

In [19]:
transactions = []
for row in range(0, 7501):
    transactions.append([str(dataset.values[row,col]) for col in range(0, 20)])
    # for each row, we are appending each column value in the list, in string form

for obj in transactions[:5]: print(obj)

['shrimp', 'almonds', 'avocado', 'vegetables mix', 'green grapes', 'whole weat flour', 'yams', 'cottage cheese', 'energy drink', 'tomato juice', 'low fat yogurt', 'green tea', 'honey', 'salad', 'mineral water', 'salmon', 'antioxydant juice', 'frozen smoothie', 'spinach', 'olive oil']
['burgers', 'meatballs', 'eggs', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan']
['chutney', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan']
['turkey', 'avocado', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan']
['mineral water', 'milk', 'energy bar', 'whole wheat rice', 'green tea', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan']


### Training the Apriori Model

In [21]:
from apyori import apriori
rules = apriori(transactions = transactions,    #---> dataset list 
                min_support = 0.003,    #---> occurance of product in dataset,  (7*3/7005 = 0.003) for product to apper 3 times daily 
                min_confidence = 0.2,   #--->  trail and error
                min_lift = 3,   #--->  trail and error 
                min_length = 2,     #---> min no of product in rule (1 on each side) can more 
                max_length = 2)     #---> max no of product in rule (1 on each side) can more

results = list(rules)
results

[RelationRecord(items=frozenset({'light cream', 'chicken'}), support=0.004532728969470737, ordered_statistics=[OrderedStatistic(items_base=frozenset({'light cream'}), items_add=frozenset({'chicken'}), confidence=0.29059829059829057, lift=4.84395061728395)]),
 RelationRecord(items=frozenset({'escalope', 'mushroom cream sauce'}), support=0.005732568990801226, ordered_statistics=[OrderedStatistic(items_base=frozenset({'mushroom cream sauce'}), items_add=frozenset({'escalope'}), confidence=0.3006993006993007, lift=3.790832696715049)]),
 RelationRecord(items=frozenset({'pasta', 'escalope'}), support=0.005865884548726837, ordered_statistics=[OrderedStatistic(items_base=frozenset({'pasta'}), items_add=frozenset({'escalope'}), confidence=0.3728813559322034, lift=4.700811850163794)]),
 RelationRecord(items=frozenset({'fromage blanc', 'honey'}), support=0.003332888948140248, ordered_statistics=[OrderedStatistic(items_base=frozenset({'fromage blanc'}), items_add=frozenset({'honey'}), confidence=0

### Organizing the results as per Pandas dataframe.

In [26]:
def inspect(results):
    lhs         = [tuple(result[2][0][0])[0] for result in results]
    rhs         = [tuple(result[2][0][1])[0] for result in results]
    supports    = [result[1] for result in results]
    confidences = [result[2][0][2] for result in results]
    lifts       = [result[2][0][3] for result in results]
    return list(zip(lhs, rhs, supports, confidences, lifts))

resultsinDataFrame = pd.DataFrame(inspect(results), columns = ['LHS', 'RHS', 'Support', 'Confidence', 'Lift'])
resultsinDataFrame.nlargest(n = 10, columns = 'Lift') # Sorts the values in the dataframe in descending order as per Lift

Unnamed: 0,LHS,RHS,Support,Confidence,Lift
3,fromage blanc,honey,0.003333,0.245098,5.164271
0,light cream,chicken,0.004533,0.290598,4.843951
2,pasta,escalope,0.005866,0.372881,4.700812
8,pasta,shrimp,0.005066,0.322034,4.506672
7,whole wheat pasta,olive oil,0.007999,0.271493,4.12241
5,tomato sauce,ground beef,0.005333,0.377358,3.840659
1,mushroom cream sauce,escalope,0.005733,0.300699,3.790833
4,herb & pepper,ground beef,0.015998,0.32345,3.291994
6,light cream,olive oil,0.0032,0.205128,3.11471
