## Apriori for Associative Rule Learning

### Index 
- [Equation and Method](#equation)
- [Pre processing](#preprocessing)
- [Building the model](#building)
- [Result](#result)

In [1]:
# importing some basic libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

<a id='equation'></a>
### Equation and Method

Apriori is a method for associative rule learning where we estimate the associativity of items by using some formulas just like in bayes theorem. The formula for the apriori algorithm are defined by 3 terms.

- $Support$
- $Confidence$
- $Lift$

###### Support ($M$)
Support is the percentage of our interested item with respect to the total number of items.

### $support = \frac{total(M)}{total number of items}$
Support is generally specified for M2 according to our current formula terminology.

##### Confidence ($M1$ ---> $M2$)
Confidence is the percentage of items having both $M1$ & $M2$ with respect to items having $M1$

### $confidence = \frac{total(M1--->M2)}{total(M1)}$

###### Lift (lift($M1$ ---> $M2$))
Lift is the measure of how our Confidnece improves, with respect to the support. I.e, having high support and confidence will result in a value that is low, meaning that there is no specific associativity between them when in fact the truth is that the associativity is not for that particular item but for everything commonly. However, if we have the confidence value high and the support is low. That results in a higher Lift. This suggests us that there is a high associativity with the two items

### $lift = \frac{confidence(M1-->M2)}{support(M2)}$


#### Algorithm

1. Set a minimum support and confidence
2. Take all the subsets in transactions having higher support than minimum support
3. Take all the rules of these subsets having higher confidence than minimum confidence.
4. Sort the rules by decreasing lift.

This method is quite effective but it is not very efficient.

The algorithm implementation in detail is given,

<img src='https://wikimedia.org/api/rest_v1/media/math/render/svg/8eed75c18217fe2f9b15f266c40b369ce038164d' style="margin:0px;">

<a id='preprocessing'></a>
### Pre processing

In [2]:
dataset = pd.read_csv('Market_Basket_Optimisation.csv')
dataset.head()

Unnamed: 0,shrimp,almonds,avocado,vegetables mix,green grapes,whole weat flour,yams,cottage cheese,energy drink,tomato juice,low fat yogurt,green tea,honey,salad,mineral water,salmon,antioxydant juice,frozen smoothie,spinach,olive oil
0,burgers,meatballs,eggs,,,,,,,,,,,,,,,,,
1,chutney,,,,,,,,,,,,,,,,,,,
2,turkey,avocado,,,,,,,,,,,,,,,,,,
3,mineral water,milk,energy bar,whole wheat rice,green tea,,,,,,,,,,,,,,,
4,low fat yogurt,,,,,,,,,,,,,,,,,,,


In our library that we are importing, we need to change this into a list of lists.

In [10]:
transactions = dataset.values.tolist()

<a id='building'></a>
### Building the model.

In [13]:
from apyori import apriori

In [14]:
rules = apriori(transactions, min_support = 0.003, min_confidence = 0.2, min_lift = 3, min_length = 2)

<a id='result'></a>
### Result

From the model that we built, we get some insights on what all items are most commonly purchased by the customers.