### Overview
- Association rule learning is a rule-based machine learning method for discovering interesting relations between variables in large databases. It is intended to identify strong rules discovered in databases using some measures of interestingness.
 For example, the rule  {onions,potatoes}=> {burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.

- Some useful concepts related to association rule:

![chart](https://www.researchgate.net/profile/Chulhyun_Kim/publication/228827521/figure/tbl1/AS:669547573026817@1536643989215/Measures-of-interestingness.png)

![nice example](https://www.researchgate.net/publication/321053532/figure/tbl1/AS:613924688896000@1523382460012/Support-confidence-and-lift-calculation-for-patient-with-diabetes.png)




- Apriori is an algorithm for frequent item set mining and association rule learning over relational databases. It proceeds by identifying the frequent individual items in the database and extending them to larger and larger item sets as long as those item sets appear sufficiently often in the database.


   ****
        Simple procedure on how Apriori works:
        - Set a minimum support and confidence
        - Take all the subsets in transaction having support higher than minimum support
        - Take all rules of these subsets having having confidence higher than minimum confidence
        - Now sort the rules by decreasing lift 
        
   ****

()



# Implenentation using libraries:

Libraries Doc Links:

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html

https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html

https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html#sklearn.preprocessing.StandardScaler.fit_transform

http://rasbt.github.io/mlxtend/
    
https://scikit-learn.org/stable/modules/generated/sklearn.metrics.confusion_matrix.html

https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.colors.ListedColormap.html

https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.axes.Axes.scatter.html

https://matplotlib.org/3.1.1/api/colors_api.html

In [3]:
# Importing the libraries



import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from mlxtend.frequent_patterns import apriori, association_rules 


# Loading the Data 
data = pd.read_csv('https://raw.githubusercontent.com/guipsamora/pandas_exercises/master/07_Visualization/Online_Retail/Online_Retail.csv',encoding='iso-8859-1') 
data.head() 

# Exploring the columns of the data 
data.columns 

# Exploring the different regions of transactions 
data.Country.unique() 

# Stripping extra spaces in the description 
data['Description'] = data['Description'].str.strip() 

# Dropping the rows without any invoice number 
data.dropna(axis = 0, subset =['InvoiceNo'], inplace = True) 
data['InvoiceNo'] = data['InvoiceNo'].astype('str') 

# Dropping all transactions which were done on credit 
data = data[~data['InvoiceNo'].str.contains('C')] 

# Transactions done in France 
basket_France = (data[data['Country'] =="France"].groupby(['InvoiceNo', 'Description'])['Quantity'] .sum().unstack().reset_index().fillna(0) .set_index('InvoiceNo')) 



# Defining the hot encoding function to make the data suitable 
# for the concerned libraries 
def hot_encode(x): 
    if(x<= 0): 
        return 0
    if(x>= 1): 
        return 1

# Encoding the datasets 
basket_encoded = basket_France.applymap(hot_encode) 
basket_France = basket_encoded 


# Building the model 
frq_items = apriori(basket_France, min_support = 0.05, use_colnames = True) 

# Collecting the inferred rules in a dataframe 
rules = association_rules(frq_items, metric ="lift", min_threshold = 1) 
rules = rules.sort_values(['confidence', 'lift'], ascending =[False, False]) 
print(rules.head()) 




                                           antecedents  \
44                        (JUMBO BAG WOODLAND ANIMALS)   
259  (RED TOADSTOOL LED NIGHT LIGHT, PLASTERS IN TI...   
271  (RED TOADSTOOL LED NIGHT LIGHT, PLASTERS IN TI...   
300  (SET/20 RED RETROSPOT PAPER NAPKINS, SET/6 RED...   
301  (SET/20 RED RETROSPOT PAPER NAPKINS, SET/6 RED...   

                         consequents  antecedent support  consequent support  \
44                         (POSTAGE)            0.076531            0.765306   
259                        (POSTAGE)            0.051020            0.765306   
271                        (POSTAGE)            0.053571            0.765306   
300  (SET/6 RED SPOTTY PAPER PLATES)            0.102041            0.127551   
301    (SET/6 RED SPOTTY PAPER CUPS)            0.102041            0.137755   

      support  confidence      lift  leverage  conviction  
44   0.076531       1.000  1.306667  0.017961         inf  
259  0.051020       1.000  1.306667  0.011974     