<a href="https://colab.research.google.com/github/PriyankaMath/256_Apriori_Algorithm/blob/main/256_MarketBasket_Apriori_Assignment.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Apriori Algorithm for Market basket analysis

##APRIORI ALGORITHM
Apriori is an algorithm for frequent item set mining and association rule learning over relational databases. It proceeds by identifying the frequent individual items in the database and extending them to larger and larger item sets as long as those item sets appear sufficiently often in the database. The frequent item sets determined by Apriori can be used to determine association rules which highlight general trends in the database: this has applications in domains such as market basket analysis.

##Installing necessary libraries

In [1]:
!pip install mlxtend --upgrade



##Importing necessary libraries

In [2]:
import pandas as pd
from mlxtend.preprocessing import TransactionEncoder
from mlxtend.frequent_patterns import apriori,association_rules

##Import dataset from drive

In [3]:
from google.colab import drive
drive.mount('/content/gdrive')

Mounted at /content/gdrive


##Load the data frame and change it to the necessary format

In [5]:
data_frame = pd.read_csv('/content/gdrive/MyDrive/Colab Notebooks/TRAIN-ARULES.csv')
data_frame.product_name = ["".join(i.split(",")) for i in data_frame.product_name]
orders = data_frame.order_id.unique()
transactions = [data_frame[data_frame.order_id == i]["product_name"].tolist() for i in orders]
trans_encoder = TransactionEncoder().fit(transactions)
transactions = trans_encoder.transform(transactions)
transactions = pd.DataFrame(transactions,columns=trans_encoder.columns_)

##Calculate the frequent itemset using Apriori, also calculate the Association rules using Apriori

In [6]:
freq_itemset_using_apriori = apriori(transactions, min_support = 0.01, use_colnames = True)
association_rules_using_apriori = association_rules(freq_itemset_using_apriori, metric = "confidence")

##Association Rules generation using Apriori Method

In [7]:
print("Association Rules using Apriori Method:- \n")
for item in association_rules_using_apriori.iterrows():
  print(f"Rule {item[0] + 1}: {tuple(item[1].antecedents)} -> {tuple(item[1].consequents)}")

Association Rules using Apriori Method:- 

Rule 1: ('Authentic French Brioche',) -> ('Petit Suisse Fruit',)
Rule 2: ('Petit Suisse Fruit',) -> ('Authentic French Brioche',)
Rule 3: ('Cran-Apple Juice Drink',) -> ('Oatmeal Crème Pies',)
Rule 4: ('Oatmeal Crème Pies',) -> ('Cran-Apple Juice Drink',)
Rule 5: ('Sparkling Water Natural Mango Essenced',) -> ('Dark Chocolate Minis',)
Rule 6: ('Grade A Extra Large Eggs',) -> ('Natural Lime Flavor Sparkling Mineral Water',)
Rule 7: ('Lemon Sparkling Water',) -> ('Orange Sparkling Water',)
Rule 8: ('Light Oaked Chardonnay',) -> ('Natural Lime Flavor Sparkling Mineral Water',)
Rule 9: ('Reduced Fat Milk',) -> ('Natural Artesian Bottled Water',)
Rule 10: ('Organic Graham Crunch Cereal',) -> ('Organic Heritage Flakes Cereal',)
Rule 11: ('Sparkling Water Natural Mango Essenced',) -> ('Organic Pink Lemonade Bunny Fruit Snacks',)
Rule 12: ('Sparkling Water Natural Mango Essenced',) -> ('Peach-Pear Sparkling Water',)
Rule 13: ('Zero Calorie Cola',) -> 

##Load Test Rules

In [8]:
data_frame_test = pd.read_csv('/content/gdrive/MyDrive/Colab Notebooks/testarules.csv')
data_frame_test.product_name = ["".join(i.split(",")) for i in data_frame_test]

  


##Check the test dataset

In [9]:
data_frame_test

Unnamed: 0,Item1,Item2,Item3,Item4,Item5
0,Dark Chocolate Minis,Organic Pink Lemonade Bunny Fruit Snacks,Peach-Pear Sparkling Water,,


##Lets analyse the support for the first item based on the training dataset

In [10]:
data_frame_test['Item1']

0    Dark Chocolate Minis
Name: Item1, dtype: object

In [11]:
freq_itemset_using_apriori[ freq_itemset_using_apriori['itemsets'] == {'Dark Chocolate Minis'} ]

Unnamed: 0,support,itemsets
26,0.021157,(Dark Chocolate Minis)


##Lets analyse the support for the second item based on the training dataset

In [12]:
data_frame_test['Item2']

0    Organic Pink Lemonade Bunny Fruit Snacks
Name: Item2, dtype: object

In [13]:
freq_itemset_using_apriori[ freq_itemset_using_apriori['itemsets'] == {'Organic Pink Lemonade Bunny Fruit Snacks'} ]

Unnamed: 0,support,itemsets
97,0.020451,(Organic Pink Lemonade Bunny Fruit Snacks)


##Lets analyse the support for the third item based on the training dataset

In [14]:
data_frame_test['Item3']

0    Peach-Pear Sparkling Water
Name: Item3, dtype: object

In [15]:
freq_itemset_using_apriori[ freq_itemset_using_apriori['itemsets'] == {'Peach-Pear Sparkling Water'} ]

Unnamed: 0,support,itemsets
116,0.016925,(Peach-Pear Sparkling Water)
