<a href="https://colab.research.google.com/github/PriyankaMath/256_Apriori_Algorithm/blob/main/256_MarketBasket_Apriori_Assignment.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Apriori Algorithm for Market basket analysis

##APRIORI ALGORITHM
Apriori is an algorithm for frequent item set mining and association rule learning over relational databases. It proceeds by identifying the frequent individual items in the database and extending them to larger and larger item sets as long as those item sets appear sufficiently often in the database. The frequent item sets determined by Apriori can be used to determine association rules which highlight general trends in the database: this has applications in domains such as market basket analysis.

##Installing necessary libraries

In [1]:
!pip install mlxtend --upgrade



##Importing necessary libraries

In [2]:
import pandas as pd
from mlxtend.preprocessing import TransactionEncoder
from mlxtend.frequent_patterns import apriori,association_rules

##Import dataset from drive

In [3]:
from google.colab import drive
drive.mount('/content/gdrive')

Drive already mounted at /content/gdrive; to attempt to forcibly remount, call drive.mount("/content/gdrive", force_remount=True).


##Load the data frame and change it to the necessary format

In [4]:
data_frame = pd.read_csv('/content/gdrive/MyDrive/Colab Notebooks/TRAIN-ARULES.csv')
data_frame.product_name = ["".join(i.split(",")) for i in data_frame.product_name]
orders = data_frame.order_id.unique()
transactions = [data_frame[data_frame.order_id == i]["product_name"].tolist() for i in orders]
trans_encoder = TransactionEncoder().fit(transactions)
transactions = trans_encoder.transform(transactions)
transactions = pd.DataFrame(transactions,columns=trans_encoder.columns_)

##Calculate the frequent itemset using Apriori, also calculate the Association rules using Apriori

In [5]:
freq_itemset_using_apriori = apriori(transactions, min_support = 0.0045, use_colnames = True)
association_rules_using_apriori = association_rules(freq_itemset_using_apriori, metric = "confidence")

##Association Rules generation using Apriori Method

In [6]:
print("Association Rules using Apriori Method:- \n")
for item in association_rules_using_apriori.iterrows():
  print(f"Rule {item[0] + 1}: {tuple(item[1].antecedents)} -> {tuple(item[1].consequents)}")

Association Rules using Apriori Method:- 

Rule 1: ('100% Premium Select Not From Concentrate Pure Prune Juice',) -> ('Natural Artesian Bottled Water',)
Rule 2: ('Jaipur Karhi Organic Potato Dumplings in Spicy Buttermilk',) -> ('1500 Pale Ale',)
Rule 3: ('80  Vodka Holiday Edition',) -> ('Jalapeno Pepper',)
Rule 4: ('Jalapeno Pepper',) -> ('80  Vodka Holiday Edition',)
Rule 5: ('Mixed Vegetables',) -> ('80  Vodka Holiday Edition',)
Rule 6: ("Annie's Bunny Fruit Snacks Variety",) -> ('Crunch Chocolate Peanut Butter Granola Bar',)
Rule 7: ("Annie's Bunny Fruit Snacks Variety",) -> ('Dark & Mint Filled Chocolate Squares',)
Rule 8: ("Annie's Bunny Fruit Snacks Variety",) -> ('Lemon Sparkling Water',)
Rule 9: ("Annie's Bunny Fruit Snacks Variety",) -> ('Orange Sparkling Water',)
Rule 10: ("Annie's Bunny Fruit Snacks Variety",) -> ('Organic Heritage Flakes Cereal',)
Rule 11: ('Antimo Caputo Flour',) -> ('Authentic French Brioche',)
Rule 12: ('Hazelnut Bite Size Wafer Cookies',) -> ('Antimo C

##Load Test Rules

In [7]:
data_frame_test = pd.read_csv('/content/gdrive/MyDrive/Colab Notebooks/testarules.csv')
data_frame_test.product_name = ["".join(i.split(",")) for i in data_frame_test]

  


##Check the test dataset

In [8]:
data_frame_test

Unnamed: 0,Item1,Item2,Item3,Item4,Item5
0,Dark Chocolate Minis,Organic Pink Lemonade Bunny Fruit Snacks,Peach-Pear Sparkling Water,,


##Lets analyse the support for the first item based on the training dataset

In [9]:
data_frame_test['Item1']

0    Dark Chocolate Minis
Name: Item1, dtype: object

In [10]:
freq_itemset_using_apriori[ freq_itemset_using_apriori['itemsets'] == {'Dark Chocolate Minis'} ]

Unnamed: 0,support,itemsets
79,0.021157,(Dark Chocolate Minis)


##Lets analyse the support for the second item based on the training dataset

In [11]:
data_frame_test['Item2']

0    Organic Pink Lemonade Bunny Fruit Snacks
Name: Item2, dtype: object

In [12]:
freq_itemset_using_apriori[ freq_itemset_using_apriori['itemsets'] == {'Organic Pink Lemonade Bunny Fruit Snacks'} ]

Unnamed: 0,support,itemsets
271,0.020451,(Organic Pink Lemonade Bunny Fruit Snacks)


##Lets analyse the support for the third item based on the training dataset

In [13]:
data_frame_test['Item3']

0    Peach-Pear Sparkling Water
Name: Item3, dtype: object

In [14]:
freq_itemset_using_apriori[ freq_itemset_using_apriori['itemsets'] == {'Peach-Pear Sparkling Water'} ]

Unnamed: 0,support,itemsets
333,0.016925,(Peach-Pear Sparkling Water)
