<a href="https://colab.research.google.com/github/RISHIshrivas/MARKET-BASKET-ANALYSIS-/blob/main/Market_Basket_Analysis.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# MARKET BASKET ANALYSIS
A data mining method called market basket analysis (MBA) was developed for the retail industry and is used to find patterns among products that are frequently bought together.
The concepts of Market Basket Analysis can be extended to Natural Language Processing (NLP) in order to analyze trends in text data, even though they are often used in the context of retail transactions and consumer behavior.
Within the field of NLP, Market Basket Analysis can be utilized to comprehend the recurring patterns of words, phrases, or concepts in a certain collection of texts or documents.

In [None]:
pip install apyori

Collecting apyori
  Downloading apyori-1.1.2.tar.gz (8.6 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Building wheels for collected packages: apyori
  Building wheel for apyori (setup.py) ... [?25l[?25hdone
  Created wheel for apyori: filename=apyori-1.1.2-py3-none-any.whl size=5954 sha256=97d872a492e4713d8e3217cabafe1d406c5e8f75408b3857ce2d214c128adff7
  Stored in directory: /root/.cache/pip/wheels/c4/1a/79/20f55c470a50bb3702a8cb7c94d8ada15573538c7f4baebe2d
Successfully built apyori
Installing collected packages: apyori
Successfully installed apyori-1.1.2




> One prominent technique used in market basket analysis is the Apriori technique, which is a well-known and often used Association Rule algorithm. Additionally, it is regarded as being more accurate than SETM and AIS algorithms. It finds common item sets in transactions and establishes the rules of association between these items. The Apriori Algorithm's frequent itemset generation is a drawback. Because it must repeatedly scan the database due to the enormous dataset, this computationally expensive phase increases time and decreases performance. The ideas of support and confidence are employed.






In [None]:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from mlxtend.preprocessing import TransactionEncoder
from mlxtend.frequent_patterns import apriori, association_rules
from pandas.plotting import parallel_coordinates
import warnings
warnings.filterwarnings('ignore')
from apyori import apriori

In [None]:
import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

In [None]:
Rishi = pd.read_csv("store_data.csv",header=None)

It is necessary for us to obtain the list of items in each transaction when we have finished reading the dataset. We shall thus execute two loops. There will be two: one for the overall number of transactions and another for the total number of columns in each transaction. The list will serve as a training set from which the Association Rules list can be created.

In [None]:
print(Rishi)

                 0                  1            2                 3   \
0            shrimp            almonds      avocado    vegetables mix   
1           burgers          meatballs         eggs               NaN   
2           chutney                NaN          NaN               NaN   
3            turkey            avocado          NaN               NaN   
4     mineral water               milk   energy bar  whole wheat rice   
...             ...                ...          ...               ...   
7496         butter         light mayo  fresh bread               NaN   
7497        burgers  frozen vegetables         eggs      french fries   
7498        chicken                NaN          NaN               NaN   
7499       escalope          green tea          NaN               NaN   
7500           eggs    frozen smoothie  yogurt cake    low fat yogurt   

                4                 5     6               7             8   \
0     green grapes  whole weat flour  yams  cot

Now that we have the list of items in our training set, we can proceed by running the apriori algorithm, which will enable us to extract the list of association rules from the training set or list. 0.0045 is considered to be the minimum support and will be used in this case. 0.2 is the minimum confidence that we have maintained. Because there needs to be a minimum of two items found to be associated, the minimum lift value is taken to be 3, and the minimum length is taken to be 2.

In [None]:
l=[]
for i in range(1,7501):
    l.append([str(Rishi.values[i,j]) for j in range(0,20)])

After running the above line of code, we generated the list of association rules between the items. So to see these rules, the below line of code needs to be run.

In [None]:
#applying apriori algorithm
association_rules = apriori(l, min_support=0.0045, min_confidence=0.2, min_lift=3, min_length=2)
association_results = list(association_rules)

In [None]:
for i in range(0, len(association_results)):
  print(association_results[i][0])

frozenset({'light cream', 'chicken'})
frozenset({'escalope', 'mushroom cream sauce'})
frozenset({'escalope', 'pasta'})
frozenset({'herb & pepper', 'ground beef'})
frozenset({'ground beef', 'tomato sauce'})
frozenset({'whole wheat pasta', 'olive oil'})
frozenset({'pasta', 'shrimp'})
frozenset({'nan', 'light cream', 'chicken'})
frozenset({'chocolate', 'shrimp', 'frozen vegetables'})
frozenset({'cooking oil', 'ground beef', 'spaghetti'})
frozenset({'nan', 'escalope', 'mushroom cream sauce'})
frozenset({'nan', 'escalope', 'pasta'})
frozenset({'ground beef', 'frozen vegetables', 'spaghetti'})
frozenset({'milk', 'olive oil', 'frozen vegetables'})
frozenset({'mineral water', 'frozen vegetables', 'shrimp'})
frozenset({'olive oil', 'frozen vegetables', 'spaghetti'})
frozenset({'shrimp', 'frozen vegetables', 'spaghetti'})
frozenset({'tomatoes', 'frozen vegetables', 'spaghetti'})
frozenset({'grated cheese', 'ground beef', 'spaghetti'})
frozenset({'herb & pepper', 'ground beef', 'mineral water'})


Here we are going to display the Rule, Support, and lift ratio for every above association rule by using for loop.

In [None]:
for item in association_results:
    # first index of the inner list
    # Contains base item and add item
    pair = item[0]
    items = [x for x in pair]
    print("Rule: " + items[0] + " -> " + items[1])
    # second index of the inner list
    print("Support: " + str(item[1]))
    # third index of the list located at 0th position
    # of the third index of the inner list
    print("Confidence: " + str(item[2][0][2]))
    print("Lift: " + str(item[2][0][3]))
    print("-----------------------------------------------------")


Rule: light cream -> chicken
Support: 0.004533333333333334
Confidence: 0.2905982905982906
Lift: 4.843304843304844
-----------------------------------------------------
Rule: escalope -> mushroom cream sauce
Support: 0.005733333333333333
Confidence: 0.30069930069930073
Lift: 3.7903273197390845
-----------------------------------------------------
Rule: escalope -> pasta
Support: 0.005866666666666667
Confidence: 0.37288135593220345
Lift: 4.700185158809287
-----------------------------------------------------
Rule: herb & pepper -> ground beef
Support: 0.016
Confidence: 0.3234501347708895
Lift: 3.2915549671393096
-----------------------------------------------------
Rule: ground beef -> tomato sauce
Support: 0.005333333333333333
Confidence: 0.37735849056603776
Lift: 3.840147461662528
-----------------------------------------------------
Rule: whole wheat pasta -> olive oil
Support: 0.008
Confidence: 0.2714932126696833
Lift: 4.130221288078346
-----------------------------------------------