<a href="https://colab.research.google.com/github/audrey-siqueira/Data-Science-Projects/blob/master/Apriori.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Arrangement of Products in the Supermarket using the Apriori Association**
---
<p align="justify">
A supermarket chain realized that the place where each product is placed for customers has a great chance of interfering in the purchase or not of a certain product. For example, they found that a customer buying a product A has a certain probability of buying a product B, depending on the place and distance that these two products are located within the market.

<p align=center>
<img src="https://drive.google.com/uc?id=1IqMVgb0FS6n_cVIejfJ3NxtbjpIPR84G" width="60%"></p>

<p align="justify">
The supermarket wants to know, through its data science department, which products have the strongest degree of association based on the customer's purchasing database.The data department received a database with a large amount of purchases made by customers and the following product lists for each purchase.

<p align=center>
<img src="https://drive.google.com/uc?id=1a5Bj3cli-I7fkZEoEE2Ux4dKES2W8__P" width="40%"></p>


The strategy in this case is to use the **Apriori** association method, returning the values of association between products represented by the Lift in descending order.


**Code description is explained below:**

## **Package Installation**

In the platform there is not any pre-installed requirements. In this step, the necessary library is installed.

In [None]:
!pip install apyori

Collecting apyori
  Downloading https://files.pythonhosted.org/packages/5e/62/5ffde5c473ea4b033490617ec5caa80d59804875ad3c3c57c0976533a21a/apyori-1.1.2.tar.gz
Building wheels for collected packages: apyori
  Building wheel for apyori (setup.py) ... [?25l[?25hdone
  Created wheel for apyori: filename=apyori-1.1.2-cp36-none-any.whl size=5975 sha256=dcd7ea930d07d814abe9d78b56af136baa49ffa67aa235ef41b8707def87b733
  Stored in directory: /root/.cache/pip/wheels/5d/92/bb/474bbadbc8c0062b9eb168f69982a0443263f8ab1711a8cad0
Successfully built apyori
Installing collected packages: apyori
Successfully installed apyori-1.1.2


## **Importing the libraries**


The 3 libraries needed for the project are imported.
- Pandas for data manipulation and analysis
- Numpy for mathematical operations
- Matplotlib for graphical visualizations

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

## **Importing the data set**


The .csv file containing database is imported.

The values of the database header and how they are distributed can be viewed in the image below:

In [None]:
dataset = pd.read_csv('/content/drive/My Drive/Colab Notebooks/Association/Apriori/Market_Basket_Optimisation.csv', header = None)
transactions = []
for i in range(0, 7501):
  transactions.append([str(dataset.values[i,j]) for j in range(0, 20)])
dataset.head()

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19
0,shrimp,almonds,avocado,vegetables mix,green grapes,whole weat flour,yams,cottage cheese,energy drink,tomato juice,low fat yogurt,green tea,honey,salad,mineral water,salmon,antioxydant juice,frozen smoothie,spinach,olive oil
1,burgers,meatballs,eggs,,,,,,,,,,,,,,,,,
2,chutney,,,,,,,,,,,,,,,,,,,
3,turkey,avocado,,,,,,,,,,,,,,,,,,
4,mineral water,milk,energy bar,whole wheat rice,green tea,,,,,,,,,,,,,,,


## **Applying Apriori model on the dataset**

The **apyori** library was used to apply the **Apriori** method.



In [None]:
from apyori import apriori
rules = apriori(transactions = transactions, min_support = 0.003, min_confidence = 0.2, min_lift = 3, min_length = 2, max_length = 2)

## **Visualising the results**

Displaying the results sorted by descending lifts.

In [None]:
results = list(rules)
def inspect(results):
    lhs         = [tuple(result[2][0][0])[0] for result in results]
    rhs         = [tuple(result[2][0][1])[0] for result in results]
    supports    = [result[1] for result in results]
    confidences = [result[2][0][2] for result in results]
    lifts       = [result[2][0][3] for result in results]
    return list(zip(lhs, rhs, supports, confidences, lifts))
resultsinDataFrame = pd.DataFrame(inspect(results), columns = ['Left Hand Side', 'Right Hand Side', 'Support', 'Confidence', 'Lift'])
resultsinDataFrame.nlargest(n = 10, columns = 'Lift')

Unnamed: 0,Left Hand Side,Right Hand Side,Support,Confidence,Lift
3,fromage blanc,honey,0.003333,0.245098,5.164271
0,light cream,chicken,0.004533,0.290598,4.843951
2,pasta,escalope,0.005866,0.372881,4.700812
8,pasta,shrimp,0.005066,0.322034,4.506672
7,whole wheat pasta,olive oil,0.007999,0.271493,4.12241
5,tomato sauce,ground beef,0.005333,0.377358,3.840659
1,mushroom cream sauce,escalope,0.005733,0.300699,3.790833
4,herb & pepper,ground beef,0.015998,0.32345,3.291994
6,light cream,olive oil,0.0032,0.205128,3.11471


## **Conclusion**

The Apriori returned the association between Fromage Blanc and Honey as the strongest association considering all the products. 