In [0]:
!pip install apyori

Collecting apyori
  Downloading https://files.pythonhosted.org/packages/25/fd/0561e2dd29aeed544bad2d1991636e38700cdaef9530490b863741f35295/apyori-1.1.1.tar.gz
Building wheels for collected packages: apyori
  Building wheel for apyori (setup.py) ... [?25ldone
[?25h  Stored in directory: /root/.cache/pip/wheels/7b/2a/35/c0c3749c1a36d4f454ea22d8396e1b854b86340d63cbbb7949
Successfully built apyori
Installing collected packages: apyori
Successfully installed apyori-1.1.1


** Apriori Principle Implementation**

Import the Libraries
The first step, as always, is to import the required libraries

In [0]:
  
import pandas as pd  
from apyori import apriori  

Importing the Dataset

In [0]:
store_data = pd.read_csv('store_data.csv')  


Let's call the head() function to see how the dataset looks:

In [0]:
store_data.head()  


Unnamed: 0,shrimp,almonds,avocado,vegetables mix,green grapes,whole weat flour,yams,cottage cheese,energy drink,tomato juice,low fat yogurt,green tea,honey,salad,mineral water,salmon,antioxydant juice,frozen smoothie,spinach,olive oil
0,burgers,meatballs,eggs,,,,,,,,,,,,,,,,,
1,chutney,,,,,,,,,,,,,,,,,,,
2,turkey,avocado,,,,,,,,,,,,,,,,,,
3,mineral water,milk,energy bar,whole wheat rice,green tea,,,,,,,,,,,,,,,
4,low fat yogurt,,,,,,,,,,,,,,,,,,,


If you carefully look at the data, we can see that the header is actually the first transaction. Each row corresponds to a transaction and each column corresponds to an item purchased in that specific transaction. The NaN tells us that the item represented by the column was not purchased in that specific transaction.

In [0]:
store_data = pd.read_csv('store_data.csv', header=None)  


In this dataset there is no header row. But by default, pd.read_csv function treats first row as header. To get rid of this problem, add header=None option to pd.read_csv function, as shown below:

In [0]:
store_data.head()  

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19
0,shrimp,almonds,avocado,vegetables mix,green grapes,whole weat flour,yams,cottage cheese,energy drink,tomato juice,low fat yogurt,green tea,honey,salad,mineral water,salmon,antioxydant juice,frozen smoothie,spinach,olive oil
1,burgers,meatballs,eggs,,,,,,,,,,,,,,,,,
2,chutney,,,,,,,,,,,,,,,,,,,
3,turkey,avocado,,,,,,,,,,,,,,,,,,
4,mineral water,milk,energy bar,whole wheat rice,green tea,,,,,,,,,,,,,,,


The Apriori library we are going to use requires our dataset to be in the form of a list of lists, where the whole dataset is a big list and each transaction in the dataset is an inner list within the outer big list. Currently we have data in the form of a pandas dataframe. To convert our pandas dataframe into a list of lists, execute the following script:

In [0]:
records = []  
for i in range(0, 7501):  
    records.append([str(store_data.values[i,j]) for j in range(0, 20)])

The next step is to apply the Apriori algorithm on the dataset. To do so, we can use the apriori class that we imported from the apyori library.

In [0]:
association_rules = apriori(records, min_support=0.0045, min_confidence=0.2, min_lift=3, min_length=2)  
association_results = list(association_rules)  

In [0]:
print(len(association_results)) 

48


In [0]:
print(association_results[0]) 

RelationRecord(items=frozenset(['chicken', 'light cream']), support=0.004532728969470737, ordered_statistics=[OrderedStatistic(items_base=frozenset(['light cream']), items_add=frozenset(['chicken']), confidence=0.29059829059829057, lift=4.84395061728395)])


In [0]:
for item in association_results:

    # first index of the inner list
    # Contains base item and add item
    pair = item[0] 
    items = [x for x in pair]
    print("Rule: " + items[0] + " -> " + items[1])

    #second index of the inner list
    print("Support: " + str(item[1]))

    #third index of the list located at 0th
    #of the third index of the inner list

    print("Confidence: " + str(item[2][0][2]))
    print("Lift: " + str(item[2][0][3]))
    print("=====================================")

Rule: chicken -> light cream
Support: 0.00453272896947
Confidence: 0.290598290598
Lift: 4.84395061728
Rule: escalope -> mushroom cream sauce
Support: 0.0057325689908
Confidence: 0.300699300699
Lift: 3.79083269672
Rule: pasta -> escalope
Support: 0.00586588454873
Confidence: 0.372881355932
Lift: 4.70081185016
Rule: herb & pepper -> ground beef
Support: 0.0159978669511
Confidence: 0.323450134771
Lift: 3.29199384113
Rule: tomato sauce -> ground beef
Support: 0.00533262231702
Confidence: 0.377358490566
Lift: 3.84065948132
Rule: olive oil -> whole wheat pasta
Support: 0.00799893347554
Confidence: 0.27149321267
Lift: 4.12241009764
Rule: pasta -> shrimp
Support: 0.00506599120117
Confidence: 0.322033898305
Lift: 4.50667214774
Rule: chicken -> nan
Support: 0.00453272896947
Confidence: 0.290598290598
Lift: 4.84395061728
Rule: frozen vegetables -> shrimp
Support: 0.00533262231702
Confidence: 0.232558139535
Lift: 3.25451232211
Rule: spaghetti -> ground beef
Support: 0.00479936008532
Confidence: 0.



---



---

