# Intro to Recommender Systems Lab

Complete the exercises below to solidify your knowledge and understanding of recommender systems.

For this lab, we are going to be putting together a user similarity based recommender system in a step-by-step fashion. Our data set contains customer grocery purchases, and we will use similar purchase behavior to inform our recommender system. Our recommender system will generate 5 recommendations for each customer based on the purchases they have made.

In [1]:
import pandas as pd
from scipy.spatial.distance import pdist, squareform



In [2]:
data = pd.read_csv('../data/customer_product_sales.csv')

In [3]:
data.head()

Unnamed: 0,CustomerID,FirstName,LastName,SalesID,ProductID,ProductName,Quantity
0,61288,Rosa,Andersen,134196,229,Bread - Hot Dog Buns,16
1,77352,Myron,Murray,6167892,229,Bread - Hot Dog Buns,20
2,40094,Susan,Stevenson,5970885,229,Bread - Hot Dog Buns,11
3,23548,Tricia,Vincent,6426954,229,Bread - Hot Dog Buns,6
4,78981,Scott,Burch,819094,229,Bread - Hot Dog Buns,20


## Step 1: Create a data frame that contains the total quantity of each product purchased by each customer.

You will need to group by CustomerID and ProductName and then sum the Quantity field.

In [4]:
step1 = data.groupby(by=['CustomerID', 'ProductName'], as_index=False)[['Quantity']].sum()
step1

Unnamed: 0,CustomerID,ProductName,Quantity
0,33,Apricots - Dried,1
1,33,Assorted Desserts,1
2,33,Bandage - Flexible Neon,1
3,33,"Bar Mix - Pina Colada, 355 Ml",1
4,33,"Beans - Kidney, Canned",1
...,...,...,...
63623,98200,Vol Au Vents,50
63624,98200,Wasabi Powder,25
63625,98200,Wine - Fume Blanc Fetzer,25
63626,98200,Wine - Hardys Bankside Shiraz,25


## Step 2: Use the `pivot_table` method to create a product by customer matrix.

The rows of the matrix should represent the products, the columns should represent the customers, and the values should be the quantities of each product purchased by each customer. You will also need to replace nulls with zeros, which you can do using the `fillna` method.

In [5]:
step2 = pd.pivot_table(step1, values='Quantity', columns='CustomerID', index='ProductName', fill_value=0)
step2 = step2.transpose()
step2

ProductName,Anchovy Paste - 56 G Tube,"Appetizer - Mini Egg Roll, Shrimp",Appetizer - Mushroom Tart,Appetizer - Sausage Rolls,Apricots - Dried,Apricots - Halves,Apricots Fresh,Arizona - Green Tea,Artichokes - Jerusalem,Assorted Desserts,...,"Wine - White, Colubia Cresh","Wine - White, Mosel Gold","Wine - White, Schroder And Schyl",Wine - Wyndham Estate Bin 777,Wonton Wrappers,Yeast Dry - Fermipan,Yoghurt Tubes,"Yogurt - Blueberry, 175 Gr",Yogurt - French Vanilla,Zucchini - Yellow
CustomerID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
33,0,0,0,0,1,0,0,0,0,1,...,0,0,0,0,0,0,0,0,1,0
200,0,0,0,0,0,0,0,0,0,0,...,1,0,0,0,0,0,0,1,0,0
264,0,0,0,0,0,1,1,0,0,0,...,0,0,0,1,0,0,0,0,0,0
356,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,1,0
412,0,0,0,0,1,0,0,0,0,0,...,0,1,1,1,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
97928,0,0,0,25,0,50,0,25,0,0,...,0,25,25,0,0,0,0,0,0,0
98069,0,0,0,25,0,25,0,0,0,25,...,0,0,0,0,0,0,0,0,0,0
98159,0,0,0,0,0,0,0,0,0,0,...,0,50,0,0,0,0,25,0,0,0
98185,0,0,25,25,0,25,0,0,0,0,...,0,0,0,25,0,0,0,0,0,0


## Step 3: Create a customer similarity matrix using `squareform` and `pdist`. For the distance metric, choose "euclidean."

In [6]:

step3_distance = pdist(X=step2, metric='euclidean')
step3_distribution = squareform(step3_distance)
step3_euclid_dist = pd.DataFrame(step3_distribution,
                           index=step2.index, 
                           columns=step2.index)

step3_euclid_dist

CustomerID,33,200,264,356,412,464,477,639,649,669,...,97697,97753,97769,97793,97900,97928,98069,98159,98185,98200
CustomerID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
33,0.000000,11.916375,10.488088,11.224972,11.401754,11.090537,12.409674,11.045361,11.269428,11.489125,...,206.871941,213.180675,225.656819,198.232187,230.913404,220.501701,217.188858,228.628520,239.000000,229.773802
200,11.916375,0.000000,11.747340,12.083046,12.569805,12.288206,12.165525,12.083046,11.874342,12.000000,...,206.310446,212.635839,224.697575,197.139544,230.952376,220.202180,215.728997,228.010965,239.037654,229.704158
264,10.488088,11.747340,0.000000,11.489125,11.224972,11.445523,12.000000,11.401754,11.180340,11.747340,...,206.387984,212.946003,225.435135,197.600607,230.371439,219.136943,216.612557,228.081126,238.266657,229.773802
356,11.224972,12.083046,11.489125,0.000000,12.083046,11.789826,12.328828,11.135529,11.958261,12.165525,...,206.649462,213.082144,225.452878,197.494304,231.038958,219.952268,217.437347,228.098663,238.493186,229.464594
412,11.401754,12.569805,11.224972,12.083046,0.000000,11.704700,12.328828,11.135529,11.789826,11.747340,...,206.900942,211.679002,225.572605,197.630969,230.614397,219.733930,217.446545,227.997807,238.396728,228.927936
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
97928,220.501701,220.202180,219.136943,219.952268,219.733930,219.599636,219.538152,219.924987,219.827205,220.070443,...,283.945417,283.945417,302.076149,272.717803,278.388218,0.000000,273.861279,291.547595,306.186218,307.205143
98069,217.188858,215.728997,216.612557,217.437347,217.446545,217.425849,216.903204,217.294731,217.080630,216.751009,...,283.945417,283.945417,295.803989,283.945417,285.043856,273.861279,0.000000,287.228132,297.909382,294.745653
98159,228.628520,228.010965,228.081126,228.098663,227.997807,228.197283,228.028507,228.181945,227.868383,228.103047,...,283.945417,279.508497,300.000000,290.473751,300.000000,291.547595,287.228132,0.000000,304.138127,305.163890
98185,239.000000,239.037654,238.266657,238.493186,238.396728,239.006276,238.949786,238.468027,238.692271,239.334494,...,301.039864,315.238005,306.186218,292.617498,314.245127,306.186218,297.909382,304.138127,0.000000,303.108891


In [7]:
# por si acaso...
#step3_distance = pdist(X=step2, metric='euclidean')
#step3_distribution = squareform(step3_distance)
#step3_euclid_dist = pd.DataFrame(step3_distribution,
#                           index=step2.index, 
#                           columns=step2.index)
#
#step3_euclid_dist

## Step 4: Check your results by generating a list of the top 5 most similar customers for a specific CustomerID.

In [8]:
def top5ById(id):
    return step3_euclid_dist[id].sort_values(ascending=False).head()



In [9]:
top533 = top5ById(33)
top533 = pd.DataFrame(top533)
top533

Unnamed: 0_level_0,33
CustomerID,Unnamed: 1_level_1
92255,248.594047
97324,247.02429
95017,244.276483
92492,241.642298
91777,240.547293


## Step 5: From the data frame you created in Step 1, select the records for the list of similar CustomerIDs you obtained in Step 4.

In [10]:
index = list(top533.index)
index

[92255, 97324, 95017, 92492, 91777]

In [11]:
top533_step5 = step1[step1['CustomerID'] == index[0]]
top533_step5 = top533_step5.append(step1[step1['CustomerID'] == index[1]],ignore_index=True)
top533_step5

  top533_step5 = top533_step5.append(step1[step1['CustomerID'] == index[1]],ignore_index=True)


Unnamed: 0,CustomerID,ProductName,Quantity
0,92255,"Appetizer - Mini Egg Roll, Shrimp",24
1,92255,Appetizer - Sausage Rolls,24
2,92255,Apricots - Dried,24
3,92255,Arizona - Green Tea,24
4,92255,Bagel - Plain,24
...,...,...,...
147,97324,Wine - Redchard Merritt,25
148,97324,Wine - Vineland Estate Semi - Dry,25
149,97324,"Wine - White, Colubia Cresh",25
150,97324,Wonton Wrappers,25


In [31]:
top533_step5 = step1[step1['CustomerID'] == index[0]]
for i in index:
    if i != index[0]:
        pass
    top533_step5 = top533_step5.append(step1[step1['CustomerID'] == i],ignore_index=True)

top533_step5['CustomerID'].unique()

  top533_step5 = top533_step5.append(step1[step1['CustomerID'] == i],ignore_index=True)


array([92255, 97324, 95017, 92492, 91777])

## Step 6: Aggregate those customer purchase records by ProductName, sum the Quantity field, and then rank them in descending order by quantity.

This will give you the total number of each product purchased by the 5 most similar customers to the customer you selected in order from most purchased to least.

In [29]:
agg = top533_step5.groupby(by='ProductName')['Quantity'].sum()
agg = pd.DataFrame(agg)
agg

Unnamed: 0_level_0,Quantity
ProductName,Unnamed: 1_level_1
"Appetizer - Mini Egg Roll, Shrimp",72
Appetizer - Mushroom Tart,49
Appetizer - Sausage Rolls,48
Apricots - Dried,48
Apricots - Halves,24
...,...
"Wine - White, Schroder And Schyl",96
Wonton Wrappers,25
Yoghurt Tubes,24
"Yogurt - Blueberry, 175 Gr",72


## Step 7: Filter the list for products that the chosen customer has not yet purchased and then recommend the top 5 products with the highest quantities that are left.

- Merge the ranked products data frame with the customer product matrix on the ProductName field.
- Filter for records where the chosen customer has not purchased the product.
- Show the top 5 results.

In [20]:
# Chosen id costumer: 33
customer_33 = step1[step1['CustomerID'] == 33]
step7 = agg.merge(customer_33, how='right', on='ProductName')
step7 = step7.fillna(value=0)
list(step7[step7['Quantity_x'] == 0.0].sort_values(by='Quantity_y', ascending=False)['ProductName'].head())

['Veal - Inside, Choice',
 'Grouper - Fresh',
 'Lettuce - Spring Mix',
 'Assorted Desserts',
 'Pate - Cognac']

## Step 8: Now that we have generated product recommendations for a single user, put the pieces together and iterate over a list of all CustomerIDs.

- Create an empty dictionary that will hold the recommendations for all customers.
- Create a list of unique CustomerIDs to iterate over.
- Iterate over the customer list performing steps 4 through 7 for each and appending the results of each iteration to the dictionary you created.

In [34]:
# Functions

def getIds():
    return list(step1['CustomerID'].unique())

def top5ById(id_):
    return step3_euclid_dist[id_].sort_values(ascending=False).head()

def top5Indexes(df):
    return list(df.index)

def top5Customers(step1, indexes):
    top5 = step1[step1['CustomerID'] == indexes[0]]
    for i in indexes:
        if i != indexes[0]:
            pass
        top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)

    return top5

def aggregationPurchaseRecords(top5Customers):
    agg = top5Customers.groupby(by='ProductName')['Quantity'].sum()
    agg = pd.DataFrame(agg)
    return agg

def recommendedProducts(step1, id_, agg):
    customer = step1[step1['CustomerID'] == id_]
    rec = agg.merge(customer, how='right', on='ProductName')
    rec = rec.fillna(value=0)
    return list(rec[rec['Quantity_x'] == 0.0].sort_values(by='Quantity_y', ascending=False)['ProductName'].head())

In [43]:
customers = getIds()
perRecs = {}
for i in customers:
    
    top5 = top5ById(i)
    
    indexes = top5Indexes(top5)
    
    top5Cust = top5Customers(step1, indexes)
    
    
    agg = aggregationPurchaseRecords(top5Cust)
    
    recs = recommendedProducts(step1, i, agg)
    
    perRecs[i] = recs
    
perRecs

  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_inde

  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_inde

  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_inde

  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_inde

  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_inde

  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_inde

  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_inde

  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_inde

  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_inde

  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_inde

  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_inde

  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_inde

  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_inde

  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_inde

  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_inde

{33: ['Veal - Inside, Choice',
  'Grouper - Fresh',
  'Lettuce - Spring Mix',
  'Assorted Desserts',
  'Pate - Cognac'],
 200: ['Lamb - Ground',
  'Loquat',
  'Wiberg Super Cure',
  'Water - Spring Water 500ml',
  'Truffle Cups - Brown'],
 264: ['Apricots Fresh',
  'Oil - Shortening,liqud, Fry',
  'Wine - Sogrape Mateus Rose',
  'Wine - Chablis 2003 Champs',
  'Turkey - Oven Roast Breast'],
 356: ['Mayonnaise - Individual Pkg',
  'Tomatoes Tear Drop',
  'Sobe - Tropical Energy',
  'Scallops - 10/20',
  'Baking Powder'],
 412: ['Flavouring - Orange',
  'Sauce - Demi Glace',
  'Pernod',
  'Beef - Rib Eye Aaa',
  'Lettuce - Frisee'],
 464: ['Vaccum Bag 10x13',
  'Kellogs All Bran Bars',
  'Apricots Fresh',
  'Rambutan',
  'Wine - Sogrape Mateus Rose'],
 477: ['Wine - Charddonnay Errazuriz',
  'Lamb - Ground',
  'Lettuce - Treviso',
  'Beef - Ground, Extra Lean, Fresh',
  'Pasta - Angel Hair'],
 639: ['Squid - Tubes / Tenticles 10/20',
  'Anchovy Paste - 56 G Tube',
  'Assorted Desserts',


##  Step 9: Store the results in a Pandas data frame. The data frame should a column for Customer ID and then a column for each of the 5 product recommendations for each customer.

In [48]:
recommendations = pd.DataFrame(perRecs)
recommendations = recommendations.transpose()
recommendations

Unnamed: 0,0,1,2,3,4
33,"Veal - Inside, Choice",Grouper - Fresh,Lettuce - Spring Mix,Assorted Desserts,Pate - Cognac
200,Lamb - Ground,Loquat,Wiberg Super Cure,Water - Spring Water 500ml,Truffle Cups - Brown
264,Apricots Fresh,"Oil - Shortening,liqud, Fry",Wine - Sogrape Mateus Rose,Wine - Chablis 2003 Champs,Turkey - Oven Roast Breast
356,Mayonnaise - Individual Pkg,Tomatoes Tear Drop,Sobe - Tropical Energy,Scallops - 10/20,Baking Powder
412,Flavouring - Orange,Sauce - Demi Glace,Pernod,Beef - Rib Eye Aaa,Lettuce - Frisee
...,...,...,...,...,...
97928,Bread Crumbs - Panko,Tea - Jasmin Green,Bay Leaf,Puree - Mocha,Pate - Cognac
98069,Mustard Prepared,Hot Chocolate - Individual,Loquat,Nantuket Peach Orange,Olives - Kalamata
98159,"Wine - White, Mosel Gold",Cheese - Parmesan Cubes,Papayas,Squid - Tubes / Tenticles 10/20,Tray - 16in Rnd Blk
98185,Bagel - Plain,Whmis - Spray Bottle Trigger,"Salsify, Organic","Pasta - Penne, Rigate, Dry",Milk - 2%


## Step 10: Change the distance metric used in Step 3 to something other than euclidean (correlation, cityblock, consine, jaccard, etc.). Regenerate the recommendations for all customers and note the differences.

In [49]:
corr_distance = pdist(X=step2, metric='correlation')
corr_distribution = squareform(corr_distance)
corr_dist = pd.DataFrame(corr_distribution,
                           index=step2.index, 
                           columns=step2.index)

corr_dist

CustomerID,33,200,264,356,412,464,477,639,649,669,...,97697,97753,97769,97793,97900,97928,98069,98159,98185,98200
CustomerID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
33,0.000000,1.025211,0.887943,0.959400,0.972579,0.971542,1.051663,1.008305,1.003146,0.990052,...,1.026814,1.064510,1.026400,1.083910,1.000520,1.084359,1.027589,1.055680,1.035068,1.032223
200,1.025211,0.000000,1.001067,1.005715,1.073948,1.075028,0.926265,1.080738,1.003482,0.978399,...,0.981300,1.015439,0.930933,0.964814,1.030125,1.063924,0.870938,0.998990,1.064954,1.046097
264,0.887943,1.001067,0.000000,1.009730,0.947144,1.039546,0.987110,1.079281,0.991922,1.039807,...,0.965398,1.037471,1.001002,1.003357,0.931461,0.903754,0.953715,0.986098,0.938976,1.037179
356,0.959400,1.005715,1.009730,0.000000,1.040222,1.041743,0.992298,0.969268,1.071740,1.056114,...,1.003586,1.054821,1.005365,0.991127,1.023312,1.015781,1.065232,0.991223,0.974719,0.997480
412,0.972579,1.073948,0.947144,1.040222,0.000000,1.009738,0.979762,0.951334,1.024509,0.969367,...,1.048723,0.888568,1.032385,1.019126,0.981172,0.999745,1.077885,0.990375,0.976296,0.941690
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
97928,1.084359,1.063924,0.903754,1.015781,0.999745,0.968645,0.981220,1.002805,0.998928,1.029966,...,1.021932,0.981406,1.050846,0.960573,0.870999,0.000000,0.902651,0.964935,1.032316,1.070734
98069,1.027589,0.870938,0.953715,1.065232,1.077885,1.065183,1.012509,1.039542,1.018858,0.976137,...,1.043515,1.001031,1.027032,1.063468,0.930410,0.902651,0.000000,0.954242,0.995756,1.004420
98159,1.055680,0.998990,0.986098,0.991223,0.990375,1.003224,0.999040,0.991565,0.959505,0.991348,...,0.979906,0.913048,0.997792,1.044248,0.974952,0.964935,0.954242,0.000000,0.982682,1.017836
98185,1.035068,1.064954,0.938976,0.974719,0.976296,1.042204,1.049346,0.956425,0.999623,1.083135,...,1.067277,1.125484,1.009437,1.020435,1.040386,1.032316,0.995756,0.982682,0.000000,0.976622


In [52]:
# New function for correlation distance
def top5ByIdCorr(id_):
    return corr_dist[id_].sort_values(ascending=False).head()

In [56]:
customersCorr = getIds()
perRecsCorr = {}
for i in customers:
    
    top5 = top5ByIdCorr(i)
   
    indexes = top5Indexes(top5)
    
    top5Cust = top5Customers(step1, indexes)
    
    
    agg = aggregationPurchaseRecords(top5Cust)
    
    recs = recommendedProducts(step1, i, agg)
    
    perRecs[i] = recs
    
perRecsCorr

  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_inde

     CustomerID                     ProductName  Quantity
0         38405       Appetizer - Sausage Rolls        20
1         38405                  Apricots Fresh        10
2         38405          Artichokes - Jerusalem        10
3         38405           Bandage - Fexible 1x3        10
4         38405  Beef - Montreal Smoked Brisket        10
..          ...                             ...       ...
385       66632        Wine - Pinot Noir Latour        17
386       66632      Wine - Two Oceans Cabernet        17
387       66632   Wine - Vidal Icewine Magnotta        17
388       66632   Wine - Wyndham Estate Bin 777        17
389       66632            Yeast Dry - Fermipan        34

[390 rows x 3 columns]
     CustomerID                        ProductName  Quantity
0         62101          Anchovy Paste - 56 G Tube        16
1         62101                Arizona - Green Tea        16
2         62101             Artichokes - Jerusalem        16
3         62101                  Ass

  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_inde

     CustomerID                       ProductName  Quantity
0         97769           Bandage - Flexible Neon        25
1         97769            Beef - Chuck, Boneless        25
2         97769                Beef - Rib Eye Aaa        25
3         97769                 Beef - Short Loin        25
4         97769                Beef - Top Sirloin        25
..          ...                               ...       ...
380       63086          Wine - White, Mosel Gold        16
381       63086  Wine - White, Schroder And Schyl        16
382       63086                     Yoghurt Tubes        16
383       63086        Yogurt - Blueberry, 175 Gr        16
384       63086           Yogurt - French Vanilla        16

[385 rows x 3 columns]
     CustomerID                  ProductName  Quantity
0         25995    Appetizer - Mushroom Tart         7
1         25995              Banana - Leaves         7
2         25995                  Beer - Blue         7
3         25995  Beets - Candy Cane,

  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_inde


     CustomerID                        ProductName  Quantity
0         87849  Appetizer - Mini Egg Roll, Shrimp        23
1         87849                   Apricots - Dried        23
2         87849  Bar - Granola Trail Mix Fruit Nut        23
3         87849            Beans - Kidney, Red Dry        23
4         87849             Beef - Chuck, Boneless        23
..          ...                                ...       ...
392       28751           Wine - Fume Blanc Fetzer         8
393       28751         Wine - Red, Colio Cabernet         8
394       28751                Wine - Red, Cooking         8
395       28751           Wine - White Cab Sauv.on         8
396       28751         Yogurt - Blueberry, 175 Gr         8

[397 rows x 3 columns]
     CustomerID                        ProductName  Quantity
0         55245  Appetizer - Mini Egg Roll, Shrimp        14
1         55245          Appetizer - Sausage Rolls        14
2         55245             Artichokes - Jerusalem        14

  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_inde

     CustomerID                        ProductName  Quantity
0         86595                      Bagel - Plain        22
1         86595              Bandage - Fexible 1x3        22
2         86595  Bar - Granola Trail Mix Fruit Nut        22
3         86595                           Bay Leaf        22
4         86595          Beef - Texas Style Burger        22
..          ...                                ...       ...
390       32386                         Watercress         9
391       32386         Wine - Crozes Hermitage E.         9
392       32386         Wine - Sogrape Mateus Rose         9
393       32386         Wine - Two Oceans Cabernet         9
394       32386      Wine - Wyndham Estate Bin 777         9

[395 rows x 3 columns]
     CustomerID                 ProductName  Quantity
0         11441           Apricots - Halves         3
1         11441           Assorted Desserts         6
2         11441               Bagel - Plain         3
3         11441             

  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_inde

     CustomerID                        ProductName  Quantity
0          2187            Bandage - Flexible Neon         1
1          2187  Bar - Granola Trail Mix Fruit Nut         1
2          2187      Bar Mix - Pina Colada, 355 Ml         1
3          2187               Beef - Ground Medium         1
4          2187      Beef - Tenderlion, Center Cut         1
..          ...                                ...       ...
394       89588     Wine - Magnotta, Merlot Sr Vqa        23
395       89588    Wine - Red, Harrow Estates, Cab        23
396       89588         Wine - Sogrape Mateus Rose        23
397       89588      Wine - Vidal Icewine Magnotta        46
398       89588   Wine - White, Schroder And Schyl        23

[399 rows x 3 columns]
     CustomerID                    ProductName  Quantity
0          1920              Apricots - Halves         1
1          1920     Bacardi Breezer - Tropical         1
2          1920                     Barramundi         1
3          1920 

  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_inde

     CustomerID                    ProductName  Quantity
0          1336                 Apricots Fresh         1
1          1336            Arizona - Green Tea         1
2          1336         Artichokes - Jerusalem         1
3          1336              Assorted Desserts         1
4          1336          Bandage - Fexible 1x3         1
..          ...                            ...       ...
392       18720        Wine - Redchard Merritt         5
393       18720         Wine - Ruffino Chianti         5
394       18720  Wine - Wyndham Estate Bin 777         5
395       18720                Wonton Wrappers         5
396       18720              Zucchini - Yellow         5

[397 rows x 3 columns]
     CustomerID                        ProductName  Quantity
0         91220  Bar - Granola Trail Mix Fruit Nut        24
1         91220             Beef - Chuck, Boneless        24
2         91220                  Beef - Short Loin        48
3         91220                    Beef Wellingt

  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_inde

     CustomerID                       ProductName  Quantity
0         34184                        Barramundi        18
1         34184                       Beans - Wax         9
2         34184              Beef - Ground Medium         9
3         34184  Beef - Ground, Extra Lean, Fresh        18
4         34184               Beef - Striploin Aa         9
..          ...                               ...       ...
377       82161      Wine - Alsace Gewurztraminer        42
378       82161        Wine - Magnotta - Cab Sauv        21
379       82161        Wine - Red, Colio Cabernet        42
380       82161           Wine - Redchard Merritt        21
381       82161        Yogurt - Blueberry, 175 Gr        21

[382 rows x 3 columns]
     CustomerID                 ProductName  Quantity
0         87303               Baking Powder        23
1         87303             Banana - Leaves        23
2         87303     Bandage - Flexible Neon        23
3         87303     Beans - Kidney, Red 

  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_inde

     CustomerID                        ProductName  Quantity
0         72305                           Bay Leaf        19
1         72305                        Beans - Wax        19
2         72305           Beef - Top Sirloin - Aaa        19
3         72305  Beer - Alexander Kieths, Pale Ale        19
4         72305      Beer - Original Organic Lager        19
..          ...                                ...       ...
367       55876     Wine - Magnotta, Merlot Sr Vqa        15
368       55876           Wine - Pinot Noir Latour        15
369       55876      Wine - Prosecco Valdobiaddene        30
370       55876             Wine - Ruffino Chianti        15
371       55876            Yogurt - French Vanilla        15

[372 rows x 3 columns]
     CustomerID                        ProductName  Quantity
0         80849  Appetizer - Mini Egg Roll, Shrimp        21
1         80849                Arizona - Green Tea        21
2         80849                      Baking Powder        21


  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_inde

     CustomerID                    ProductName  Quantity
0         82440      Appetizer - Sausage Rolls        21
1         82440               Apricots - Dried        21
2         82440              Apricots - Halves        21
3         82440                 Apricots Fresh        21
4         82440                        Bananas        21
..          ...                            ...       ...
412       82146       Wine - Pinot Noir Latour        21
413       82146  Wine - Prosecco Valdobiaddene        21
414       82146    Wine - White, Colubia Cresh        21
415       82146  Wine - Wyndham Estate Bin 777        21
416       82146        Yogurt - French Vanilla        21

[417 rows x 3 columns]
     CustomerID                        ProductName  Quantity
0         55124          Anchovy Paste - 56 G Tube        14
1         55124  Bar - Granola Trail Mix Fruit Nut        14
2         55124                        Beans - Wax        14
3         55124                 Beef - Top Sirlo

  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_inde

     CustomerID                        ProductName  Quantity
0         16139  Appetizer - Mini Egg Roll, Shrimp         5
1         16139                Beef - Inside Round         5
2         16139                    Beef Wellington         5
3         16139  Beer - Alexander Kieths, Pale Ale         5
4         16139                        Beer - Blue         5
..          ...                                ...       ...
408        6185         Wine - Two Oceans Cabernet         2
409        6185      Wine - Vidal Icewine Magnotta         2
410        6185           Wine - White Cab Sauv.on         2
411        6185   Wine - White, Schroder And Schyl         2
412        6185            Yogurt - French Vanilla         2

[413 rows x 3 columns]
     CustomerID                     ProductName  Quantity
0         92492       Appetizer - Mushroom Tart        24
1         92492               Apricots - Halves        24
2         92492      Bacardi Breezer - Tropical        24
3         92

  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_inde

     CustomerID                       ProductName  Quantity
0         85496                    Apricots Fresh        22
1         85496              Beef - Ground Medium        22
2         85496  Beef - Ground, Extra Lean, Fresh        22
3         85496    Beef - Montreal Smoked Brisket        22
4         85496         Beef - Texas Style Burger        22
..          ...                               ...       ...
377        1072           Wine - Chardonnay South         2
378        1072        Wine - Gato Negro Cabernet         1
379        1072   Wine - Red, Harrow Estates, Cab         1
380        1072       Wine - White, Colubia Cresh         1
381        1072              Yeast Dry - Fermipan         1

[382 rows x 3 columns]
     CustomerID                        ProductName  Quantity
0         97282          Appetizer - Mushroom Tart        25
1         97282                      Bagel - Plain        25
2         97282                    Banana - Leaves        25
3         97

  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_inde

     CustomerID                        ProductName  Quantity
0         51939                     Apricots Fresh        14
1         51939              Bandage - Fexible 1x3        14
2         51939  Bar - Granola Trail Mix Fruit Nut        14
3         51939                         Barramundi        14
4         51939                 Beef - Rib Eye Aaa        14
..          ...                                ...       ...
401       22454      Wine - Vidal Icewine Magnotta         6
402       22454  Wine - Vineland Estate Semi - Dry         6
403       22454           Wine - White, Mosel Gold         6
404       22454                    Wonton Wrappers         6
405       22454                  Zucchini - Yellow         6

[406 rows x 3 columns]
     CustomerID                       ProductName  Quantity
0         97029              Beans - Kidney White        25
1         97029               Beef - Striploin Aa        25
2         97029                Beef Ground Medium        25
3   

  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_inde

     CustomerID                  ProductName  Quantity
0          3074            Apricots - Halves         1
1          3074               Apricots Fresh         1
2          3074       Artichokes - Jerusalem         1
3          3074        Bandage - Fexible 1x3         1
4          3074      Bandage - Flexible Neon         1
..          ...                          ...       ...
384       20901   Wine - Chablis 2003 Champs        12
385       20901       Wine - Ruffino Chianti         6
386       20901     Wine - White Cab Sauv.on         6
387       20901  Wine - White, Colubia Cresh         6
388       20901      Yogurt - French Vanilla         6

[389 rows x 3 columns]
     CustomerID                        ProductName  Quantity
0         21973                Arizona - Green Tea         6
1         21973             Artichokes - Jerusalem         6
2         21973                      Baking Powder         6
3         21973  Bar - Granola Trail Mix Fruit Nut         6
4         2

  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_inde

     CustomerID                        ProductName  Quantity
0          2939                     Apricots Fresh         1
1          2939  Bar - Granola Trail Mix Fruit Nut         1
2          2939             Beans - Kidney, Canned         1
3          2939                 Beef Ground Medium         1
4          2939     Bread - Italian Corn Meal Poly         1
..          ...                                ...       ...
351       77651            Wine - Chardonnay South        20
352       77651         Wine - Magnotta - Belpaese        20
353       77651                Wine - Red, Cooking        20
354       77651    Wine - Red, Harrow Estates, Cab        20
355       77651                      Yoghurt Tubes        20

[356 rows x 3 columns]
     CustomerID                       ProductName  Quantity
0         97029              Beans - Kidney White        25
1         97029               Beef - Striploin Aa        25
2         97029                Beef Ground Medium        25
3   

  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_inde

     CustomerID                       ProductName  Quantity
0         18680         Anchovy Paste - 56 G Tube         5
1         18680         Appetizer - Mushroom Tart         5
2         18680                  Apricots - Dried         5
3         18680                    Apricots Fresh         5
4         18680               Arizona - Green Tea         5
..          ...                               ...       ...
404       96524        Water - Spring Water 500ml        25
405       96524     Wine - Blue Nun Qualitatswein        25
406       96524           Wine - Redchard Merritt        25
407       96524        Wine - Two Oceans Cabernet        25
408       96524  Wine - White, Schroder And Schyl        25

[409 rows x 3 columns]
     CustomerID                       ProductName  Quantity
0         87228                     Bagel - Plain        23
1         87228              Beans - Kidney White        23
2         87228                       Beans - Wax        23
3         87228 

  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_inde

     CustomerID                        ProductName  Quantity
0         95314  Appetizer - Mini Egg Roll, Shrimp        25
1         95314                      Bagel - Plain        25
2         95314               Beef - Ground Medium        50
3         95314           Beef - Top Sirloin - Aaa        25
4         95314                 Beer - Labatt Blue        25
..          ...                                ...       ...
399       72330         Wine - Magnotta - Belpaese        19
400       72330         Wine - Sogrape Mateus Rose        19
401       72330         Wine - Two Oceans Cabernet        19
402       72330           Wine - White Cab Sauv.on        19
403       72330      Wine - Wyndham Estate Bin 777        19

[404 rows x 3 columns]
     CustomerID                   ProductName  Quantity
0         86786           Beef - Inside Round        22
1         86786          Beef - Prime Rib Aaa        22
2         86786            Beef Ground Medium        22
3         86786   Be

  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_inde

     CustomerID                        ProductName  Quantity
0         57497  Appetizer - Mini Egg Roll, Shrimp        30
1         57497                Beer - Rickards Red        15
2         57497          Beer - Sleemans Cream Ale        15
3         57497                       Blackberries        15
4         57497                        Blueberries        15
..          ...                                ...       ...
371        2686         Wine - Two Oceans Cabernet         1
372        2686  Wine - Vineland Estate Semi - Dry         1
373        2686           Wine - White, Mosel Gold         1
374        2686      Wine - Wyndham Estate Bin 777         1
375        2686         Yogurt - Blueberry, 175 Gr         1

[376 rows x 3 columns]
     CustomerID                        ProductName  Quantity
0         85161          Appetizer - Mushroom Tart        22
1         85161          Appetizer - Sausage Rolls        22
2         85161             Beef - Chuck, Boneless        44


  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_inde

     CustomerID                        ProductName  Quantity
0         31075      Bar Mix - Pina Colada, 355 Ml        24
1         31075             Beef - Chuck, Boneless         8
2         31075                Beef - Inside Round         8
3         31075  Beer - Alexander Kieths, Pale Ale         8
4         31075          Beer - Sleemans Cream Ale        16
..          ...                                ...       ...
363       43345       Wine - Alsace Gewurztraminer        11
364       43345      Wine - Blue Nun Qualitatswein        11
365       43345     Wine - Magnotta, Merlot Sr Vqa        11
366       43345             Wine - Ruffino Chianti        11
367       43345               Yeast Dry - Fermipan        11

[368 rows x 3 columns]
     CustomerID                       ProductName  Quantity
0         13823         Appetizer - Mushroom Tart         4
1         13823                    Apricots Fresh         4
2         13823        Bacardi Breezer - Tropical         8
3   

  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_inde

     CustomerID                      ProductName  Quantity
0         42087        Appetizer - Mushroom Tart        11
1         42087                Apricots - Halves        11
2         42087              Arizona - Green Tea        22
3         42087           Artichokes - Jerusalem        11
4         42087                  Banana - Leaves        11
..          ...                              ...       ...
382       39110    Wine - Ej Gallo Sierra Valley        10
383       39110       Wine - Gato Negro Cabernet        10
384       39110       Wine - Red, Colio Cabernet        10
385       39110  Wine - Red, Harrow Estates, Cab        20
386       39110    Wine - Vidal Icewine Magnotta        10

[387 rows x 3 columns]
     CustomerID                 ProductName  Quantity
0         46250         Arizona - Green Tea        12
1         46250      Artichokes - Jerusalem        12
2         46250           Assorted Desserts        12
3         46250       Bandage - Fexible 1x3        1

     CustomerID                     ProductName  Quantity
0         25994               Assorted Desserts         7
1         25994                   Bagel - Plain         7
2         25994          Beans - Kidney, Canned         7
3         25994                     Beans - Wax         7
4         25994  Beef - Montreal Smoked Brisket         7
..          ...                             ...       ...
383       86786  Wine - Magnotta, Merlot Sr Vqa        22
384       86786      Wine - Red, Colio Cabernet        22
385       86786      Wine - Sogrape Mateus Rose        22
386       86786             Wine - Toasted Head        44
387       86786     Wine - White, Colubia Cresh        22

[388 rows x 3 columns]
     CustomerID                        ProductName  Quantity
0         94547                     Apricots Fresh        24
1         94547             Artichokes - Jerusalem        24
2         94547               Beef - Ground Medium        24
3         94547                  Bee

  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)
  top5 = top5.append(step1[step1['CustomerID'] == i],ignore_index=True)


{}