# Recommender Systems Deep-Dive Lab

There are many different approaches that we can take when creating recommender systems. In the Intro to Recommender Systems lesson and lab, we put together a user similarity based recommender that first calculated the similarities between users and then leveraged a rank-based item recommender within each group of similar customers. In other words, for a given user, our recommender found the top 5 customers who were the most similar to them, aggregated and ranked the purchases of those 5 customers, and then recommended the top 5 most popular products among that group of similar users to the customer.

In this lab, we are going to start out with the same data set, but we are going to dive deeper into the analysis of customers and products and look at an alternative way to generate recommendations.

We will begin by importing everything we will need for this lab (libraries, data set, etc.).

In [1]:
import pandas as pd
import random

from scipy.spatial.distance import pdist, squareform

In [2]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [3]:
data = pd.read_csv('/content/drive/MyDrive/Colab Notebooks/LABS/Módulo 2/lab-recommender-deepdive-master/data/customer_product_sales.csv')

## Data Preparation

We will then put together the foundational transformations of the data that we will need to eventually produce recommendations. The steps in this section should be familiar to you, as you would have had to tranform the data in this manner to create the user similarity based recommender in the Intro to Recommender Systems lab.

First, we will create a data frame that contains the total quantity of each product purchased by each customer.

In [4]:
customer_products = data.groupby(['CustomerID', 'ProductName']).agg({'Quantity':'sum'}).reset_index()

In [5]:
customer_products

Unnamed: 0,CustomerID,ProductName,Quantity
0,33,Apricots - Dried,1
1,33,Assorted Desserts,1
2,33,Bandage - Flexible Neon,1
3,33,"Bar Mix - Pina Colada, 355 Ml",1
4,33,"Beans - Kidney, Canned",1
...,...,...,...
63623,98200,Vol Au Vents,50
63624,98200,Wasabi Powder,25
63625,98200,Wine - Fume Blanc Fetzer,25
63626,98200,Wine - Hardys Bankside Shiraz,25


Then, we want to create a matrix that has customers on one axis, products on the other, and the quantity purchased as the values. There will be many instances where a customer has not purchased a product, which by default will be expressed with a null value. We will want to replace those nulls with zeros by appending `.fillna(0)` to our pivot table.

In [6]:
prod_cust_pivot = customer_products.pivot_table(values='Quantity', 
                                                columns='CustomerID', 
                                                index='ProductName', 
                                                aggfunc='sum').fillna(0)



In [7]:
prod_cust_pivot.head()

CustomerID,33,200,264,356,412,464,477,639,649,669,694,756,883,891,1008,1034,1066,1072,1336,1428,1435,1534,1577,1594,1754,1839,1920,2187,2329,2503,2556,2566,2582,2617,2686,2754,2776,2902,2915,2939,...,94438,94547,94599,94910,94951,95017,95034,95059,95078,95121,95314,95372,95819,96024,96088,96272,96522,96524,96560,96615,96666,96684,97029,97052,97063,97093,97201,97282,97324,97495,97697,97753,97769,97793,97900,97928,98069,98159,98185,98200
ProductName,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1,Unnamed: 48_level_1,Unnamed: 49_level_1,Unnamed: 50_level_1,Unnamed: 51_level_1,Unnamed: 52_level_1,Unnamed: 53_level_1,Unnamed: 54_level_1,Unnamed: 55_level_1,Unnamed: 56_level_1,Unnamed: 57_level_1,Unnamed: 58_level_1,Unnamed: 59_level_1,Unnamed: 60_level_1,Unnamed: 61_level_1,Unnamed: 62_level_1,Unnamed: 63_level_1,Unnamed: 64_level_1,Unnamed: 65_level_1,Unnamed: 66_level_1,Unnamed: 67_level_1,Unnamed: 68_level_1,Unnamed: 69_level_1,Unnamed: 70_level_1,Unnamed: 71_level_1,Unnamed: 72_level_1,Unnamed: 73_level_1,Unnamed: 74_level_1,Unnamed: 75_level_1,Unnamed: 76_level_1,Unnamed: 77_level_1,Unnamed: 78_level_1,Unnamed: 79_level_1,Unnamed: 80_level_1,Unnamed: 81_level_1
Anchovy Paste - 56 G Tube,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,...,0.0,0.0,0.0,0.0,25.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,25.0,25.0,0.0,0.0,0.0,0.0,25.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,25.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
"Appetizer - Mini Egg Roll, Shrimp",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,25.0,25.0,0.0,0.0,25.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,25.0,25.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
Appetizer - Mushroom Tart,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,...,0.0,0.0,0.0,25.0,0.0,25.0,25.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,25.0,0.0,0.0,0.0,25.0,0.0,0.0,25.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,25.0,0.0
Appetizer - Sausage Rolls,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,24.0,0.0,0.0,0.0,0.0,25.0,0.0,0.0,0.0,0.0,25.0,0.0,0.0,25.0,25.0,25.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,25.0,25.0,25.0,0.0,25.0,0.0
Apricots - Dried,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,1.0,0.0,0.0,0.0,...,24.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,25.0,0.0,0.0,25.0,0.0,25.0,25.0,25.0,25.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,25.0,0.0,25.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In the pivot table we created, the rows represent the products and the columns represent the customers. Depending on what need to do with the matrix, we may instead need to transpose it so that the rows represent customers and the columns to represent products. We can do this easily by appending `.T` to our product customer matrix.

In [8]:
cust_prod_pivot = prod_cust_pivot.T
cust_prod_pivot.head()

ProductName,Anchovy Paste - 56 G Tube,"Appetizer - Mini Egg Roll, Shrimp",Appetizer - Mushroom Tart,Appetizer - Sausage Rolls,Apricots - Dried,Apricots - Halves,Apricots Fresh,Arizona - Green Tea,Artichokes - Jerusalem,Assorted Desserts,Bacardi Breezer - Tropical,Bagel - Plain,Baking Powder,Banana - Leaves,Banana Turning,Bananas,Bandage - Fexible 1x3,Bandage - Flexible Neon,Bar - Granola Trail Mix Fruit Nut,"Bar Mix - Pina Colada, 355 Ml",Barramundi,Bay Leaf,Beans - Kidney White,"Beans - Kidney, Canned","Beans - Kidney, Red Dry",Beans - Wax,"Beef - Chuck, Boneless",Beef - Ground Medium,"Beef - Ground, Extra Lean, Fresh",Beef - Inside Round,Beef - Montreal Smoked Brisket,Beef - Prime Rib Aaa,Beef - Rib Eye Aaa,Beef - Short Loin,Beef - Striploin Aa,"Beef - Tenderlion, Center Cut",Beef - Texas Style Burger,Beef - Top Sirloin,Beef - Top Sirloin - Aaa,Beef Ground Medium,...,Whmis - Spray Bottle Trigger,Wiberg Super Cure,Wine - Alsace Gewurztraminer,Wine - Blue Nun Qualitatswein,"Wine - Cahors Ac 2000, Clos",Wine - Chablis 2003 Champs,Wine - Charddonnay Errazuriz,Wine - Chardonnay South,Wine - Crozes Hermitage E.,Wine - Ej Gallo Sierra Valley,Wine - Fume Blanc Fetzer,Wine - Gato Negro Cabernet,Wine - Hardys Bankside Shiraz,Wine - Magnotta - Belpaese,Wine - Magnotta - Cab Sauv,"Wine - Magnotta, Merlot Sr Vqa",Wine - Pinot Noir Latour,Wine - Prosecco Valdobiaddene,"Wine - Red, Colio Cabernet","Wine - Red, Cooking","Wine - Red, Harrow Estates, Cab",Wine - Redchard Merritt,Wine - Ruffino Chianti,Wine - Sogrape Mateus Rose,Wine - Toasted Head,Wine - Two Oceans Cabernet,Wine - Valpolicella Masi,Wine - Vidal Icewine Magnotta,Wine - Vineland Estate Semi - Dry,Wine - White Cab Sauv.on,"Wine - White, Colubia Cresh","Wine - White, Mosel Gold","Wine - White, Schroder And Schyl",Wine - Wyndham Estate Bin 777,Wonton Wrappers,Yeast Dry - Fermipan,Yoghurt Tubes,"Yogurt - Blueberry, 175 Gr",Yogurt - French Vanilla,Zucchini - Yellow
CustomerID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1,Unnamed: 48_level_1,Unnamed: 49_level_1,Unnamed: 50_level_1,Unnamed: 51_level_1,Unnamed: 52_level_1,Unnamed: 53_level_1,Unnamed: 54_level_1,Unnamed: 55_level_1,Unnamed: 56_level_1,Unnamed: 57_level_1,Unnamed: 58_level_1,Unnamed: 59_level_1,Unnamed: 60_level_1,Unnamed: 61_level_1,Unnamed: 62_level_1,Unnamed: 63_level_1,Unnamed: 64_level_1,Unnamed: 65_level_1,Unnamed: 66_level_1,Unnamed: 67_level_1,Unnamed: 68_level_1,Unnamed: 69_level_1,Unnamed: 70_level_1,Unnamed: 71_level_1,Unnamed: 72_level_1,Unnamed: 73_level_1,Unnamed: 74_level_1,Unnamed: 75_level_1,Unnamed: 76_level_1,Unnamed: 77_level_1,Unnamed: 78_level_1,Unnamed: 79_level_1,Unnamed: 80_level_1,Unnamed: 81_level_1
33,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0
200,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,...,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0
264,0.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,0.0,1.0,1.0,0.0,1.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,1.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0
356,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0
412,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,...,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0


Another thing we may want to do is normalize the values across rows or columns of the matrix so that all the values are between 0 and 1. Doing this for customers would help us identify customers that may have purcahsed a similar mix of products even though some of those customers may have purchased large quantities while other may have purchased smaller quantities. Doing this for products would help us better identify products that have been purchased by similar groups of customers regardless of the quantities purchased.

We can normalize across rows for each matrix as follows.

In [9]:
prod_cust_pivot = prod_cust_pivot.div(prod_cust_pivot.sum(axis=1), axis=0)
cust_prod_pivot = cust_prod_pivot.div(cust_prod_pivot.sum(axis=1), axis=0)

In [10]:
prod_cust_pivot

CustomerID,33,200,264,356,412,464,477,639,649,669,694,756,883,891,1008,1034,1066,1072,1336,1428,1435,1534,1577,1594,1754,1839,1920,2187,2329,2503,2556,2566,2582,2617,2686,2754,2776,2902,2915,2939,...,94438,94547,94599,94910,94951,95017,95034,95059,95078,95121,95314,95372,95819,96024,96088,96272,96522,96524,96560,96615,96666,96684,97029,97052,97063,97093,97201,97282,97324,97495,97697,97753,97769,97793,97900,97928,98069,98159,98185,98200
ProductName,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1,Unnamed: 48_level_1,Unnamed: 49_level_1,Unnamed: 50_level_1,Unnamed: 51_level_1,Unnamed: 52_level_1,Unnamed: 53_level_1,Unnamed: 54_level_1,Unnamed: 55_level_1,Unnamed: 56_level_1,Unnamed: 57_level_1,Unnamed: 58_level_1,Unnamed: 59_level_1,Unnamed: 60_level_1,Unnamed: 61_level_1,Unnamed: 62_level_1,Unnamed: 63_level_1,Unnamed: 64_level_1,Unnamed: 65_level_1,Unnamed: 66_level_1,Unnamed: 67_level_1,Unnamed: 68_level_1,Unnamed: 69_level_1,Unnamed: 70_level_1,Unnamed: 71_level_1,Unnamed: 72_level_1,Unnamed: 73_level_1,Unnamed: 74_level_1,Unnamed: 75_level_1,Unnamed: 76_level_1,Unnamed: 77_level_1,Unnamed: 78_level_1,Unnamed: 79_level_1,Unnamed: 80_level_1,Unnamed: 81_level_1
Anchovy Paste - 56 G Tube,0.000000,0.000000,0.0,0.000000,0.000000,0.0,0.000000,0.000521,0.000000,0.000000,0.000000,0.0,0.000521,0.000521,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.0,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.000521,0.000000,0.000000,0.000000,0.000000,0.000000,0.000521,0.000000,0.0,...,0.000000,0.0,0.000000,0.000000,0.013034,0.000000,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.013034,0.013034,0.0,0.000000,0.0,0.000000,0.013034,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.013034,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000
"Appetizer - Mini Egg Roll, Shrimp",0.000000,0.000000,0.0,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.000449,0.000000,0.000000,0.000000,0.000000,0.000449,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.0,0.000449,0.0,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000449,0.000000,0.000000,0.000000,0.0,...,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.011221,0.011221,0.000000,0.000000,0.011221,0.000000,0.000000,0.000000,0.000000,0.0,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.011221,0.011221,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000
Appetizer - Mushroom Tart,0.000000,0.000000,0.0,0.000000,0.000000,0.0,0.000000,0.000532,0.000000,0.000000,0.000532,0.0,0.000000,0.000532,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000532,0.0,0.000000,0.000000,0.0,0.000000,0.0,0.000000,0.000000,0.000532,0.0,0.000000,0.000000,0.000532,0.000000,0.000000,0.000000,0.000000,0.000532,0.0,...,0.000000,0.0,0.000000,0.013298,0.000000,0.013298,0.013298,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.000000,0.0,0.000000,0.013298,0.000000,0.000000,0.000000,0.013298,0.000000,0.000000,0.013298,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.013298,0.000000
Appetizer - Sausage Rolls,0.000000,0.000000,0.0,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000575,0.000000,0.000575,0.000575,0.0,0.000000,0.000000,0.0,0.000000,0.0,0.000000,0.000000,0.000575,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,...,0.000000,0.0,0.013809,0.000000,0.000000,0.000000,0.000000,0.014384,0.000000,0.0,0.000000,0.000000,0.014384,0.000000,0.000000,0.014384,0.014384,0.014384,0.000000,0.0,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.014384,0.014384,0.014384,0.000000,0.014384,0.000000
Apricots - Dried,0.000491,0.000000,0.0,0.000000,0.000491,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.000491,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.0,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.000000,0.000491,0.000000,0.000000,0.000491,0.000491,0.000000,0.000000,0.0,...,0.011794,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.012285,0.0,0.000000,0.012285,0.000000,0.012285,0.012285,0.012285,0.012285,0.000000,0.000000,0.0,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.012285,0.000000,0.012285,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
Yeast Dry - Fermipan,0.000000,0.000000,0.0,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.000595,0.000000,0.000000,0.000000,0.000000,0.000595,0.000000,0.000595,0.000000,0.0,0.000000,0.000000,0.0,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000595,0.000595,0.000595,0.000000,0.0,...,0.000000,0.0,0.000000,0.014881,0.014881,0.000000,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.014881,0.0,0.000000,0.0,0.000000,0.014881,0.000000,0.014881,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.014881,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000
Yoghurt Tubes,0.000000,0.000000,0.0,0.000000,0.000000,0.0,0.000000,0.000000,0.000969,0.000000,0.000000,0.0,0.000000,0.000000,0.000484,0.000484,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.000484,0.000000,0.0,0.000000,0.0,0.000000,0.000484,0.000484,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,...,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.012112,0.012112,0.0,0.000000,0.000000,0.012112,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.000000,0.0,0.000000,0.012112,0.012112,0.012112,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.012112,0.000000,0.000000
"Yogurt - Blueberry, 175 Gr",0.000000,0.000478,0.0,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000478,0.000000,0.000000,0.000000,0.000000,0.0,0.000000,0.000478,0.0,0.000478,0.0,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.000478,0.000000,0.000000,0.000000,0.000000,0.0,...,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.011939,0.000000,0.000000,0.000000,0.0,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.011939,0.000000,0.000000,0.011939,0.011939,0.000000,0.000000,0.011939,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000
Yogurt - French Vanilla,0.000457,0.000000,0.0,0.000457,0.000000,0.0,0.000914,0.000000,0.000000,0.000457,0.000000,0.0,0.000000,0.000000,0.000000,0.000457,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.0,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.000457,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,...,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.011431,0.000000,0.011431,0.000000,0.000000,0.000000,0.000000,0.0,0.011431,0.0,0.011431,0.000000,0.011431,0.000000,0.000000,0.011431,0.011431,0.000000,0.000000,0.000000,0.011431,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.011431


In [11]:
cust_prod_pivot

ProductName,Anchovy Paste - 56 G Tube,"Appetizer - Mini Egg Roll, Shrimp",Appetizer - Mushroom Tart,Appetizer - Sausage Rolls,Apricots - Dried,Apricots - Halves,Apricots Fresh,Arizona - Green Tea,Artichokes - Jerusalem,Assorted Desserts,Bacardi Breezer - Tropical,Bagel - Plain,Baking Powder,Banana - Leaves,Banana Turning,Bananas,Bandage - Fexible 1x3,Bandage - Flexible Neon,Bar - Granola Trail Mix Fruit Nut,"Bar Mix - Pina Colada, 355 Ml",Barramundi,Bay Leaf,Beans - Kidney White,"Beans - Kidney, Canned","Beans - Kidney, Red Dry",Beans - Wax,"Beef - Chuck, Boneless",Beef - Ground Medium,"Beef - Ground, Extra Lean, Fresh",Beef - Inside Round,Beef - Montreal Smoked Brisket,Beef - Prime Rib Aaa,Beef - Rib Eye Aaa,Beef - Short Loin,Beef - Striploin Aa,"Beef - Tenderlion, Center Cut",Beef - Texas Style Burger,Beef - Top Sirloin,Beef - Top Sirloin - Aaa,Beef Ground Medium,...,Whmis - Spray Bottle Trigger,Wiberg Super Cure,Wine - Alsace Gewurztraminer,Wine - Blue Nun Qualitatswein,"Wine - Cahors Ac 2000, Clos",Wine - Chablis 2003 Champs,Wine - Charddonnay Errazuriz,Wine - Chardonnay South,Wine - Crozes Hermitage E.,Wine - Ej Gallo Sierra Valley,Wine - Fume Blanc Fetzer,Wine - Gato Negro Cabernet,Wine - Hardys Bankside Shiraz,Wine - Magnotta - Belpaese,Wine - Magnotta - Cab Sauv,"Wine - Magnotta, Merlot Sr Vqa",Wine - Pinot Noir Latour,Wine - Prosecco Valdobiaddene,"Wine - Red, Colio Cabernet","Wine - Red, Cooking","Wine - Red, Harrow Estates, Cab",Wine - Redchard Merritt,Wine - Ruffino Chianti,Wine - Sogrape Mateus Rose,Wine - Toasted Head,Wine - Two Oceans Cabernet,Wine - Valpolicella Masi,Wine - Vidal Icewine Magnotta,Wine - Vineland Estate Semi - Dry,Wine - White Cab Sauv.on,"Wine - White, Colubia Cresh","Wine - White, Mosel Gold","Wine - White, Schroder And Schyl",Wine - Wyndham Estate Bin 777,Wonton Wrappers,Yeast Dry - Fermipan,Yoghurt Tubes,"Yogurt - Blueberry, 175 Gr",Yogurt - French Vanilla,Zucchini - Yellow
CustomerID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1,Unnamed: 48_level_1,Unnamed: 49_level_1,Unnamed: 50_level_1,Unnamed: 51_level_1,Unnamed: 52_level_1,Unnamed: 53_level_1,Unnamed: 54_level_1,Unnamed: 55_level_1,Unnamed: 56_level_1,Unnamed: 57_level_1,Unnamed: 58_level_1,Unnamed: 59_level_1,Unnamed: 60_level_1,Unnamed: 61_level_1,Unnamed: 62_level_1,Unnamed: 63_level_1,Unnamed: 64_level_1,Unnamed: 65_level_1,Unnamed: 66_level_1,Unnamed: 67_level_1,Unnamed: 68_level_1,Unnamed: 69_level_1,Unnamed: 70_level_1,Unnamed: 71_level_1,Unnamed: 72_level_1,Unnamed: 73_level_1,Unnamed: 74_level_1,Unnamed: 75_level_1,Unnamed: 76_level_1,Unnamed: 77_level_1,Unnamed: 78_level_1,Unnamed: 79_level_1,Unnamed: 80_level_1,Unnamed: 81_level_1
33,0.0,0.0,0.000000,0.000000,0.015873,0.000000,0.000000,0.000000,0.0,0.015873,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.015873,0.000000,0.015873,0.000000,0.000000,0.000000,0.015873,0.000000,0.000000,0.015873,0.000000,0.000000,0.000000,0.000000,0.015873,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.0,0.000000,...,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.015873,0.000000,0.000000,0.015873,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.0,0.000000,0.0,0.000000,0.000000,0.015873,0.0,0.015873,0.015873,0.000000,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.0,0.0,0.000000,0.000000,0.015873,0.0
200,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.000000,0.012987,0.012987,0.000000,0.000000,0.000000,0.000000,0.0,0.000000,0.012987,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.012987,0.000000,0.000000,0.012987,0.000000,0.000000,0.000000,0.000000,0.000000,0.012987,0.0,0.0,0.012987,...,0.000000,0.012987,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.012987,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.0,0.000000,0.0,0.000000,0.012987,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.0,0.012987,0.000000,0.000000,0.000000,0.0,0.0,0.000000,0.012987,0.000000,0.0
264,0.0,0.0,0.000000,0.000000,0.000000,0.015385,0.015385,0.000000,0.0,0.000000,0.015385,0.015385,0.000000,0.015385,0.000000,0.015385,0.0,0.015385,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.015385,0.000000,0.0,0.0,0.000000,...,0.000000,0.000000,0.0,0.015385,0.000000,0.015385,0.000000,0.015385,0.000000,0.000000,0.000000,0.015385,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.0,0.000000,0.0,0.015385,0.000000,0.015385,0.0,0.000000,0.000000,0.000000,0.0,0.0,0.000000,0.000000,0.000000,0.015385,0.0,0.0,0.000000,0.000000,0.000000,0.0
356,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.014925,0.000000,0.000000,0.000000,0.0,0.000000,0.014925,0.014925,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.014925,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.0,0.000000,...,0.000000,0.000000,0.0,0.014925,0.014925,0.000000,0.000000,0.000000,0.000000,0.014925,0.000000,0.000000,0.000000,0.014925,0.000000,0.000000,0.000000,0.0,0.0,0.014925,0.0,0.000000,0.000000,0.000000,0.0,0.014925,0.000000,0.000000,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.0,0.0,0.000000,0.000000,0.014925,0.0
412,0.0,0.0,0.000000,0.000000,0.013699,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.013699,0.000000,0.013699,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.013699,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.013699,0.013699,0.000000,0.000000,0.000000,0.0,0.0,0.000000,...,0.013699,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.013699,0.000000,0.000000,0.0,0.0,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.013699,0.000000,0.000000,0.0,0.0,0.000000,0.013699,0.013699,0.013699,0.0,0.0,0.000000,0.000000,0.000000,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
97928,0.0,0.0,0.000000,0.014706,0.000000,0.029412,0.000000,0.014706,0.0,0.000000,0.014706,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.014706,0.000000,0.000000,0.014706,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.014706,0.000000,0.000000,0.0,0.0,0.000000,...,0.000000,0.000000,0.0,0.000000,0.000000,0.014706,0.000000,0.000000,0.000000,0.014706,0.000000,0.000000,0.000000,0.000000,0.014706,0.000000,0.000000,0.0,0.0,0.000000,0.0,0.000000,0.014706,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.0,0.000000,0.014706,0.014706,0.000000,0.0,0.0,0.000000,0.000000,0.000000,0.0
98069,0.0,0.0,0.000000,0.014286,0.000000,0.014286,0.000000,0.000000,0.0,0.014286,0.000000,0.014286,0.000000,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.014286,0.000000,0.000000,0.000000,0.014286,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.014286,0.000000,0.0,0.0,0.000000,...,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.014286,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.0,0.000000,0.0,0.000000,0.014286,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.0,0.0,0.000000,0.000000,0.000000,0.0
98159,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.014286,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.028571,0.014286,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.0,0.000000,...,0.000000,0.014286,0.0,0.000000,0.000000,0.000000,0.014286,0.014286,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.0,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.014286,0.0,0.0,0.000000,0.028571,0.000000,0.000000,0.0,0.0,0.014286,0.000000,0.000000,0.0
98185,0.0,0.0,0.012195,0.012195,0.000000,0.012195,0.000000,0.000000,0.0,0.000000,0.000000,0.024390,0.000000,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.012195,0.000000,0.012195,0.000000,0.000000,0.012195,0.000000,0.012195,0.000000,0.012195,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.0,0.000000,...,0.024390,0.000000,0.0,0.012195,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.012195,0.012195,0.0,0.0,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.0,0.0,0.000000,0.000000,0.000000,0.012195,0.0,0.0,0.000000,0.000000,0.000000,0.0


## User Similarity Based Recommendations

The next step in creating recommendations is calculating similarities. For our user similarity based recommender, we calculated them between customers.

In [12]:
cust_dist = pd.DataFrame(1/(1 + squareform(pdist(cust_prod_pivot, 'euclidean'))),
                         index=cust_prod_pivot.index, columns=cust_prod_pivot.index)

cust_dist.head()

CustomerID,33,200,264,356,412,464,477,639,649,669,694,756,883,891,1008,1034,1066,1072,1336,1428,1435,1534,1577,1594,1754,1839,1920,2187,2329,2503,2556,2566,2582,2617,2686,2754,2776,2902,2915,2939,...,94438,94547,94599,94910,94951,95017,95034,95059,95078,95121,95314,95372,95819,96024,96088,96272,96522,96524,96560,96615,96666,96684,97029,97052,97063,97093,97201,97282,97324,97495,97697,97753,97769,97793,97900,97928,98069,98159,98185,98200
CustomerID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1,Unnamed: 48_level_1,Unnamed: 49_level_1,Unnamed: 50_level_1,Unnamed: 51_level_1,Unnamed: 52_level_1,Unnamed: 53_level_1,Unnamed: 54_level_1,Unnamed: 55_level_1,Unnamed: 56_level_1,Unnamed: 57_level_1,Unnamed: 58_level_1,Unnamed: 59_level_1,Unnamed: 60_level_1,Unnamed: 61_level_1,Unnamed: 62_level_1,Unnamed: 63_level_1,Unnamed: 64_level_1,Unnamed: 65_level_1,Unnamed: 66_level_1,Unnamed: 67_level_1,Unnamed: 68_level_1,Unnamed: 69_level_1,Unnamed: 70_level_1,Unnamed: 71_level_1,Unnamed: 72_level_1,Unnamed: 73_level_1,Unnamed: 74_level_1,Unnamed: 75_level_1,Unnamed: 76_level_1,Unnamed: 77_level_1,Unnamed: 78_level_1,Unnamed: 79_level_1,Unnamed: 80_level_1,Unnamed: 81_level_1
33,1.0,0.854082,0.859145,0.852783,0.856118,0.853239,0.849549,0.848801,0.851224,0.849892,0.855133,0.861119,0.841662,0.847094,0.849458,0.862064,0.850386,0.839939,0.857901,0.846442,0.85925,0.854618,0.861239,0.853198,0.854228,0.856465,0.840073,0.857805,0.852028,0.840813,0.848771,0.851798,0.846383,0.855871,0.862313,0.85654,0.851806,0.848902,0.847455,0.847293,...,0.8546,0.844065,0.847271,0.845196,0.862347,0.848284,0.854899,0.857,0.847917,0.856952,0.860661,0.853833,0.858091,0.853859,0.856508,0.854656,0.842193,0.8479,0.857975,0.851315,0.855022,0.851366,0.844645,0.854262,0.851279,0.854425,0.849289,0.859051,0.853399,0.850335,0.85513,0.846812,0.850301,0.841254,0.852221,0.846485,0.852848,0.847637,0.856465,0.851614
200,0.854082,1.0,0.857935,0.8559,0.856445,0.853129,0.863562,0.850521,0.857381,0.856621,0.860408,0.860375,0.850292,0.852987,0.856865,0.863821,0.857986,0.851578,0.858058,0.852979,0.864745,0.857129,0.860474,0.860729,0.858953,0.866773,0.853532,0.859945,0.860794,0.844753,0.851954,0.853669,0.854987,0.862149,0.868276,0.863472,0.855853,0.856086,0.853193,0.857091,...,0.865633,0.854565,0.859284,0.851551,0.873248,0.855933,0.855318,0.866739,0.854228,0.862377,0.863339,0.859561,0.862516,0.863864,0.857098,0.853656,0.855266,0.86235,0.869875,0.857962,0.850193,0.860464,0.85295,0.853964,0.856413,0.851111,0.85264,0.868435,0.863028,0.85532,0.864397,0.856014,0.862447,0.854629,0.85667,0.854046,0.86899,0.857312,0.861508,0.857184
264,0.859145,0.857935,1.0,0.851801,0.86005,0.851262,0.855847,0.846677,0.854206,0.848997,0.859012,0.861119,0.85173,0.849912,0.858068,0.865069,0.852338,0.845323,0.853472,0.85023,0.859751,0.85562,0.857294,0.860751,0.852296,0.858626,0.843966,0.855982,0.851467,0.836912,0.847693,0.852866,0.852004,0.865256,0.85796,0.862899,0.851747,0.855636,0.853996,0.85024,...,0.860293,0.847094,0.856205,0.845189,0.864464,0.852866,0.855594,0.861009,0.849904,0.854228,0.853143,0.851747,0.856463,0.856913,0.852971,0.851842,0.8528,0.852547,0.857617,0.847497,0.84684,0.857117,0.85173,0.851262,0.854211,0.851817,0.855427,0.856463,0.862149,0.847192,0.861268,0.850766,0.854178,0.848507,0.858909,0.860153,0.859773,0.854246,0.864699,0.853668
356,0.852783,0.8559,0.851801,1.0,0.852571,0.849431,0.853831,0.851902,0.847607,0.846328,0.855159,0.852641,0.851802,0.843032,0.855883,0.8593,0.846864,0.840125,0.855998,0.848432,0.8593,0.855533,0.852562,0.850688,0.851303,0.863255,0.841165,0.856659,0.854774,0.840859,0.850683,0.847646,0.848302,0.859795,0.863838,0.857417,0.85079,0.851769,0.848353,0.847458,...,0.853014,0.850533,0.851304,0.84262,0.86131,0.853458,0.849131,0.85619,0.847211,0.851413,0.85843,0.850819,0.857159,0.856745,0.850233,0.850871,0.845453,0.852366,0.8657,0.848487,0.844065,0.854233,0.846828,0.850356,0.852217,0.849857,0.848345,0.864556,0.858373,0.849319,0.857171,0.848015,0.852222,0.847706,0.851413,0.851289,0.851219,0.85226,0.860721,0.854377
412,0.856118,0.856445,0.86005,0.852571,1.0,0.855755,0.858976,0.857221,0.854857,0.855991,0.85882,0.861384,0.855069,0.854763,0.858162,0.860656,0.851347,0.845601,0.862253,0.852899,0.864687,0.861829,0.857152,0.858431,0.855685,0.865269,0.840635,0.860112,0.860077,0.840926,0.853023,0.854414,0.852622,0.859584,0.863766,0.857457,0.85158,0.851109,0.850055,0.85281,...,0.864225,0.857996,0.859531,0.849856,0.865043,0.856823,0.853614,0.86885,0.854305,0.854228,0.863652,0.85346,0.860043,0.867648,0.852315,0.860605,0.855843,0.858431,0.86518,0.85203,0.849441,0.863131,0.849425,0.856661,0.854814,0.855184,0.854648,0.862602,0.859199,0.856565,0.859117,0.862847,0.854943,0.850044,0.858385,0.856629,0.855052,0.856608,0.865269,0.862253


Once we had our similarity matrix, then we could produce recommendations for each user and package all the recommendations into a data frame.

In [13]:
recommendations = {}
customers = list(customer_products['CustomerID'].unique())

for customer in customers:
    similar_cust = list(cust_dist[customer].sort_values(ascending=False)[1:].head().index)
    sim_cust_prod = customer_products[customer_products['CustomerID'].isin(similar_cust)]
    grouped = sim_cust_prod.groupby('ProductName').agg({'Quantity':'sum'})
    ranked_products = grouped.sort_values('Quantity', ascending=False).reset_index()
    
    merged = pd.merge(ranked_products, pd.DataFrame(cust_prod_pivot.T[customer]), on='ProductName')
    merged.columns = ['ProductName', 'Quantity', 'Purchased']
    recs = merged[merged['Purchased']==0].head()
    recommendations[customer] = list(recs['ProductName'])

user_recs = pd.DataFrame.from_dict(recommendations, orient='index').reset_index()
user_recs.columns = ['CustomerID', 'Rec1', 'Rec2', 'Rec3', 'Rec4', 'Rec5']
user_recs.head()

Unnamed: 0,CustomerID,Rec1,Rec2,Rec3,Rec4,Rec5
0,33,Wine - Redchard Merritt,Bread - Calabrese Baguette,"Thyme - Lemon, Fresh",Milk Powder,Ecolab - Lime - A - Way 4/4 L
1,200,Sauce - Demi Glace,General Purpose Trigger,Cookie Chocolate Chip With,Chef Hat 20cm,Pasta - Angel Hair
2,264,Ezy Change Mophandle,Eggplant - Asian,Scallops - 10/20,Cinnamon Buns Sticky,Wine - Ej Gallo Sierra Valley
3,356,Tea - Herbal Sweet Dreams,Curry Paste - Madras,Tea - English Breakfast,Juice - Orange,Ecolab - Lime - A - Way 4/4 L
4,412,Cake - Box Window 10x10x2.5,Beef - Montreal Smoked Brisket,Bread - Raisin Walnut Oval,"Mushroom - Trumpet, Dry",Cheese - Mix


## Deeper Dive Into Our User Similarity Recommendations

Let's deconstruct what we've done and take a deeper dive into how we put this together. Doing this will equip us with the knowledge to be able to put together an item-based similarity recommender in the next section. 

After creating an empty dictionary to store our recommendations and getting a unique list of customer IDs to iterate through, we are first identifying the top 5 similar customers to the customer we are trying to generate recommendations for. Let's plug in customer ID 33 and see what results we get.

In [14]:
similar_cust = list(cust_dist[33].sort_values(ascending=False)[1:].head().index)
similar_cust

[60862, 27672, 6001, 79458, 33759]

What we get is a list containing the 5 customer IDs of the customers whose purchase behavior is most similar to customer 33. We then go back to our customer_products data frame and select just the purchases where the customer ID is in our list of similar customers. We aggregate on product name, summing up the total quantity purchased of each product by all 5 similar customers, and then we rank them by sorting in descending order by the total quantity.

In [15]:
sim_cust_prod = customer_products[customer_products['CustomerID'].isin(similar_cust)]
grouped = sim_cust_prod.groupby('ProductName').agg({'Quantity':'sum'})
ranked_products = grouped.sort_values('Quantity', ascending=False).reset_index()
ranked_products.head()

Unnamed: 0,ProductName,Quantity
0,Wine - Redchard Merritt,59
1,Cassis,58
2,Bread - Calabrese Baguette,45
3,Wine - Crozes Hermitage E.,45
4,"Thyme - Lemon, Fresh",42


We now have a ranked list of products that similar customers have purchased, but we haven't taken into consideration yet whether our target customer already purchases any of those items. We want to recommend them items that they might like but haven't purchased before. So we will merge the list of ranked products with our target customer's purchase list and keep only the records for items that the customer has not purchased. These will be the items that we recommend to the customer.

In [16]:
merged = pd.merge(ranked_products, pd.DataFrame(cust_prod_pivot.T[33]), on='ProductName')
merged.columns = ['ProductName', 'Quantity', 'Purchased']
recs = merged[merged['Purchased']==0].head()
recs

Unnamed: 0,ProductName,Quantity,Purchased
0,Wine - Redchard Merritt,59,0.0
2,Bread - Calabrese Baguette,45,0.0
4,"Thyme - Lemon, Fresh",42,0.0
9,Milk Powder,39,0.0
10,Ecolab - Lime - A - Way 4/4 L,39,0.0


## Item Similarity Based Recommendations

In this section, you will create an item similarity based recommender system in a step-by-step fashion. Whereas our user similarity based recommender leveraged similarities between customers, this recommender will utilize similarities between products. You already have all the tools in your toolbox, so follow each of the steps below to complete this lab.

### Step 1: Create a product distance matrix.

In [17]:
products_customer = data.groupby(['ProductName','CustomerID']).agg({'Quantity':'sum'}).reset_index().sort_values('Quantity', ascending=False).reset_index(drop=True)
products_customer

Unnamed: 0,ProductName,CustomerID,Quantity
0,Longos - Grilled Salmon With Bbq,90069,92
1,Yeast Dry - Fermipan,80694,84
2,Bread - French Baquette,97063,75
3,Spice - Peppercorn Melange,96524,75
4,Fenngreek Seed,97029,75
...,...,...,...
63623,Pork - Inside,3472,1
63624,Pork - Inside,3885,1
63625,Pork - Inside,3903,1
63626,Dc - Frozen Momji,3472,1


In [18]:
prod_cust_pivot = products_customer.pivot_table(values='Quantity', 
                                                columns='CustomerID', 
                                                index='ProductName', 
                                                aggfunc='sum').fillna(0)
prod_cust_pivot.head()

CustomerID,33,200,264,356,412,464,477,639,649,669,694,756,883,891,1008,1034,1066,1072,1336,1428,1435,1534,1577,1594,1754,1839,1920,2187,2329,2503,2556,2566,2582,2617,2686,2754,2776,2902,2915,2939,...,94438,94547,94599,94910,94951,95017,95034,95059,95078,95121,95314,95372,95819,96024,96088,96272,96522,96524,96560,96615,96666,96684,97029,97052,97063,97093,97201,97282,97324,97495,97697,97753,97769,97793,97900,97928,98069,98159,98185,98200
ProductName,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1,Unnamed: 48_level_1,Unnamed: 49_level_1,Unnamed: 50_level_1,Unnamed: 51_level_1,Unnamed: 52_level_1,Unnamed: 53_level_1,Unnamed: 54_level_1,Unnamed: 55_level_1,Unnamed: 56_level_1,Unnamed: 57_level_1,Unnamed: 58_level_1,Unnamed: 59_level_1,Unnamed: 60_level_1,Unnamed: 61_level_1,Unnamed: 62_level_1,Unnamed: 63_level_1,Unnamed: 64_level_1,Unnamed: 65_level_1,Unnamed: 66_level_1,Unnamed: 67_level_1,Unnamed: 68_level_1,Unnamed: 69_level_1,Unnamed: 70_level_1,Unnamed: 71_level_1,Unnamed: 72_level_1,Unnamed: 73_level_1,Unnamed: 74_level_1,Unnamed: 75_level_1,Unnamed: 76_level_1,Unnamed: 77_level_1,Unnamed: 78_level_1,Unnamed: 79_level_1,Unnamed: 80_level_1,Unnamed: 81_level_1
Anchovy Paste - 56 G Tube,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,...,0.0,0.0,0.0,0.0,25.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,25.0,25.0,0.0,0.0,0.0,0.0,25.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,25.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
"Appetizer - Mini Egg Roll, Shrimp",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,25.0,25.0,0.0,0.0,25.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,25.0,25.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
Appetizer - Mushroom Tart,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,...,0.0,0.0,0.0,25.0,0.0,25.0,25.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,25.0,0.0,0.0,0.0,25.0,0.0,0.0,25.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,25.0,0.0
Appetizer - Sausage Rolls,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,24.0,0.0,0.0,0.0,0.0,25.0,0.0,0.0,0.0,0.0,25.0,0.0,0.0,25.0,25.0,25.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,25.0,25.0,25.0,0.0,25.0,0.0
Apricots - Dried,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,1.0,0.0,0.0,0.0,...,24.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,25.0,0.0,0.0,25.0,0.0,25.0,25.0,25.0,25.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,25.0,0.0,25.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [19]:
cust_prod_pivot = prod_cust_pivot.T

prod_cust_pivot = prod_cust_pivot.div(prod_cust_pivot.sum(axis=1), axis=0)
cust_prod_pivot = cust_prod_pivot.div(cust_prod_pivot.sum(axis=1), axis=0)


#https://www.geeksforgeeks.org/python-pandas-dataframe-sum/#:~:text=sum()%20function%20return%20the,the%20values%20in%20each%20column.
# axis = 0 suma la columna, axis=1 suma la fila


In [20]:
prod_cust_pivot.tail()

CustomerID,33,200,264,356,412,464,477,639,649,669,694,756,883,891,1008,1034,1066,1072,1336,1428,1435,1534,1577,1594,1754,1839,1920,2187,2329,2503,2556,2566,2582,2617,2686,2754,2776,2902,2915,2939,...,94438,94547,94599,94910,94951,95017,95034,95059,95078,95121,95314,95372,95819,96024,96088,96272,96522,96524,96560,96615,96666,96684,97029,97052,97063,97093,97201,97282,97324,97495,97697,97753,97769,97793,97900,97928,98069,98159,98185,98200
ProductName,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1,Unnamed: 48_level_1,Unnamed: 49_level_1,Unnamed: 50_level_1,Unnamed: 51_level_1,Unnamed: 52_level_1,Unnamed: 53_level_1,Unnamed: 54_level_1,Unnamed: 55_level_1,Unnamed: 56_level_1,Unnamed: 57_level_1,Unnamed: 58_level_1,Unnamed: 59_level_1,Unnamed: 60_level_1,Unnamed: 61_level_1,Unnamed: 62_level_1,Unnamed: 63_level_1,Unnamed: 64_level_1,Unnamed: 65_level_1,Unnamed: 66_level_1,Unnamed: 67_level_1,Unnamed: 68_level_1,Unnamed: 69_level_1,Unnamed: 70_level_1,Unnamed: 71_level_1,Unnamed: 72_level_1,Unnamed: 73_level_1,Unnamed: 74_level_1,Unnamed: 75_level_1,Unnamed: 76_level_1,Unnamed: 77_level_1,Unnamed: 78_level_1,Unnamed: 79_level_1,Unnamed: 80_level_1,Unnamed: 81_level_1
Yeast Dry - Fermipan,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.000595,0.0,0.0,0.0,0.0,0.000595,0.0,0.000595,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.000595,0.000595,0.000595,0.0,0.0,...,0.0,0.0,0.0,0.014881,0.014881,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014881,0.0,0.0,0.0,0.0,0.014881,0.0,0.014881,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014881,0.0,0.0,0.0,0.0,0.0,0.0
Yoghurt Tubes,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.000969,0.0,0.0,0.0,0.0,0.0,0.000484,0.000484,0.0,0.0,0.0,0.0,0.0,0.0,0.000484,0.0,0.0,0.0,0.0,0.0,0.000484,0.000484,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012112,0.012112,0.0,0.0,0.0,0.012112,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012112,0.012112,0.012112,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012112,0.0,0.0
"Yogurt - Blueberry, 175 Gr",0.0,0.000478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.000478,0.0,0.0,0.0,0.0,0.0,0.0,0.000478,0.0,0.000478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.000478,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.011939,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.011939,0.0,0.0,0.011939,0.011939,0.0,0.0,0.011939,0.0,0.0,0.0,0.0,0.0,0.0
Yogurt - French Vanilla,0.000457,0.0,0.0,0.000457,0.0,0.0,0.000914,0.0,0.0,0.000457,0.0,0.0,0.0,0.0,0.0,0.000457,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.000457,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.011431,0.0,0.011431,0.0,0.0,0.0,0.0,0.0,0.011431,0.0,0.011431,0.0,0.011431,0.0,0.0,0.011431,0.011431,0.0,0.0,0.0,0.011431,0.0,0.0,0.0,0.0,0.0,0.0,0.011431
Zucchini - Yellow,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.000478,0.0,0.0,0.0,0.000478,0.0,0.0,0.0,0.000478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.000478,0.000478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.000478,0.0,0.0,...,0.011483,0.0,0.022967,0.011962,0.0,0.0,0.0,0.0,0.011962,0.0,0.0,0.023923,0.0,0.011962,0.0,0.0,0.0,0.0,0.0,0.0,0.011962,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [21]:
product_dist = pd.DataFrame(1/(1 + squareform(pdist(prod_cust_pivot, 'euclidean'))),
                         columns=prod_cust_pivot.index, index=prod_cust_pivot.index)

product_dist.head()

ProductName,Anchovy Paste - 56 G Tube,"Appetizer - Mini Egg Roll, Shrimp",Appetizer - Mushroom Tart,Appetizer - Sausage Rolls,Apricots - Dried,Apricots - Halves,Apricots Fresh,Arizona - Green Tea,Artichokes - Jerusalem,Assorted Desserts,Bacardi Breezer - Tropical,Bagel - Plain,Baking Powder,Banana - Leaves,Banana Turning,Bananas,Bandage - Fexible 1x3,Bandage - Flexible Neon,Bar - Granola Trail Mix Fruit Nut,"Bar Mix - Pina Colada, 355 Ml",Barramundi,Bay Leaf,Beans - Kidney White,"Beans - Kidney, Canned","Beans - Kidney, Red Dry",Beans - Wax,"Beef - Chuck, Boneless",Beef - Ground Medium,"Beef - Ground, Extra Lean, Fresh",Beef - Inside Round,Beef - Montreal Smoked Brisket,Beef - Prime Rib Aaa,Beef - Rib Eye Aaa,Beef - Short Loin,Beef - Striploin Aa,"Beef - Tenderlion, Center Cut",Beef - Texas Style Burger,Beef - Top Sirloin,Beef - Top Sirloin - Aaa,Beef Ground Medium,...,Whmis - Spray Bottle Trigger,Wiberg Super Cure,Wine - Alsace Gewurztraminer,Wine - Blue Nun Qualitatswein,"Wine - Cahors Ac 2000, Clos",Wine - Chablis 2003 Champs,Wine - Charddonnay Errazuriz,Wine - Chardonnay South,Wine - Crozes Hermitage E.,Wine - Ej Gallo Sierra Valley,Wine - Fume Blanc Fetzer,Wine - Gato Negro Cabernet,Wine - Hardys Bankside Shiraz,Wine - Magnotta - Belpaese,Wine - Magnotta - Cab Sauv,"Wine - Magnotta, Merlot Sr Vqa",Wine - Pinot Noir Latour,Wine - Prosecco Valdobiaddene,"Wine - Red, Colio Cabernet","Wine - Red, Cooking","Wine - Red, Harrow Estates, Cab",Wine - Redchard Merritt,Wine - Ruffino Chianti,Wine - Sogrape Mateus Rose,Wine - Toasted Head,Wine - Two Oceans Cabernet,Wine - Valpolicella Masi,Wine - Vidal Icewine Magnotta,Wine - Vineland Estate Semi - Dry,Wine - White Cab Sauv.on,"Wine - White, Colubia Cresh","Wine - White, Mosel Gold","Wine - White, Schroder And Schyl",Wine - Wyndham Estate Bin 777,Wonton Wrappers,Yeast Dry - Fermipan,Yoghurt Tubes,"Yogurt - Blueberry, 175 Gr",Yogurt - French Vanilla,Zucchini - Yellow
ProductName,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1,Unnamed: 48_level_1,Unnamed: 49_level_1,Unnamed: 50_level_1,Unnamed: 51_level_1,Unnamed: 52_level_1,Unnamed: 53_level_1,Unnamed: 54_level_1,Unnamed: 55_level_1,Unnamed: 56_level_1,Unnamed: 57_level_1,Unnamed: 58_level_1,Unnamed: 59_level_1,Unnamed: 60_level_1,Unnamed: 61_level_1,Unnamed: 62_level_1,Unnamed: 63_level_1,Unnamed: 64_level_1,Unnamed: 65_level_1,Unnamed: 66_level_1,Unnamed: 67_level_1,Unnamed: 68_level_1,Unnamed: 69_level_1,Unnamed: 70_level_1,Unnamed: 71_level_1,Unnamed: 72_level_1,Unnamed: 73_level_1,Unnamed: 74_level_1,Unnamed: 75_level_1,Unnamed: 76_level_1,Unnamed: 77_level_1,Unnamed: 78_level_1,Unnamed: 79_level_1,Unnamed: 80_level_1,Unnamed: 81_level_1
Anchovy Paste - 56 G Tube,1.0,0.881127,0.882219,0.884604,0.890449,0.879859,0.885648,0.880852,0.886501,0.890573,0.883037,0.880477,0.882783,0.883193,0.885259,0.88268,0.885902,0.881536,0.880238,0.887182,0.883113,0.881864,0.884083,0.883401,0.880967,0.886188,0.883742,0.875954,0.880038,0.883716,0.889731,0.884583,0.880762,0.879185,0.887827,0.876087,0.88161,0.884851,0.883028,0.879337,...,0.876557,0.88093,0.878016,0.881549,0.882818,0.880807,0.882285,0.887599,0.885625,0.88112,0.881908,0.878821,0.882314,0.887693,0.881602,0.879385,0.882591,0.878033,0.879662,0.884903,0.879441,0.8801,0.882131,0.880725,0.881015,0.885524,0.881594,0.883722,0.883955,0.884211,0.884117,0.888124,0.886567,0.879578,0.875696,0.878427,0.886163,0.886564,0.887956,0.878223
"Appetizer - Mini Egg Roll, Shrimp",0.881127,1.0,0.883877,0.880604,0.88793,0.881577,0.885241,0.890196,0.888646,0.883484,0.884239,0.885036,0.88256,0.883238,0.88219,0.880902,0.887395,0.885753,0.879971,0.887037,0.881912,0.88808,0.884982,0.885118,0.882582,0.888405,0.888591,0.8803,0.88237,0.880962,0.888113,0.882211,0.882915,0.877746,0.883588,0.884541,0.87932,0.883545,0.884928,0.887947,...,0.882175,0.886265,0.879724,0.884176,0.886683,0.879684,0.881605,0.885806,0.885861,0.884348,0.884391,0.883569,0.882743,0.888123,0.882043,0.879421,0.884266,0.882342,0.885324,0.884958,0.882135,0.882714,0.880568,0.880483,0.879459,0.8787,0.882515,0.88513,0.88565,0.883759,0.887569,0.887831,0.888033,0.881019,0.87872,0.871915,0.888702,0.890837,0.887661,0.882616
Appetizer - Mushroom Tart,0.882219,0.883877,1.0,0.884708,0.883266,0.884124,0.886495,0.886072,0.887707,0.881019,0.891831,0.886506,0.884732,0.883474,0.881741,0.886167,0.891829,0.884157,0.882122,0.88556,0.885776,0.885725,0.884109,0.883087,0.884483,0.887445,0.888735,0.878107,0.881451,0.88585,0.89059,0.881036,0.881753,0.883677,0.886419,0.888407,0.881537,0.891155,0.883555,0.881394,...,0.887678,0.886372,0.879307,0.883224,0.883822,0.877546,0.88255,0.886033,0.891472,0.884236,0.887651,0.882241,0.882008,0.888933,0.882221,0.882714,0.88974,0.881586,0.881528,0.886441,0.888841,0.888062,0.883237,0.886393,0.89142,0.881569,0.885898,0.881861,0.884478,0.890692,0.886796,0.885968,0.885056,0.887445,0.882789,0.879286,0.890062,0.88996,0.889397,0.881077
Appetizer - Sausage Rolls,0.884604,0.880604,0.884708,1.0,0.885025,0.88502,0.881992,0.88309,0.880745,0.881226,0.880586,0.88479,0.881673,0.882953,0.882631,0.879086,0.883686,0.881267,0.878398,0.88808,0.881357,0.881317,0.881226,0.881229,0.879719,0.885686,0.88338,0.875841,0.875117,0.88562,0.887597,0.87973,0.881802,0.878879,0.885236,0.878803,0.879898,0.880235,0.879571,0.878013,...,0.880842,0.878611,0.872252,0.88578,0.882197,0.87659,0.877558,0.882533,0.883161,0.882221,0.879073,0.88114,0.879008,0.88204,0.878471,0.879969,0.88067,0.884603,0.877359,0.882943,0.878106,0.885202,0.882867,0.886386,0.881486,0.879051,0.879725,0.877579,0.880899,0.883979,0.882793,0.886507,0.884984,0.886515,0.877195,0.868976,0.88547,0.884688,0.883661,0.877701
Apricots - Dried,0.890449,0.88793,0.883266,0.885025,1.0,0.886792,0.887542,0.889058,0.886234,0.887162,0.884768,0.883129,0.888315,0.887522,0.888851,0.885563,0.890336,0.887294,0.88271,0.892127,0.885941,0.885818,0.888557,0.891156,0.883213,0.891419,0.887225,0.878677,0.884743,0.881489,0.890468,0.885432,0.882029,0.881509,0.890999,0.879535,0.883515,0.884637,0.882436,0.884456,...,0.889274,0.885842,0.878711,0.885846,0.88683,0.886424,0.883435,0.887813,0.889119,0.884367,0.881945,0.881374,0.884873,0.887666,0.882416,0.888558,0.888936,0.884762,0.884019,0.887811,0.884839,0.884839,0.885142,0.885488,0.882467,0.888155,0.885761,0.886261,0.889869,0.885726,0.886223,0.889001,0.890087,0.887478,0.878571,0.872542,0.892076,0.891209,0.889794,0.886477


### Step 2: Get the products purchased for a specific customer of your choice.

In [22]:
rnd_customer = (list(products_customer.CustomerID)[random.randint(1,products_customer.shape[0])])
rnd_customer

29910

In [23]:
prd_spf=products_customer[products_customer['CustomerID']==rnd_customer].sort_values(by=['Quantity'], ascending=False)
prd_spf.head()

Unnamed: 0,ProductName,CustomerID,Quantity
24217,"Appetizer - Mini Egg Roll, Shrimp",29910,16
25144,Pomello,29910,16
25368,Beans - Kidney White,29910,16
25477,Bay Leaf,29910,16
26438,Ecolab - Lime - A - Way 4/4 L,29910,16


### Step 3: For each product the customer purchased, get a list of the top 5 similar products. Package the lists into a nested list, flatten the list, and then filter out any products the customer has already purchased.

In [24]:
top_5_similar = []
cust_prod_pur = list(prd_spf.ProductName.unique())
for i in cust_prod_pur:
  prod_five = list(product_dist[i].sort_values(ascending=False).index)[1:6]
  top_5_similar.append(prod_five)

In [25]:
top_5_similar[:3]

[['Spinach - Baby',
  'Hickory Smoke, Liquid',
  'French Pastry - Mini Chocolate',
  'Pepper - Paprika, Hungarian',
  'Pastry - Raisin Muffin - Mini'],
 ['Towels - Paper / Kraft',
  'Bread - Roll, Canadian Dinner',
  'Juice - Orange',
  'Apricots Fresh',
  'Wine - Crozes Hermitage E.'],
 ['Rice - Jasmine Sented',
  'Towels - Paper / Kraft',
  'V8 - Berry Blend',
  'Scallops - 10/20',
  'Langers - Ruby Red Grapfruit']]

In [26]:
def flatten(t):
    return [item for sublist in t for item in sublist]

In [27]:
flat = flatten(top_5_similar)
flat[4:6]

['Pastry - Raisin Muffin - Mini', 'Towels - Paper / Kraft']

In [28]:
len(flat)

330

In [29]:
flat = list(flat)

In [30]:
for i in cust_prod_pur:
  if i in flat:
    flat.remove(i)

In [31]:
len(flat)

313

### Step 4: Count the number of times each similar product occurs in your filtered list. Sort and return a list containing the top 5 items.

In [32]:
top_five_items = pd.Series(flat)

In [33]:
list_top_5 = list(top_five_items.value_counts().sort_values(ascending= False).index)[:5]
list_top_5

['Spinach - Baby',
 'Sun - Dried Tomatoes',
 'Rosemary - Primerba, Paste',
 'Cookies - Assorted',
 'V8 - Berry Blend']

### Step 5: Now that we have generated product recommendations for a single user, put the pieces together and iterate over a list of all CustomerIDs.

- Create an empty dictionary that will hold the recommendations for all customers.
- Create a list of unique CustomerIDs to iterate over.
- Iterate over the customer list performing steps 2 through 4 for each and appending the results of each iteration to the dictionary you created.

In [34]:
recom = {}
ids = list(data.CustomerID.unique())

In [35]:
for i in ids:
  prd_spf=products_customer[products_customer['CustomerID']==i].sort_values(by=['Quantity'], ascending=False)
  top_5_similar = []
  cust_prod_pur = list(prd_spf.ProductName.unique())
  for g in cust_prod_pur:
    prod_five = list(product_dist[g].sort_values(ascending=False).index)[1:6]
    top_5_similar.append(prod_five)
  flat = flatten(top_5_similar)
  list_top_5 = list(pd.Series(flat).value_counts().sort_values(ascending= False).index)[:5]
  recom[i]=list_top_5

### Step 6: Store the results in a Pandas data frame. The data frame should a column for Customer ID and then a column for each of the 5 product recommendations for each customer.

In [36]:
columns_names = ['Recommendation-1','Recommendation-2','Recommendation-3','Recommendation-4','Recommendation-5']
five_recom = pd.DataFrame.from_dict(recom, orient='index', dtype=None, columns= columns_names)
five_recom.head(10)

Unnamed: 0,Recommendation-1,Recommendation-2,Recommendation-3,Recommendation-4,Recommendation-5
61288,Oil - Shortening - All - Purpose,"Rosemary - Primerba, Paste",Cookies - Assorted,"Yogurt - Blueberry, 175 Gr","Pepper - Paprika, Hungarian"
77352,French Pastry - Mini Chocolate,Beef - Montreal Smoked Brisket,Oil - Shortening - All - Purpose,"Rosemary - Primerba, Paste",Towels - Paper / Kraft
40094,V8 - Berry Blend,Oil - Shortening - All - Purpose,Cookies - Assorted,Veal - Osso Bucco,"Rosemary - Primerba, Paste"
23548,French Pastry - Mini Chocolate,"Yogurt - Blueberry, 175 Gr",Oil - Shortening - All - Purpose,"Rosemary - Primerba, Paste",Cookies - Assorted
78981,Sun - Dried Tomatoes,Cookies - Assorted,Bread - Italian Roll With Herbs,"Chocolate - Semi Sweet, Calets",Oil - Shortening - All - Purpose
83106,Oil - Shortening - All - Purpose,V8 - Berry Blend,Spinach - Baby,Sun - Dried Tomatoes,"Yogurt - Blueberry, 175 Gr"
11253,Sun - Dried Tomatoes,Oil - Shortening - All - Purpose,"Yogurt - Blueberry, 175 Gr",Spinach - Baby,Beef - Montreal Smoked Brisket
35107,Sun - Dried Tomatoes,Spinach - Baby,Oil - Shortening - All - Purpose,"Rosemary - Primerba, Paste","Chocolate - Semi Sweet, Calets"
15088,Sun - Dried Tomatoes,"Rosemary - Primerba, Paste",Spinach - Baby,"Chocolate - Semi Sweet, Calets",Cookies - Assorted
26031,Oil - Shortening - All - Purpose,Cookies - Assorted,Sun - Dried Tomatoes,Spinach - Baby,"Rosemary - Primerba, Paste"


## Recommending Items to a New Customer

Suppose we get a new customer and on their first visit, they purchase the following items and quantities.

In [37]:
new_customer = {'Cookies - Assorted':3,
                'Flavouring - Orange':3,
                'Fenngreek Seed':1,
                'Wine - White Cab Sauv.on':1,
                'Bandage - Flexible Neon':3,
                'Oil - Shortening - All - Purpose':2,
                'Beef - Montreal Smoked Brisket':4,
                'French Pastry - Mini Chocolate':4,
                'Snapple Lemon Tea':5,
                'Pepper - White, Ground':2,
                'Spinach - Baby':5,
                'Sole - Dover, Whole, Fresh':4}

In [39]:
products_customer.head()

Unnamed: 0,ProductName,CustomerID,Quantity
0,Longos - Grilled Salmon With Bbq,90069,92
1,Yeast Dry - Fermipan,80694,84
2,Bread - French Baquette,97063,75
3,Spice - Peppercorn Melange,96524,75
4,Fenngreek Seed,97029,75


In [40]:
products_customer.columns

Index(['ProductName', 'CustomerID', 'Quantity'], dtype='object')

### Step 7: Recommend 5 products to this new customer using a user similarity approach.

### Step 8: Recommend 5 products to this new customer using a item similarity approach.

In [38]:
prd_spf=products_customer[products_customer['CustomerID']==i].sort_values(by=['Quantity'], ascending=False)
top_5_similar = []
cust_prod_pur = list(prd_spf.ProductName.unique())
for g in cust_prod_pur:
  prod_five = list(product_dist[g].sort_values(ascending=False).index)[1:6]
  top_5_similar.append(prod_five)
flat = flatten(top_5_similar)
list_top_5 = list(pd.Series(flat).value_counts().sort_values(ascending= False).index)[:5]
recom[i]=list_top_5