# Recommendation System for Amazon Clothing Products
---
## 3. Popularity Based Recommendation System

*Author*: Mariam Elsayed

*Contact*: mariamkelsayed@gmail.com

*Notebook*: 3 of 5

*Previous Notebook*: `reviews_loading_preprocessing.ipynb`

*Next Notebook*: `content_rec.ipynb`

In [53]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

## Table of Contents

* [Introduction](#intro)

* [Loading the Data](#loading)

* [Popularity-Based Recommendation System](#rec)

    * [Using `average_rating`](#average_rating)

    * [Including a Threshold](#threshold)

    * [Using `rank`](#rank)

* [Conclusion](#conc)

## Introduction <a class="anchor" id="intro"></a>

The first recommendation system we will consider is popularity-based. It is a user-independant recommendation system, meaning that we have no information about the user on what we could make recommendations about. So, we suggest the most popular items.

## Loading the Data <a class="anchor" id="loading"></a>

This recommendation system will use the products data frame with the review summary created in the last notebook.

In [54]:
# loading products dataframe
products_df = pd.read_csv('Data/products_summary.csv')

In [55]:
products_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 147472 entries, 0 to 147471
Data columns (total 64 columns):
 #   Column                                     Non-Null Count   Dtype  
---  ------                                     --------------   -----  
 0   category                                   147472 non-null  object 
 1   description                                147399 non-null  object 
 2   title                                      147472 non-null  object 
 3   brand                                      147472 non-null  object 
 4   rank                                       147472 non-null  int64  
 5   asin                                       147472 non-null  object 
 6   imageURL                                   133612 non-null  object 
 7   price                                      147472 non-null  float64
 8   maincat_Luggage & Travel Gear              147472 non-null  bool   
 9   maincat_Backpacks                          147472 non-null  bool   
 10  maincat_

## Popularity-Based Recommendation System<a class="anchor" id="rec"></a>

For this recommendation system, we are going to be sorting by highest rating for products with a high number of reviews, and we will also investigating using `rank`.

### Using `average_rating` <a class="anchor" id="average_rating"></a>

Let's start with sorting by the highest average rating.

In [56]:
# Sorting by top 10 products with the highest rating
products_df.sort_values(by=['average_rating'], ascending=False).head(10)

Unnamed: 0,category,description,title,brand,rank,asin,imageURL,price,maincat_Luggage & Travel Gear,maincat_Backpacks,...,subcat_Girls,subcat_Boys,"subcat_Shoe, Jewelry & Watch Accessories",subcat_Jewelry Accessories,subcat_Shoe Care & Accessories,subcat_Contemporary & Designer,subcat_Travel Accessories,"subcat_Surf, Skate & Street",average_rating,total_reviews
35848,"['Clothing, Shoes & Jewelry', 'Costumes & Acce...",Be everyone's thunder buddy for life. Grab a c...,Rasta Imposta Ted Tunic,Rasta Imposta,1425860,B008S1NIWA,['https://images-na.ssl-images-amazon.com/imag...,36.3,False,False,...,False,False,False,False,False,False,False,False,5.0,7
118959,"['Clothing, Shoes & Jewelry', 'Novelty & More'...",Look stylish with this Katydid Soccer Mom Wome...,Soccer Mom Sports Game Day Women's Trucker Hat...,Katydid,155857,B016J9CGES,,22.95,False,False,...,False,False,False,False,False,False,False,False,5.0,6
130670,"['Clothing, Shoes & Jewelry', 'Women', 'Jewelr...",This beautiful women's bracelet will arrive in...,Hidden Hollow Beads Message Charm (84 Options)...,Hidden Hollow Beads,11220,B01AYP05A0,,11.99,False,False,...,False,False,False,False,False,False,False,False,5.0,4
60531,"['Clothing, Shoes & Jewelry', 'Women', 'Jewelr...",Stunning and classic. These earrings are truly...,Bridal Wedding Jewelry Crystal Pearl Chic Mode...,Accessoriesforever,996066,B00GYH78P8,['https://images-na.ssl-images-amazon.com/imag...,13.5,False,False,...,False,False,False,False,False,False,False,False,5.0,5
26177,"['Clothing, Shoes & Jewelry', 'Women', 'Clothi...","Camouflage backdrop aside, it's hard to blend ...","Ozone Women's Flower Camo Socks, One Size Fits...",Unknown,382786,B005KPLLWQ,,14.75,False,False,...,False,False,False,False,False,False,False,False,5.0,8
26176,"['Clothing, Shoes & Jewelry', 'Women', 'Clothi...",Ozone Design is dedicated to creating the worl...,Ozone Women's Flower Camo Socks,Unknown,595552,B005KPLLXK,['https://images-na.ssl-images-amazon.com/imag...,15.05,False,False,...,False,False,False,False,False,False,False,False,5.0,14
26172,"['Clothing, Shoes & Jewelry', 'Men', 'Watches'...",Gold-plated stainless steel case with a black ...,Invicta Men's 1150 Subaqua Noma III GMT Blue D...,Invicta,1399122,B005KPL5ZE,['https://images-na.ssl-images-amazon.com/imag...,168.72,False,False,...,False,False,False,False,False,False,False,False,5.0,4
26171,"['Clothing, Shoes & Jewelry', 'Women', 'Shoes'...",rn\x95 This stylish wedge adds a burst of brig...,Not Rated Women's Safari Wedge Sandal,Not Rated,6563011,B005KPFW3A,['https://images-na.ssl-images-amazon.com/imag...,49.0,False,False,...,False,False,False,False,False,False,False,False,5.0,5
39318,"['Clothing, Shoes & Jewelry', 'Novelty & More'...",This screen printed t-shirt features a hand dr...,ENDO Apparel AR-15 Builders Club Men's T-Shirt,ENDO Apparel,1605760,B00A3G4WC8,['https://images-na.ssl-images-amazon.com/imag...,30.0,False,False,...,False,False,False,False,False,False,False,False,5.0,5
26168,"['Clothing, Shoes & Jewelry', 'Costumes & Acce...","""Kigurumi"" comes from a combination of two Jap...",Spooky Black Cat Kigurumi - Japanese Sazac Cos...,SAZAC,1077344,B005KP1U6I,['https://images-na.ssl-images-amazon.com/imag...,59.0,False,False,...,False,False,False,False,False,False,False,False,5.0,8


The top reviews only have one review with a 5 star rating. This is not the most informative representation of a popular product. 

### Including a Threshold <a class="anchor" id="threshold"></a>

To overcome the issue above, we are going to set a threshold on how many reviews are needed to be considered in the recommendation, then we will recommend the products with the highest reviews.

From the description of the `total_reviews` column, 75% of the products have at least 6 reviews, and 5794 reviews is the highest number of reviews a product has had. Lets pick the threshold to be 500.   

In [57]:
products_df['total_reviews'].describe()

count    147472.000000
mean         37.399235
std         167.737720
min           1.000000
25%           7.000000
50%          12.000000
75%          27.000000
max        9846.000000
Name: total_reviews, dtype: float64

In [58]:
# filtering out products with less than 500 reviews
threshold = 500
products_with_many_reviews = products_df[products_df['total_reviews'] >= threshold]

# sorting by highest average rating
products_with_many_reviews.sort_values(by=['average_rating'], ascending=False).head(10)

Unnamed: 0,category,description,title,brand,rank,asin,imageURL,price,maincat_Luggage & Travel Gear,maincat_Backpacks,...,subcat_Girls,subcat_Boys,"subcat_Shoe, Jewelry & Watch Accessories",subcat_Jewelry Accessories,subcat_Shoe Care & Accessories,subcat_Contemporary & Designer,subcat_Travel Accessories,"subcat_Surf, Skate & Street",average_rating,total_reviews
59148,"['Clothing, Shoes & Jewelry', 'Men', 'Accessor...",VinicioBelt prides itself in providing timeles...,Men's Holeless Leather Ratchet Click Belt - Tr...,VINICIOBELT,2010,B00GKDFOJS,['https://images-na.ssl-images-amazon.com/imag...,19.85,False,False,...,False,False,False,False,False,False,False,False,4.8,662
61358,"['Clothing, Shoes & Jewelry', 'Shoe, Jewelry &...",The Perfect Reversible Shoehorn That Fits All ...,"Shacke Metal Shoe Horn 7.5"" inches – Double Si...",Shacke,23409,B00H9RZDRM,['https://images-na.ssl-images-amazon.com/imag...,5.99,False,False,...,False,False,True,False,True,False,False,False,4.8,786
35205,"['Clothing, Shoes & Jewelry', 'Women', 'Shoes'...","Carefree, casual, and comfortable. this sporty...",Clarks Women's Breeze Sea Flip-Flop,Unknown,1158,B008KK1AZQ,['https://images-na.ssl-images-amazon.com/imag...,38.62,False,False,...,False,False,False,False,False,False,False,False,4.8,1515
141808,"['Clothing, Shoes & Jewelry', 'Women', 'Shoes'...","Carefree, casual, and comfortable. this sporty...",Clarks Women's Breeze Sea Flip-Flop,Unknown,1169,B01FH9ETBO,['https://images-na.ssl-images-amazon.com/imag...,35.82,False,False,...,False,False,False,False,False,False,False,False,4.8,1179
97504,"['Clothing, Shoes & Jewelry', 'Men', 'Accessor...",VinicioBelt prides itself in providing timeles...,Men's Holeless Leather Ratchet Click Belt - Tr...,VINICIOBELT,3339,B00VCQ09KW,['https://images-na.ssl-images-amazon.com/imag...,19.85,False,False,...,False,False,False,False,False,False,False,False,4.8,682
72008,"['Clothing, Shoes & Jewelry', 'Luggage & Trave...","', 'Pack More Clothes in a Small Space For You...",Shacke Pak - 4 Set Packing Cubes - Travel Orga...,Shacke,118,B00KKXCJQU,['https://images-na.ssl-images-amazon.com/imag...,24.99,True,False,...,False,False,False,False,False,False,True,False,4.8,1328
26385,"['Clothing, Shoes & Jewelry', 'Luggage & Trave...","With three large packing cubes included, this ...",eBags Large Packing Cubes for Travel - 3pc Set,Unknown,29099,B005LXPSHG,['https://images-na.ssl-images-amazon.com/imag...,5.35,True,False,...,False,False,False,False,False,False,True,False,4.8,799
7969,"['Clothing, Shoes & Jewelry', 'Boys', 'Jewelry...",Pick your favorite Cross color from menu above...,Horseshoe Nail Cross Necklaces -(Solid Color) ...,Horseshoe Crosses,70751,B001DQ5YZ6,['https://images-na.ssl-images-amazon.com/imag...,52.77,False,False,...,False,True,False,False,False,False,False,False,4.8,940
97083,"['Clothing, Shoes & Jewelry', 'Boys', 'Jewelry...",Pick your favorite Cross color from menu above...,Horseshoe Nail Cross Necklaces -(Solid Color) ...,Horseshoe Crosses,74357,B00V5RV0ME,['https://images-na.ssl-images-amazon.com/imag...,29.61,False,False,...,False,True,False,False,False,False,False,False,4.8,901
16623,"['Clothing, Shoes & Jewelry', 'Boys', 'Jewelry...",Pick your favorite Cross color from menu above...,Horseshoe Nail Cross Necklaces -(Solid Color) ...,Horseshoe Crosses,73166,B003YNJUHY,['https://images-na.ssl-images-amazon.com/imag...,6.75,False,False,...,False,True,False,False,False,False,False,False,4.8,900


In [59]:
products_df

Unnamed: 0,category,description,title,brand,rank,asin,imageURL,price,maincat_Luggage & Travel Gear,maincat_Backpacks,...,subcat_Girls,subcat_Boys,"subcat_Shoe, Jewelry & Watch Accessories",subcat_Jewelry Accessories,subcat_Shoe Care & Accessories,subcat_Contemporary & Designer,subcat_Travel Accessories,"subcat_Surf, Skate & Street",average_rating,total_reviews
0,"['Clothing, Shoes & Jewelry', 'Women', 'Import...",Veneziana Sexy Strip 20 Open Crotch Pantyhose ...,Sexystrip,Veneziana,734888,5120053017,['https://images-na.ssl-images-amazon.com/imag...,14.95,False,False,...,False,False,False,False,False,False,False,False,3.7,38
1,"['Clothing, Shoes & Jewelry', 'Women', 'Import...",Veneziana Ar Beautiful - Hold Ups Thigh High S...,Beautiful,Veneziana,551160,5120053351,['https://images-na.ssl-images-amazon.com/imag...,17.70,False,False,...,False,False,False,False,False,False,False,False,3.9,47
2,"['Clothing, Shoes & Jewelry', 'Women', 'Matern...",Dress Length (Neck to Bottom Hem) Small - 40 i...,sofsy Soft-Touch Rayon Blend Tie Front Nursing...,Unknown,740957,5120053890,['https://images-na.ssl-images-amazon.com/imag...,34.99,False,False,...,False,False,False,False,False,False,False,False,4.6,31
3,"['Clothing, Shoes & Jewelry', 'Baby', 'Baby Gi...","Little Brother: Size 70: Length 38 CM, Bust*2 ...",Toddler Girls Big Sister T Shirt Matching Litt...,Kingte,36991,5780122040,['https://images-na.ssl-images-amazon.com/imag...,10.49,False,False,...,False,False,False,False,False,False,False,False,3.5,6
4,"['Clothing, Shoes & Jewelry', 'Women', 'Clothi...","GorgeoUS lightweight cotton dress in red, pink...",Pistachio Women's Sun Flower Flowing Knee Leng...,Pistachio,1131061,6040972467,['https://images-na.ssl-images-amazon.com/imag...,22.99,False,False,...,False,False,False,False,False,False,False,False,4.2,26
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
147467,"['Clothing, Shoes & Jewelry', 'Men', 'Shoes', ...",These men's Tubular shoes honor running-inspir...,adidas Originals Men's Tubular Shadow Running ...,Unknown,20519,B01HJDW6ZC,['https://images-na.ssl-images-amazon.com/imag...,120.08,False,False,...,False,False,False,False,False,False,False,False,4.9,8
147468,"['Clothing, Shoes & Jewelry', 'Men', 'Shoes', ...",An edgy take on Adidas running-inspired herita...,adidas Originals Men's Tubular Shadow Fashion ...,Unknown,74828,B01HJDVCJI,['https://images-na.ssl-images-amazon.com/imag...,123.72,False,False,...,False,False,False,False,False,False,False,False,5.0,3
147469,"['Clothing, Shoes & Jewelry', 'Women', 'Shoes'...",Catalina - where sporty meets sparkly! adorned...,Dansko Women's Catalina Flat Sandal,Unknown,1114685,B01HJCMR4I,['https://images-na.ssl-images-amazon.com/imag...,94.98,False,False,...,False,False,False,False,False,False,False,False,4.1,15
147470,"['Clothing, Shoes & Jewelry', 'Men', 'Shoes', ...",A classic wingtip with a subtle twist that cov...,Deer Stags Men's Hampden Oxford,Unknown,956501,B01HJH7W0W,['https://images-na.ssl-images-amazon.com/imag...,52.38,False,False,...,False,False,False,False,False,False,False,False,4.0,5


In [105]:
x = 'maincat_Contemporary & Designer' 
x in products_df.columns[products_df.columns.str.startswith('maincat')]

True

### Creating a General Function

Let's create a general function which returns the top n most popular re

In [108]:
def popularity_rec_func(n = 10, category=None, threshold=100):

    '''
    popularity_rec_func: A function that returns the top n most popular products within a category or whole dataframe 
    (default) based on a user defined threshold (default)

    INPUT: n (int)           - Number of rows to return in popularity dataframe
                                 DEFAULT: 10 
           category (string) - Category to return popular products from
                                 DEFAULT: Whole dataframe
           threshold (int)   - Number of reviews a product must have to be ranked in popularity dataframe
                                 DEFAULT: 100 reviews total

    OUTPUT: Dataframe containing top n most popular products
    '''
    
    assert type(n) is int, 'n must be an integer'
    assert ('maincat_' + category) in products_df.columns[products_df.columns.str.startswith('maincat')], 'Category is not valid'
    assert type(threshold) == int, 'threshold must be an integer'

    if (category == None):

        products_df_filtered = products_df[products_df['total_reviews'] >= threshold]

        result = products_df_filtered.sort_values(by=['average_rating'], ascending=False).head(n)

    else:

        maincat_category = 'maincat_' + category

        products_df_filtered = products_df[products_df[maincat_category] == 1]

        products_df_filtered = products_df_filtered[products_df_filtered['total_reviews'] >= threshold]

        result = products_df_filtered.sort_values(by=['average_rating'], ascending=False).head(n)
        result = result.iloc[:, (~(result.columns.str.startswith('maincat_')) \
                                    & ~(result.columns.str.startswith('subcat_')))]

        if result.shape[0] < n:

            print(f'Limited results because the number of reviews the products have in the {category} category do \
                    not meet the threshold.\nSet the threshold to a number within range of the category.')
                
            print(products_df_filtered['total_reviews'].describe())

    return result

In [114]:
x = 4
popularity_rec_func(5, 'Travel Accessories', 40)

Unnamed: 0,category,description,title,brand,rank,asin,imageURL,price,average_rating,total_reviews
68930,"['Clothing, Shoes & Jewelry', 'Luggage & Trave...","', 'THE ULTIMATE TRAVEL ACCESSORIES -- Dont le...",Dot&Dot Large Packing Cubes for Travel - Lugga...,Dot&Dot,33924,B00JKII3K2,['https://images-na.ssl-images-amazon.com/imag...,27.99,4.9,101
141982,"['Clothing, Shoes & Jewelry', 'Luggage & Trave...","Material: Fleece, cotton Small size: 32cm x 40...",Misslo Set of 3 Cotton Breathable Dust-proof D...,MISSLO,9989,B01FJYLOB0,['https://images-na.ssl-images-amazon.com/imag...,12.99,4.9,63
105233,"['Clothing, Shoes & Jewelry', 'Luggage & Trave...","', 'THE ULTIMATE ORGANIZER! The last thing you...",Chameleon PACKING CUBES for Travel -4 Piece Se...,231,474070,B00ZQW1PRO,['https://images-na.ssl-images-amazon.com/imag...,23.87,4.8,46
27359,"['Clothing, Shoes & Jewelry', 'Luggage & Trave...","Great for tee shirts, tank tops, diapers all t...",eBags Small Packing Cubes for Travel - Organiz...,Unknown,110203,B005V0WA4I,['https://images-na.ssl-images-amazon.com/imag...,26.99,4.8,156
66451,"['Clothing, Shoes & Jewelry', 'Luggage & Trave...","""THE ULTIMATE TRAVEL ACCESSORIES Dont leave h...",Dot&Dot Medium Packing Cubes for Travel - 4 Pi...,Dot&Dot,45351,B00ISL4KMW,['https://images-na.ssl-images-amazon.com/imag...,25.98,4.8,119


In [81]:
products_df.shape

(147472, 64)

## Conclusion <a class="anchor" id="conc"></a>

A simple popularity-based recommendation system with a threshold was made by considering products with a high number of reviews then by sorting by the average review rating. 

*Next notebook*: `content_rec.ipynb`