# Amazon Recommendation System - Lab

## Introduction

Now that you've gotten an introduction to collaborative filtering and recommendation systems, it's time to put your skills to test and build a recommendation system for a real world dataset! For this lab, you'll be using a dataset regarding the book reviews on the Amazon marketplace. While the previous lesson focused on user-based recommendation systems, you'll apply a parallel process for an item-based recommendation system to recommend similar books at the bottom of the product page.

## Objectives

In this lab you will: 

- Use graph-based similarity metrics to create a collaborative filtering recommender system

## Load the Dataset

In [16]:
import pandas as pd
import networkx as nx
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

G = nx.Graph()

df = pd.read_csv('books_data.edgelist', names=['source', 'target', 'weight'], delimiter=' ')
df.head()

Unnamed: 0,source,target,weight
0,827229534,0804215715,0.7
1,827229534,156101074X,0.5
2,827229534,0687023955,0.8
3,827229534,0687074231,0.8
4,827229534,082721619X,0.7


In [19]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 741124 entries, 0 to 741123
Data columns (total 3 columns):
source    741124 non-null object
target    741124 non-null object
weight    741124 non-null float64
dtypes: float64(1), object(2)
memory usage: 17.0+ MB


## Load the Metadata 

Import the metadata available in the file `'books_meta.txt'` (note it is `'\t'` seperated). 

In [17]:
# Your code here
meta_df = pd.read_csv("books_meta.txt", delimiter='\t')
meta_df.head()

Unnamed: 0,Id,ASIN,Title,Categories,Group,SalesRank,TotalReviews,AvgRating,DegreeCentrality,ClusteringCoeff
0,1,827229534,Patterns of Preaching: A Sermon Sampler,clergi sermon subject religion preach spiritu ...,Book,396585,2,5.0,8,0.8
1,2,738700797,Candlemas: Feast of Flames,subject witchcraft earth religion spiritu base...,Book,168596,12,4.5,9,0.85
2,3,486287785,World War II Allied Fighter Planes Trading Cards,general hobbi subject craft home garden book,Book,1270652,1,5.0,0,0.0
3,4,842328327,Life Application Bible Commentary: 1 and 2 Tim...,spiritu translat commentari christian book gui...,Book,631289,1,4.0,6,0.79
4,5,1577943082,Prayers That Avail Much for Business: Executive,subject religion spiritu busi christian live w...,Book,455160,0,0.0,4,1.0


## Select Books to Test Your Recommender On

Select a small subset of books that you are interested in generating recommendations for. 

In [83]:
# Your code here
books_to_test = meta_df.loc[:20, 'Title']

## Generate Recommendations for a Few Books of Choice

The `'books_data.edgelist'` has conveniently already calculated the distance between items for you. Given this preprocessed data, it's time to employ collaborative filtering to generate recommendations! Generate the top 10 recommendations for each book in the subset you chose. Be sure to print the book name that you are generating recommendations for as well as the name of the books being recommended. 

In [87]:
def get_recs(title, num_of_recs):
    
    # get book number
    book_num = meta_df[meta_df.Title == title].ASIN.values[0]
    
    # find all edges with the book number as a source or target
    edges = df[(df.source == book_num) | (df.target == book_num)]
    
    # sort by weights and slice by num_of_recs
    edges = edges.sort_values('weight', ascending=False)[:num_of_recs]
    
    # get recomended book numbers
    rec_book_nums = []
    for row in edges.index:
        if edges.source[row] == book_num:
            rec_book_nums.append(edges.target[row])
        else:
            rec_book_nums.append(edges.source[row])
    
    # translate rec_book_nums to titles
    rec_book_titles = [meta_df[meta_df.ASIN == num].Title.values[0] for num in rec_book_nums]
    
    # print results
    print('Recomendations for {}'.format(title))
    if rec_book_titles == []:
        print('ZERO RECOMENDATIONS')
    for n, book in enumerate(rec_book_titles):
        print("{}) {}".format(n+1, book))
    

In [88]:
for book in books_to_test:
    get_recs(book, 10)
    print('\n\n')

Recomendations for Patterns of Preaching: A Sermon Sampler
1) The Four Pages of the Sermon: A Guide to Biblical Preaching
2) Performing the Word: Preaching As Theatre
3) Witness of Preaching
4) Interpreting the Gospel: An Introduction to Preaching
5) Handbook of Contemporary Preaching
6) Just Say the Word!: Writing for the Ear
7) Preaching Liberation (Fortress Resources for Preaching)
8) The Preaching Life



Recomendations for Candlemas: Feast of Flames
1) Lammas
2) Ostara: Customs, Spells & Rituals for the Rites of Spring
3) The Pagan Book of Halloween : A Complete Guide to the Magick, Incantations, Recipes, Spells, and Lore
4) Beltane: Springtime Rituals, Lore and Celebration
5) Midsummer: Magical Celebrations of the Summer Solstice
6) Halloween: Customs, Recipes & Spells
7) Yule: A Celebration of Light and Warmth
8) The Summer Solstice : Celebrating the Journey of the Sun from May Day to Harvest
9) The Winter Solstice: The Sacred Traditions of Christmas



Recomendations for World 

## Summary

Well done! In this lab, you effectively created a recommendation system for a real world dataset!