# Amazon Recommendation System - Lab

## Introduction

Now that you've gotten an introduction to collaborative filtering and recommendation systems, it's time to put your skills to test and build a recommendation system for a real world dataset! For this lab, you'll be using a dataset regarding the book reviews on the Amazon marketplace. While the previous lesson focused on user-based recommendation systems, you'll apply a parallel process for an item-based recommendation system to recommend similar books at the bottom of the product page.

## Objectives

In this lab you will: 

- Use graph-based similarity metrics to create a collaborative filtering recommender system

## Load the Dataset

In [1]:
import pandas as pd
import networkx as nx
G = nx.Graph()

df = pd.read_csv('books_data.edgelist', names=['source', 'target', 'weight'], delimiter=' ')
df.head()

Unnamed: 0,source,target,weight
0,827229534,0804215715,0.7
1,827229534,156101074X,0.5
2,827229534,0687023955,0.8
3,827229534,0687074231,0.8
4,827229534,082721619X,0.7


## Load the Metadata 

Import the metadata available in the file `'books_meta.txt'` (note it is `'\t'` seperated). 

In [2]:
metadata = pd.read_csv('books_meta.txt', delimiter='\t')
metadata.head()

Unnamed: 0,Id,ASIN,Title,Categories,Group,SalesRank,TotalReviews,AvgRating,DegreeCentrality,ClusteringCoeff
0,1,827229534,Patterns of Preaching: A Sermon Sampler,clergi sermon subject religion preach spiritu ...,Book,396585,2,5.0,8,0.8
1,2,738700797,Candlemas: Feast of Flames,subject witchcraft earth religion spiritu base...,Book,168596,12,4.5,9,0.85
2,3,486287785,World War II Allied Fighter Planes Trading Cards,general hobbi subject craft home garden book,Book,1270652,1,5.0,0,0.0
3,4,842328327,Life Application Bible Commentary: 1 and 2 Tim...,spiritu translat commentari christian book gui...,Book,631289,1,4.0,6,0.79
4,5,1577943082,Prayers That Avail Much for Business: Executive,subject religion spiritu busi christian live w...,Book,455160,0,0.0,4,1.0


## Select Books to Test Your Recommender On

Select a small subset of books that you are interested in generating recommendations for. 

In [3]:
dragons = metadata[metadata.Title.str.contains('dragon')]
dragons

Unnamed: 0,Id,ASIN,Title,Categories,Group,SalesRank,TotalReviews,AvgRating,DegreeCentrality,ClusteringCoeff
19991,28075,0380781042,"Grail (The Pendragon Cycle, Book 5)",subject stephen author l epic z fantasi histor...,Book,38590,30,3.5,12,0.52
41710,58385,0380717573,Pendragon : Book Four of the Pendragon Cycle (...,general subject stephen author l epic z fantas...,Book,39704,13,4.0,11,0.64
43121,60373,1928999131,"The Arthurian Companion (Pendragon, 6208)",general subject arthurian ann literatur phylli...,Book,578570,4,5.0,1,0.0
68199,95111,0920336531,From Mondragon to America: Experiments in Comm...,growth event general biographi subject current...,Book,530770,0,0.0,2,1.0
92857,129437,0801433258,Values at Work: Employee Participation Meets M...,behavior busi technic sociolog book com labor ...,Book,1184228,1,5.0,3,0.58
96152,133931,0743437314,The Merchant of Death (Pendragon Series #1),literatur action book hamster fiction general ...,Book,22160,91,5.0,9,0.2
103715,144586,0875461824,Making Mondragon: The Growth and Dynamics of t...,general nonfict subject sociolog social book s...,Book,616136,0,0.0,2,1.0
113866,158778,0743437322,The Lost City of Faar (Pendragon Series #2),adventur subject literatur monkey magic action...,Book,11151,28,5.0,1,0.0
113890,158812,0764546511,The Offical Pendragon Forms¿ For Palm OS® Star...,databas os general subject system palm api int...,Book,747165,4,3.5,0,0.0
125283,174764,0877736111,Rumi's World : The Life and Works of the Great...,art general biographi subject religion literat...,Book,198111,5,5.0,8,0.42


## Generate Recommendations for a Few Books of Choice

The `'books_data.edgelist'` has conveniently already calculated the distance between items for you. Given this preprocessed data, it's time to employ collaborative filtering to generate recommendations! Generate the top 10 recommendations for each book in the subset you chose. Be sure to print the book name that you are generating recommendations for as well as the name of the books being recommended. 

In [4]:
rec_dict = {}
id_name_dict = dict(zip(metadata.ASIN, metadata.Title))
for row in dragons.index:
    book_id = dragons.ASIN[row]
    book_name = id_name_dict[book_id]
    most_similar = df[(df.source == book_id)
                      | (df.target == book_id)
                     ].sort_values(by='weight', ascending=False).head(10)
    most_similar['source_name'] = most_similar['source'].map(id_name_dict)
    most_similar['target_name'] = most_similar['target'].map(id_name_dict)
    recommendations = []
    for row in most_similar.index:
        if most_similar.source[row] == book_id:
            recommendations.append((most_similar.target_name[row], most_similar.weight[row]))
        else:
            recommendations.append((most_similar.source_name[row], most_similar.weight[row]))
    rec_dict[book_name] = recommendations
    print('Recommendations for:', book_name)
    for r in recommendations:
        print(r)
    print('\n\n')


Recommendations for: Grail (The Pendragon Cycle, Book 5)
('The Silver Hand: Song of Albion Book 2 (Song of Albion Trilogy)', 0.77)
('The Dragon King Saga: In the Hall of the Dragon King, The Warlords of Nin, and The Sword and the Flame', 0.77)
('The Paradise War (Song of Albion, Volume 1)', 0.77)
('The Paradise War: Song of Albion Book 1 (Song of Albion Trilogy)', 0.77)
('Merlin (The Pendragon Cycle , Book 2)', 0.75)
('Taliesin : Book One of the Pendragon Cycle (Pendragon Cycle (Paperback))', 0.75)
('Arthur (The Pendragon Cycle, Book 3)', 0.75)
('The Iron Lance (The Celtic Crusades, Book 1)', 0.73)
('Avalon: : The Return of King Arthur', 0.67)
('The Silver Hand (Song of Albion, Volume 2)', 0.67)



Recommendations for: Pendragon : Book Four of the Pendragon Cycle (Pendragon Cycle, No 4)
('The Paradise War: Song of Albion Book 1 (Song of Albion Trilogy)', 0.85)
('Taliesin : Book One of the Pendragon Cycle (Pendragon Cycle (Paperback))', 0.81)
('Merlin (The Pendragon Cycle , Book 2)', 0.

## Summary

Well done! In this lab, you effectively created a recommendation system for a real world dataset!