# Amazon Recommendation System - Lab

## Introduction

Now that you've gotten an introduction to collaborative filtering and recommendation systems, it's time to put your skills to test and build a recommendation system for a real world dataset! For this lab, you'll be using a dataset regarding the book reviews on the Amazon marketplace. While the previous lesson focused on user-based recommendation systems, you'll apply a parallel process for an item-based recommendation system to recommend similar books at the bottom of the product page.

## Objectives

In this lab you will: 

- Use graph-based similarity metrics to create a collaborative filtering recommender system

## Load the Dataset

In [1]:
import pandas as pd
import networkx as nx
G = nx.Graph()

df = pd.read_csv('books_data.edgelist', names=['source', 'target', 'weight'], delimiter=' ')
df.head()

Unnamed: 0,source,target,weight
0,827229534,0804215715,0.7
1,827229534,156101074X,0.5
2,827229534,0687023955,0.8
3,827229534,0687074231,0.8
4,827229534,082721619X,0.7


## Load the Metadata 

Import the metadata available in the file `'books_meta.txt'` (note it is `'\t'` seperated). 

In [2]:
books = pd.read_csv('books_meta.txt',delimiter='\t')
books.head()

Unnamed: 0,Id,ASIN,Title,Categories,Group,SalesRank,TotalReviews,AvgRating,DegreeCentrality,ClusteringCoeff
0,1,827229534,Patterns of Preaching: A Sermon Sampler,clergi sermon subject religion preach spiritu ...,Book,396585,2,5.0,8,0.8
1,2,738700797,Candlemas: Feast of Flames,subject witchcraft earth religion spiritu base...,Book,168596,12,4.5,9,0.85
2,3,486287785,World War II Allied Fighter Planes Trading Cards,general hobbi subject craft home garden book,Book,1270652,1,5.0,0,0.0
3,4,842328327,Life Application Bible Commentary: 1 and 2 Tim...,spiritu translat commentari christian book gui...,Book,631289,1,4.0,6,0.79
4,5,1577943082,Prayers That Avail Much for Business: Executive,subject religion spiritu busi christian live w...,Book,455160,0,0.0,4,1.0


## Select Books to Test Your Recommender On

Select a small subset of books that you are interested in generating recommendations for. 

In [10]:
titanic = books[books.Title.str.contains('Titanic')].head(10)
titanic

Unnamed: 0,Id,ASIN,Title,Categories,Group,SalesRank,TotalReviews,AvgRating,DegreeCentrality,ClusteringCoeff
8846,12601,0439076706,Draw the Titanic,art subject draw music book children,Book,814492,3,3.0,0,0.0
20367,28589,1557833559,Titanic: The Complete Book of the Musical : St...,songbook art general entertain subject perform...,Book,492241,5,4.5,2,0.0
26199,36836,0152013059,SOS Titanic,literatur boat eve book fiction general subjec...,Book,23694,52,4.0,4,0.0
27382,38481,1885508271,Titanic Reference Map,general nonfict subject literatur in fiction m...,Book,1273676,0,0.0,0,0.0
27439,38556,031331215X,The Titanic : Historiography and Annotated Bib...,europ book general subject transport store stu...,Book,2299550,1,3.0,0,0.0
29949,42023,0609601032,Douglas Adams's Starship Titanic,general subject author j terri z fantasi book ...,Book,111681,108,3.0,4,0.9
35292,49313,0393315134,Titanic: Destination Disaster : The Legends an...,subject shipwreck world titan ship book transp...,Book,221722,6,4.5,4,0.6
38994,54560,059033123X,Titanic: The Long Night,general subject boat natur world titan europ a...,Book,567346,76,5.0,1,0.0
40806,57094,0786710055,The Mammoth Book of the Titanic: Contemporary ...,historiographi general subject shipwreck world...,Book,639211,1,3.0,5,0.84
44082,61718,0965520994,Dusk to Dawn: Survivor Accounts of the Last Ni...,nonfict general subject world titan ship book ...,Book,551218,9,4.5,0,0.0


## Generate Recommendations for a Few Books of Choice

The `'books_data.edgelist'` has conveniently already calculated the distance between items for you. Given this preprocessed data, it's time to employ collaborative filtering to generate recommendations! Generate the top 10 recommendations for each book in the subset you chose. Be sure to print the book name that you are generating recommendations for as well as the name of the books being recommended. 

In [11]:
recommendations={}
book_titles=dict(zip(books.ASIN,books.Title))
for row in titanic.index:
    isbn=titanic.ASIN[row]
    title=book_titles[isbn]
    rec=df[(df.source==isbn)|(df.target==isbn)].sort_values(by='weight',ascending=False).head()
    rec['source_name']=rec['source'].map(book_titles)
    rec['target_name']=rec['target'].map(book_titles)
    rec_list=[]
    for row in rec.index:
        if rec.source[row]==isbn:
            rec_list.append(rec.target_name[row])
        else:
            rec_list.append(rec.source_name[row])
    recommendations[title]=rec_list
    print(f'Recommendations for: {title}')
    for book in rec_list:
        print(book)
    print('\n\n')

Recommendations for: Draw the Titanic



Recommendations for: Titanic: The Complete Book of the Musical : Story and Book by Peter Stone, Music and Lyrics by Maury Yeston
Titanic: A New Musical
A Chorus Line : The Complete Book of the Musical



Recommendations for: SOS Titanic
Inside the Titanic : A Giant Cut-away Book (Giant Cutaway Book)
Across Five Aprils
Titanic Crossing
Parallel Journeys



Recommendations for: Titanic Reference Map



Recommendations for: The Titanic : Historiography and Annotated Bibliography (Bibliographies and Indexes in World History)



Recommendations for: Douglas Adams's Starship Titanic
Long Dark Tea Time of the Soul
The Salmon of Doubt: Hitchhiking the Galaxy One Last Time
Mostly Harmless
Last Chance to See



Recommendations for: Titanic: Destination Disaster : The Legends and the Reality
Titanic: A Journey Through Time
A Night to Remember
The Loss of the S.S. Titanic: Its Story and Its Lessons, By One of the Survivors
Titanic: Triumph and Tragedy



Re

## Summary

Well done! In this lab, you effectively created a recommendation system for a real world dataset!