# Recommendation Systems - Lab

## Introduction

Now that you've gotten an introduction to collaborative filtering and recommendation systems, it's time to put your skills to test and attempt to build a recommendation system for a real world dataset! For this exercise, you'll be using a dataset regarding the book reviews on the Amazon marketplace. While the previous lesson focused on user-based recommendation systems, you'll apply a parallel process for an item-based recommendation system to recommend similar books at the bottom of the product page.

## Objectives

You will be able to:
* Implement a recommendation system on a real world dataset

## Load the Dataset

In [1]:
import pandas as pd

df = pd.read_csv('books_data.edgelist', names=['source', 'target', 'weight'], delimiter=' ')
df.head()

Unnamed: 0,source,target,weight
0,827229534,0804215715,0.7
1,827229534,156101074X,0.5
2,827229534,0687023955,0.8
3,827229534,0687074231,0.8
4,827229534,082721619X,0.7


In [2]:
import networkx as nx
G = nx.Graph()

## Load the MetaData

In [3]:
meta = pd.read_csv('books_meta.txt', sep='\t')
meta.head()

Unnamed: 0,Id,ASIN,Title,Categories,Group,SalesRank,TotalReviews,AvgRating,DegreeCentrality,ClusteringCoeff
0,1,827229534,Patterns of Preaching: A Sermon Sampler,clergi sermon subject religion preach spiritu ...,Book,396585,2,5.0,8,0.8
1,2,738700797,Candlemas: Feast of Flames,subject witchcraft earth religion spiritu base...,Book,168596,12,4.5,9,0.85
2,3,486287785,World War II Allied Fighter Planes Trading Cards,general hobbi subject craft home garden book,Book,1270652,1,5.0,0,0.0
3,4,842328327,Life Application Bible Commentary: 1 and 2 Tim...,spiritu translat commentari christian book gui...,Book,631289,1,4.0,6,0.79
4,5,1577943082,Prayers That Avail Much for Business: Executive,subject religion spiritu busi christian live w...,Book,455160,0,0.0,4,1.0


## Select Books to Test Your Recommender On

Select a small subset of books that you are interested in generating recommendations for. 

In [4]:
OUT = meta[meta.Title.str.contains('Outlander')]
OUT

Unnamed: 0,Id,ASIN,Title,Categories,Group,SalesRank,TotalReviews,AvgRating,DegreeCentrality,ClusteringCoeff
20818,29253,0385302304,Outlander,general travel subject author z gabaldon g his...,Book,65326,1015,4.5,5,0.9
27984,39303,0373638329,Outlanders: Tomb of Time,adventur general subject literatur fantasi boo...,Book,730193,9,5.0,0,0.0
51498,71930,0373638388,Talon and Fang (Outlanders #25),adventur general subject literatur action fant...,Book,570026,10,5.0,1,0.0
96160,133941,0812571134,"The Outlanders (The Lon Tobyn Chronicle, Book 2)",subject epic fantasi book scienc fiction,Book,112505,16,4.0,3,0.58
100066,139465,1569715149,Outlander - The Exile of Sharad Hett (Star War...,literatur popular book graphic fiction subject...,Book,25245,6,4.5,5,0.84
132013,184167,0373638299,Tigers Of Heaven (Outlanders),adventur general subject fantasi book scienc f...,Book,625820,11,5.0,2,1.0
132990,185498,0440212561,Outlander,general travel subject author z paperback gaba...,Book,1596,1015,4.5,19,0.78
147989,206360,0553473298,Outlander,general travel subject literatur format author...,Book,233840,1015,4.5,5,0.9
155271,216627,0553714538,Outlander,general subject literatur diana author cd z ga...,Book,157609,1015,4.5,5,0.9
231703,321778,0373638280,Doom Dynasty (Outlanders # 15),adventur general subject literatur action fant...,Book,696835,7,5.0,2,1.0


## Generate Recommendations for a Few Books of Choice

The 'books_data.edgelist' has conveniently already calculated the distance between items for you. Given this preprocessed and data, it's time to employ collaborative filtering to generate recommendations! Generate the top 10 recommendations for each book in the subset you chose. Be sure to print the book name that you are generating recommendations for as well as the name of the books being recommended.

In [5]:
rec_dict = {}
id_name_dict = dict(zip(meta.ASIN, meta.Title))
for row in OUT.index:
    book_id = OUT.ASIN[row]
    book_name = id_name_dict[book_id]
    most_similar = df[(df.source==book_id) | (df.target==book_id)].sort_values(by='weight', ascending=False).head(10)
    most_similar['source_name'] = most_similar['source'].map(id_name_dict)
    most_similar['target_name'] = most_similar['target'].map(id_name_dict)
    recommendations = []
    for row in most_similar.index:
        if most_similar.source[row] == book_id:
            recommendations.append((most_similar.target_name[row], most_similar.weight[row]))
        else:
            recommendations.append((most_similar.source_name[row], most_similar.weight[row]))
    rec_dict[book_name] = recommendations
    print("Recommendations for:", book_name)
    for r in recommendations:
        print(r)
    print('\n\n')

Recommendations for: Outlander
('Dragonfly in Amber', 0.86)
('Voyager', 0.86)
('Drums of Autumn', 0.71)
('The Fiery Cross', 0.59)
('Lord John and the Private Matter', 0.17)



Recommendations for: Outlanders: Tomb of Time



Recommendations for: Talon and Fang (Outlanders #25)
('Dragoneye (Outlanders #22)', 0.91)



Recommendations for: The Outlanders (The Lon Tobyn Chronicle, Book 2)
('Rules of Ascension (Winds of the Forelands, Book 1)', 0.86)
('Children of Amarid (The Lon Tobyn Chronicle, Book 1)', 0.86)
('Seeds of Betrayal: Book 2 of the Winds of the Forelands Tetralogy', 0.86)



Recommendations for: Outlander - The Exile of Sharad Hett (Star Wars: Ongoing, Volume 2)
('Star Wars - Jedi Council: Acts of War', 1.0)
('Emissaries to Malastare (Star Wars: Ongoing, Volume 3)', 1.0)
('Star Wars: The Hunt for Aurra Sing', 0.85)
('Twilight (Star Wars: Ongoing, Volume 4)', 0.85)
('Star Wars: Darkness', 0.46)



Recommendations for: Tigers Of Heaven (Outlanders)
('Doom Dynasty (Outlanders # 

## Summary

Well done! In this lab, you effectively created a recommendation system for a real world dataset!