# Recommendation Systems - Lab

## Introduction

Now that you've gotten an introduction to collaborative filtering and recommendation systems, it's time to put your skills to test and attempt to build a recommendation system for a real world dataset! For this exercise, you'll be using a dataset regarding the book reviews on the Amazon marketplace. While the previous lesson focused on user-based recommendation systems, you'll apply a parallel process for an item-based recommendation system to recommend similar books at the bottom of the product page.

## Objectives

You will be able to:
* Implement a recommendation system on a real world dataset

## Load the Dataset

In [28]:
#Your code here
import pandas as pd

df = pd.read_csv('books_data.edgelist', header=None, sep=' ')
df.columns = ['source', 'target', 'weight']
df.head()

Unnamed: 0,source,target,weight
0,827229534,0804215715,0.7
1,827229534,156101074X,0.5
2,827229534,0687023955,0.8
3,827229534,0687074231,0.8
4,827229534,082721619X,0.7


## Load the MetaData

In [29]:
#Your code here
meta_data = pd.read_csv('books_meta.txt', sep='\t')
meta_data.head()

Unnamed: 0,Id,ASIN,Title,Categories,Group,SalesRank,TotalReviews,AvgRating,DegreeCentrality,ClusteringCoeff
0,1,827229534,Patterns of Preaching: A Sermon Sampler,clergi sermon subject religion preach spiritu ...,Book,396585,2,5.0,8,0.8
1,2,738700797,Candlemas: Feast of Flames,subject witchcraft earth religion spiritu base...,Book,168596,12,4.5,9,0.85
2,3,486287785,World War II Allied Fighter Planes Trading Cards,general hobbi subject craft home garden book,Book,1270652,1,5.0,0,0.0
3,4,842328327,Life Application Bible Commentary: 1 and 2 Tim...,spiritu translat commentari christian book gui...,Book,631289,1,4.0,6,0.79
4,5,1577943082,Prayers That Avail Much for Business: Executive,subject religion spiritu busi christian live w...,Book,455160,0,0.0,4,1.0


## Select Books to Test Your Recommender On

Select a small subset of books that you are interested in generating recommendations for. 

In [32]:
#Your code here

sample = meta_data.sample(50)
len(sample)

50

In [33]:
sample.head()

Unnamed: 0,ASIN,Title
159822,521892759,A Critique of Max Weber's Philosophy of Social...
391776,590059750,Abby and the Notorious Neighbor (Baby-Sitters ...
39152,70536805,English-Spanish/Spanish-English Medical Dictio...
191062,310678706,Faith Lessons on the Promised Land (Church Vol...
332051,881064335,The Hippopotamus (Animal Close-Ups)


In [34]:
df.loc[df['source']=='0738700797']

Unnamed: 0,source,target,weight
86,738700797,738700827,0.57
87,738700797,1567184960,0.35
88,738700797,1567182836,0.53
89,738700797,738700525,0.5
90,738700797,738700940,0.73
91,738700797,835608158,0.31
92,738700797,835607690,0.17
93,738700797,1567187196,0.4
94,738700797,140196161,0.57


## Generate Recommendations for a Few Books of Choice

The 'books_data.edgelist' has conveniently already calculated the distance between items for you. Given this preprocessed and data, it's time to employ collaborative filtering to generate recommendations! Generate the top 10 recommendations for each book in the subset you chose. Be sure to print the book name that you are generating recommendations for as well as the name of the books being recommended.

In [39]:
#Your code here
def book_recommendations(book_id):
    similar_books = df.loc[df['source']==book_id]
    similar_books.sort_values(by=['weight'], ascending=False, inplace=True)
    book_list =  list(similar_books['target'][0:10])
    
    names = []
    
    for book in book_list:
        name = meta_data.loc[meta_data['ASIN']==book]['Title']
        names.append(name)
    return names

In [41]:
book_recommendations('0070536805')

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  after removing the cwd from sys.path.


[326785    Delmar's English and Spanish Pocket Dictionary...
 Name: Title, dtype: object,
 369076    Diccionario Mosby Ingles-espanol/espanol-ingle...
 Name: Title, dtype: object,
 171966    English and Spanish Medical Words and Phrases ...
 Name: Title, dtype: object,
 227121    Speedy Spanish for Medical Personnel (Speedy L...
 Name: Title, dtype: object,
 312558    Spanish for Health Care Professionals
 Name: Title, dtype: object,
 100641    Medical Spanish
 Name: Title, dtype: object,
 155510    Medical Spanish: An Instant Translator
 Name: Title, dtype: object,
 244858    Say It in Spanish: A Guide for Health Care Pro...
 Name: Title, dtype: object,
 137880    NTC's Dictionary of Spanish False Cognates
 Name: Title, dtype: object]

## Summary

Well done! In this lab, you effectively created a recommendation system for a real world dataset!