# Recommendation Systems - Lab

## Introduction

Now that you've gotten an introduction to collaborative filtering and recommendation systems, it's time to put your skills to test and attempt to build a recommendation system for a real world dataset! For this exercise, you'll be using a dataset regarding the book reviews on the Amazon marketplace. While the previous lesson focused on user-based recommendation systems, you'll apply a parallel process for an item-based recommendation system to recommend similar books at the bottom of the product page.

## Objectives

You will be able to:
* Implement a recommendation system on a real world dataset

## Load the Dataset

In [1]:
#Your code here
import pandas as pd

df = pd.read_csv('books_data.edgelist', names=['source', 'target', 'weight'], delimiter=' ')
df.head()

Unnamed: 0,source,target,weight
0,827229534,0804215715,0.7
1,827229534,156101074X,0.5
2,827229534,0687023955,0.8
3,827229534,0687074231,0.8
4,827229534,082721619X,0.7


## Load the MetaData

In [8]:
#Your code here
meta = pd.read_csv('books_meta.txt', sep='\t')
print(meta.shape)
meta.head()

(393561, 10)


Unnamed: 0,Id,ASIN,Title,Categories,Group,SalesRank,TotalReviews,AvgRating,DegreeCentrality,ClusteringCoeff
0,1,827229534,Patterns of Preaching: A Sermon Sampler,clergi sermon subject religion preach spiritu ...,Book,396585,2,5.0,8,0.8
1,2,738700797,Candlemas: Feast of Flames,subject witchcraft earth religion spiritu base...,Book,168596,12,4.5,9,0.85
2,3,486287785,World War II Allied Fighter Planes Trading Cards,general hobbi subject craft home garden book,Book,1270652,1,5.0,0,0.0
3,4,842328327,Life Application Bible Commentary: 1 and 2 Tim...,spiritu translat commentari christian book gui...,Book,631289,1,4.0,6,0.79
4,5,1577943082,Prayers That Avail Much for Business: Executive,subject religion spiritu busi christian live w...,Book,455160,0,0.0,4,1.0


## Select Books to Test Your Recommender On

Select a small subset of books that you are interested in generating recommendations for. 

In [18]:
#Your code here
Data = meta[meta.Title.str.contains('Data Analysis for')]
Data

Unnamed: 0,Id,ASIN,Title,Categories,Group,SalesRank,TotalReviews,AvgRating,DegreeCentrality,ClusteringCoeff
42708,59816,205289037,Data Analysis for Social Workers,nonfict appli mathemat subject statist technic...,Book,582429,0,0.0,2,0.0
101211,141127,761987363,Theory Based Data Analysis for the Social Scie...,general nonfict subject statist sociolog socia...,Book,718337,0,0.0,0,0.0
162675,226907,333763068,Data Construction and Data Analysis for Survey...,general nonfict subject home invest statist bu...,Book,723293,0,0.0,0,0.0
171894,239492,471489786,Chemometrics : Data Analysis for the Laborator...,general subject chemic analyt chemistri techni...,Book,298288,0,0.0,3,0.0
182498,254370,963502700,Data Analysis for Scientists and Engineers,general appli mathemat subject statist technic...,Book,506891,2,4.5,5,0.9
208649,290685,761901078,Methods and Data Analysis for Cross-Cultural R...,mind general counsel nonfict subject health bo...,Book,737355,0,0.0,2,0.0
232943,323459,412063417,Practical Data Analysis for Designed Experiments,busi technic book com probabl offic general su...,Book,743779,0,0.0,0,0.0
301564,416769,521009766,Experimental Design and Data Analysis for Biol...,biolog general subject methodolog educ experi ...,Book,90629,0,0.0,19,0.48
358834,494701,764516612,Excel Data Analysis for Dummies,introductori book spreadsheet com guid offic d...,Book,27397,1,5.0,4,0.7
381258,525611,750650869,Data Analysis for Database Design,analysi system busi book com offic databas gen...,Book,1203883,1,4.0,0,0.0


## Generate Recommendations for a Few Books of Choice

The 'books_data.edgelist' has conveniently already calculated the distance between items for you. Given this preprocessed and data, it's time to employ collaborative filtering to generate recommendations! Generate the top 10 recommendations for each book in the subset you chose. Be sure to print the book name that you are generating recommendations for as well as the name of the books being recommended.

In [23]:
#Your code here
rec_dict = {}
id_name_dict = dict(zip(meta.ASIN, meta.Title))
for row in Data.index:
    book_id = Data.ASIN[row]
    book_name = id_name_dict[book_id]
    most_similar = df[(df.source == book_id) | (df.target == book_id)].sort_values(by='weight', ascending=False).head(10)
    most_similar['source_name'] = most_similar['source'].map(id_name_dict)
    most_similar['target_name'] = most_similar['target'].map(id_name_dict)
    
    recommendations = []
    for row in most_similar.index:
        if most_similar.source[row] == book_id:
            recommendations.append((most_similar.target_name[row], most_similar.weight[row]))
        else:
            recommendations.append((most_similar.source_name[row], most_similar.weight[row]))
    rec_dict[book_name] = recommendations
    print("Recommendations for:", book_name)
    for r in recommendations:
        print(r)
    print('\n')

Recommendations for: Data Analysis for Social Workers
('Management of Human Service Programs', 0.17)
('Becoming an Effective Policy Advocate: From Policy Practice to Social Justice', 0.16)


Recommendations for: Theory Based Data Analysis for the Social Sciences (Undergraduate Research Methods & Statistics in the Social Sciences)


Recommendations for: Data Construction and Data Analysis for Survey Research


Recommendations for: Chemometrics : Data Analysis for the Laboratory and Chemical Plant
('Statistics and Chemometrics for Analytical Chemistry (4th Edition)', 0.7)
('Statistics for Experimenters: An Introduction to Design, Data Analysis, and Model Building', 0.28)
('Multivariate Statistical Analysis: A Conceptual Introduction', 0.0)


Recommendations for: Data Analysis for Scientists and Engineers
('Data Reduction and Error Analysis for the Physical Sciences', 0.83)
('An Introduction to Error Analysis: The Study of Uncertainties in Physical Measurements', 0.54)
('The Statistical A

## Summary

Well done! In this lab, you effectively created a recommendation system for a real world dataset!