ðŸ“˜ Business Objective

The primary business objective of this Book Recommendation System is to enhance user engagement and satisfaction by delivering highly personalized and relevant book suggestions. In an environment where users are overwhelmed with choices, the system aims to simplify decision-making by identifying patterns in userâ€“item interactions and leveraging similarity-based algorithms to recommend books that closely align with individual preferences. By improving content discoverability and reducing the search effort for users, the system supports increased platform retention, longer browsing sessions, and a more intuitive exploration of available titles. For businesses, this leads to higher user satisfaction, improved conversion rates, and the ability to offer targeted content that aligns with user interests, ultimately driving strategic growth and long-term loyalty.

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import warnings
warnings.filterwarnings('ignore')

In [54]:
books = pd.read_csv('/content/Books.csv')
users = pd.read_csv('/content/Users.csv')
ratings = pd.read_csv('/content/Ratings.csv')

In [55]:
books.head()

Unnamed: 0,ISBN,Book-Title,Book-Author,Year-Of-Publication,Publisher,Image-URL-S,Image-URL-M,Image-URL-L
0,195153448,Classical Mythology,Mark P. O. Morford,2002,Oxford University Press,http://images.amazon.com/images/P/0195153448.0...,http://images.amazon.com/images/P/0195153448.0...,http://images.amazon.com/images/P/0195153448.0...
1,2005018,Clara Callan,Richard Bruce Wright,2001,HarperFlamingo Canada,http://images.amazon.com/images/P/0002005018.0...,http://images.amazon.com/images/P/0002005018.0...,http://images.amazon.com/images/P/0002005018.0...
2,60973129,Decision in Normandy,Carlo D'Este,1991,HarperPerennial,http://images.amazon.com/images/P/0060973129.0...,http://images.amazon.com/images/P/0060973129.0...,http://images.amazon.com/images/P/0060973129.0...
3,374157065,Flu: The Story of the Great Influenza Pandemic...,Gina Bari Kolata,1999,Farrar Straus Giroux,http://images.amazon.com/images/P/0374157065.0...,http://images.amazon.com/images/P/0374157065.0...,http://images.amazon.com/images/P/0374157065.0...
4,393045218,The Mummies of Urumchi,E. J. W. Barber,1999,W. W. Norton &amp; Company,http://images.amazon.com/images/P/0393045218.0...,http://images.amazon.com/images/P/0393045218.0...,http://images.amazon.com/images/P/0393045218.0...


In [56]:
users.head()

Unnamed: 0,User-ID,Location,Age
0,1,"nyc, new york, usa",
1,2,"stockton, california, usa",18.0
2,3,"moscow, yukon territory, russia",
3,4,"porto, v.n.gaia, portugal",17.0
4,5,"farnborough, hants, united kingdom",


In [57]:
ratings

Unnamed: 0,User-ID,ISBN,Book-Rating
0,276725,034545104X,0
1,276726,0155061224,5
2,276727,0446520802,0
3,276729,052165615X,3
4,276729,0521795028,6
...,...,...,...
1149775,276704,1563526298,9
1149776,276706,0679447156,0
1149777,276709,0515107662,10
1149778,276721,0590442449,10


In [58]:
books = books[['ISBN','Book-Title',	'Book-Author','Year-Of-Publication','Publisher']]

In [59]:
books.rename(columns = {'Book-Title':'title','Book-Author':'author','Year-Of-Publication':'year','Publisher':'publisher'},inplace = True)

In [60]:
books.head()

Unnamed: 0,ISBN,title,author,year,publisher
0,195153448,Classical Mythology,Mark P. O. Morford,2002,Oxford University Press
1,2005018,Clara Callan,Richard Bruce Wright,2001,HarperFlamingo Canada
2,60973129,Decision in Normandy,Carlo D'Este,1991,HarperPerennial
3,374157065,Flu: The Story of the Great Influenza Pandemic...,Gina Bari Kolata,1999,Farrar Straus Giroux
4,393045218,The Mummies of Urumchi,E. J. W. Barber,1999,W. W. Norton &amp; Company


In [61]:
users.head()

Unnamed: 0,User-ID,Location,Age
0,1,"nyc, new york, usa",
1,2,"stockton, california, usa",18.0
2,3,"moscow, yukon territory, russia",
3,4,"porto, v.n.gaia, portugal",17.0
4,5,"farnborough, hants, united kingdom",


In [62]:
users.rename(columns = {'User-ID':'user_id','Location':'location','Age':'age'},inplace = True)

In [63]:
users.head()

Unnamed: 0,user_id,location,age
0,1,"nyc, new york, usa",
1,2,"stockton, california, usa",18.0
2,3,"moscow, yukon territory, russia",
3,4,"porto, v.n.gaia, portugal",17.0
4,5,"farnborough, hants, united kingdom",


In [64]:
ratings.head()

Unnamed: 0,User-ID,ISBN,Book-Rating
0,276725,034545104X,0
1,276726,0155061224,5
2,276727,0446520802,0
3,276729,052165615X,3
4,276729,0521795028,6


In [65]:
ratings.rename(columns = {'User-ID':'user_id','Book-Rating':'rating'},inplace = True)

In [66]:
ratings.head()

Unnamed: 0,user_id,ISBN,rating
0,276725,034545104X,0
1,276726,0155061224,5
2,276727,0446520802,0
3,276729,052165615X,3
4,276729,0521795028,6


In [67]:
print(books.shape)
print(users.shape)
print(ratings.shape)

(129038, 5)
(278858, 3)
(1149780, 3)


In [68]:
x = ratings['user_id'].value_counts()>200

In [69]:
y = x[x].index

In [70]:
y

Index([ 11676, 198711, 153662,  98391,  35859, 212898, 278418,  76352, 110973,
       235105,
       ...
       116122,  44296,  28634,  59727,  73681, 274808, 188951,   9856, 155916,
       268622],
      dtype='int64', name='user_id', length=899)

In [71]:
ratings = ratings[ratings['user_id'].isin(y)]

In [72]:
ratings

Unnamed: 0,user_id,ISBN,rating
1456,277427,002542730X,10
1457,277427,0026217457,0
1458,277427,003008685X,8
1459,277427,0030615321,0
1460,277427,0060002050,0
...,...,...,...
1147612,275970,3829021860,0
1147613,275970,4770019572,0
1147614,275970,896086097,0
1147615,275970,9626340762,8


In [73]:
ratings_with_books = ratings.merge(books,on ='ISBN')

In [74]:
ratings_with_books.shape

(377451, 7)

In [75]:
number_rating = ratings_with_books.groupby('title')['rating'].count().reset_index()

In [76]:
number_rating.rename(columns = {'rating':'number of rating'},inplace = True)

In [77]:
number_rating

Unnamed: 0,title,number of rating
0,A Light in the Storm: The Civil War Diary of ...,2
1,Beyond IBM: Leadership Marketing and Finance ...,1
2,Dark Justice,1
3,Earth Prayers From around the World: 365 Pray...,3
4,Final Fantasy Anthology: Official Strategy Gu...,3
...,...,...
84597,Ãƒ?coute ma diffÃƒÂ©rence (Le Temps des femmes),1
84598,Ãƒ?ngeles fugaces (Falling Angels),1
84599,Ãƒ?Ã‚?rger mit Produkt X. Roman.,1
84600,Ãƒ?Ã‚?stlich der Berge.,1


In [78]:
final_rating = ratings_with_books.merge(number_rating,on = 'title')

In [79]:
final_ratings = final_rating[final_rating['number of rating']>=50]

In [80]:
final_ratings.shape

(60464, 8)

In [81]:
final_ratings.duplicated(['user_id','title']).sum()

np.int64(1859)

In [82]:
final_ratings.drop_duplicates(['user_id','title'],inplace = True)

In [83]:
final_ratings.shape

(58605, 8)

In [84]:
 book_pivot = final_ratings.pivot_table(columns = 'user_id', index = 'title',values = 'rating')

In [85]:
book_pivot

user_id,254,2276,2766,2977,3363,3757,4017,4385,6242,6251,...,274004,274061,274301,274308,274808,275970,277427,277478,277639,278418
title,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
1984,9.0,,,,,,,,,,...,,,,,,0.0,,,,
1st to Die: A Novel,,,,,,,,,,,...,,,,,,,,,,
2nd Chance,,10.0,,,,,,,,,...,,,,0.0,,,,,0.0,
4 Blondes,,,,,,,,,,0.0,...,,,,,,,,,,
84 Charing Cross Road,,,,,,,,,,,...,,,,,,10.0,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
Year of Wonders,,,,7.0,,,,,7.0,,...,,,,,,0.0,,,,
You Belong To Me,,,,,,,,,,,...,,,,,,,,,,
Zen and the Art of Motorcycle Maintenance: An Inquiry into Values,,,,,0.0,,,,,0.0,...,,,,,,0.0,,,,
Zoya,,,,,,,,,,,...,,,,,,,,,,


In [86]:
book_pivot.fillna(0,inplace = True)

In [87]:
book_pivot

user_id,254,2276,2766,2977,3363,3757,4017,4385,6242,6251,...,274004,274061,274301,274308,274808,275970,277427,277478,277639,278418
title,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
1984,9.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1st to Die: A Novel,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2nd Chance,0.0,10.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4 Blondes,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
84 Charing Cross Road,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,10.0,0.0,0.0,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
Year of Wonders,0.0,0.0,0.0,7.0,0.0,0.0,0.0,0.0,7.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
You Belong To Me,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
Zen and the Art of Motorcycle Maintenance: An Inquiry into Values,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
Zoya,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [88]:
from scipy.sparse import csr_matrix
book_sparse = csr_matrix(book_pivot)

In [89]:
type(book_sparse)

In [90]:
from sklearn.neighbors import NearestNeighbors

In [91]:
np.where(book_pivot.index == 237)

(array([], dtype=int64),)

In [92]:
model = NearestNeighbors(metric='cosine', algorithm='brute', n_jobs=1)
model.fit(book_sparse)

query = book_sparse[237].reshape(1, -1)

distances, suggestions = model.kneighbors(query, n_neighbors=6)

In [93]:
distances

array([[0.        , 0.54496057, 0.55967844, 0.58315122, 0.70913787,
        0.77524942]])

In [94]:
suggestions

array([[237, 233, 236, 234, 235, 286]])

In [95]:
for i in range(len(suggestions)):
    print(book_pivot.index[suggestions[i]])

Index(['Harry Potter and the Sorcerer's Stone (Book 1)',
       'Harry Potter and the Chamber of Secrets (Book 2)',
       'Harry Potter and the Prisoner of Azkaban (Book 3)',
       'Harry Potter and the Goblet of Fire (Book 4)',
       'Harry Potter and the Order of the Phoenix (Book 5)',
       'Jacob Have I Loved'],
      dtype='object', name='title')


In [96]:
np.where(book_pivot.index == 'Animal Farm')[0][0]

np.int64(52)

In [97]:
def recommend_book(book_index, model, data, n_neighbors=6):
    query = data[book_index].reshape(1, -1)
    distances, suggestions = model.kneighbors(query, n_neighbors=n_neighbors)
    return suggestions[0]

In [98]:
book_index = np.where(book_pivot.index == 'Animal Farm')[0][0]
recommend_book(book_index, model, book_sparse)

array([ 52,   0, 331,  50,  88, 514])

In [99]:
book_index = np.where(book_pivot.index == 'Animal Farm')[0][0]
suggested_book_indices = recommend_book(book_index, model, book_sparse)

print("Recommended books for 'Animal Farm':")
for idx in suggested_book_indices:
    print(book_pivot.index[idx])

Recommended books for 'Animal Farm':
Animal Farm
1984
Midnight
Angus, Thongs and Full-Frontal Snogging: Confessions of Georgia Nicolson
Brave New World
The Catcher in the Rye


In [100]:
book_index = np.where(book_pivot.index == 'High Fidelity')[0][0]
suggested_book_indices = recommend_book(book_index, model, book_sparse)

print("Recommended books for 'High Fidelity':")
for idx in suggested_book_indices:
    print(book_pivot.index[idx])

Recommended books for 'High Fidelity':
High Fidelity
About a Boy
How to Be Good
Five Quarters of the Orange
Notes from a Small Island
White Teeth: A Novel


In [101]:
book_index = np.where(book_pivot.index == '1984')[0][0]
suggested_book_indices = recommend_book(book_index, model, book_sparse)

print("Recommended books for '1984':")
for idx in suggested_book_indices:
    print(book_pivot.index[idx])

Recommended books for '1984':
1984
Animal Farm
The Catcher in the Rye
Lord of the Flies
The Handmaid's Tale
Slaughterhouse Five or the Children's Crusade: A Duty Dance With Death


In [102]:
import pickle

In [103]:
from scipy.sparse import save_npz
save_npz("book_sparse.npz", book_sparse)

In [104]:
book_pivot.to_pickle("book_pivot1.pkl")

In [105]:
from google.colab import files

files.download("book_sparse.npz")
files.download("book_pivot1.pkl")

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>