# **Description:**

A Collaborative-Based Book Recommendation System is a personalized recommendation algorithm that suggests books to users based on the preferences and behavior of other users who have similar tastes. This system relies on the principle of collaborative filtering, where the recommendations are derived from the collective data of all users rather than analyzing the content of the books themselves.The key idea is that if two users have similar reading histories or have rated the same books similarly, they are likely to enjoy other books that their counterpart has rated highly. The system uses this similarity between users or items (books) to generate recommendations.


# 1. Import Libraries

In [66]:
import pandas as pd
import numpy as np
from scipy.sparse import csr_matrix
from sklearn.neighbors import NearestNeighbors

# 2. Import Dataset

 This dataset is of 3 files which contains book id, user id', book names, book ratings, author, year of publication, publisher...etc.

 # Reference:
https://www.kaggle.com/datasets/ra4u12/bookrecommendation?select=BX-Users.csv

In [67]:
books=pd.read_csv("BX-Books.csv",sep=';',encoding='latin-1',on_bad_lines='skip')
print(books.shape)
books.head()

(271360, 8)


  books=pd.read_csv("BX-Books.csv",sep=';',encoding='latin-1',on_bad_lines='skip')


Unnamed: 0,ISBN,Book-Title,Book-Author,Year-Of-Publication,Publisher,Image-URL-S,Image-URL-M,Image-URL-L
0,195153448,Classical Mythology,Mark P. O. Morford,2002,Oxford University Press,http://images.amazon.com/images/P/0195153448.0...,http://images.amazon.com/images/P/0195153448.0...,http://images.amazon.com/images/P/0195153448.0...
1,2005018,Clara Callan,Richard Bruce Wright,2001,HarperFlamingo Canada,http://images.amazon.com/images/P/0002005018.0...,http://images.amazon.com/images/P/0002005018.0...,http://images.amazon.com/images/P/0002005018.0...
2,60973129,Decision in Normandy,Carlo D'Este,1991,HarperPerennial,http://images.amazon.com/images/P/0060973129.0...,http://images.amazon.com/images/P/0060973129.0...,http://images.amazon.com/images/P/0060973129.0...
3,374157065,Flu: The Story of the Great Influenza Pandemic...,Gina Bari Kolata,1999,Farrar Straus Giroux,http://images.amazon.com/images/P/0374157065.0...,http://images.amazon.com/images/P/0374157065.0...,http://images.amazon.com/images/P/0374157065.0...
4,393045218,The Mummies of Urumchi,E. J. W. Barber,1999,W. W. Norton &amp; Company,http://images.amazon.com/images/P/0393045218.0...,http://images.amazon.com/images/P/0393045218.0...,http://images.amazon.com/images/P/0393045218.0...


Let's consider only few columns like ISBN,Book-Title, Year of publication,Image URL-L...etc .

# 3. Data Preprocessing

In [68]:
books=books[['ISBN','Book-Title','Book-Author','Year-Of-Publication','Publisher','Image-URL-L']]
books.rename(columns={"Book-Title":'title',
                    "Book-Author":'author',
                    "Year-Of-Publication":"year",
                    "Publisher":"Publisher",
                    'Image-URL-L':'image-url'},inplace=True)
books.head()

Unnamed: 0,ISBN,title,author,year,Publisher,image-url
0,195153448,Classical Mythology,Mark P. O. Morford,2002,Oxford University Press,http://images.amazon.com/images/P/0195153448.0...
1,2005018,Clara Callan,Richard Bruce Wright,2001,HarperFlamingo Canada,http://images.amazon.com/images/P/0002005018.0...
2,60973129,Decision in Normandy,Carlo D'Este,1991,HarperPerennial,http://images.amazon.com/images/P/0060973129.0...
3,374157065,Flu: The Story of the Great Influenza Pandemic...,Gina Bari Kolata,1999,Farrar Straus Giroux,http://images.amazon.com/images/P/0374157065.0...
4,393045218,The Mummies of Urumchi,E. J. W. Barber,1999,W. W. Norton &amp; Company,http://images.amazon.com/images/P/0393045218.0...


In [69]:
users=pd.read_csv("BX-Users.csv",sep=';',on_bad_lines='skip',encoding='latin-1')
print(users.shape)
users.head()

(278858, 3)


Unnamed: 0,User-ID,Location,Age
0,1,"nyc, new york, usa",
1,2,"stockton, california, usa",18.0
2,3,"moscow, yukon territory, russia",
3,4,"porto, v.n.gaia, portugal",17.0
4,5,"farnborough, hants, united kingdom",


In [70]:
ratings=pd.read_csv("BX-Book-Ratings.csv",sep=';',on_bad_lines='skip',encoding='latin-1')
print(ratings.shape)
ratings.head()

(1149780, 3)


Unnamed: 0,User-ID,ISBN,Book-Rating
0,276725,034545104X,0
1,276726,0155061224,5
2,276727,0446520802,0
3,276729,052165615X,3
4,276729,0521795028,6


In [71]:
ratings.rename(columns={'User-ID':'user-id',"Book-Rating":'rating'},inplace=True)
ratings['user-id'].unique().shape
ratings.head()

Unnamed: 0,user-id,ISBN,rating
0,276725,034545104X,0
1,276726,0155061224,5
2,276727,0446520802,0
3,276729,052165615X,3
4,276729,0521795028,6


Let's consider only books containing more than 200 ratings.

In [72]:
x=ratings['user-id'].value_counts()>200
print(x[x].shape)
y=x[x].index
y

(899,)


Index([ 11676, 198711, 153662,  98391,  35859, 212898, 278418,  76352, 110973,
       235105,
       ...
       260183,  73681,  44296, 155916,   9856, 274808,  28634,  59727, 268622,
       188951],
      dtype='int64', name='user-id', length=899)

In [73]:
ratings=ratings[ratings['user-id'].isin(y)]
print(ratings.shape)
ratings.head()

(526356, 3)


Unnamed: 0,user-id,ISBN,rating
1456,277427,002542730X,10
1457,277427,0026217457,0
1458,277427,003008685X,8
1459,277427,0030615321,0
1460,277427,0060002050,0


Merge ratings dataset with books dataset on ISBN column.

In [74]:
ratings_with_books=ratings.merge(books,on='ISBN')
print(ratings_with_books.shape)
ratings_with_books.head()

(487671, 8)


Unnamed: 0,user-id,ISBN,rating,title,author,year,Publisher,image-url
0,277427,002542730X,10,Politically Correct Bedtime Stories: Modern Ta...,James Finn Garner,1994,John Wiley &amp; Sons Inc,http://images.amazon.com/images/P/002542730X.0...
1,3363,002542730X,0,Politically Correct Bedtime Stories: Modern Ta...,James Finn Garner,1994,John Wiley &amp; Sons Inc,http://images.amazon.com/images/P/002542730X.0...
2,11676,002542730X,6,Politically Correct Bedtime Stories: Modern Ta...,James Finn Garner,1994,John Wiley &amp; Sons Inc,http://images.amazon.com/images/P/002542730X.0...
3,12538,002542730X,10,Politically Correct Bedtime Stories: Modern Ta...,James Finn Garner,1994,John Wiley &amp; Sons Inc,http://images.amazon.com/images/P/002542730X.0...
4,13552,002542730X,0,Politically Correct Bedtime Stories: Modern Ta...,James Finn Garner,1994,John Wiley &amp; Sons Inc,http://images.amazon.com/images/P/002542730X.0...


Let's group the dataset by title , ratings so that all of them will be at same place in dataset.

In [75]:
num_ratings=ratings_with_books.groupby('title')['rating'].count().reset_index()
num_ratings.rename(columns={'rating':'no.of.rating'},inplace=True)
num_ratings.head()

Unnamed: 0,title,no.of.rating
0,A Light in the Storm: The Civil War Diary of ...,2
1,Always Have Popsicles,1
2,Apple Magic (The Collector's series),1
3,Beyond IBM: Leadership Marketing and Finance ...,1
4,Clifford Visita El Hospital (Clifford El Gran...,1


In [76]:
final_ratings=ratings_with_books.merge(num_ratings,on='title')
final_ratings[final_ratings['no.of.rating']>=50]
final_ratings.drop_duplicates(['title','user-id'],inplace=True)
book_pivot=final_ratings.pivot_table(columns='user-id',index='title',values='rating')
book_pivot.fillna(0,inplace=True)


In [77]:
book_pivot.sample(10)

user-id,254,2276,2766,2977,3363,3757,4017,4385,6242,6251,...,274004,274061,274301,274308,274808,275970,277427,277478,277639,278418
title,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
Hot Shot,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
Outrageous!: The Fine Life and Flagrant Good Times of Basketball's Irresistible Force,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
High Contrast,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
The Dragon In Lyonesse (Dragon),0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
Abra Cadaver (Jake Merlin Mysteries),0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
How to Practice : The Way to a Meaningful Life,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
Midnight Ghost (Spine Chillers),0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
Bunnies on Their Own (Pied Piper Paperbacks),0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
He Followed Me Home (Family Circle),0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
Pokemon Sticker Series 2 (Sticker Activity),0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


# 4. Build model.
So that we recommend up to 5 books for users.

In [78]:
book_sparse=csr_matrix(book_pivot)
model=NearestNeighbors(algorithm='brute')
model.fit(book_sparse)
distance,suggestion=model.kneighbors(book_pivot.iloc[237,:].values.reshape(1,-1),n_neighbors=6)
distance
suggestion
book_pivot.index[3]
book_names=book_pivot.index

# 5. Build Recommendation system.

In [79]:
def recommend_book(book_names):
    book_id=np.where(book_pivot.index==book_names)[0][0]
    distance,suggestion=model.kneighbors(book_pivot.iloc[book_id].values.reshape(1,-1),n_neighbors=6)
    for i in range(len(suggestion)):
        books=book_pivot.index[suggestion[i]]
        for j in books:
            print(j)

# 6. Recommendations

In [80]:
book_name='A Whisper of Blood'
recommend_book(book_name)

1001 Hints &amp; Tips for Your Garden : An Indispensable Guide to Easier and More Effective Gardening
A Gardener's Guide to Planters, Containers &amp; Raised Beds
A Whisper of Blood
Afterage
A Sponsorship Guide for 12-Step Programs
A Necessary Evil : A History of American Distrust of Government


In [81]:
book_name='The last days of MASH'
recommend_book(book_name)

A First Book of Jewelry Making,
Advanced Beadwork (Beadwork Books)
Age of Fable or Beauties of Mythology
Amelia Bedelia's Family Album
All Wired Up
A Christmas Sonata


In [82]:
book_name=r"The Rich Man's Table"
recommend_book(book_name)

A Woman of Salt
The Holy Innocents : Holy Innocents, The
The Frequency of Souls
Deep Valley Malice
Mad As the Dickens (Laura Fleming Mystery)
A Fashionable Murder
