# **Book Recommendation System Using Machine Learning**

This project involves building a Book Recommendation System using Machine Learning techniques with Python.
Used **feature extraction** to analyze the attributes of the books and applied **cosine similarity to measure the similarity between books based on their features.** This approach helps to recommend books that are closely related to the user's preferences.

## **Overview of the dataset**

The dataset consists of 6237 entries with the following 10 features:

*   Title: The name of the book.
*   Author: The writers associated with the book.
*   Edition: Publication dates and book formats.
*   Reviews: Reader reviews of the book.
*   Ratings: Readers ratings of the book.
*   Synopsis: A brief summary of the book.
*   Genre: The genre of the movie (e.g., Drama, Comedy, Action).
*   BookCategory: Specific categories the book might fall into.
*   Price: The cost associated with purchasing the book.
*   Index: A unique identifier for each book entry.


















In [1]:
#Import packages
import numpy as np
import pandas as pd
import difflib
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity

## 1. Given data

In [2]:
# loading the data from the csv file to apandas dataframe
book_data = pd.read_csv('/content/Book.csv')

book_data['index'] = range(0, 6237)

In [3]:
# printing the first 5 rows of the dataframe
book_data.head()

Unnamed: 0,Title,Author,Edition,Reviews,Ratings,Synopsis,Genre,BookCategory,Price,index
0,The Prisoner's Gold (The Hunters 3),Chris Kuzneski,"Paperback,– 10 Mar 2016",4.0 out of 5 stars,8 customer reviews,THE HUNTERS return in their third brilliant no...,Action & Adventure (Books),Action & Adventure,220.0,0
1,Guru Dutt: A Tragedy in Three Acts,Arun Khopkar,"Paperback,– 7 Nov 2012",3.9 out of 5 stars,14 customer reviews,A layered portrait of a troubled genius for wh...,Cinema & Broadcast (Books),"Biographies, Diaries & True Accounts",202.93,1
2,Leviathan (Penguin Classics),Thomas Hobbes,"Paperback,– 25 Feb 1982",4.8 out of 5 stars,6 customer reviews,"""During the time men live without a common Pow...",International Relations,Humour,299.0,2
3,A Pocket Full of Rye (Miss Marple),Agatha Christie,"Paperback,– 5 Oct 2017",4.1 out of 5 stars,13 customer reviews,A handful of grain is found in the pocket of a...,Contemporary Fiction (Books),"Crime, Thriller & Mystery",180.0,3
4,LIFE 70 Years of Extraordinary Photography,Editors of Life,"Hardcover,– 10 Oct 2006",5.0 out of 5 stars,1 customer review,"For seven decades, ""Life"" has been thrilling t...",Photography Textbooks,"Arts, Film & Photography",965.62,4


In [4]:
# number of rows and columns in the data frame

book_data.shape

(6237, 10)

## 2. Data preprocessing

In [5]:
 #selecting the relevant features for recommendation

selected_features = ['Genre','BookCategory','Reviews','Ratings','Author','Title']
print(selected_features)

['Genre', 'BookCategory', 'Reviews', 'Ratings', 'Author', 'Title']


In [6]:
# combining all the 5 selected features

combined_features = book_data['Genre']+' '+book_data['BookCategory']+' '+book_data['Reviews']+' '+book_data['Author']+' '+book_data['Title']

In [7]:
print(combined_features)

0       Action & Adventure (Books) Action & Adventure ...
1       Cinema & Broadcast (Books) Biographies, Diarie...
2       International Relations Humour 4.8 out of 5 st...
3       Contemporary Fiction (Books) Crime, Thriller &...
4       Photography Textbooks Arts, Film & Photography...
                              ...                        
6232    Anthropology (Books) Humour 5.0 out of 5 stars...
6233    Contemporary Fiction (Books) Crime, Thriller &...
6234    Romance (Books) Romance 3.8 out of 5 stars Jul...
6235    Action & Adventure (Books) Action & Adventure ...
6236    Action & Adventure (Books) Action & Adventure ...
Length: 6237, dtype: object


In [8]:
# converting the text data to feature vectors

vectorizer = TfidfVectorizer()

feature_vectors = vectorizer.fit_transform(combined_features)

print(feature_vectors)

  (0, 320)	0.2876182221875535
  (0, 370)	0.28612627654486994
  (0, 1454)	0.06212594444161257
  (0, 7682)	0.051197695545433894
  (0, 7576)	0.051197695545433894
  (0, 10005)	0.051197695545433894
  (0, 2073)	0.32798320899324057
  (0, 5937)	0.4161848887602325
  (0, 10554)	0.1866362531867486
  (0, 8299)	0.42760932436334437
  (0, 4261)	0.3599384712825303
  (0, 4954)	0.44233798352261017
  (1, 1454)	0.051234244716792006
  (1, 7682)	0.04222189756770298
  (1, 7576)	0.04222189756770298
  (1, 10005)	0.04222189756770298
  (1, 2098)	0.22511430759239295
  (1, 1600)	0.2266879766057271
  (1, 1342)	0.13055113674715593
  (1, 2913)	0.14024774310840107
  (1, 10844)	0.13770429260523812
  (1, 301)	0.1394955922491458
  (1, 810)	0.31840336522248014
  (1, 5748)	0.3819083670030653
  (1, 4450)	0.33552287148430066
  :	:
  (6234, 1574)	0.4234691756702787
  (6235, 320)	0.27408410482428236
  (6235, 370)	0.27266236393870397
  (6235, 1454)	0.05920255587123001
  (6235, 7682)	0.04878854491870802
  (6235, 7576)	0.04878854

In [9]:
# getting the similarity scores using cosine similarity

similarity = cosine_similarity(feature_vectors)

print(similarity)

print(similarity.shape)

[[1.         0.00966797 0.00920955 ... 0.02789677 0.16801905 0.18293038]
 [0.00966797 1.         0.00759497 ... 0.03495241 0.00921303 0.00912891]
 [0.00920955 0.00759497 1.         ... 0.0088167  0.00877619 0.00869606]
 ...
 [0.02789677 0.03495241 0.0088167  ... 1.         0.01069505 0.01846937]
 [0.16801905 0.00921303 0.00877619 ... 0.01069505 1.         0.15865084]
 [0.18293038 0.00912891 0.00869606 ... 0.01846937 0.15865084 1.        ]]
(6237, 6237)


## **Getting the book name from the user**

In [10]:
# getting the book name from the user

book_name = input(' Enter your favourite book name : ')

# creating a list with all the book names given in the dataset

list_of_all_titles = book_data['Title'].tolist()


# finding the close match for the book name given by the user

find_close_match = difflib.get_close_matches(book_name, list_of_all_titles)


close_match = find_close_match[0]


# finding the index of the book with title

index_of_the_book = book_data[book_data.Title == close_match]['index'].values[0]
input_book_price = book_data[book_data.Title == close_match]['Price'].values[0]



# getting a list of similar book

similarity_score = list(enumerate(similarity[index_of_the_book]))


len(similarity_score)

# sorting the book based on their similarity score

sorted_similar_book = sorted(similarity_score, key = lambda x:x[1], reverse = True)


# print the name of similar book based on the index

print('Movies suggested for you : \n')

i = 1

for book in sorted_similar_book:
  index = book[0]
  title_from_index = book_data[book_data.index==index][['Title','Price']].values[0]
  price_difference = abs(title_from_index[1] - input_book_price)
  if (i < 30 and price_difference <= 0.2 * input_book_price):  # Check if within 20% price range
    print(i, '.',title_from_index[0], "- $", title_from_index[1])
    i += 1

 Enter your favourite book name : I love you
Movies suggested for you : 

1 . P.S. I Love You - $ 206.0
2 . The Gift - $ 232.0
3 . Lyrebird - $ 207.0
4 . Thanks for the Memories - $ 199.0
5 . One Hundred Names - $ 229.0
6 . The Last Love Letter - $ 247.0
7 . Does Love Ever End?: An Inspirational Love Story - $ 220.0
8 . Different Shades of Love: Some Relations Are Beyond Love - $ 185.0
9 . Love . . .Not for Sale! - $ 175.0
10 . Bared to You: Crossfire, Book - $ 194.0
11 . Reflected in You: Crossfire Book - $ 229.0
12 . Someone Exactly Like You - $ 175.0
13 . The Collector - $ 238.0
14 . You Belong To Me (The Baltimore Series Book 1) - $ 170.0
15 . The Affair - $ 183.0
16 . Portrait in Death - $ 224.0
17 . Homeport - $ 224.0
18 . Black Hills - $ 233.0
19 . Sunset in Central Park (From Manhattan with Love) - $ 222.0
20 . Fairytale - $ 209.0
21 . Confessions of a Shopaholic - $ 247.0
22 . See Me - $ 188.0
23 . The Stars Shine Down - $ 180.0
24 . The Lucky One - $ 205.0
25 . Secret Vampire