<div style="text-align:center;">
    <span style="color:darkblue; font-size:36px;">Book Recommendation System</span>
</div>


<div style="text-align:center;">
    <img src=https://st.depositphotos.com/1741875/1237/i/450/depositphotos_12376816-stock-photo-stack-of-old-books.jpgh" alt="imagebooksn" style="width:300px;height:300px;">
</div>

- **<span style="color:darkblue; font-size:24px;">About Dataset</span>**



<span style="color:darkblue; font-size:20px;">The Rise of Recommender Systems</span>

During the last few decades, with the emergence of platforms like YouTube, Amazon, Netflix, and many others, recommender systems have increasingly become integral to our lives. From e-commerce, where they suggest articles that may interest buyers, to online advertising, where they recommend content tailored to users' preferences, recommender systems now play an indispensable role in our daily online experiences.


**Dataset Contains**

- **Users:** Contains anonymized user IDs (User-ID) mapping to integers. Demographic data such as Location and Age is provided if available. Otherwise, these fields contain NULL-values.
    - **User-ID:** Anonymized user IDs mapping to integers.
    - **Location:** Location information of users.
    - **Age:** Age of users (if available).

- **Books:** Identified by their respective ISBNs. Invalid ISBNs have been removed. Content-based information includes Book-Title, Book-Author, Year-Of-Publication, and Publisher obtained from Amazon Web Services. Note that only the first author is provided for books with multiple authors. URLs linking to cover images are provided in small, medium, and large sizes (Image-URL-S, Image-URL-M, Image-URL-L), directing to the Amazon website.
    - **ISBN:** A unique identifier for each book.
    - **Book-Title:** The title of the book.
    - **Book-Author:** The author(s) of the book.
    - **Year-Of-Publication:** The year the book was published.
    - **Publisher:** The publisher of the book.
    - **Image-URL-S:** URL linking to a small-sized cover image of the book.
    - **Image-URL-M:** URL linking to a medium-sized cover image of the book.
    - **Image-URL-L:** URL linking to a large-sized cover image of the book.
  
- **Ratings:** Contains book rating information.
    - **User-ID:** This column represents the unique identifier for each user.
    - **ISBN:** This column contains the ISBN (International Standard Book Number) of the books.
    - **Book-Rating:** This column provides the rating given by users for the corresponding books. Ratings range from 1 to 10, with higher values indicating higher appreciation.
 expressed as 0.
e expressed as 0.
pressed as 0.
pressed as 0.
 as 0.


- **<span style="color:darkblue; font-size:20px;">Import Libraries</span>**


In [1]:
#For data manipulation and analysis.
import pandas as pd

#For numerical operations.
import numpy as np

#For data visualization.
import matplotlib.pyplot as plt
import seaborn as sns

import warnings
warnings.filterwarnings('ignore')


- **<span style="color:darkblue; font-size:20px;">Reading Files</span>**


In [2]:
books = pd.read_csv("Books.csv")
users = pd.read_csv("Users.csv")
ratings = pd.read_csv("Ratings.csv")

- **<span style="color:darkblue; font-size:20px;">Insights into Datasets</span>**


In [3]:
#books dataset
books.head()

Unnamed: 0,ISBN,Book-Title,Book-Author,Year-Of-Publication,Publisher,Image-URL-S,Image-URL-M,Image-URL-L
0,195153448,Classical Mythology,Mark P. O. Morford,2002,Oxford University Press,http://images.amazon.com/images/P/0195153448.0...,http://images.amazon.com/images/P/0195153448.0...,http://images.amazon.com/images/P/0195153448.0...
1,2005018,Clara Callan,Richard Bruce Wright,2001,HarperFlamingo Canada,http://images.amazon.com/images/P/0002005018.0...,http://images.amazon.com/images/P/0002005018.0...,http://images.amazon.com/images/P/0002005018.0...
2,60973129,Decision in Normandy,Carlo D'Este,1991,HarperPerennial,http://images.amazon.com/images/P/0060973129.0...,http://images.amazon.com/images/P/0060973129.0...,http://images.amazon.com/images/P/0060973129.0...
3,374157065,Flu: The Story of the Great Influenza Pandemic...,Gina Bari Kolata,1999,Farrar Straus Giroux,http://images.amazon.com/images/P/0374157065.0...,http://images.amazon.com/images/P/0374157065.0...,http://images.amazon.com/images/P/0374157065.0...
4,393045218,The Mummies of Urumchi,E. J. W. Barber,1999,W. W. Norton &amp; Company,http://images.amazon.com/images/P/0393045218.0...,http://images.amazon.com/images/P/0393045218.0...,http://images.amazon.com/images/P/0393045218.0...


In [4]:
books.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 271360 entries, 0 to 271359
Data columns (total 8 columns):
 #   Column               Non-Null Count   Dtype 
---  ------               --------------   ----- 
 0   ISBN                 271360 non-null  object
 1   Book-Title           271360 non-null  object
 2   Book-Author          271358 non-null  object
 3   Year-Of-Publication  271360 non-null  object
 4   Publisher            271358 non-null  object
 5   Image-URL-S          271360 non-null  object
 6   Image-URL-M          271360 non-null  object
 7   Image-URL-L          271357 non-null  object
dtypes: object(8)
memory usage: 16.6+ MB


<div style="font-size:16px; color:black;">
    The dataset comprises 271,360 entries distributed across 8 columns.
</div>


In [5]:
#users dataset
users.head()

Unnamed: 0,User-ID,Location,Age
0,1,"nyc, new york, usa",
1,2,"stockton, california, usa",18.0
2,3,"moscow, yukon territory, russia",
3,4,"porto, v.n.gaia, portugal",17.0
4,5,"farnborough, hants, united kingdom",


In [6]:
users.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 278858 entries, 0 to 278857
Data columns (total 3 columns):
 #   Column    Non-Null Count   Dtype  
---  ------    --------------   -----  
 0   User-ID   278858 non-null  int64  
 1   Location  278858 non-null  object 
 2   Age       168096 non-null  float64
dtypes: float64(1), int64(1), object(1)
memory usage: 6.4+ MB


<div style="font-size:16px; color:black;">
    The dataset has a total of 278,858 entries.
</div>


In [7]:
#ratings dataset
ratings.head()

Unnamed: 0,User-ID,ISBN,Book-Rating
0,276725,034545104X,0
1,276726,0155061224,5
2,276727,0446520802,0
3,276729,052165615X,3
4,276729,0521795028,6


In [8]:
ratings.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1149780 entries, 0 to 1149779
Data columns (total 3 columns):
 #   Column       Non-Null Count    Dtype 
---  ------       --------------    ----- 
 0   User-ID      1149780 non-null  int64 
 1   ISBN         1149780 non-null  object
 2   Book-Rating  1149780 non-null  int64 
dtypes: int64(2), object(1)
memory usage: 26.3+ MB


In [9]:
print("Size of the Books DataFrame: {:,} entries".format(books.size))
print("Shape of the Books DataFrame: {} rows x {} columns".format(*books.shape))
print("Size of the Users DataFrame: {:,} entries".format(users.size))
print("Shape of the Users DataFrame: {} rows x {} columns".format(*users.shape))
print("Size of the Ratings DataFrame: {:,} entries".format(ratings.size))
print("Shape of the Ratings DataFrame: {} rows x {} columns".format(*ratings.shape))


Size of the Books DataFrame: 2,170,880 entries
Shape of the Books DataFrame: 271360 rows x 8 columns
Size of the Users DataFrame: 836,574 entries
Shape of the Users DataFrame: 278858 rows x 3 columns
Size of the Ratings DataFrame: 3,449,340 entries
Shape of the Ratings DataFrame: 1149780 rows x 3 columns


- **<span style="color:darkblue; font-size:20px;">Data Preprocessing </span>**


<span style="color:darkblue; font-size:20px;">Removing Unnecessary Columns</span>

 the unnecessary columns for a book recommendation system would typically be those related to auxiliary information (such as URLs for cover images) and those that do not directly contribute to the recommendation process (such as location and age in the Users dataset, and year-of-publication and publisher in the Books dataset).

In [10]:
#User Dataset
unnecessary_columns_users = ['Location', 'Age']
users = users.drop(columns=unnecessary_columns_users)

In [11]:
unnecessary_columns_books = ['Year-Of-Publication', 'Publisher', 'Image-URL-S','Image-URL-L']
books = books.drop(columns=unnecessary_columns_books)

In [12]:
books['Image-URL-M'] = books['Image-URL-M'].str.replace('http://', 'https://')

In [13]:
books.head()

Unnamed: 0,ISBN,Book-Title,Book-Author,Image-URL-M
0,195153448,Classical Mythology,Mark P. O. Morford,https://images.amazon.com/images/P/0195153448....
1,2005018,Clara Callan,Richard Bruce Wright,https://images.amazon.com/images/P/0002005018....
2,60973129,Decision in Normandy,Carlo D'Este,https://images.amazon.com/images/P/0060973129....
3,374157065,Flu: The Story of the Great Influenza Pandemic...,Gina Bari Kolata,https://images.amazon.com/images/P/0374157065....
4,393045218,The Mummies of Urumchi,E. J. W. Barber,https://images.amazon.com/images/P/0393045218....


In [14]:
users.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 278858 entries, 0 to 278857
Data columns (total 1 columns):
 #   Column   Non-Null Count   Dtype
---  ------   --------------   -----
 0   User-ID  278858 non-null  int64
dtypes: int64(1)
memory usage: 2.1 MB


In [15]:
ratings.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1149780 entries, 0 to 1149779
Data columns (total 3 columns):
 #   Column       Non-Null Count    Dtype 
---  ------       --------------    ----- 
 0   User-ID      1149780 non-null  int64 
 1   ISBN         1149780 non-null  object
 2   Book-Rating  1149780 non-null  int64 
dtypes: int64(2), object(1)
memory usage: 26.3+ MB


- **<span style="color:darkblue; font-size:20px;">Conclusion </span>**


<b>we have 271360 books data and total registered users on the website are approximately 278000 and they have given near about 11 lakh rating. hence we can say that the dataset we have is nice and reliable.</b>

In [16]:
#checking null values

print("Null values in Books dataset:\n", books.isnull().sum())
print("\n")
print("Null values in Users dataset:\n",users.isnull().sum())
print("\n")
print("Null values in Ratings dataset:\n", ratings.isnull().sum())


Null values in Books dataset:
 ISBN           0
Book-Title     0
Book-Author    2
Image-URL-M    0
dtype: int64


Null values in Users dataset:
 User-ID    0
dtype: int64


Null values in Ratings dataset:
 User-ID        0
ISBN           0
Book-Rating    0
dtype: int64


In [17]:
#Checking the duplicates

print("Duplicate values in Books dataset:\n", books.duplicated().sum())
print("\n")
print("Duplicate values in Users dataset:\n",users.duplicated().sum())
print("\n")
print("Duplicate values in Ratings dataset:\n", ratings.duplicated().sum())

Duplicate values in Books dataset:
 0


Duplicate values in Users dataset:
 0


Duplicate values in Ratings dataset:
 0


- **<span style="color:darkblue; font-size:20px;">Popularity Based Recommender System</span>**


<span style="color:darkblue; font-size:20px;">Merge ratings with books</span>

We will merge ratings with books on basis of ISBN so that we will get the rating of each user on each book id and the user who has not rated that book id the value will be zero.

In [18]:
rating_on_books = ratings.merge(books, on='ISBN')
rating_on_books.head()

Unnamed: 0,User-ID,ISBN,Book-Rating,Book-Title,Book-Author,Image-URL-M
0,276725,034545104X,0,Flesh Tones: A Novel,M. J. Rose,https://images.amazon.com/images/P/034545104X....
1,2313,034545104X,5,Flesh Tones: A Novel,M. J. Rose,https://images.amazon.com/images/P/034545104X....
2,6543,034545104X,0,Flesh Tones: A Novel,M. J. Rose,https://images.amazon.com/images/P/034545104X....
3,8680,034545104X,5,Flesh Tones: A Novel,M. J. Rose,https://images.amazon.com/images/P/034545104X....
4,10314,034545104X,9,Flesh Tones: A Novel,M. J. Rose,https://images.amazon.com/images/P/034545104X....


In [19]:
number_of_Ratings=rating_on_books.groupby('Book-Title').count()['Book-Rating'].reset_index()
number_of_Ratings.rename(columns={'Book-Rating':'Number_of_ratings'},inplace=True)
number_of_Ratings

Unnamed: 0,Book-Title,Number_of_ratings
0,A Light in the Storm: The Civil War Diary of ...,4
1,Always Have Popsicles,1
2,Apple Magic (The Collector's series),1
3,"Ask Lily (Young Women of Faith: Lily Series, ...",1
4,Beyond IBM: Leadership Marketing and Finance ...,1
...,...,...
241066,Ã?Â?lpiraten.,2
241067,Ã?Â?rger mit Produkt X. Roman.,4
241068,Ã?Â?sterlich leben.,1
241069,Ã?Â?stlich der Berge.,3


In [20]:
Average_of_Ratings=rating_on_books.groupby('Book-Title')['Book-Rating'].mean().reset_index()
Average_of_Ratings.rename(columns={'Book-Rating':'average_of_ratings'},inplace=True)
Average_of_Ratings

Unnamed: 0,Book-Title,average_of_ratings
0,A Light in the Storm: The Civil War Diary of ...,2.250000
1,Always Have Popsicles,0.000000
2,Apple Magic (The Collector's series),0.000000
3,"Ask Lily (Young Women of Faith: Lily Series, ...",8.000000
4,Beyond IBM: Leadership Marketing and Finance ...,0.000000
...,...,...
241066,Ã?Â?lpiraten.,0.000000
241067,Ã?Â?rger mit Produkt X. Roman.,5.250000
241068,Ã?Â?sterlich leben.,7.000000
241069,Ã?Â?stlich der Berge.,2.666667


In [21]:
popular_df = number_of_Ratings.merge(Average_of_Ratings,on='Book-Title')

In [22]:
popular_df

Unnamed: 0,Book-Title,Number_of_ratings,average_of_ratings
0,A Light in the Storm: The Civil War Diary of ...,4,2.250000
1,Always Have Popsicles,1,0.000000
2,Apple Magic (The Collector's series),1,0.000000
3,"Ask Lily (Young Women of Faith: Lily Series, ...",1,8.000000
4,Beyond IBM: Leadership Marketing and Finance ...,1,0.000000
...,...,...,...
241066,Ã?Â?lpiraten.,2,0.000000
241067,Ã?Â?rger mit Produkt X. Roman.,4,5.250000
241068,Ã?Â?sterlich leben.,1,7.000000
241069,Ã?Â?stlich der Berge.,3,2.666667


In [23]:
popular_df

Unnamed: 0,Book-Title,Number_of_ratings,average_of_ratings
0,A Light in the Storm: The Civil War Diary of ...,4,2.250000
1,Always Have Popsicles,1,0.000000
2,Apple Magic (The Collector's series),1,0.000000
3,"Ask Lily (Young Women of Faith: Lily Series, ...",1,8.000000
4,Beyond IBM: Leadership Marketing and Finance ...,1,0.000000
...,...,...,...
241066,Ã?Â?lpiraten.,2,0.000000
241067,Ã?Â?rger mit Produkt X. Roman.,4,5.250000
241068,Ã?Â?sterlich leben.,1,7.000000
241069,Ã?Â?stlich der Berge.,3,2.666667


In [24]:
popular_df = popular_df[popular_df['Number_of_ratings']>=250].sort_values('average_of_ratings',ascending=False).head(50)

In [25]:
#Top 50 Books 
popular_df

Unnamed: 0,Book-Title,Number_of_ratings,average_of_ratings
80434,Harry Potter and the Prisoner of Azkaban (Book 3),428,5.852804
80422,Harry Potter and the Goblet of Fire (Book 4),387,5.824289
80441,Harry Potter and the Sorcerer's Stone (Book 1),278,5.73741
80426,Harry Potter and the Order of the Phoenix (Boo...,347,5.501441
80414,Harry Potter and the Chamber of Secrets (Book 2),556,5.183453
191612,The Hobbit : The Enchanting Prelude to The Lor...,281,5.007117
187377,The Fellowship of the Ring (The Lord of the Ri...,368,4.94837
80445,Harry Potter and the Sorcerer's Stone (Harry P...,575,4.895652
211384,"The Two Towers (The Lord of the Rings, Part 2)",260,4.880769
219741,To Kill a Mockingbird,510,4.7


In [26]:
popular_df = popular_df.merge(books,on='Book-Title').drop_duplicates('Book-Title')

In [27]:
popular_df.info()

<class 'pandas.core.frame.DataFrame'>
Index: 50 entries, 0 to 195
Data columns (total 6 columns):
 #   Column              Non-Null Count  Dtype  
---  ------              --------------  -----  
 0   Book-Title          50 non-null     object 
 1   Number_of_ratings   50 non-null     int64  
 2   average_of_ratings  50 non-null     float64
 3   ISBN                50 non-null     object 
 4   Book-Author         50 non-null     object 
 5   Image-URL-M         50 non-null     object 
dtypes: float64(1), int64(1), object(4)
memory usage: 2.7+ KB


- **<span style="color:darkblue; font-size:20px;">Collaborative Filtering Based Recommender System</span>**


<span style="color:darkblue; font-size:20px;">Extract users and ratings of more than 200</span>

we will extract the user ids who have given more than 200 ratings and when we will have user ids we will extract the ratings of only this user id from the rating dataframe.

In [28]:
x = rating_on_books.groupby('User-ID').count()['Book-Rating']>200
essential_users = x[x].index

In [29]:
filtered_user_rating = rating_on_books[rating_on_books['User-ID'].isin(essential_users)]

<span style="color:darkblue; font-size:20px;">Extract books that have received more than 50 ratings.</span>

Now dataframe size has decreased and we have 4.8 lakh because when we merge the dataframe, all the book id-data we were not having. Now we will count the rating of each book so we will group data based on title and aggregate based on rating.

In [30]:
y= filtered_user_rating.groupby('Book-Title').count()['Book-Rating']>=50

In [31]:
essential_books = y[y].index

In [32]:
final_ratings = filtered_user_rating[filtered_user_rating['Book-Title'].isin(essential_books)]

In [33]:
final_ratings.drop_duplicates()

Unnamed: 0,User-ID,ISBN,Book-Rating,Book-Title,Book-Author,Image-URL-M
63,278418,0446520802,0,The Notebook,Nicholas Sparks,https://images.amazon.com/images/P/0446520802....
65,3363,0446520802,0,The Notebook,Nicholas Sparks,https://images.amazon.com/images/P/0446520802....
66,7158,0446520802,10,The Notebook,Nicholas Sparks,https://images.amazon.com/images/P/0446520802....
69,11676,0446520802,10,The Notebook,Nicholas Sparks,https://images.amazon.com/images/P/0446520802....
74,23768,0446520802,6,The Notebook,Nicholas Sparks,https://images.amazon.com/images/P/0446520802....
...,...,...,...,...,...,...
1026724,266865,0531001725,10,The Catcher in the Rye,Jerome David Salinger,https://images.amazon.com/images/P/0531001725....
1027923,269566,0670809381,0,Echoes,Maeve Binchy,https://images.amazon.com/images/P/0670809381....
1028777,271284,0440910927,0,The Rainmaker,John Grisham,https://images.amazon.com/images/P/0440910927....
1029070,271705,B0001PIOX4,0,Fahrenheit 451,Ray Bradbury,https://images.amazon.com/images/P/B0001PIOX4....


<span style="color:darkblue; font-size:20px;">Create Pivot Table</span>

In [34]:
pt = final_ratings.pivot_table(columns='User-ID', index='Book-Title', values="Book-Rating")
pt.fillna(0, inplace=True)

In [35]:
pt

User-ID,254,2276,2766,2977,3363,4017,4385,6251,6323,6543,...,271705,273979,274004,274061,274301,274308,275970,277427,277639,278418
Book-Title,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
1984,9.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,10.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1st to Die: A Novel,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,9.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2nd Chance,0.0,10.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4 Blondes,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
A Bend in the Road,0.0,0.0,7.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
Year of Wonders,0.0,0.0,0.0,7.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,9.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
You Belong To Me,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
Zen and the Art of Motorcycle Maintenance: An Inquiry into Values,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
Zoya,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


- **<span style="color:darkblue; font-size:20px;">Cosine Similarity</span>**

In [36]:
from sklearn.metrics.pairwise import cosine_similarity

In [37]:
#Cosine similarity measures the similarity between two vectors of an inner product space. 

In [38]:
similarity_score = cosine_similarity(pt)

In [39]:
similarity_score[0]

array([1.        , 0.10255025, 0.01220856, 0.        , 0.05367224,
       0.02774901, 0.08216491, 0.13732869, 0.03261686, 0.03667591,
       0.02322418, 0.06766487, 0.02083978, 0.09673735, 0.13388865,
       0.08303112, 0.11153543, 0.05100411, 0.02517784, 0.11706383,
       0.        , 0.14333793, 0.07847534, 0.06150451, 0.08723968,
       0.        , 0.07009814, 0.13658681, 0.07600328, 0.12167134,
       0.00768046, 0.01473221, 0.        , 0.07965814, 0.04522617,
       0.01556271, 0.09495938, 0.0182307 , 0.02610465, 0.07984012,
       0.11679969, 0.0569124 , 0.08354155, 0.08471898, 0.08785938,
       0.05491435, 0.0548505 , 0.27026514, 0.09779123, 0.06016046,
       0.08958835, 0.06748675, 0.        , 0.04468098, 0.01920872,
       0.        , 0.05629067, 0.00557964, 0.07877059, 0.05219479,
       0.18908177, 0.        , 0.01240656, 0.02984572, 0.04279502,
       0.12680125, 0.16566735, 0.        , 0.13357242, 0.06615478,
       0.        , 0.        , 0.        , 0.10968075, 0.02806

In [40]:
#Above similarity score indicates similarity of book at index 0 with similarity score of all the other books

<span style="color:darkblue; font-size:20px;">Recommedation Function</span>

In [41]:
def recommend(book_name):
    #index fetch
    index = np.where(pt.index==book_name)[0][0]
    similar_items =sorted(list(enumerate(similarity_score[index])),key=lambda x:x[1],reverse=True)[1:6]
    data = []
    
    for i in similar_items:
        item = []
        temp_df = books[books['Book-Title']==pt.index[i[0]]]
        item.extend(list(temp_df.drop_duplicates('Book-Title')['Book-Title'].values))
        item.extend(list(temp_df.drop_duplicates('Book-Title')['Book-Author'].values))
        item.extend(list(temp_df.drop_duplicates('Book-Title')['Image-URL-M'].values))
        data.append(item)
    return data                   


In [42]:
recommend('Message in a Bottle')

[['Nights in Rodanthe',
  'Nicholas Sparks',
  'https://images.amazon.com/images/P/0446531332.01.MZZZZZZZ.jpg'],
 ['The Mulberry Tree',
  'Jude Deveraux',
  'https://images.amazon.com/images/P/0743437640.01.MZZZZZZZ.jpg'],
 ['A Walk to Remember',
  'Nicholas Sparks',
  'https://images.amazon.com/images/P/0446608955.01.MZZZZZZZ.jpg'],
 ["River's End",
  'Nora Roberts',
  'https://images.amazon.com/images/P/0515127833.01.MZZZZZZZ.jpg'],
 ['Nightmares &amp; Dreamscapes',
  'Stephen King',
  'https://images.amazon.com/images/P/0451180232.01.MZZZZZZZ.jpg']]

In [43]:
import pickle
pickle.dump(popular_df,open('popular.pkl','wb'))

In [44]:
pickle.dump(pt,open('pt.pkl','wb'))
pickle.dump(books,open('books.pkl','wb'))
pickle.dump(similarity_score,open('similarity_score.pkl','wb'))