<h1>Book Recommendation Engine using the Collaborative Filtering(CF) Algorithm</h1> 

<h4>A recommendation system is a subclass of facts filtering systems that seeks to expect the rating or the choice a user would possibly deliver to an item. In simple phrases, it's miles an set of rules that shows applicable objects to customers.</h4>
<h4>One of the effective personalization technologies powering the adaptive web is collaborative filtering. Collaborative filtering (CF) is the manner of filtering or comparing objects via the critiques of different people. CF technology brings collectively the reviews of massive interconnected communities at the web, assisting filtering of widespread portions of statistics. in this project we introduce the middle concepts of collaborative filtering, its primary makes use of for users of the adaptive internet, the theory and practice of CF algorithms, and layout decisions regarding rating systems and acquisition of rankings.</h4>

In [30]:
import pandas as pd
import numpy as np

In [31]:
ratings_data = pd.read_csv("../input/book-crossing-dataset/BX-Book-Ratings.csv",sep=";",error_bad_lines=False,encoding="latin-1")
ratings_data_copy = ratings_data.copy()

users_data = pd.read_csv("../input/book-crossing-dataset/BX-Users.csv",encoding="latin-1",sep=";",error_bad_lines=False)
users_data_copy = users_data.copy()

books_data = pd.read_csv("../input/book-crossing-dataset/BX-Books.csv",sep=";",encoding="latin-1",error_bad_lines=False,engine="python")
books_data_copy = books_data.copy()

<h5> Keeping the Required Columns that will be required for Our ML Training </h5>

In [32]:
books_data = books_data[["ISBN","Book-Title","Book-Author","Year-Of-Publication","Publisher"]]

<h5>Changing the column names for ease.</h5>

In [33]:
books_data.rename(columns={"Book-Title":"title","Book-Author":"author","Year-Of-Publication":"year","Publisher":"publisher"},inplace=True)
users_data.rename(columns={"User-ID":"user_id","Location":"location","Age":"age"},inplace=True)
ratings_data.rename(columns={"User-ID":"user_id","Book-Rating":"rating"},inplace=True)

<h5>Finding out those users who have rated the books</h5>

In [34]:
ratings_data["user_id"].value_counts().shape

**105283 users have actually rated the books**

**Selecting only those users who have given more than 180 ratings, so as to  to improve our accuracy**

In [35]:
a = ratings_data["user_id"].value_counts()>=180
b = a[a].index

**Selecting only those rows of the user id who have given us more than 180 ratings**

In [36]:
ratings_data = ratings_data[ratings_data["user_id"].isin(b)]

**Merging rating_data and users_data based on ISBN column**

In [37]:
rated_books_data = ratings_data.merge(books_data,on="ISBN")

**Calculate the number of books that have been rated many times**

In [38]:
no_of_ratings_data = rated_books_data.groupby("title")["rating"].count().reset_index()
no_of_ratings_data.rename(columns={"rating":"number_of_rating"},inplace=True)
no_of_ratings_data

**Merging no_of_ratings_data with rated_books_data**

In [39]:
final_rating_data = rated_books_data.merge(no_of_ratings_data,on="title")

In [40]:
final_rating_data

**Selecting books with 50 or more number of rating**

In [41]:
final_rating_data = final_rating_data[final_rating_data["number_of_rating"]>=50]

In [42]:
final_rating_data

**drop any duplicates from final_rating_data**

In [43]:
final_rating_data.drop_duplicates(["user_id","title"],inplace=True)

In [44]:
final_rating_data.shape

**Creating pivot table with col=user,index=books,values=rating**

In [45]:
rating_pivot_table = final_rating_data.pivot_table(columns="user_id",index="title",values="rating")
rating_pivot_table.fillna(0,inplace=True)

**We will cluster, so the 0 wil be less important for our algorigthm so we will convert this pivot table to sparse matrix and the sparse matrix will be used for building model**

In [46]:
from scipy.sparse import csr_matrix
sparse_matrix = csr_matrix(rating_pivot_table)

<h2> MODEL BUILDING </h2>

In [47]:
from sklearn.neighbors import NearestNeighbors
model = NearestNeighbors(algorithm='brute')

In [48]:
model.fit(sparse_matrix)

**calculating distance of each book with every other books**

In [49]:
distances,suggestions=model.kneighbors(rating_pivot_table.iloc[0,:].values.reshape(1,-1),n_neighbors=6)

In [50]:
suggestions

In [51]:
def recommend_books(book_name):
  book_index = np.where(rating_pivot_table.index==book_name)[0][0]
  distances , suggestions = model.kneighbors(rating_pivot_table.iloc[book_index,:].values.reshape(1,-1),n_neighbors=6)
  suggestions = np.ravel(suggestions, order='C') #2d to 1d array
  for i in suggestions:
    print(rating_pivot_table.index[i])

In [52]:
recommend_books("1984")

In [53]:
recommend_books("Animal Farm")

**Here our recommendation model is complete**

**Making book image dataframe**

In [54]:
books_image_data = books_data_copy[["Book-Title","Image-URL-M"]]
books_image_data.rename(columns={"Book-Title":"title","Image-URL-M":"image"},inplace=True)
books_image_data = books_image_data[books_image_data["title"].isin(rating_pivot_table.index)]
books_image_data.drop_duplicates(subset=["title"],keep='first',inplace=True)

In [55]:
books_image_data

<h2>Pickling Our Model</h2>

In [56]:
import pickle

In [57]:
pickle.dump(rating_pivot_table,open("rating_table.pkl","wb"))

In [58]:
pickle.dump(books_image_data,open("books_image_data.pkl","wb"))

<h1> Webpage Related </h1>

In [59]:
import pandas as pd
import numpy as np
import pickle

In [60]:
rating_table = pickle.load(open("./rating_table.pkl","rb"))
books_image_data = pickle.load(open("./books_image_data.pkl","rb"))

In [61]:
from scipy.sparse import csr_matrix
new_sparse_matrix = csr_matrix(rating_table)

In [62]:
from sklearn.neighbors import NearestNeighbors
model2 = NearestNeighbors(algorithm='brute')

In [63]:
model2.fit(new_sparse_matrix)

<h2>Function for recommending books</h2>

In [64]:
def rec(book_name):
  recommended_books = []
  image_url = []
  book_index = np.where(rating_table.index==book_name)[0][0]
  distances , suggestions = model2.kneighbors(rating_table.iloc[book_index,:].values.reshape(1,-1),n_neighbors=6)
  suggestions = np.ravel(suggestions, order='C') #2d to 1d array
  for i in suggestions:
    recommended_books.append(rating_table.index[i])
  
  for i in recommended_books:
    image_url.append(books_image_data[books_image_data["title"] == i ].image.to_string(index=False))

    
  return recommended_books,image_url

**Function to get the images**

In [65]:
def image(book_list):
  image_url = []
  for i in book_list:
    image_url.append(books_image_data[books_image_data["title"] == i ].image.to_string(index=False))
  return image_url

In [66]:
rec("Animal Farm")[1]

In [67]:
books_image_data[books_image_data.title=="Exclusive"]

In [68]:
import pandas as pd
import numpy as np
import pickle
from scipy.sparse import csr_matrix
from sklearn.neighbors import NearestNeighbors


rating_table = pickle.load(open("rating_table.pkl","rb"))
books_image_data = pickle.load(open("books_image_data.pkl","rb"))

books_name = rating_table.index.to_list()


sparse_matrix = csr_matrix(rating_table)
model = NearestNeighbors(algorithm='brute')
model.fit(sparse_matrix)


#Function for recommending books
def recommend(book_name):
  recommended_books = []
  image_url = []
  book_index = np.where(rating_table.index==book_name)[0][0]
  distances , suggestions = model.kneighbors(rating_table.iloc[book_index,:].values.reshape(1,-1),n_neighbors=5)
  suggestions = np.ravel(suggestions, order='C') #2d to 1d array
  for i in suggestions:
    recommended_books.append(rating_table.index[i])

  for i in recommended_books:
    image_url.append(books_image_data[books_image_data["title"] == i ].image.to_string(index=False))
    
  return recommended_books,image_url


In [69]:
!pip install streamlit

In [70]:
import streamlit as st

st.title("Book Recommendation Engine")

selected_book = st.selectbox(
     'Search your books here',
     books_name)

if st.button('Search'):
    books,images = recommend(selected_book) 

    container1 =st.container()
    container1.header("YOU HAVE SEARCHED FOR -")
    container1.header(books[0])
    container1.image(images[0])

    st.header("PEOPLE ALSO LIKED -")
    col1, col2, col3,col4 = st.columns(4)

    with col1:
        st.subheader(books[1])
        st.image(images[1])
    with col2:
        st.subheader(books[2])
        st.image(images[2])
    with col3:
        st.subheader(books[3])
        st.image(images[3])
    with col4:
        st.subheader(books[4])
        st.image(images[4])


In [71]:
!wget https://github.com/Vidrow/Book-Recommendation-Engine/blob/7639cd4f70690854207ed37d5fe94f33c5583cc4/End%20to%20End%20Implementation/main.py

In [72]:
!pip install pyngrok===4.1.1
from pyngrok import ngrok

In [73]:
!nohup streamlit run main.py &
url = ngrok.connect(port='8501')
url