<div style="border-radius: 10px; border: #6B8E23 solid; padding: 15px; background-color: #F5F5DC; font-size: 100%; text-align: left">

<h3 align="left"><font color='#556B2F'>📜 Introduction : </font></h3>
    
In today's world, making the right choice among millions of books can be a challenging experience for readers. However, this process can be facilitated through book recommendation systems, which provide personalized suggestions to readers. This book dataset analysis focuses on developing book recommendations using the Collaborative Filtering approach. The methods employed include Item-Based and User-Based Collaborative Filtering, along with Model-Based approaches. This study aims to enhance the reading experience by offering recommendations based on the preferences of other users with similar interests, thus making literature more accessible and providing readers with a personalized experience in the world of books.

<a id = "1"></a><br>
<p style="font-family: 'Pacifico', cursive; font-weight: bold; letter-spacing: 2px; color: #556B2F; font-size: 160%; text-align: left; padding: 0px; border-bottom: 3px solid">✨Item-Based Recommendation System✨</p>

In [None]:
import numpy as np
import pandas as pd

import warnings
warnings.filterwarnings("ignore")

In [None]:
rating = pd.read_csv("/kaggle/input/goodbooks-10k/ratings.csv")
books = pd.read_csv("/kaggle/input/goodbooks-10k/books.csv", 
                 usecols=["book_id",
                          "original_publication_year",
                          "average_rating",
                          "title",
                          "average_rating"])

In [None]:
books.head()

In [None]:
rating.head()

In [None]:
df = pd.merge(books,rating, how="inner", on="book_id")

In [None]:
df.shape

In [None]:
user_df = df.groupby(["user_id","title"])["rating"].mean().unstack().notnull()
user_df

In [None]:
# we take random book name from out dataset
 
sample_name = pd.Series(user_df.columns).sample(1, random_state = 42).values[0]

sample_name

In [None]:
# We take the other bookworms votes if they give rate to Heidi.

sample = user_df[sample_name]

In [None]:
sample

In [None]:
# Most correlation with Heidi book, to suggest book readers.

user_df.corrwith(sample).sort_values(ascending=False).head(10)

<center><img src="https://i.imgur.com/Y2DRcty.jpg" width="800" height="800"></center>

<a id = "2"></a><br>
<p style="font-family: 'Pacifico', cursive; font-weight: bold; letter-spacing: 2px; color: #556B2F; font-size: 160%; text-align: left; padding: 0px; border-bottom: 3px solid">✨User-Based Recommendation System✨</p>

In [None]:
user_df = df.groupby(["user_id","title"])["rating"].mean().unstack()

In [None]:
random_user = user_df.sample(1,random_state=689).index[0]

In [None]:
random_user_df = user_df[user_df.index == random_user]
random_user_df

<a id = "3"></a><br>
<div style="font-family:JetBrains Mono; font-weight:bold; letter-spacing: 2px; color:#E5788F; font-size:150%; text-align:left; padding: 0px;">Other Users Watching the Same Movies</div>

In [None]:
book_read = random_user_df.dropna(axis=1).columns.tolist()
book_read

<div style="border-radius:10px; border:#6B8BA0 solid; padding: 15px; background-color: #F2EADF; font-size:100%; text-align:left">

<h3 align="left"><font color='#6B8BA0'>🗨️ Comment: </font></h3>

We are checking whether the randomly selected reader has been read by other readers.

In [None]:
book_read_df = user_df[book_read]
book_read_df

In [None]:
# how many same books readen by other readers
user_book_count = book_read_df.T.notnull().sum()
user_book_count.max()

In [None]:
# Reader IDs of people who read books with more than 30% similarity.
users_same_books = user_book_count[user_book_count > (book_read_df.shape[1] * 30 ) / 100].index
users_same_books

<a id = "4"></a><br>
<div style="font-family:JetBrains Mono; font-weight:bold; letter-spacing: 2px; color:#E5788F; font-size:150%; text-align:left; padding: 0px;">Determination of Similarity</div>

In [None]:
filted_df = book_read_df[book_read_df.index.isin(users_same_books)]
filted_df.head()

In [None]:
corr_df = filted_df.T.corr().unstack().drop_duplicates()
corr_df

In [None]:
top_readers = pd.DataFrame(corr_df[random_user][corr_df[random_user] > 0.70], columns=["corr"])
top_readers

<div style="border-radius:10px; border:#6B8BA0 solid; padding: 15px; background-color: #F2EADF; font-size:100%; text-align:left">

<h3 align="left"><font color='#6B8BA0'>🗨️ Comment: </font></h3>
    
Readers that have a correlation of more than 65% with the specified reader.

<a id = "5"></a><br>
<div style="font-family:JetBrains Mono; font-weight:bold; letter-spacing: 2px; color:#E5788F; font-size:150%; text-align:left; padding: 0px;">Score Calculation</div>

In [None]:
top_readers_ratings = pd.merge(top_readers, df[["user_id", "book_id", "rating"]], how='inner', on="user_id")
top_readers_ratings

In [None]:
# We weighted top readers.
top_readers_ratings['weighted_rating'] = top_readers_ratings['corr'] * top_readers_ratings['rating']
top_readers_ratings

In [None]:
recommendation_df = top_readers_ratings.pivot_table(values="weighted_rating", index="book_id", aggfunc="mean")
recommendation_df

In [None]:
books_recommend = recommendation_df[recommendation_df["weighted_rating"] > 3.5].sort_values(by="weighted_rating", ascending=False).head()
books_recommend

In [None]:
books[books["book_id"].isin(books_recommend.index)]

<center><img src="https://i.imgur.com/d9BAbkF.png" width="800" height="800"></center>

<a id = "6"></a><br>
<p style="font-family: 'Pacifico', cursive; font-weight: bold; letter-spacing: 2px; color: #556B2F; font-size: 160%; text-align: left; padding: 0px; border-bottom: 3px solid">✨Model-Based Recommendation System✨</p>

In [None]:
import pandas as pd
from surprise import Reader, SVD, Dataset, accuracy
from surprise.model_selection import GridSearchCV, train_test_split, cross_validate
pd.set_option('display.max_columns', None)

In [None]:
df.head()

In [None]:
# got 1 sample
user_id = df["user_id"].sample(1,random_state=42).values.tolist()[0]
user_id

In [None]:
# books that our sample readed
sample_df = df[df["user_id"] == user_id]
sample_df

In [None]:
reader = Reader(rating_scale=(1, 5))

In [None]:
# created data before modelling
data = Dataset.load_from_df(df[['user_id',
                                       'book_id',
                                       'rating']], reader)

In [None]:
# building model
trainset, testset = train_test_split(data, test_size=.25, random_state = 42)
svd_model = SVD(random_state = 42)
svd_model.fit(trainset)
predictions = svd_model.test(testset)

In [None]:
accuracy.rmse(predictions)

In [None]:
df["book_id"][~(df["user_id"]==45029)]

In [None]:
didnt_read = df["book_id"][~(df["user_id"]==user_id)].drop_duplicates().values.tolist()

In [None]:
# Function that recommends the book to a user who hasn't read it but gets a high score from our machine learning model
def suggest(df,user_id,sug):
    didnt_read = df["book_id"][~(df["user_id"]==user_id)].drop_duplicates().values.tolist()
    temp_dict={}
    for i in didnt_read:
        temp_dict[i] = svd_model.predict(uid=user_id, iid=i)[3]
    suggestions = pd.DataFrame(temp_dict.items(),columns=["book_id",'possible_rate']).sort_values(by="possible_rate", ascending=False).head(sug)
    merged = pd.merge(suggestions,books[["book_id","title"]], how="inner", on="book_id")
    return merged

In [None]:
suggest(df,user_id,5)

<div style="border-radius:10px; border:#6B8BA0 solid; padding: 15px; background-color: #F2EADF; font-size:100%; text-align:left">

<h3 align="left"><font color='#6B8BA0'>🗨️ Comment: </font></h3>
    
We can suggest this books to our sample reader

<center><img src="https://i.imgur.com/TKcovIp.png" width="800" height="800"></center>