<div style="border-radius:10px; border:#4E5672 solid; padding: 15px; background-color: #F8F1E8; font-size:100%; text-align:left">

<h3 align="left"><font color='#4E5672'>📝 Description:</font></h3>

* I will apply three different collaborative filtering algorithms to recommend books to users.

1. **Item-Based Collaborative Filtering:**
   - **Description:** Item-based collaborative filtering focuses on finding items with similar characteristics and recommends other items based on a user's previous preferences. It suggests items that are similar to a user's liked item.
    

2. **User-Based Collaborative Filtering:**
   - **Description:** User-based collaborative filtering identifies other users with similar interests based on a user's preferences and recommends items liked by these similar users. It suggests items that users with similar tastes have enjoyed.
    

3. **Model-Based Collaborative Filtering:**
   - **Description:** Model-based collaborative filtering creates a learning model using user and item features to predict the likelihood of a user liking a particular item. It makes recommendations based on these predictions. Model-based approaches often involve techniques like matrix factorization.
   - **Operation:** Factors representing user and item features are used to build the model. The learning algorithm determines these factors. The model then predicts the probability of a user liking an item, and recommendations are made accordingly.

In [639]:
import pandas as pd

# <p style="background-color:#F8F1E8; font-family:newtimeroman;color:#602F44; font-size:150%; text-align:center; border-radius: 15px 50px;"> ⇣ Reading Data ⇣</p>

In [640]:
books = pd.read_csv("/kaggle/input/goodbooks-10k/books.csv", 
                 usecols=["book_id",
                          "original_publication_year",
                          "average_rating",
                          "title",
                          "average_rating"])
books.head(5)

Unnamed: 0,book_id,original_publication_year,title,average_rating
0,2767052,2008.0,"The Hunger Games (The Hunger Games, #1)",4.34
1,3,1997.0,Harry Potter and the Sorcerer's Stone (Harry P...,4.44
2,41865,2005.0,"Twilight (Twilight, #1)",3.57
3,2657,1960.0,To Kill a Mockingbird,4.25
4,4671,1925.0,The Great Gatsby,3.89


In [641]:
rating = pd.read_csv("/kaggle/input/goodbooks-10k/ratings.csv")
rating.head(5)

Unnamed: 0,book_id,user_id,rating
0,1,314,5
1,1,439,3
2,1,588,5
3,1,1169,4
4,1,1185,4


<div style="border-radius:10px; border:#6B8BA0 solid; padding: 15px; background-color: #F2EADF; font-size:100%; text-align:left">

<h3 align="left"><font color='#6B8BA0'>👀 Features: </font></h3>

**Books Data:**
1. **book_id:** Unique identifier for each book.
2. **original_publication_year:** The year when the book was originally published.
3. **title:** The title of the book.
4. **average_rating:** The average rating given by users to the book.

**Rating Data:**
1. **book_id:** Unique identifier for each book, linking the rating to a specific book.
2. **user_id:** Unique identifier for each user providing a rating.
3. **rating:** The rating given by the user to the specific book.


# <p style="font-family:JetBrains Mono; font-weight:bold; letter-spacing: 2px; color:#A16B56; font-size:140%; text-align:left;padding: 0px; border-bottom: 3px solid #E2D7A7">📚Item- Based Recommendation</p>

In [642]:
df = pd.merge(books, rating, on="book_id", how="inner")
df.head()

Unnamed: 0,book_id,original_publication_year,title,average_rating,user_id,rating
0,3,1997.0,Harry Potter and the Sorcerer's Stone (Harry P...,4.44,314,3
1,3,1997.0,Harry Potter and the Sorcerer's Stone (Harry P...,4.44,588,1
2,3,1997.0,Harry Potter and the Sorcerer's Stone (Harry P...,4.44,2077,2
3,3,1997.0,Harry Potter and the Sorcerer's Stone (Harry P...,4.44,2487,3
4,3,1997.0,Harry Potter and the Sorcerer's Stone (Harry P...,4.44,2900,3


<div style="border-radius:10px; border:#484366 solid; padding: 15px; background-color: #FFEBCC; font-size:100%; text-align:left">

<h3 align="left"><font color='#484366'>💬 Comment</font></h3>

*  Merged Books and Rating Data

In [643]:
df["book_id"].value_counts()

book_id
3       100
415     100
6538    100
5128    100
3061    100
       ... 
6819     60
9762     60
9418     59
8964     59
9534     57
Name: count, Length: 812, dtype: int64

In [644]:
user_df = df.groupby(["user_id","title"])["rating"].mean().unstack().notnull()
user_df

title,'Salem's Lot,"'Tis (Frank McCourt, #2)",1421: The Year China Discovered America,1776,1984,A Bend in the River,A Bend in the Road,A Brief History of Time,A Briefer History of Time,A Case of Need,...,"Women in Love (Brangwen Family, #2)",World War Z: An Oral History of the Zombie War,"World Without End (The Kingsbridge Series, #2)",Wuthering Heights,"Xenocide (Ender's Saga, #3)",Year of Wonders,You Shall Know Our Velocity!,Zen and the Art of Motorcycle Maintenance: An Inquiry Into Values,Zodiac,number9dream
user_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
3,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
4,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
7,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
9,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
53419,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
53420,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
53422,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
53423,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False


<div style="border-radius:10px; border:#484366 solid; padding: 15px; background-color: #FFEBCC; font-size:100%; text-align:left">

<h3 align="left"><font color='#484366'>💬 Comment</font></h3>

*  Created a dataframe with user_ids as the index, titles as columns, and boolean values indicating whether a user has rated a particular book or not.

In [645]:
random_book = pd.Series(user_df.columns).sample(1,random_state=42).values[0]
random_book

'Heidi'

In [646]:
book_name = user_df[random_book]
user_df.corrwith(book_name).sort_values(ascending=False).head(10)

title
Heidi                                                           1.000000
Harry Potter Collection (Harry Potter, #1-6)                    0.528368
Harry Potter and the Order of the Phoenix (Harry Potter, #5)    0.518334
Harry Potter and the Prisoner of Azkaban (Harry Potter, #3)     0.488230
Harry Potter and the Half-Blood Prince (Harry Potter, #6)       0.488230
Heretics of Dune (Dune Chronicles #5)                           0.478195
Harry Potter Boxed Set, Books 1-5 (Harry Potter, #1-5)          0.478195
The Lord of the Rings (The Lord of the Rings, #1-3)             0.468160
Notes from a Small Island                                       0.460460
Harry Potter and the Sorcerer's Stone (Harry Potter, #1)        0.448091
dtype: float64

<div style="border-radius:10px; border:#484366 solid; padding: 15px; background-color: #FFEBCC; font-size:100%; text-align:left">

<h3 align="left"><font color='#484366'>💬 Comment</font></h3>

*  Got a random book, and calculated most correlated books with it.

# <p style="font-family:JetBrains Mono; font-weight:bold; letter-spacing: 2px; color:#A16B56; font-size:140%; text-align:left;padding: 0px; border-bottom: 3px solid #E2D7A7">🕵️User - Based Recommendation🕵️‍♀️</p>

In [647]:
user_df = df.groupby(["user_id","title"])["rating"].mean().unstack()
user_df

title,'Salem's Lot,"'Tis (Frank McCourt, #2)",1421: The Year China Discovered America,1776,1984,A Bend in the River,A Bend in the Road,A Brief History of Time,A Briefer History of Time,A Case of Need,...,"Women in Love (Brangwen Family, #2)",World War Z: An Oral History of the Zombie War,"World Without End (The Kingsbridge Series, #2)",Wuthering Heights,"Xenocide (Ender's Saga, #3)",Year of Wonders,You Shall Know Our Velocity!,Zen and the Art of Motorcycle Maintenance: An Inquiry Into Values,Zodiac,number9dream
user_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2,,,,,,,,,,,...,,,,,,,,,,
3,,,,,,,,,,,...,,,,,,,,,,
4,,,,,,,,,,,...,,,,,,,,,,
7,,,,,,,,,,,...,,,,,,,,,,
9,,,,,,,,,,,...,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
53419,,,,,,,,,,,...,,,,,,,,,,
53420,,,,,,,,,,,...,,,,,,,,,,
53422,,,,,,,,,,,...,,,,,,,,,,
53423,,,,,,,,,,,...,,,,,,,,,,


<div style="border-radius:10px; border:#484366 solid; padding: 15px; background-color: #FFEBCC; font-size:100%; text-align:left">

<h3 align="left"><font color='#484366'>💬 Comment</font></h3>

*  Created a dataframe with user_ids as the index, titles as columns, and the values representing the ratings given by each user to every book.

In [648]:
random_user = user_df.sample(1,random_state=689).index[0]

In [649]:
random_user_df = user_df[user_df.index == random_user]
random_user_df

title,'Salem's Lot,"'Tis (Frank McCourt, #2)",1421: The Year China Discovered America,1776,1984,A Bend in the River,A Bend in the Road,A Brief History of Time,A Briefer History of Time,A Case of Need,...,"Women in Love (Brangwen Family, #2)",World War Z: An Oral History of the Zombie War,"World Without End (The Kingsbridge Series, #2)",Wuthering Heights,"Xenocide (Ender's Saga, #3)",Year of Wonders,You Shall Know Our Velocity!,Zen and the Art of Motorcycle Maintenance: An Inquiry Into Values,Zodiac,number9dream
user_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
20467,,,,,,,,,,,...,,,,,,,,,,


In [650]:
book_read = random_user_df.dropna(axis=1).columns.tolist()
book_read

['Burmese Days',
 'Daniel Deronda',
 'Freakonomics: A Rogue Economist Explores the Hidden Side of Everything (Freakonomics, #1)',
 'Harry Potter and the Half-Blood Prince (Harry Potter, #6)',
 'Harry Potter and the Prisoner of Azkaban (Harry Potter, #3)',
 'Heidi',
 'I am Charlotte Simmons',
 'Me Talk Pretty One Day',
 'Quicksilver (The Baroque Cycle, #1)',
 'The 158-Pound Marriage',
 'The Broken Wings',
 'The Corrections',
 'The Elegant Universe: Superstrings, Hidden Dimensions, and the Quest for the Ultimate Theory',
 'The Fellowship of the Ring (The Lord of the Rings, #1)',
 "The Hitchhiker's Guide to the Galaxy (Hitchhiker's Guide to the Galaxy, #1)",
 'The Known World',
 'The Long Dark Tea-Time of the Soul (Dirk Gently, #2)',
 'The Lord of the Rings (The Lord of the Rings, #1-3)',
 'The Lord of the Rings: The Art of The Fellowship of the Ring',
 'The Phantom Tollbooth',
 'Tropic of Cancer']

<div style="border-radius:10px; border:#484366 solid; padding: 15px; background-color: #FFEBCC; font-size:100%; text-align:left">

<h3 align="left"><font color='#484366'>💬 Comment</font></h3>

*  Picked a random user and obtained the list of books they have read

In [651]:
book_read_df = user_df[book_read]
book_read_df

title,Burmese Days,Daniel Deronda,"Freakonomics: A Rogue Economist Explores the Hidden Side of Everything (Freakonomics, #1)","Harry Potter and the Half-Blood Prince (Harry Potter, #6)","Harry Potter and the Prisoner of Azkaban (Harry Potter, #3)",Heidi,I am Charlotte Simmons,Me Talk Pretty One Day,"Quicksilver (The Baroque Cycle, #1)",The 158-Pound Marriage,...,The Corrections,"The Elegant Universe: Superstrings, Hidden Dimensions, and the Quest for the Ultimate Theory","The Fellowship of the Ring (The Lord of the Rings, #1)","The Hitchhiker's Guide to the Galaxy (Hitchhiker's Guide to the Galaxy, #1)",The Known World,"The Long Dark Tea-Time of the Soul (Dirk Gently, #2)","The Lord of the Rings (The Lord of the Rings, #1-3)",The Lord of the Rings: The Art of The Fellowship of the Ring,The Phantom Tollbooth,Tropic of Cancer
user_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2,,,,,,,,,,,...,,,,,,,,,,
3,,,,,,,,,,,...,,,,,,,,,,
4,,,,,,,,,,,...,,,,,,,,,,
7,,,,,,,,,,,...,,,,,,,,,,
9,,,,,,,,,,,...,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
53419,,,,,,,,,,,...,,,,,,,,,,
53420,,,,,,,,,,,...,,,,,,,,,,
53422,,,,,,,,,,,...,,,,,,,,,,
53423,,,,,,,,,,,...,,,,,,,,,,


In [652]:
user_book_count  = book_read_df.notnull().sum(axis=1)
user_book_count.max()

21

In [653]:
user_book_count = book_read_df.notnull().sum(axis=1)
user_book_count

user_id
2        0
3        0
4        0
7        0
9        0
        ..
53419    0
53420    0
53422    0
53423    0
53424    0
Length: 28906, dtype: int64

<div style="border-radius:10px; border:#484366 solid; padding: 15px; background-color: #FFEBCC; font-size:100%; text-align:left">

<h3 align="left"><font color='#484366'>💬 Comment</font></h3>

*  How many books that other users have read are the same as the ones read by a randomly selected user?

In [654]:
users_same_books = user_book_count[user_book_count > (book_read_df.shape[1] * 30 ) / 100].index
users_same_books

Index([  588,  1952,  5461,  6342,  9246, 10111, 10727, 10944, 11692, 11927,
       11945, 12381, 13544, 13794, 17984, 18031, 18361, 20467, 22602, 23576,
       23612, 24326, 26244, 26661, 28767, 29703, 32592, 32748, 32923, 33065,
       33716, 33872, 38080, 42404, 44397, 45554, 47800, 48559, 48687, 51166],
      dtype='int64', name='user_id')

<div style="border-radius:10px; border:#484366 solid; padding: 15px; background-color: #FFEBCC; font-size:100%; text-align:left">

<h3 align="left"><font color='#484366'>💬 Comment</font></h3>

*  Readers who have read at least 30% (indicated as 0.8 here) of the same book.

In [655]:
filted_df = book_read_df[book_read_df.index.isin(users_same_books)]
filted_df.head()

title,Burmese Days,Daniel Deronda,"Freakonomics: A Rogue Economist Explores the Hidden Side of Everything (Freakonomics, #1)","Harry Potter and the Half-Blood Prince (Harry Potter, #6)","Harry Potter and the Prisoner of Azkaban (Harry Potter, #3)",Heidi,I am Charlotte Simmons,Me Talk Pretty One Day,"Quicksilver (The Baroque Cycle, #1)",The 158-Pound Marriage,...,The Corrections,"The Elegant Universe: Superstrings, Hidden Dimensions, and the Quest for the Ultimate Theory","The Fellowship of the Ring (The Lord of the Rings, #1)","The Hitchhiker's Guide to the Galaxy (Hitchhiker's Guide to the Galaxy, #1)",The Known World,"The Long Dark Tea-Time of the Soul (Dirk Gently, #2)","The Lord of the Rings (The Lord of the Rings, #1-3)",The Lord of the Rings: The Art of The Fellowship of the Ring,The Phantom Tollbooth,Tropic of Cancer
user_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
588,,,,5.0,,,2.0,,,,...,,,,4.0,4.0,,3.0,,4.0,4.0
1952,,4.0,,,5.0,4.0,,,,,...,,,,5.0,4.0,,4.0,5.0,4.0,
5461,,4.0,4.0,3.0,5.0,5.0,4.0,,,,...,,,,4.0,,4.0,4.0,,,
6342,,4.0,4.0,,5.0,,,,,,...,,,,5.0,5.0,,4.0,4.0,,
9246,,,,1.0,3.0,,3.0,,,,...,,,,5.0,5.0,3.0,4.0,2.0,5.0,


<div style="border-radius:10px; border:#484366 solid; padding: 15px; background-color: #FFEBCC; font-size:100%; text-align:left">

<h3 align="left"><font color='#484366'>💬 Comment</font></h3>

*  Filtered the dataframe based on the books that a randomly selected reader has read.

In [656]:
corr_df = filted_df.T.corr().unstack().drop_duplicates()
corr_df

user_id  user_id
588      588        1.000000
         1952       0.333333
         5461      -0.774597
         6342       1.000000
         9246      -0.161165
                      ...   
44397    51166      0.482382
45554    48559      0.157243
         51166     -0.904534
47800    48559      0.321288
48559    51166     -0.166667
Length: 582, dtype: float64

<div style="border-radius:10px; border:#484366 solid; padding: 15px; background-color: #FFEBCC; font-size:100%; text-align:left">

<h3 align="left"><font color='#484366'>💬 Comment</font></h3>

*  Calculated the correlation between user ratings

In [657]:
top_readers = pd.DataFrame(corr_df[random_user][corr_df[random_user] > 0.70], columns=["corr"])
top_readers

Unnamed: 0_level_0,corr
user_id,Unnamed: 1_level_1
23612,0.871044
26661,0.782624
33716,0.744157


<div style="border-radius:10px; border:#484366 solid; padding: 15px; background-color: #FFEBCC; font-size:100%; text-align:left">

<h3 align="left"><font color='#484366'>💬 Comment</font></h3>

*  Found users which has more than 70% correlation with random user.

In [658]:
top_readers_ratings = pd.merge(top_readers, df[["user_id", "book_id", "rating"]], how='inner', on="user_id")
top_readers_ratings

Unnamed: 0,user_id,corr,book_id,rating
0,23612,0.871044,3,3
1,23612,0.871044,34,1
2,23612,0.871044,2,5
3,23612,0.871044,968,3
4,23612,0.871044,1,4
...,...,...,...,...
72,33716,0.744157,36,3
73,33716,0.744157,647,3
74,33716,0.744157,304,5
75,33716,0.744157,6670,3


<div style="border-radius:10px; border:#484366 solid; padding: 15px; background-color: #FFEBCC; font-size:100%; text-align:left">

<h3 align="left"><font color='#484366'>💬 Comment</font></h3>

*  Merged correlation and dataframe for find the book_ids

In [659]:
top_readers_ratings['weighted_rating'] = top_readers_ratings['corr'] * top_readers_ratings['rating']
top_readers_ratings

Unnamed: 0,user_id,corr,book_id,rating,weighted_rating
0,23612,0.871044,3,3,2.613133
1,23612,0.871044,34,1,0.871044
2,23612,0.871044,2,5,4.355222
3,23612,0.871044,968,3,2.613133
4,23612,0.871044,1,4,3.484178
...,...,...,...,...,...
72,33716,0.744157,36,3,2.232472
73,33716,0.744157,647,3,2.232472
74,33716,0.744157,304,5,3.720787
75,33716,0.744157,6670,3,2.232472


<div style="border-radius:10px; border:#484366 solid; padding: 15px; background-color: #FFEBCC; font-size:100%; text-align:left">

<h3 align="left"><font color='#484366'>💬 Comment</font></h3>

*  Calculated weighted rating for every user.

In [660]:
recommendation_df = top_readers_ratings.pivot_table(values="weighted_rating", index="book_id", aggfunc="mean")
recommendation_df.head()

Unnamed: 0_level_0,weighted_rating
book_id,Unnamed: 1_level_1
1,3.445153
2,4.038004
3,2.422803
5,3.444874
8,3.444874


<div style="border-radius:10px; border:#484366 solid; padding: 15px; background-color: #FFEBCC; font-size:100%; text-align:left">

<h3 align="left"><font color='#484366'>💬 Comment</font></h3>

*  Calculated mean of weighted rating for every book.

In [661]:
books_recommend = recommendation_df[recommendation_df["weighted_rating"] > 3.5].sort_values(by="weighted_rating", ascending=False).head(5)
books_recommend

Unnamed: 0_level_0,weighted_rating
book_id,Unnamed: 1_level_1
21,4.355222
24,4.355222
25,4.355222
27,4.355222
1274,4.355222


<div style="border-radius:10px; border:#484366 solid; padding: 15px; background-color: #FFEBCC; font-size:100%; text-align:left">

<h3 align="left"><font color='#484366'>💬 Comment</font></h3>

*  Found books which has more than 3.5 rating by users

In [662]:
books[books["book_id"].isin(books_recommend.index)]

Unnamed: 0,book_id,original_publication_year,title,average_rating
373,21,2003.0,A Short History of Nearly Everything,4.19
572,1274,1998.0,"Men Are from Mars, Women Are from Venus",3.52
1459,24,2000.0,In a Sunburned Country,4.05
1975,25,1998.0,I'm a Stranger Here Myself: Notes on Returning...,3.89
2278,27,1991.0,Neither Here nor There: Travels in Europe,3.88


### <p style="font-family:JetBrains Mono; font-weight:bold; letter-spacing: 2px; color:#5E1C9F; font-size:140%; text-align:center;padding: 0px; border-bottom: 3px solid #FFFAF0">Recommendations</p>
<center><img src="https://i.imgur.com/By3JsnC.png" style ><center>

# <p style="font-family:JetBrains Mono; font-weight:bold; letter-spacing: 2px; color:#A16B56; font-size:140%; text-align:left;padding: 0px; border-bottom: 3px solid #E2D7A7">🤖 Model - Based Recommendation 💻</p>

In [663]:
from surprise import Reader, SVD, Dataset, accuracy
from surprise.model_selection import GridSearchCV, train_test_split, cross_validate

In [664]:
user_id = df["user_id"].sample(1,random_state=42).values.tolist()[0]
user_id

45029

In [665]:
sample_user = df[df["user_id"] == user_id]
sample_user

Unnamed: 0,book_id,original_publication_year,title,average_rating,user_id,rating
5311,2165,1952.0,The Old Man and the Sea,3.73,45029,4
7552,5359,1993.0,The Client,3.97,45029,4
19233,2373,1997.0,"The Bone Collector (Lincoln Rhyme, #1)",4.18,45029,4
61309,4630,1937.0,To Have and Have Not,3.57,45029,4
62878,5548,1988.0,What Do You Care What Other People Think?,4.28,45029,4


<div style="border-radius:10px; border:#484366 solid; padding: 15px; background-color: #FFEBCC; font-size:100%; text-align:left">

<h3 align="left"><font color='#484366'>💬 Comment</font></h3>

*  So, our random user has just read 5 books, and we will calculate their potential ratings using a machine learning model for books they haven't read..

In [666]:
reader = Reader(rating_scale=(1, 5))
data = Dataset.load_from_df(df[['user_id','book_id','rating']], reader)

In [667]:
trainset, testset = train_test_split(data, test_size=.25, random_state=42)
svd_model = SVD(random_state=42)
svd_model.fit(trainset)
predictions = svd_model.test(testset)
accuracy.rmse(predictions)

RMSE: 0.9115


0.9114649631121562

<div style="border-radius:10px; border:#484366 solid; padding: 15px; background-color: #FFEBCC; font-size:100%; text-align:left">

<h3 align="left"><font color='#484366'>💬 Comment</font></h3>

*  Created the prediction model.

In [668]:
def suggest(df,user_id,sug):
    didnt_read = df["book_id"][~(df["user_id"]==user_id)].drop_duplicates().values.tolist()
    temp_dict={}
    for i in didnt_read:
        temp_dict[i] = svd_model.predict(uid=user_id, iid=i)[3]
    suggestions = pd.DataFrame(temp_dict.items(),columns=["book_id",'possible_rate']).sort_values(by="possible_rate", ascending=False).head(sug)
    merged = pd.merge(suggestions,books[["book_id","title"]], how="inner", on="book_id")
    return merged

In [669]:
suggest(df, user_id, 5)

Unnamed: 0,book_id,possible_rate,title
0,9566,4.823831,Still Life with Woodpecker
1,9531,4.760682,Peter and the Shadow Thieves (Peter and the St...
2,3885,4.718368,The Taste of Home Cookbook
3,4708,4.709676,The Beautiful and Damned
4,6423,4.66215,"High Five (Stephanie Plum, #5)"


<div style="border-radius:10px; border:#484366 solid; padding: 15px; background-color: #FFEBCC; font-size:100%; text-align:left">

<h3 align="left"><font color='#484366'>💬 Comment</font></h3>

*  Suggesting 5 books to the random user based on his/her the possible ratings predicted by the model.

### <p style="font-family:JetBrains Mono; font-weight:bold; letter-spacing: 2px; color:#5E1C9F; font-size:140%; text-align:center;padding: 0px; border-bottom: 3px solid #FFFAF0">Recommendations</p>
<center><img src="https://i.imgur.com/Kg0vc0k.png" style ><center>