# **Book Recommendation System**

## **Objective**
The goal is to develop a **book recommendation system** that suggests books to users based on their past interactions.  
We will implement multiple recommendation techniques, combining **collaborative filtering, content-based filtering, and hybrid models** to provide personalized recommendations.

## **Criteria for Users and Books**
To ensure meaningful recommendations, we apply the following filtering criteria:

1. **Users Selection**  
   - Consider **only users who have rated at least 200 books**.  
   - This ensures that recommendations are based on users with enough reading history.
   
2. **Books Selection**  
   - Include **only books that have received at least 50 ratings**.  
   - This ensures that the books considered are popular enough to have reliable ratings.

---

## **Recommendation Models Used**

### **1️⃣ K-Nearest Neighbors (KNN)**
KNN is a **collaborative filtering approach** that recommends books based on user-book interactions.  
- It identifies the **k-nearest users or books** based on rating similarity.
- We used **cosine similarity** as the distance metric to find similar books or users.
- **Item-based KNN**: Recommends books similar to those a user has rated highly.

---

### **2️⃣ Cosine Similarity-Based Model**
Instead of relying on matrix factorization, we use **pure cosine similarity** to recommend books:
- **User-Item Cosine Similarity**: Finds similar users and recommends books they liked.
- **Item-Item Cosine Similarity**: Recommends books similar to the ones a user has rated highly.

---

### **3️⃣ Singular Value Decomposition (SVD) – Surprise Library**
SVD is a **matrix factorization technique** used to reduce data sparsity and identify hidden patterns in user-book interactions.
- It decomposes the **user-item interaction matrix** into latent factors representing users and books.
- The model predicts book ratings by capturing underlying preferences.
- Implemented using the **Surprise Library's SVD algorithm** for improved performance.

---

### **4️⃣ Hybrid Model (SVD + KNN)**
A **hybrid approach** combining **SVD for rating prediction** and **KNN for book similarity-based recommendations**:
1. **SVD predicts book ratings** for a user.
2. **Top-N books from SVD predictions** are selected.
3. **KNN finds books similar** to those highly rated by SVD.
4. The final recommendation is a **combination of both techniques**.

This **hybrid method balances accuracy and diversity**, leveraging the strengths of both models.

---

## **Final Output**
The system generates book recommendations using:
✅ **KNN-based Collaborative Filtering**  
✅ **Cosine Similarity-Based Filtering**  
✅ **SVD (Matrix Factorization) from Surprise**  
✅ **Hybrid Model (SVD + KNN)**  

This ensures a **robust and diverse recommendation system** that provides personalized book suggestions. 📚✨

## **KNN-Based Recommendations**

In [1]:
import pandas as pd
import numpy as np
from sklearn.neighbors import NearestNeighbors
from scipy.sparse import csr_matrix
from sklearn.model_selection import train_test_split

In [2]:
data = pd.read_csv("../artifacts/cleaned_data.csv",encoding='ISO-8859-1')

In [3]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1031136 entries, 0 to 1031135
Data columns (total 12 columns):
 #   Column               Non-Null Count    Dtype  
---  ------               --------------    -----  
 0   User-ID              1031136 non-null  int64  
 1   ISBN                 1031136 non-null  object 
 2   Book-Rating          1031136 non-null  int64  
 3   Book-Title           1031136 non-null  object 
 4   Book-Author          1031134 non-null  object 
 5   Year-Of-Publication  1031136 non-null  float64
 6   Publisher            1031134 non-null  object 
 7   Image-URL-M          1031136 non-null  object 
 8   Age                  1031136 non-null  float64
 9   City                 1013763 non-null  object 
 10  State                987803 non-null   object 
 11  Country              992503 non-null   object 
dtypes: float64(2), int64(2), object(8)
memory usage: 94.4+ MB


In [4]:
data

Unnamed: 0,User-ID,ISBN,Book-Rating,Book-Title,Book-Author,Year-Of-Publication,Publisher,Image-URL-M,Age,City,State,Country
0,276725,034545104X,0,Flesh Tones: A Novel,M. J. Rose,2002.0,Ballantine Books,http://images.amazon.com/images/P/034545104X.0...,35.0,tyler,texas,usa
1,276726,0155061224,5,Rites of Passage,Judith Rae,2001.0,Heinle,http://images.amazon.com/images/P/0155061224.0...,35.0,seattle,washington,usa
2,276727,0446520802,0,The Notebook,Nicholas Sparks,1996.0,Warner Books,http://images.amazon.com/images/P/0446520802.0...,16.0,h,new south wales,australia
3,276729,052165615X,3,Help!: Level 1,Philip Prowse,1999.0,Cambridge University Press,http://images.amazon.com/images/P/052165615X.0...,16.0,rijeka,,croatia
4,276729,0521795028,6,The Amsterdam Connection : Level 4 (Cambridge ...,Sue Leather,2001.0,Cambridge University Press,http://images.amazon.com/images/P/0521795028.0...,16.0,rijeka,,croatia
...,...,...,...,...,...,...,...,...,...,...,...,...
1031131,276704,0876044011,0,Edgar Cayce on the Akashic Records: The Book o...,Kevin J. Todeschi,1998.0,A.R.E. Press (Association of Research &amp; Enlig,http://images.amazon.com/images/P/0876044011.0...,35.0,cedar park,texas,usa
1031132,276704,1563526298,9,Get Clark Smart : The Ultimate Guide for the S...,Clark Howard,2000.0,Longstreet Press,http://images.amazon.com/images/P/1563526298.0...,36.0,cedar park,texas,usa
1031133,276706,0679447156,0,Eight Weeks to Optimum Health: A Proven Progra...,Andrew Weil,1997.0,Alfred A. Knopf,http://images.amazon.com/images/P/0679447156.0...,18.0,quebec,quebec,canada
1031134,276709,0515107662,10,The Sherbrooke Bride (Bride Trilogy (Paperback)),Catherine Coulter,1996.0,Jove Books,http://images.amazon.com/images/P/0515107662.0...,38.0,mannington,west virginia,usa


### **Grouping the user with book ratings count** 

In [5]:
data.groupby('User-ID')['Book-Rating'].count().reset_index()


Unnamed: 0,User-ID,Book-Rating
0,2,1
1,8,17
2,9,3
3,10,1
4,12,1
...,...,...
92101,278846,1
92102,278849,4
92103,278851,23
92104,278852,1


- Only 92k out of total user have rated the books
- Majority of the user haven't rated the books

### **Flitering the data where user have rated more than equal to 200 books**

In [6]:
users_with_200_ratings_data = data.loc[data.groupby('User-ID')['Book-Rating'].transform('count') >= 200]

In [7]:
users_with_200_ratings_data

Unnamed: 0,User-ID,ISBN,Book-Rating,Book-Title,Book-Author,Year-Of-Publication,Publisher,Image-URL-M,Age,City,State,Country
1150,277427,002542730X,10,Politically Correct Bedtime Stories: Modern Ta...,James Finn Garner,1994.0,John Wiley &amp; Sons Inc,http://images.amazon.com/images/P/002542730X.0...,48.0,gilbert,arizona,usa
1151,277427,0026217457,0,Vegetarian Times Complete Cookbook,Lucy Moll,1995.0,John Wiley &amp; Sons,http://images.amazon.com/images/P/0026217457.0...,48.0,gilbert,arizona,usa
1152,277427,003008685X,8,Pioneers,James Fenimore Cooper,1974.0,Thomson Learning,http://images.amazon.com/images/P/003008685X.0...,48.0,gilbert,arizona,usa
1153,277427,0030615321,0,"Ask for May, Settle for June (A Doonesbury book)",G. B. Trudeau,1982.0,Henry Holt &amp; Co,http://images.amazon.com/images/P/0030615321.0...,48.0,gilbert,arizona,usa
1154,277427,0060002050,0,On a Wicked Dawn (Cynster Novels),Stephanie Laurens,2002.0,Avon Books,http://images.amazon.com/images/P/0060002050.0...,48.0,gilbert,arizona,usa
...,...,...,...,...,...,...,...,...,...,...,...,...
1029357,275970,1931868123,0,There's a Porcupine in My Outhouse: Misadventu...,Mike Tougias,2002.0,Capital Books (VA),http://images.amazon.com/images/P/1931868123.0...,46.0,pittsburgh,pennsylvania,usa
1029358,275970,3411086211,10,Die Biene.,Sybil GrÃ?ÃÂ¤fin SchÃ?ÃÂ¶nfeldt,1993.0,"Bibliographisches Institut, Mannheim",http://images.amazon.com/images/P/3411086211.0...,46.0,pittsburgh,pennsylvania,usa
1029359,275970,3829021860,0,The Penis Book,Joseph Cohen,1999.0,Konemann,http://images.amazon.com/images/P/3829021860.0...,46.0,pittsburgh,pennsylvania,usa
1029360,275970,4770019572,0,Musashi,Eiji Yoshikawa,1995.0,Kodansha International (JPN),http://images.amazon.com/images/P/4770019572.0...,46.0,pittsburgh,pennsylvania,usa


- Users with **200+** rating have rated **50%** of the books

---

### **Fliter this filtered data on books with 50+ ratings**

In [8]:
users_with_200_ratings_data.groupby('Book-Title')['Book-Rating'].count().reset_index()

Unnamed: 0,Book-Title,Book-Rating
0,A Light in the Storm: The Civil War Diary of ...,2
1,Always Have Popsicles,1
2,Apple Magic (The Collector's series),1
3,Beyond IBM: Leadership Marketing and Finance ...,1
4,Clifford Visita El Hospital (Clifford El Gran...,1
...,...,...
156132,Ã?Ã?ber das Fernsehen.,2
156133,Ã?Ã?ber die Pflicht zum Ungehorsam gegen den...,3
156134,Ã?Ã?lpiraten.,1
156135,Ã?Ã?stlich der Berge.,1


In [9]:
final_filtered_data = users_with_200_ratings_data.loc[users_with_200_ratings_data.groupby('Book-Title')['Book-Rating'].transform('count') >= 50]

In [10]:
final_filtered_data

Unnamed: 0,User-ID,ISBN,Book-Rating,Book-Title,Book-Author,Year-Of-Publication,Publisher,Image-URL-M,Age,City,State,Country
1150,277427,002542730X,10,Politically Correct Bedtime Stories: Modern Ta...,James Finn Garner,1994.0,John Wiley &amp; Sons Inc,http://images.amazon.com/images/P/002542730X.0...,48.0,gilbert,arizona,usa
1163,277427,0060930535,0,The Poisonwood Bible: A Novel,Barbara Kingsolver,1999.0,Perennial,http://images.amazon.com/images/P/0060930535.0...,48.0,gilbert,arizona,usa
1165,277427,0060934417,0,Bel Canto: A Novel,Ann Patchett,2002.0,Perennial,http://images.amazon.com/images/P/0060934417.0...,48.0,gilbert,arizona,usa
1168,277427,0061009059,9,One for the Money (Stephanie Plum Novels (Pape...,Janet Evanovich,1995.0,HarperTorch,http://images.amazon.com/images/P/0061009059.0...,48.0,gilbert,arizona,usa
1174,277427,006440188X,0,The Secret Garden,Frances Hodgson Burnett,1998.0,HarperTrophy,http://images.amazon.com/images/P/006440188X.0...,48.0,gilbert,arizona,usa
...,...,...,...,...,...,...,...,...,...,...,...,...
1029196,275970,1400031354,0,Tears of the Giraffe (No.1 Ladies Detective Ag...,Alexander McCall Smith,2002.0,Anchor,http://images.amazon.com/images/P/1400031354.0...,46.0,pittsburgh,pennsylvania,usa
1029197,275970,1400031362,0,Morality for Beautiful Girls (No.1 Ladies Dete...,Alexander McCall Smith,2002.0,Anchor,http://images.amazon.com/images/P/1400031362.0...,46.0,pittsburgh,pennsylvania,usa
1029270,275970,1573229725,0,Fingersmith,Sarah Waters,2002.0,Riverhead Books,http://images.amazon.com/images/P/1573229725.0...,46.0,pittsburgh,pennsylvania,usa
1029309,275970,1586210661,9,Me Talk Pretty One Day,David Sedaris,2001.0,Time Warner Audio Major,http://images.amazon.com/images/P/1586210661.0...,46.0,pittsburgh,pennsylvania,usa


- only **58k** books have ratings more than 50 and rated by top users(**>=200 ratings**)

---

# **Creating the User-Book Interaction Matrix for Recommendations**

## **Why Do We Need This Matrix?**
To implement **collaborative filtering**, we require a structured representation of user-book interactions.  
A **pivot table** helps us transform raw data into a **User-Book interaction matrix**, where:
- **Rows represent books (Book-Title).**
- **Columns represent users (User-ID).**
- **Values represent ratings given by users to books.**

This matrix allows us to analyze **user behavior patterns** and find similarities between users or books.

## **Why is This Approach Effective?**
- The matrix enables **pattern recognition** in user behavior.
- It allows us to compute **similarities between users or books**.
- It forms the foundation for **personalized book recommendations**.

🚀 **This matrix is the backbone of our collaborative filtering-based recommendation system!**

In [11]:
# Create user-item matrix
user_item_matrix = final_filtered_data.pivot_table(index="User-ID", columns="Book-Title", values="Book-Rating", fill_value=0)

### **Creating the sparse matrix**

In [12]:
sparse_matrix = csr_matrix(user_item_matrix.values)

In [13]:
sparse_matrix

<Compressed Sparse Row sparse matrix of dtype 'float64'
	with 14345 stored elements and shape (815, 707)>

### **Model training**

In [14]:
# Train KNN model
knn_model = NearestNeighbors(metric="cosine", algorithm="brute", n_neighbors=5, n_jobs=-1)
knn_model.fit(sparse_matrix)

### **Recommendations**

In [15]:
# Recommendation function
def recommend_books(user_id, num_recommendations=5):
    if user_id not in user_item_matrix.index:
        return "User not found in dataset."

    user_index = user_item_matrix.index.get_loc(user_id)
    distances, indices = knn_model.kneighbors(sparse_matrix[user_index], n_neighbors=6)
    
    return distances,indices

In [16]:
user_id=6563

In [17]:
distances,indices = recommend_books(user_id=user_id)

In [18]:
distances

array([[0.        , 0.72378366, 0.73647624, 0.75209908, 0.75487998,
        0.76764222]])

- first element in the target user, hence the distance is zero

In [19]:
indices

array([[ 10, 494, 269, 282, 769, 802]], dtype=int64)

- represents the row numbers of the silimar user in  the user item matrix

### **The actual index of the users** 

In [20]:
similar_users_knn = [user_item_matrix.index[i] for i in indices[0][1:]]

In [21]:
similar_users_knn

[175003, 101851, 106225, 260897, 271195]

In [22]:
books_rated_by_user = set(user_item_matrix.columns[user_item_matrix.loc[user_id] > 0])

In [23]:
books_rated_by_user

{'A Time to Kill',
 "Angela's Ashes (MMP) : A Memoir",
 'Beach Music',
 'Confessions of an Ugly Stepsister : A Novel',
 "Don't Sweat the Small Stuff and It's All Small Stuff : Simple Ways to Keep the Little Things from Taking Over Your Life (Don't Sweat the Small Stuff Series)",
 'Family Album',
 'Four Blind Mice',
 'Girl in Hyacinth Blue',
 'Good in Bed',
 'Harry Potter and the Goblet of Fire (Book 4)',
 'Harry Potter and the Order of the Phoenix (Book 5)',
 "Harry Potter and the Sorcerer's Stone (Harry Potter (Paperback))",
 'Mother of Pearl',
 'Prodigal Summer',
 "She's Come Undone (Oprah's Book Club)",
 'Smart Women',
 'Summer Sisters',
 'The Bean Trees',
 'The Client',
 'The Five People You Meet in Heaven',
 "The General's Daughter",
 'The Honk and Holler Opening Soon',
 'The Nanny Diaries: A Novel',
 'The Return Journey',
 'The Saving Graces: A Novel',
 'The Secret Life of Bees',
 "Tuesdays with Morrie: An Old Man, a Young Man, and Life's Greatest Lesson",
 'Under the Tuscan Sun'

- the books the user have already rated , means he/she have already read this books
- so we will exclude this books for the final recommendations

In [24]:
recommended_books_knn = {}

for sim_user in similar_users_knn:
    sim_user_books = user_item_matrix.loc[sim_user]
    for book, rating in sim_user_books.items():
        if book not in books_rated_by_user and rating > 0 and book not in recommended_books_knn:
            recommended_books_knn[book] = rating

### **Unsorted recommendated books**

In [25]:
recommended_books_knn

{'Chicken Soup for the Soul (Chicken Soup for the Soul)': 10.0,
 'Harry Potter and the Chamber of Secrets (Book 2)': 10.0,
 'Harry Potter and the Prisoner of Azkaban (Book 3)': 10.0,
 "Harry Potter and the Sorcerer's Stone (Book 1)": 10.0,
 'Hannibal': 6.0,
 'Lord of the Flies': 8.0,
 'Silence of the Lambs': 7.0,
 "The Bonesetter's Daughter": 10.0,
 'The Firm': 8.0,
 'The Joy Luck Club': 10.0,
 "The Kitchen God's Wife": 9.0,
 'The Tao of Pooh': 10.0,
 'A Painted House': 9.0,
 'Dying for Chocolate (Culinary Mysteries (Paperback))': 7.0,
 "From Potter's Field": 8.0,
 'Girl with a Pearl Earring': 10.0,
 'How to Be Good': 5.0,
 "Suzanne's Diary for Nicholas": 7.0,
 'The Brethren': 6.0,
 'The Chamber': 8.0,
 'The Lovely Bones: A Novel': 8.0,
 'The Pillars of the Earth': 10.0,
 'The Summons': 7.0,
 'The Surgeon': 7.0,
 "White Oleander : A Novel (Oprah's Book Club)": 8.0,
 'Winter Solstice': 6.0,
 '4 Blondes': 6.0,
 'A Heartbreaking Work of Staggering Genius': 10.0,
 "Bridget Jones's Diary": 

### **Sorted**

In [26]:
recommended_books_knn = sorted(recommended_books_knn.items(), key=lambda x: x[1], reverse=True)

In [27]:
recommended_books_knn

[('Chicken Soup for the Soul (Chicken Soup for the Soul)', 10.0),
 ('Harry Potter and the Chamber of Secrets (Book 2)', 10.0),
 ('Harry Potter and the Prisoner of Azkaban (Book 3)', 10.0),
 ("Harry Potter and the Sorcerer's Stone (Book 1)", 10.0),
 ("The Bonesetter's Daughter", 10.0),
 ('The Joy Luck Club', 10.0),
 ('The Tao of Pooh', 10.0),
 ('Girl with a Pearl Earring', 10.0),
 ('The Pillars of the Earth', 10.0),
 ('A Heartbreaking Work of Staggering Genius', 10.0),
 ("Dude, Where's My Country?", 10.0),
 ('Fried Green Tomatoes at the Whistle Stop Cafe', 10.0),
 ('The Little Prince', 10.0),
 ("The Kitchen God's Wife", 9.0),
 ('A Painted House', 9.0),
 ('Empire Falls', 9.0),
 ('Notes from a Small Island', 9.0),
 ('The Tale of the Body Thief (Vampire Chronicles (Paperback))', 9.0),
 ('The Witching Hour (Lives of the Mayfair Witches)', 9.0),
 ('Lord of the Flies', 8.0),
 ('The Firm', 8.0),
 ("From Potter's Field", 8.0),
 ('The Chamber', 8.0),
 ('The Lovely Bones: A Novel', 8.0),
 ("White

### **Top 5 recommendations**

In [28]:
recommended_books_knn[:5]

[('Chicken Soup for the Soul (Chicken Soup for the Soul)', 10.0),
 ('Harry Potter and the Chamber of Secrets (Book 2)', 10.0),
 ('Harry Potter and the Prisoner of Azkaban (Book 3)', 10.0),
 ("Harry Potter and the Sorcerer's Stone (Book 1)", 10.0),
 ("The Bonesetter's Daughter", 10.0)]

## **Cosine Similarity based Recommendations**

In [29]:
from sklearn.metrics.pairwise import cosine_similarity

In [30]:
# Compute cosine similarity
user_similarity_matrix = cosine_similarity(user_item_matrix)
user_similarity_matrix = pd.DataFrame(user_similarity_matrix, index=user_item_matrix.index, columns=user_item_matrix.index)

In [31]:
user_similarity_matrix

User-ID,254,2276,2766,2977,3363,4017,4385,6251,6323,6543,...,271705,273979,274004,274061,274301,274308,275970,277427,277639,278418
User-ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
254,1.000000,0.000000,0.000000,0.113963,0.0,0.000000,0.000000,0.195832,0.000000,0.073324,...,0.133641,0.000000,0.000000,0.048247,0.000000,0.072985,0.173413,0.000000,0.0,0.000000
2276,0.000000,1.000000,0.120877,0.000000,0.0,0.000000,0.343401,0.000000,0.000000,0.000000,...,0.000000,0.000000,0.140975,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.000000
2766,0.000000,0.120877,1.000000,0.000000,0.0,0.136117,0.000000,0.000000,0.116618,0.056298,...,0.080805,0.000000,0.112360,0.000000,0.038723,0.000000,0.000000,0.000000,0.0,0.055196
2977,0.113963,0.000000,0.000000,1.000000,0.0,0.056002,0.000000,0.089181,0.000000,0.000000,...,0.000000,0.106410,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.000000
3363,0.000000,0.000000,0.000000,0.000000,1.0,0.136717,0.000000,0.000000,0.000000,0.000000,...,0.117684,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.000000
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
274308,0.072985,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.066143,0.077970,0.000000,...,0.000000,0.000000,0.048776,0.000000,0.049117,1.000000,0.080119,0.125455,0.0,0.000000
275970,0.173413,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.055878,0.000000,0.059623,...,0.058682,0.000000,0.000000,0.000000,0.000000,0.080119,1.000000,0.067174,0.0,0.000000
277427,0.000000,0.000000,0.000000,0.000000,0.0,0.047360,0.000000,0.077638,0.000000,0.056806,...,0.000000,0.000000,0.058421,0.000000,0.044654,0.125455,0.067174,1.000000,0.0,0.000000
277639,0.000000,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,...,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,1.0,0.000000


### **Most similar users**

In [32]:
user_similarity_matrix[user_id].sort_values(ascending=False)

User-ID
6563      1.000000
175003    0.276216
101851    0.263524
106225    0.247901
260897    0.245120
            ...   
139467    0.000000
138441    0.000000
138097    0.000000
137688    0.000000
278418    0.000000
Name: 6563, Length: 815, dtype: float64

### **Top 5 similar users**

In [33]:
similar_users_cs  = user_similarity_matrix[user_id].sort_values(ascending=False)[1:6].index

In [34]:
books_rated_by_user

{'A Time to Kill',
 "Angela's Ashes (MMP) : A Memoir",
 'Beach Music',
 'Confessions of an Ugly Stepsister : A Novel',
 "Don't Sweat the Small Stuff and It's All Small Stuff : Simple Ways to Keep the Little Things from Taking Over Your Life (Don't Sweat the Small Stuff Series)",
 'Family Album',
 'Four Blind Mice',
 'Girl in Hyacinth Blue',
 'Good in Bed',
 'Harry Potter and the Goblet of Fire (Book 4)',
 'Harry Potter and the Order of the Phoenix (Book 5)',
 "Harry Potter and the Sorcerer's Stone (Harry Potter (Paperback))",
 'Mother of Pearl',
 'Prodigal Summer',
 "She's Come Undone (Oprah's Book Club)",
 'Smart Women',
 'Summer Sisters',
 'The Bean Trees',
 'The Client',
 'The Five People You Meet in Heaven',
 "The General's Daughter",
 'The Honk and Holler Opening Soon',
 'The Nanny Diaries: A Novel',
 'The Return Journey',
 'The Saving Graces: A Novel',
 'The Secret Life of Bees',
 "Tuesdays with Morrie: An Old Man, a Young Man, and Life's Greatest Lesson",
 'Under the Tuscan Sun'

### **Recommendations**

In [35]:
recommended_books_cs = {}

for sim_user in similar_users_cs:
    for book, rating in user_item_matrix.loc[sim_user].items():
        if rating > 0 and book not in books_rated_by_user and book not in recommended_books_cs:
            recommended_books_cs[book] = rating


In [36]:
recommended_books_cs

{'Chicken Soup for the Soul (Chicken Soup for the Soul)': 10.0,
 'Harry Potter and the Chamber of Secrets (Book 2)': 10.0,
 'Harry Potter and the Prisoner of Azkaban (Book 3)': 10.0,
 "Harry Potter and the Sorcerer's Stone (Book 1)": 10.0,
 'Hannibal': 6.0,
 'Lord of the Flies': 8.0,
 'Silence of the Lambs': 7.0,
 "The Bonesetter's Daughter": 10.0,
 'The Firm': 8.0,
 'The Joy Luck Club': 10.0,
 "The Kitchen God's Wife": 9.0,
 'The Tao of Pooh': 10.0,
 'A Painted House': 9.0,
 'Dying for Chocolate (Culinary Mysteries (Paperback))': 7.0,
 "From Potter's Field": 8.0,
 'Girl with a Pearl Earring': 10.0,
 'How to Be Good': 5.0,
 "Suzanne's Diary for Nicholas": 7.0,
 'The Brethren': 6.0,
 'The Chamber': 8.0,
 'The Lovely Bones: A Novel': 8.0,
 'The Pillars of the Earth': 10.0,
 'The Summons': 7.0,
 'The Surgeon': 7.0,
 "White Oleander : A Novel (Oprah's Book Club)": 8.0,
 'Winter Solstice': 6.0,
 '4 Blondes': 6.0,
 'A Heartbreaking Work of Staggering Genius': 10.0,
 "Bridget Jones's Diary": 

In [37]:
recommended_books_cs = sorted(recommended_books_cs.items(), key=lambda x: x[1], reverse=True)

In [38]:
recommended_books_cs

[('Chicken Soup for the Soul (Chicken Soup for the Soul)', 10.0),
 ('Harry Potter and the Chamber of Secrets (Book 2)', 10.0),
 ('Harry Potter and the Prisoner of Azkaban (Book 3)', 10.0),
 ("Harry Potter and the Sorcerer's Stone (Book 1)", 10.0),
 ("The Bonesetter's Daughter", 10.0),
 ('The Joy Luck Club', 10.0),
 ('The Tao of Pooh', 10.0),
 ('Girl with a Pearl Earring', 10.0),
 ('The Pillars of the Earth', 10.0),
 ('A Heartbreaking Work of Staggering Genius', 10.0),
 ("Dude, Where's My Country?", 10.0),
 ('Fried Green Tomatoes at the Whistle Stop Cafe', 10.0),
 ('The Little Prince', 10.0),
 ("The Kitchen God's Wife", 9.0),
 ('A Painted House', 9.0),
 ('Empire Falls', 9.0),
 ('Notes from a Small Island', 9.0),
 ('The Tale of the Body Thief (Vampire Chronicles (Paperback))', 9.0),
 ('The Witching Hour (Lives of the Mayfair Witches)', 9.0),
 ('Lord of the Flies', 8.0),
 ('The Firm', 8.0),
 ("From Potter's Field", 8.0),
 ('The Chamber', 8.0),
 ('The Lovely Bones: A Novel', 8.0),
 ("White

### **Top 5 Recommendations**

In [39]:
recommended_books_cs[:5]

[('Chicken Soup for the Soul (Chicken Soup for the Soul)', 10.0),
 ('Harry Potter and the Chamber of Secrets (Book 2)', 10.0),
 ('Harry Potter and the Prisoner of Azkaban (Book 3)', 10.0),
 ("Harry Potter and the Sorcerer's Stone (Book 1)", 10.0),
 ("The Bonesetter's Daughter", 10.0)]

## **Singular Value Decomposition (SVD) based Recommendations**

In [40]:
from surprise import Dataset, Reader, SVD
from surprise.model_selection import train_test_split
from surprise.accuracy import rmse

In [41]:
# Convert data to Surprise format
reader = Reader(rating_scale=(0, 10))
data = Dataset.load_from_df(final_filtered_data[["User-ID", "Book-Title", "Book-Rating"]], reader)

In [42]:
# Train-test split
trainset, testset = train_test_split(data, test_size=0.2)

In [43]:
# Train SVD model
model_svd = SVD(n_factors=50)
model_svd.fit(trainset)

<surprise.prediction_algorithms.matrix_factorization.SVD at 0x1765930dfa0>

In [44]:
# Evaluate model
predictions = model_svd.test(testset)
print("Model RMSE:", rmse(predictions))

RMSE: 3.5559
Model RMSE: 3.5559156560003897


In [45]:
# Recommend books for a specific user
all_books = final_filtered_data["Book-Title"].unique()
user_books = final_filtered_data[final_filtered_data["User-ID"] == user_id]["Book-Title"].tolist()

In [46]:
book_predictions = []
for book in all_books:
    if book not in user_books:  # Avoid recommending already rated books
        pred_rating = model_svd.predict(user_id, book).est
        book_predictions.append((book, pred_rating))

In [47]:
# Sort by predicted rating and display top recommendations
book_predictions.sort(key=lambda x: x[1], reverse=True)
recommendations = [book for book, rating in book_predictions[:5]]
print(f"SVD Recommendations for User {user_id}: {recommendations}")

SVD Recommendations for User 6563: ['The Hobbit : The Enchanting Prelude to The Lord of the Rings', 'The Da Vinci Code', 'Fast Food Nation: The Dark Side of the All-American Meal', 'STONES FROM THE RIVER', 'The Phantom Tollbooth']


## **Hybrid Recommendation System (SVD + KNN/Cosine Similarity)**

In [48]:
# Function to get top-N book recommendations for a user
def get_svd_recommendations(user_id, n=5):
    all_books = final_filtered_data['ISBN'].unique()
    user_rated_books = final_filtered_data[final_filtered_data['User-ID'] == user_id]['ISBN'].values
    books_to_predict = list(set(all_books) - set(user_rated_books))

    predictions = [(book, model_svd.predict(user_id, book).est) for book in books_to_predict]
    predictions.sort(key=lambda x: x[1], reverse=True)
    top_books = [book for book, _ in predictions[:n]]

    return top_books

In [49]:
get_svd_recommendations(user_id=user_id)

['0679403612', '0385296495', '0156028352', '0812967259', '0385312660']

In [50]:
# Build KNN Model (Item-based similarity)
def build_knn_model():
    book_pivot = final_filtered_data.pivot_table(index='ISBN', columns='User-ID', values='Book-Rating').fillna(0)
    knn = NearestNeighbors(metric='cosine', algorithm='brute')
    knn.fit(book_pivot.values)
    
    return knn, book_pivot

knn_model, book_pivot = build_knn_model()

In [51]:
# Function to find similar books using KNN
def get_knn_recommendations(book_isbn, n=5):
    if book_isbn not in book_pivot.index:
        return []

    book_idx = book_pivot.index.get_loc(book_isbn)
    distances, indices = knn_model.kneighbors([book_pivot.iloc[book_idx].values], n_neighbors=n+1)
    
    similar_books = book_pivot.iloc[indices.flatten()[1:]].index.tolist()
    return similar_books

In [52]:
# Hybrid recommendation function
def hybrid_recommendations(user_id, n=5):
    svd_books = get_svd_recommendations(user_id, n)
    final_recommendations = set(svd_books)

    for book in svd_books:
        knn_books = get_knn_recommendations(book, n=3)
        final_recommendations.update(knn_books)

    return list(final_recommendations)[:n]

hybrid_recommendation_booksID = hybrid_recommendations(user_id=user_id)

In [53]:
hybrid_recommendation_booksID

['1580601200', '0679403612', '0385297661', '0385296495', '0451516508']

In [54]:
print(f"Hybrid Recommendations for User {user_id} are:\n")
for i,id in enumerate(hybrid_recommendation_booksID):
    print(f"({i+1}) {final_filtered_data[final_filtered_data['ISBN'] == id]['Book-Title'].iloc[0]}")

Hybrid Recommendations for User 6563 are:

(1) Beloved
(2) Saint Maybe
(3) Daddy
(4) Zoya
(5) Wuthering Heights
