# **Project 2: Building a Recommender System**
> **Philip Corrado and John Rempe**



Instructor: Dr. Binod Rimal, Department of Mathematics, UTampa

Course: DSC 201

Due: Tuesday November 5,  11:59 pm




**Where is the data from?**

https://www.kaggle.com/datasets/arashnic/book-recommendation-dataset?select=Users.csv

Kaggle, a reliable source for data.

## **Project Introduction**
Our goal for this project is to take the book recommendation dataset from Kaggle and operate the following components of the project:

1. Clean/Preprocess Data in preparation for recommender systems
2. Build a Content-Based Recommender System for books
3. Build a Collaborative-Based Recommender System for books


## **Part 1: Data Cleaning/Preprocessing**

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from google.colab import drive

In [None]:
drive.mount('/content/drive')
books = pd.read_csv('/content/drive/MyDrive/DSC_201/Books.csv')
books.head()

Mounted at /content/drive


  books = pd.read_csv('/content/drive/MyDrive/DSC_201/Books.csv')


Unnamed: 0,ISBN,Book-Title,Book-Author,Year-Of-Publication,Publisher,Image-URL-S,Image-URL-M,Image-URL-L
0,195153448,Classical Mythology,Mark P. O. Morford,2002,Oxford University Press,http://images.amazon.com/images/P/0195153448.0...,http://images.amazon.com/images/P/0195153448.0...,http://images.amazon.com/images/P/0195153448.0...
1,2005018,Clara Callan,Richard Bruce Wright,2001,HarperFlamingo Canada,http://images.amazon.com/images/P/0002005018.0...,http://images.amazon.com/images/P/0002005018.0...,http://images.amazon.com/images/P/0002005018.0...
2,60973129,Decision in Normandy,Carlo D'Este,1991,HarperPerennial,http://images.amazon.com/images/P/0060973129.0...,http://images.amazon.com/images/P/0060973129.0...,http://images.amazon.com/images/P/0060973129.0...
3,374157065,Flu: The Story of the Great Influenza Pandemic...,Gina Bari Kolata,1999,Farrar Straus Giroux,http://images.amazon.com/images/P/0374157065.0...,http://images.amazon.com/images/P/0374157065.0...,http://images.amazon.com/images/P/0374157065.0...
4,393045218,The Mummies of Urumchi,E. J. W. Barber,1999,W. W. Norton &amp; Company,http://images.amazon.com/images/P/0393045218.0...,http://images.amazon.com/images/P/0393045218.0...,http://images.amazon.com/images/P/0393045218.0...


In [None]:
#read books data (PHIL'S DIRECTORY)

# data_path = "/content/drive/MyDrive/ColabNotebooks/Books.csv"

# books = pd.read_csv(data_path)
# books.head()

In [None]:
books.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 271360 entries, 0 to 271359
Data columns (total 8 columns):
 #   Column               Non-Null Count   Dtype 
---  ------               --------------   ----- 
 0   ISBN                 271360 non-null  object
 1   Book-Title           271360 non-null  object
 2   Book-Author          271358 non-null  object
 3   Year-Of-Publication  271360 non-null  object
 4   Publisher            271358 non-null  object
 5   Image-URL-S          271360 non-null  object
 6   Image-URL-M          271360 non-null  object
 7   Image-URL-L          271357 non-null  object
dtypes: object(8)
memory usage: 16.6+ MB


All features are object data type, use .apply() to change data type.

In [None]:
#this function converts the data type of a feature to string
def to_string(value):
    try:
        return str(value)
    except ValueError:
        return str('nan')

In [None]:
#convert Book Title to string
books['Book-Title'] = books['Book-Title'].apply(to_string)

#convert Book Author to string
books['Book-Author'] = books['Book-Author'].apply(to_string)

#convert Publisher to string
books['Publisher'] = books['Publisher'].apply(to_string)

#convert the image URLs to strings
books['Image-URL-S'] = books['Image-URL-S'].apply(to_string)
books['Image-URL-M'] = books['Image-URL-M'].apply(to_string)
books['Image-URL-L'] = books['Image-URL-L'].apply(to_string)

In [None]:
#convert ISBN and Year of Publication to integer data type
books['ISBN'] = pd.to_numeric(books['ISBN'], errors='coerce').astype('Int64')
books['Year-Of-Publication'] = pd.to_numeric(books['Year-Of-Publication'], errors='coerce').astype('Int64')

In [None]:
#change names of columns
books.rename(columns={'Book-Title':'Title', 'Book-Author':'Author', 'Year-Of-Publication':'PublicationYear'}, inplace=True)

In [None]:
books = books.sort_values('ISBN')
books

Unnamed: 0,ISBN,Title,Author,PublicationYear,Publisher,Image-URL-S,Image-URL-M,Image-URL-L
254249,913154,The Way Things Work: An Illustrated Encycloped...,C. van Amerongen (translator),1967,Simon &amp; Schuster,http://images.amazon.com/images/P/0000913154.0...,http://images.amazon.com/images/P/0000913154.0...,http://images.amazon.com/images/P/0000913154.0...
215806,1010565,Mog's Christmas,Judith Kerr,1992,Collins,http://images.amazon.com/images/P/0001010565.0...,http://images.amazon.com/images/P/0001010565.0...,http://images.amazon.com/images/P/0001010565.0...
42562,1046438,Liar,Stephen Fry,0,Harpercollins Uk,http://images.amazon.com/images/P/0001046438.0...,http://images.amazon.com/images/P/0001046438.0...,http://images.amazon.com/images/P/0001046438.0...
112555,1046713,Twopence to Cross the Mersey,Helen Forrester,1992,HarperCollins Publishers,http://images.amazon.com/images/P/0001046713.0...,http://images.amazon.com/images/P/0001046713.0...,http://images.amazon.com/images/P/0001046713.0...
146193,1046934,The Prime of Miss Jean Brodie,Muriel Spark,1999,Trafalgar Square Publishing,http://images.amazon.com/images/P/0001046934.0...,http://images.amazon.com/images/P/0001046934.0...,http://images.amazon.com/images/P/0001046934.0...
...,...,...,...,...,...,...,...,...
271323,,You Got an Ology,Maureen Lipman,1990,HarperCollins Publishers,http://images.amazon.com/images/P/000637610X.0...,http://images.amazon.com/images/P/000637610X.0...,http://images.amazon.com/images/P/000637610X.0...
271335,,"Ein Fall fÃ?Â¼r TKKG, Bd.50, Sklaven fÃ?Â¼r Wu...",Stefan Wolf,1989,Pelikan,http://images.amazon.com/images/P/381440176X.0...,http://images.amazon.com/images/P/381440176X.0...,http://images.amazon.com/images/P/381440176X.0...
271343,,The Unified Modeling Language Reference Manual...,James Rumbaugh,1998,Addison-Wesley Professional,http://images.amazon.com/images/P/020130998X.0...,http://images.amazon.com/images/P/020130998X.0...,http://images.amazon.com/images/P/020130998X.0...
271353,,Anti Death League,Kingsley Amis,1975,Viking Press,http://images.amazon.com/images/P/014002803X.0...,http://images.amazon.com/images/P/014002803X.0...,http://images.amazon.com/images/P/014002803X.0...


In [None]:
#drop duplicate books
books = books.drop_duplicates(subset=['ISBN'])

In [None]:
#reset index
books = books.reset_index(drop=True)

In [None]:
users = pd.read_csv('/content/drive/MyDrive/DSC_201/Users.csv')
users.head()

Unnamed: 0,User-ID,Location,Age
0,1,"nyc, new york, usa",
1,2,"stockton, california, usa",18.0
2,3,"moscow, yukon territory, russia",
3,4,"porto, v.n.gaia, portugal",17.0
4,5,"farnborough, hants, united kingdom",


In [None]:
# #read users data (PHIL'S DIRECTORY)

# data_path = "/content/drive/MyDrive/ColabNotebooks/Users.csv"

# users = pd.read_csv(data_path)
# users.head()

In [None]:
ratings = pd.read_csv('/content/drive/MyDrive/DSC_201/Ratings.csv')
ratings.head()

Unnamed: 0,User-ID,ISBN,Book-Rating
0,276725,034545104X,0
1,276726,0155061224,5
2,276727,0446520802,0
3,276729,052165615X,3
4,276729,0521795028,6


In [None]:
# #read ratings data (PHIL'S DIRECTORY)

# data_path = "/content/drive/MyDrive/ColabNotebooks/Ratings.csv"

# ratings = pd.read_csv(data_path)
# ratings.head()

## **Part 2: Content-Based Recommender System**


In [None]:
# Creating a new column with column name 'combined_features'
# which takes the value/information from selected features.

def combined_features(row):
    return (
        str(row['ISBN']) + " " +
        str(row['Title']) + " " +
        str(row['Author']) + " " +
        str(row['PublicationYear']) + " " +
        str(row['Publisher'])
    )

In [None]:
#now, we added an additional row called cobined features
books["combined_features"] = books.apply(combined_features, axis =1)
books.head()

Unnamed: 0,ISBN,Title,Author,PublicationYear,Publisher,Image-URL-S,Image-URL-M,Image-URL-L,combined_features
0,913154,The Way Things Work: An Illustrated Encycloped...,C. van Amerongen (translator),1967,Simon &amp; Schuster,http://images.amazon.com/images/P/0000913154.0...,http://images.amazon.com/images/P/0000913154.0...,http://images.amazon.com/images/P/0000913154.0...,913154 The Way Things Work: An Illustrated Enc...
1,1010565,Mog's Christmas,Judith Kerr,1992,Collins,http://images.amazon.com/images/P/0001010565.0...,http://images.amazon.com/images/P/0001010565.0...,http://images.amazon.com/images/P/0001010565.0...,1010565 Mog's Christmas Judith Kerr 1992 Collins
2,1046438,Liar,Stephen Fry,0,Harpercollins Uk,http://images.amazon.com/images/P/0001046438.0...,http://images.amazon.com/images/P/0001046438.0...,http://images.amazon.com/images/P/0001046438.0...,1046438 Liar Stephen Fry 0 Harpercollins Uk
3,1046713,Twopence to Cross the Mersey,Helen Forrester,1992,HarperCollins Publishers,http://images.amazon.com/images/P/0001046713.0...,http://images.amazon.com/images/P/0001046713.0...,http://images.amazon.com/images/P/0001046713.0...,1046713 Twopence to Cross the Mersey Helen For...
4,1046934,The Prime of Miss Jean Brodie,Muriel Spark,1999,Trafalgar Square Publishing,http://images.amazon.com/images/P/0001046934.0...,http://images.amazon.com/images/P/0001046934.0...,http://images.amazon.com/images/P/0001046934.0...,1046934 The Prime of Miss Jean Brodie Muriel S...


In [None]:
books.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 249027 entries, 0 to 249026
Data columns (total 9 columns):
 #   Column             Non-Null Count   Dtype 
---  ------             --------------   ----- 
 0   ISBN               249026 non-null  Int64 
 1   Title              249027 non-null  object
 2   Author             249027 non-null  object
 3   PublicationYear    249025 non-null  Int64 
 4   Publisher          249027 non-null  object
 5   Image-URL-S        249027 non-null  object
 6   Image-URL-M        249027 non-null  object
 7   Image-URL-L        249027 non-null  object
 8   combined_features  249027 non-null  object
dtypes: Int64(2), object(7)
memory usage: 17.6+ MB


In [None]:
#import necessary libraries
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity

In [None]:
#initialize CountVectorizer and create CountMatrix
# Vectorize the content
tfidf = TfidfVectorizer(stop_words='english')
CountMatrix = tfidf.fit_transform(books["combined_features"][0:5000])

In [None]:
CountMatrix.shape

(5000, 14789)

In [None]:
cosine_sim = cosine_similarity(CountMatrix)
cosine_sim

array([[1.        , 0.        , 0.        , ..., 0.01481915, 0.01308978,
        0.        ],
       [0.        , 1.        , 0.        , ..., 0.        , 0.        ,
        0.        ],
       [0.        , 0.        , 1.        , ..., 0.        , 0.        ,
        0.01030515],
       ...,
       [0.01481915, 0.        , 0.        , ..., 1.        , 0.09023107,
        0.        ],
       [0.01308978, 0.        , 0.        , ..., 0.09023107, 1.        ,
        0.        ],
       [0.        , 0.        , 0.01030515, ..., 0.        , 0.        ,
        1.        ]])

In [None]:
book_title = input("Enter your book: ")

Enter your book: Liar


In [None]:
def get_index_from_title(title):
    return books[books.Title == title].index.values[0]

In [None]:
book_index=get_index_from_title(book_title)
book_index

np.int64(2)

In [None]:
def get_title_from_index(index):
    return books[books.index == index]["Title"].values[0]

In [None]:
get_title_from_index(book_index)

'Liar'

In [None]:
pairwise_cosine= list(enumerate(cosine_sim[book_index]))

In [None]:
# idea is to calculate the cosine similarity with only one vector with rest at a time

# pairwise_cosine = cosine_similarity(CountMatrix[book_index], CountMatrix[~book_index] )

In [None]:
pairwise_cosine[0:10]

[(0, np.float64(0.0)),
 (1, np.float64(0.0)),
 (2, np.float64(1.0000000000000002)),
 (3, np.float64(0.00985586622595097)),
 (4, np.float64(0.0)),
 (5, np.float64(0.011593072121428804)),
 (6, np.float64(0.0)),
 (7, np.float64(0.0)),
 (8, np.float64(0.0)),
 (9, np.float64(0.0))]

In [None]:
# Selecting first 25 books with highest cosine value
sorted_cosine = sorted(pairwise_cosine,
                               key=lambda x:x[1],
                               reverse=True)[0:26]

In [None]:
num_recomendations = int(input("Enter number of recomendations: "))

Enter number of recomendations: 10


In [None]:
print("Books similar to", book_title, "are")
print("*********************************************")
i=0
for book in sorted_cosine[1:]:

    print(get_title_from_index(book[0]))
    i=i+1
    if i>num_recomendations:
        break
print("*********************************************")

Books similar to Liar are
*********************************************
One Tree
Scorpio Illusion Uk
In Her Defense
Traces
Stephen and Violet
Card Games (Collins Gem)
Origami (Collins Gems Series)
Helliconia
CARETAKERS
The Bottle Boy
Fourth Estate
*********************************************


## **Part 3: Collaborative-Based Recommender System**


In [None]:
from scipy.sparse.linalg import svds

In [None]:
#first, we must merge the dataframes on User-ID into a new DF
df = pd.merge(users, ratings, on='User-ID')

In [None]:
df.head()

Unnamed: 0,User-ID,Location,Age,ISBN,Book-Rating
0,2,"stockton, california, usa",18.0,195153448,0
1,7,"washington, dc, usa",,34542252,0
2,8,"timmins, ontario, canada",,2005018,5
3,8,"timmins, ontario, canada",,60973129,0
4,8,"timmins, ontario, canada",,374157065,0


In [None]:
#first, we must merge the dataframes on User-ID into a new DF
df = pd.merge(users, ratings, on='User-ID')

# Convert 'ISBN' column in both DataFrames to a common type, e.g., string
df['ISBN'] = df['ISBN'].astype(str)
books['ISBN'] = books['ISBN'].astype(str)

# Now perform the merge
df = pd.merge(df, books, on='ISBN')

In [None]:
df = df.sort_values('User-ID')

In [None]:
df['ISBN'] = pd.to_numeric(df['ISBN'], errors='coerce').astype('Int64')

In [None]:
df.head(20)

Unnamed: 0,User-ID,Location,Age,ISBN,Book-Rating,Title,Author,PublicationYear,Publisher,Image-URL-S,Image-URL-M,Image-URL-L,combined_features
4,8,"timmins, ontario, canada",,1881320189,7,Goodbye to the Buttermilk Sky,Julia Oliver,1994,River City Pub,http://images.amazon.com/images/P/1881320189.0...,http://images.amazon.com/images/P/1881320189.0...,http://images.amazon.com/images/P/1881320189.0...,1881320189 Goodbye to the Buttermilk Sky Julia...
1,8,"timmins, ontario, canada",,1558746218,0,A Second Chicken Soup for the Woman's Soul (Ch...,Jack Canfield,1998,Health Communications,http://images.amazon.com/images/P/1558746218.0...,http://images.amazon.com/images/P/1558746218.0...,http://images.amazon.com/images/P/1558746218.0...,1558746218 A Second Chicken Soup for the Woman...
2,8,"timmins, ontario, canada",,1567407781,6,The Witchfinder (Amos Walker Mystery Series),Loren D. Estleman,1998,Brilliance Audio - Trade,http://images.amazon.com/images/P/1567407781.0...,http://images.amazon.com/images/P/1567407781.0...,http://images.amazon.com/images/P/1567407781.0...,1567407781 The Witchfinder (Amos Walker Myster...
3,8,"timmins, ontario, canada",,1575663937,6,More Cunning Than Man: A Social History of Rat...,Robert Hendrickson,1999,Kensington Publishing Corp.,http://images.amazon.com/images/P/1575663937.0...,http://images.amazon.com/images/P/1575663937.0...,http://images.amazon.com/images/P/1575663937.0...,1575663937 More Cunning Than Man: A Social His...
0,8,"timmins, ontario, canada",,1552041778,5,Jane Doe,R. J. Kaiser,1999,Mira Books,http://images.amazon.com/images/P/1552041778.0...,http://images.amazon.com/images/P/1552041778.0...,http://images.amazon.com/images/P/1552041778.0...,1552041778 Jane Doe R. J. Kaiser 1999 Mira Books
5,10,"albacete, wisconsin, spain",26.0,1841721522,0,New Vegetarian: Bold and Beautiful Recipes for...,Celia Brooks Brown,2001,Ryland Peters &amp; Small Ltd,http://images.amazon.com/images/P/1841721522.0...,http://images.amazon.com/images/P/1841721522.0...,http://images.amazon.com/images/P/1841721522.0...,1841721522 New Vegetarian: Bold and Beautiful ...
6,12,"fort bragg, california, usa",,1879384493,10,If I'd Known Then What I Know Now: Why Not Lea...,J. R. Parrish,2003,Cypress House,http://images.amazon.com/images/P/1879384493.0...,http://images.amazon.com/images/P/1879384493.0...,http://images.amazon.com/images/P/1879384493.0...,1879384493 If I'd Known Then What I Know Now: ...
7,22,"erfurt, thueringen, germany",,3404921038,7,Wie Barney es sieht.,Mordecai Richler,2002,LÃ?Â¼bbe,http://images.amazon.com/images/P/3404921038.0...,http://images.amazon.com/images/P/3404921038.0...,http://images.amazon.com/images/P/3404921038.0...,3404921038 Wie Barney es sieht. Mordecai Richl...
9,22,"erfurt, thueringen, germany",,3442410665,0,Sturmzeit. Roman.,Charlotte Link,1991,Goldmann,http://images.amazon.com/images/P/3442410665.0...,http://images.amazon.com/images/P/3442410665.0...,http://images.amazon.com/images/P/3442410665.0...,3442410665 Sturmzeit. Roman. Charlotte Link 19...
10,22,"erfurt, thueringen, germany",,3442446937,0,Tage der Unschuld.,Richard North Patterson,2000,Goldmann,http://images.amazon.com/images/P/3442446937.0...,http://images.amazon.com/images/P/3442446937.0...,http://images.amazon.com/images/P/3442446937.0...,3442446937 Tage der Unschuld. Richard North Pa...


In [None]:
df = df[0:5000]

In [None]:
df

Unnamed: 0,User-ID,Location,Age,ISBN,Book-Rating,Title,Author,PublicationYear,Publisher,Image-URL-S,Image-URL-M,Image-URL-L,combined_features
4,8,"timmins, ontario, canada",,1881320189,7,Goodbye to the Buttermilk Sky,Julia Oliver,1994,River City Pub,http://images.amazon.com/images/P/1881320189.0...,http://images.amazon.com/images/P/1881320189.0...,http://images.amazon.com/images/P/1881320189.0...,1881320189 Goodbye to the Buttermilk Sky Julia...
1,8,"timmins, ontario, canada",,1558746218,0,A Second Chicken Soup for the Woman's Soul (Ch...,Jack Canfield,1998,Health Communications,http://images.amazon.com/images/P/1558746218.0...,http://images.amazon.com/images/P/1558746218.0...,http://images.amazon.com/images/P/1558746218.0...,1558746218 A Second Chicken Soup for the Woman...
2,8,"timmins, ontario, canada",,1567407781,6,The Witchfinder (Amos Walker Mystery Series),Loren D. Estleman,1998,Brilliance Audio - Trade,http://images.amazon.com/images/P/1567407781.0...,http://images.amazon.com/images/P/1567407781.0...,http://images.amazon.com/images/P/1567407781.0...,1567407781 The Witchfinder (Amos Walker Myster...
3,8,"timmins, ontario, canada",,1575663937,6,More Cunning Than Man: A Social History of Rat...,Robert Hendrickson,1999,Kensington Publishing Corp.,http://images.amazon.com/images/P/1575663937.0...,http://images.amazon.com/images/P/1575663937.0...,http://images.amazon.com/images/P/1575663937.0...,1575663937 More Cunning Than Man: A Social His...
0,8,"timmins, ontario, canada",,1552041778,5,Jane Doe,R. J. Kaiser,1999,Mira Books,http://images.amazon.com/images/P/1552041778.0...,http://images.amazon.com/images/P/1552041778.0...,http://images.amazon.com/images/P/1552041778.0...,1552041778 Jane Doe R. J. Kaiser 1999 Mira Books
...,...,...,...,...,...,...,...,...,...,...,...,...,...
4239,11676,"n/a, n/a, n/a",,1565843428,9,Working: People Talk About What They Do All Da...,Studs Terkel,2004,New Press,http://images.amazon.com/images/P/1565843428.0...,http://images.amazon.com/images/P/1565843428.0...,http://images.amazon.com/images/P/1565843428.0...,1565843428 Working: People Talk About What The...
4240,11676,"n/a, n/a, n/a",,1565920872,10,Linux Network Administrator's Guide,Olaf Kirch,1994,O'Reilly,http://images.amazon.com/images/P/1565920872.0...,http://images.amazon.com/images/P/1565920872.0...,http://images.amazon.com/images/P/1565920872.0...,1565920872 Linux Network Administrator's Guide...
4241,11676,"n/a, n/a, n/a",,1565923715,0,Java Examples in A Nutshell,David Flanagan,1997,O'Reilly,http://images.amazon.com/images/P/1565923715.0...,http://images.amazon.com/images/P/1565923715.0...,http://images.amazon.com/images/P/1565923715.0...,1565923715 Java Examples in A Nutshell David F...
4242,11676,"n/a, n/a, n/a",,1566193087,10,Wuthering Heights,Emily Bronte,1994,Dorset Press,http://images.amazon.com/images/P/1566193087.0...,http://images.amazon.com/images/P/1566193087.0...,http://images.amazon.com/images/P/1566193087.0...,1566193087 Wuthering Heights Emily Bronte 1994...


In [None]:
# Create a user-item rating matrix
user_book_ratings = df.pivot(index='User-ID',columns='ISBN',values='Book-Rating').fillna(0)

In [None]:
user_book_ratings.head()

ISBN,345245504,374237131,380976587,385504209,394700031,446611778,449911004,679444815,684717972,1400000408,...,9875500526,9875500534,9879397274,9879630130,9960340112,9972847012,9974560004,9974643058,9997511417,9997522052
User-ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
8,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
10,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
12,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
22,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
64,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [None]:
user_book_ratings.index

Index([    8,    10,    12,    22,    64,    68,    69,    70,    75,    79,
       ...
       11529, 11577, 11601, 11621, 11624, 11629, 11638, 11652, 11659, 11676],
      dtype='int64', name='User-ID', length=1067)

In [None]:
#  Calculate the mean rating of each user
user_book_mean = user_book_ratings.mean(axis=1)

In [None]:
# Normalize the user-item rating matrix by subtracting the mean rating of each user
user_book_ratings_normalized = user_book_ratings.sub(user_book_mean, axis=0)

In [None]:
# Convert the DataFrame to a NumPy array
user_book_ratings_array = user_book_ratings_normalized.values

In [None]:
# Perform Singular Value Decomposition (SVD)
U, sigma, Vt = svds(user_book_ratings_array, k=75)

In [None]:
# Convert sigma to a diagonal matrix
sigma_diag_matrix = np.diag(sigma)

In [None]:
# Reconstruct the predicted ratings
predicted_ratings = np.dot(np.dot(U, sigma_diag_matrix), Vt) + user_book_mean.values.reshape(-1, 1)

In [None]:
predicted_ratings

array([[-3.66291025e-03,  8.36394008e-03,  8.36394008e-03, ...,
         9.15019613e-04,  8.36394008e-03,  8.36394008e-03],
       [ 6.31717960e-17,  1.93795290e-19,  1.93795290e-19, ...,
        -8.30226493e-18,  1.93795290e-19,  1.93795290e-19],
       [-9.44089491e-04,  3.41394708e-03,  3.41394708e-03, ...,
         3.68993739e-04,  3.41394708e-03,  3.41394708e-03],
       ...,
       [-2.70776452e-04,  3.69679972e-04,  3.69679972e-04, ...,
         8.62001375e-05,  3.69679972e-04,  3.69679972e-04],
       [ 7.56807131e-18,  4.61518950e-20,  4.61518950e-20, ...,
         1.94723512e-19,  4.61518950e-20,  4.61518950e-20],
       [-6.46861291e-04,  1.21832090e-03,  1.21832090e-03, ...,
         6.99914986e+00,  1.21832090e-03,  1.21832090e-03]])

In [None]:
# Create a DataFrame for predicted ratings
predicted_ratings_df = pd.DataFrame(predicted_ratings,
                                    columns=user_book_ratings.columns,
                                    index=user_book_ratings.index)

In [None]:
# Define a function to get movie recommendations for a given user
def get_book_recommendations_svd(user_id, num_recommendations):

    # Get the user's predicted ratings
    user_predicted_ratings = predicted_ratings_df.loc[user_id]

    # Find movies that the user has not already rated
    unrated_books = user_book_ratings.loc[user_id][user_book_ratings.loc[user_id] == 0].index

    # Get the predicted ratings for those unrated movies and sort them in descending order
    top_rated_books = user_predicted_ratings[unrated_books].sort_values(ascending=False).index

    # Choose the top k movies for recommendation
    recommended_books = top_rated_books[:num_recommendations]

    print(f"Recommended books for user {user_id}:")
    for isbn in recommended_books:
        book_title = df[df['ISBN'] == isbn]['Title'].values[0]
        print(f"Book ISBN: {isbn}, Title: {book_title}")

    return print(f"\n Enjoy reading recommended books!!!")

In [None]:
user_id = int(input("Enter user ID: "))
num_recommendations = int(input("Enter number of recomendations: "))
get_book_recommendations_svd(user_id, num_recommendations)

Enter user ID: 8
Enter number of recomendations: 10
Recommended books for user 8:
Book ISBN: 1560980087, Title: Bertha Lum (American Printmakers)
Book ISBN: 1860462995, Title: The Swan: A Novel
Book ISBN: 1558534245, Title: Sailing on the Ice: And Other Stories from the Old Squire's Farm
Book ISBN: 1573229350, Title: Best Friends
Book ISBN: 1892065444, Title: Cats In Cyberspace
Book ISBN: 1551665999, Title: Driving Lessons (Mira)
Book ISBN: 1931520038, Title: The Mount: A Novel
Book ISBN: 1558612815, Title: Allegra Maud Goldman
Book ISBN: 2070388905, Title: Comme un roman
Book ISBN: 2070408507, Title: Le Petit Prince

 Enjoy reading recommended books!!!
