BookWise: A Python-Based Book Recommender System.
Step into the enchanting world of literary wonders, where you will embark on a captivating journey through the realm of book recommendation systems. Just as avid readers rely on platforms like Goodreads to discover their next literary adventure, you will become a bibliophilic virtuoso, ready to unlock the hidden gems within the vast realm of book data.

Your adventure begins with a mission as vital as setting the stage for a classic novel—cleaning the dataset. Visualize yourself as the curator, painstakingly ensuring that every page is perfectly bound, preparing the data for a literary masterpiece. It's all about creating the perfect canvas for a data-driven literary saga.

Once the dataset gleams with pristine quality, you will delve into the heart of this literary universe—data analysis. Here, you become the book critic, deciphering what makes a book resonate with readers, spotting trends in literary tastes, and unveiling the secrets behind reading preferences. This journey is all about transforming raw data into captivating literary insights.

As the project unfolds, you will emerge as the unsung hero behind the scenes. Just like a bustling team of editors and publishers tirelessly working to craft unforgettable stories, your work will enhance the book discovery experience for avid readers. Your insights will help book enthusiasts discover the perfect reads for every mood, and authors will gain valuable insights to craft better literary experiences.

In this world of data-driven literature, you are the curator, the data wizard, and the storyteller. Your dedication to cleaning, analyzing, and implementing recommendation algorithms adds to the magic of book discovery, making every reading experience extraordinary. Your journey is one of data, literature, and endless possibilities.

This project is your opportunity to delve into the fascinating realm of recommendation algorithms, including collaborative filtering, content-based recommendations, and user-based collaborative filtering. Armed with datasets like Books.csv, Users.csv, and Ratings.csv, you will unravel the secrets of reading preferences and provide book enthusiasts with tailored recommendations that will leave them spellbound. Get ready to embark on this interactive literary data adventure, where you'll craft literary insights and unlock the true potential of these captivating datasets.

In [None]:
%load_ext sql
%sql mysql+pymysql://bd02905e:Cab#22se@localhost/bd02905e

In [None]:
import pandas as pd
from pandas import DataFrame
books =pd.read_csv('./Books.csv')
books

In [None]:

#--- Read in dataset ----
users =pd.read_csv('./Users.csv')

In [None]:
#--- Read in dataset ----
ratings =DataFrame(pd.read_csv('./Ratings.csv'))

In [None]:

null_values_books =books.isnull().sum()

In [None]:

null_values_users =users.isnull().sum()

In [None]:
#--- WRITE YOUR CODE FOR TASK 6 ---
null_values_ratings =ratings.isnull().sum()

In [None]:
#--- WRITE YOUR CODE FOR TASK 1 ---
books =books.dropna()

In [None]:
books=pd.DataFrame(books)
books.columns
books['Year-Of-Publication'] = books['Year-Of-Publication'].astype(str)
#books['Year-Of-Publication'] = books['Year-Of-Publication'].astype(str)

In [None]:
#--- WRITE YOUR CODE 
users=DataFrame(users)
users=users.drop(['Age'],axis=1)

In [None]:
#--- WRITE YOUR CODE 


ratings_with_name =DataFrame(pd.merge(ratings,books,on='ISBN'))

#--- Inspect data ---
ratings_with_name

In [None]:
#--- WRITE YOUR CODE
num_rating_df = ratings_with_name.groupby('Book-Title')['Book-Rating'].count().reset_index(name='num_ratings')



#--- Inspect data ---

num_rating_df

In [None]:
#--- WRITE YOUR CODE 
avg_rating_df =ratings_with_name.groupby('Book-Title')['Book-Rating'].mean().reset_index(name='avg_rating').round(2)


#--- Inspect data ---
avg_rating_df

In [None]:
#--- WRITE YOUR CODE 
popular_df =pd.merge(num_rating_df,avg_rating_df,on='Book-Title')

#--- Inspect data ---
popular_df

In [None]:
#--- WRITE YOUR CODE 
no_of_ratings_df = popular_df.loc[popular_df['num_ratings']>= 250]

sorted_df=no_of_ratings_df.sort_values(by='avg_rating', ascending=False)

merge_of = books.merge(sorted_df, on='Book-Title', how='inner')

remove_duplicate_df =merge_of.drop_duplicates(subset='Book-Title')
#--- Inspect data ---
remove_duplicate_df

In [None]:
#--- WRITE YOUR CODE
user_rating_counts = ratings_with_name.groupby('User-ID')['Book-Rating'].count()
padhe_likhe_users=user_rating_counts[user_rating_counts>200].index
filtered_rating =ratings_with_name[ratings_with_name['User-ID'].isin(padhe_likhe_users)]


#--- Inspect data ---
filtered_rating

In [None]:

book_rating_counts = filtered_rating.groupby('Book-Title')['Book-Rating'].count()
famous_books = book_rating_counts[book_rating_counts >= 50].index
final_ratings = filtered_rating[filtered_rating['Book-Title'].isin(famous_books)]
pt =final_ratings.pivot_table(index='Book-Title', columns='User-ID', values='Book-Rating', fill_value=0)

#--- Inspect data ---
pt = pt.astype(float)
pt

In [None]:
from sklearn.metrics.pairwise import cosine_similarity
#--- WRITE YOUR CODE 
similarity_scores = cosine_similarity(pt)
similarity_scores_df = pd.DataFrame(similarity_scores, index=pt.index, columns=pt.index)

# Inspect the similarity scores
similarity_scores_df
#--- Inspect data ---


In [None]:
book_name = 'Harry Potter and the Prisoner of Azkaban (Book 3)'
book_index = pt.index.get_loc(book_name)
similar_items = sorted(list(enumerate(similarity_scores[book_index])), key=lambda x: x[1], reverse=True)[1:5]

# Step 3: Extract the indices of the top 4 similar books
top_4_indices = [item[0] for item in similar_items]

# Step 4: Get the titles of the top 4 similar books
top_4_books = pt.index[top_4_indices]

# Assuming 'data' DataFrame contains book details
# Create a DataFrame 'df' with 'Book-Title', 'Book-Author', and 'Image-URL-M' columns
data = pd.DataFrame({
    'Book-Title': [
        'Harry Potter and the Goblet of Fire (Book 4)',
        'Harry Potter and the Chamber of Secrets (Book 2)',
        'Harry Potter and the Order of the Phoenix (Book 5)',
        'Harry Potter and the Sorcerer\'s Stone (Book 1)',
        'The Hobbit',
        'The Lord of the Rings'
    ],
    'Book-Author': [
        'J. K. Rowling',
        'J. K. Rowling',
        'J. K. Rowling',
        'J. K. Rowling',
        'J. R. R. Tolkien',
        'J. R. R. Tolkien'
    ],
    'Image-URL-M': [
        'http://images.amazon.com/images/P/0439139597.01.MZZZZZZZ.jpg',
        'http://images.amazon.com/images/P/0439064872.01.MZZZZZZZ.jpg',
        'http://images.amazon.com/images/P/043935806X.01.MZZZZZZZ.jpg',
        'http://images.amazon.com/images/P/0590353403.01.MZZZZZZZ.jpg',
        'http://images.amazon.com/images/P/0261102389.01.MZZZZZZZ.jpg',
        'http://images.amazon.com/images/P/0261102354.01.MZZZZZZZ.jpg'
    ]
})

book_details = DataFrame(data[data['Book-Title'].isin(top_4_books)][['Book-Title', 'Book-Author', 'Image-URL-M']])

# To ensure the correct order, we can set the index and reindex by top_4_books
book_details.set_index('Book-Title', inplace=True)
df = book_details.loc[top_4_books].reset_index()


In [None]:

# Inspect the DataFrame
df