# Item-Based Recommender

Using the book dataset, the function `item_based_recommender` returns the `n` most similar books to the `isnb`.

The cosine similarity is applied to all book pairs in which both books have been rated by at least 5 users.

In [15]:
import pandas as pd
from sklearn.metrics.pairwise import cosine_similarity

In [None]:
books = pd.read_csv("../data/books/clean/books.csv", dtype="object")
ratings = pd.read_csv("../data/books/clean/ratings.csv", dtype="object")

## Create item-based recommender

In [19]:
# Create a user-item matrix
user_item_matrix = pd.pivot_table(
    data=ratings, values="Book-Rating", index="User-ID", columns="ISBN", fill_value=0
)

# Compute the cosine similarity between the books
items_cosines_matrix = pd.DataFrame(
    cosine_similarity(user_item_matrix.T), columns=user_item_matrix.columns,
    index=user_item_matrix.columns
)

The main function is below.

In [21]:
def item_based_recommender(isbn, n):
    """
    Recommends the n most similar books for a giving book

    Parameters
    ----------
    isbn : str
        ISBN of the book to recommend similar books for

    n : integer
        Number of books to recommend

    Returns
    -------
    pd.DataFrame
        DataFrame with the top n most similar books
    """
    # Create a DataFrame using the values from 'items_cosines_matrix' for the requested book
    target_cosines_df = pd.DataFrame(items_cosines_matrix[isbn])

    # Rename the column 'isbn' to 'target_cosine'
    target_cosines_df = target_cosines_df.rename(columns={isbn: 'target_cosine'})

    # Remove the row with the index 'isbn'
    target_cosines_df = target_cosines_df[target_cosines_df.index != isbn]

    # Sort the 'target_cosines_df' by the column 'target_cosine' column in descending order
    target_cosines_df = target_cosines_df.sort_values(by="target_cosine", ascending=False)

    # Find out the number of users rated both the target book and the other book
    no_of_users_rated_both_books = [sum((user_item_matrix[isbn] > 0) &
                                        (user_item_matrix[other_isbn] > 0))
                                    for other_isbn in target_cosines_df.index]

    # Create a column for the number of users who rated the target book and the other book
    target_cosines_df['users_who_rated_both_books'] = no_of_users_rated_both_books

    # Remove recommendations that have less than 10 users who rated both books
    target_cosines_df = target_cosines_df[target_cosines_df["users_who_rated_both_books"] > 5]

    # Combine with the titles and authors
    target_top_cosine = target_cosines_df.head(n).reset_index().merge(books, how='left')

    # Selecting specific columns from the merged DataFrame to include in the final result
    target_top_cosine = target_top_cosine[[
        "ISBN",
        "Book-Title",
        "Book-Author",
        "Year-Of-Publication",
    ]]

    return target_top_cosine

Example usage to obtain the top 10 most similar books in the dataset.

In [22]:
harry_potter_book2_isbn = "0439064872"

item_based_recommender(harry_potter_book2_isbn, 10)

Unnamed: 0,ISBN,Book-Title,Book-Author,Year-Of-Publication
0,059035342X,Harry Potter and the Sorcerer's Stone (Harry P...,J. K. Rowling,1999
1,043935806X,Harry Potter and the Order of the Phoenix (Boo...,J. K. Rowling,2003
2,0439139597,Harry Potter and the Goblet of Fire (Book 4),J. K. Rowling,2000
3,0439136350,Harry Potter and the Prisoner of Azkaban (Book 3),J. K. Rowling,1999
4,0064400557,Charlotte's Web (Trophy Newbery),E. B. White,1974
5,0553284789,F Is for Fugitive (Kinsey Millhone Mysteries (...,Sue Grafton,1990
6,0345339703,The Fellowship of the Ring (The Lord of the Ri...,J.R.R. TOLKIEN,1986
7,0060928336,Divine Secrets of the Ya-Ya Sisterhood: A Novel,Rebecca Wells,1997
8,0842329129,Left Behind: A Novel of the Earth's Last Days ...,Tim Lahaye,1996
9,0553280341,B Is for Burglar (Kinsey Millhone Mysteries (P...,Sue Grafton,1986
