# Popularity Recommender

Using the books dataset, the function `popularity_recommender` returns the `n` most popular books.

The popularity is determined by a minimum number of user ratings (50 for now).
To provide diverse results, only one recommendation per author is provided. (This is mostly due to the Harry Potter and Lord of the Rings franchises dominating the popularity ratings.)

In [1]:
import pandas as pd

In [3]:
books = pd.read_csv("../data/books/clean/books.csv", dtype="object")
ratings = pd.read_csv("../data/books/clean/ratings.csv", dtype="object")

## Create popularity recommender

In [5]:
# Create a minimalistic DataFrame containing the mean and count of ratings
rating_count = ratings.drop(columns="User-ID")
rating_count = rating_count.groupby('ISBN')['Book-Rating'].agg(['mean', 'count']).reset_index()

The main function is below.

In [6]:
def popularity_recommender(n):
    """
    Recommends the n most popular books.

    Parameters
    ----------
    n : integer
        Number of books to recommend.

    Returns
    -------
    pd.DataFrame
        DataFrame with the top n most popular books.
    """
    count_threshold = 50

    # Get the most rated books above a rating count threshold 
    mask = rating_count["count"] > count_threshold

    # Get the best rated books sorted in descending order of their mean rating
    top_rated = rating_count[mask].sort_values("mean", ascending=False)

    # Combine rating and book list
    top_rated_books = top_rated.merge(books).drop(columns=["mean", "count"])

    # Ensure diverse results by only taking one book per author
    top_rated_books = top_rated_books.drop_duplicates(subset=["Book-Author"])

    # Grab the top n books
    top_rated_books = top_rated_books.head(n).reset_index()

    # Selecting specific columns from the merged DataFrame to include in the final result
    top_rated_books = top_rated_books[[
        "ISBN",
        "Book-Title",
        "Book-Author",
        "Year-Of-Publication",
    ]]

    return top_rated_books

Example usage to obtain the top 10 most popular books in the dataset.

In [7]:
popularity_recommender(10)

Unnamed: 0,ISBN,Book-Title,Book-Author,Year-Of-Publication
0,345339738,"The Return of the King (The Lord of the Rings,...",J.R.R. TOLKIEN,1986
1,439139597,Harry Potter and the Goblet of Fire (Book 4),J. K. Rowling,2000
2,446310786,To Kill a Mockingbird,Harper Lee,1988
3,441172717,Dune (Remembering Tomorrow),Frank Herbert,1996
4,451524934,1984,George Orwell,1990
5,812550706,Ender's Game (Ender Wiggins Saga (Paperback)),Orson Scott Card,1994
6,440498058,A Wrinkle In Time,MADELEINE L'ENGLE,1998
7,553296981,Anne Frank: The Diary of a Young Girl,ANNE FRANK,1993
8,345348036,The Princess Bride: S Morgenstern's Classic Ta...,WILLIAM GOLDMAN,1987
9,345342968,Fahrenheit 451,RAY BRADBURY,1987
