# A Simple Book Recommendation System

###### Done by __Safae HAJJOUT (@Ariyes)__ and __Mounia BADDOU (@MTheCreator)__ for Algorithmics class of 2022 / 2023.
###### Mohammed VI Polytechnic University, School of Computer Science, 1st year.


### ______________________________

##### This is a walkthrough of our simple recommendation system implemented using Python 3.10 and primitive data structures such as Linked lists, Arrays and some Python structures (Lists, Dictionaries, ... ).

##### The goal of this recommender system is to try and give the most acurate recommendations based on multiple criterias: titles, ratings, genres, description and others. We tried to make use also of some libraries to ease the recommedation process that is related to natural language (NLP).

#### Now we will go through the purpose of every chunk of code. Good reading :) .

### _____________________________

#### __First import what is needed !__

In [2]:
# Here can be downloaded every needed library for our code!
#! pip install --upgrade click
#! pip intall nltk
#! pip install spacy
#! python -m spacy download en_core_web_sm
#! python -m spacy download en
#! pip install json

In [3]:
import spacy
import json
import nltk
from nltk.corpus import stopwords
from useful_classes import Node, LinkedList, Queue, heapsort

nlp = spacy.load("en_core_web_sm")
nltk.download('stopwords')

[nltk_data] Downloading package stopwords to
[nltk_data]     C:\Users\hp\AppData\Roaming\nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


True

#### _______________________________

#### We can now load our book related data

In [4]:
with open('books.json') as file:
    books = json.load(file)

#### This following function will come in handy in the next chunks of code.

In [None]:
def change_value(queue: Queue, old_value, new_value):
    """
    Modifies a given queue by replacing occurrences of an old value with a new value.

    Args:
        queue (Queue): The queue to be modified.
        old_value: The value to be replaced.
        new_value: The new value to replace the old value with.

    Returns:
        None: This function modifies the queue in-place.

    """
    modified = False
    temp_queue = Queue()

    # Dequeue elements until the desired value is found or the queue becomes empty
    while len(queue) > 0:
        element = queue.dequeue()
        if element == old_value:
            # Update the value
            element = new_value
            modified = True
        temp_queue.enqueue(element)

    # Enqueue the modified value back into the queue
    if modified:
        queue.enqueue(new_value)

    # Enqueue the remaining dequeued elements back into the queue
    while len(temp_queue) > 0:
        queue.enqueue(temp_queue.dequeue())

#### _________________________________

## __Book Class__

##### One of our main classes, contains every valuable information on every book in our dataset in the shape of attributes (author, desc, title, ... ).
##### This class also contains two methods: __str__ and __similar_books__. One serves as a display function, and the other one as a recommendation method based on rankibg given by our users.
##### More informations in our docstrings!

In [5]:
class Book:
    def __init__(self, author: str, desc: str, genre: list, isbn: str, pages: int, rating: int, title: str, totalratings: int):
        self.author = author                    # Initialisation of the author of the book
        self.desc = desc                        # Short description of the content of the book
        self.genre = genre                      # List that provides the many genres that the book lies within 
        self.isbn = isbn                        # ID of the book, used to ease the access to the book objects
        self.pages = pages                      # Number of pages in each book, other mean of recommendation
        self.rating = rating                    # A ratiing on 5, with 5 being the most recommendable and 0 the least
        self.title = title                      # Initialisation of the title of the book
        self.totalratings = totalratings        # The total number of given ratings to the book, popularity credential
        self.rate = self.rating * self.totalratings # rate for sorting
        
    def __str__(self):
        """ A display method where the __str__ function was overridden to meet the needs of our recommendation system. """
        return self.title + " by " + self.author + "\trate: " + str(self.rate)
    
    def similar_books(self, data: dict):
        """
    Recommends similar books based on the provided data.

    Args:
        data (dict): A dictionary containing book data, where the keys are ISBNs and the values are Book objects.

    Returns:
        list: A list of Book objects representing similar books.

    """
        pass

##### In this following chunk of code, we tried to sort our database of Book objects in a Python list using a heapsort to ease our recommendation based on ratings. 
##### The return will be the index, which will represent the rank of the book.

In [6]:
books_list = []
for book in books:
    author, desc, genre, isbn, pages, rating, title, totalratings = list(
        book.values())
    obj = Book(author, desc, list({g.lower() for g in genre.split(',')}), isbn, pages, rating, title, totalratings)
    books_list.append(obj)

# sort books in descending order
heapsort(books_list)

##### In this chunk, we try to tokenize our database in order to ease the recommendation using Spacy and NLTK.
##### This operation will help reduce the size of the words we can use to recommend. (Stemming and Lemmatization).  

In [None]:
books_dict = {}
stopwords_set = set(stopwords.words('english'))
for book in books_list:
    tokenized_title = list({word.lower() for word in book.title.split(' ') if word.lower() not in stopwords_set and word.isalnum()})
    tokenized_desc = list({word.lower() for word in book.desc.split(' ') if word.lower() not in stopwords_set and word.isalnum()})
    books_dict[book] = set(tokenized_title + book.genre + tokenized_desc)

#### _______________________________

## __User Class__

##### Our second Class for this code, where we store every information related to the user, such as the id, name, library, wishlist and other.

##### Almost every recommendation will be done through the User class since it is easier and recaommends to each and every user.

##### For the methods of this class, they will be explained in the docstrings that accompany the code.

In [None]:
class user:
    def __init__(self, id: int, name: str, fav_genres: LinkedList, library: Queue, wishlist: Queue):
        """
        Initializes a User object with the provided attributes.

        Args:
            id (int): The ID of the user.
            name (str): The name of the user.
            fav_genres (LinkedList): A linked list containing the user's favorite genres.
            library (Queue): A queue representing the user's library of books.
            wishlist (Queue): A queue representing the user's wishlist of books.

        """
        self.id = id
        self.name = name
        self.fav_genres = fav_genres
        self.library = library
        self.wishlist = wishlist

    def read_book(self, book: Book):
        """
            Adds a book to the user's library.

            Args:
                book (Book): The book to be added to the library.

            Returns:
                None

        """
        self.library.enqueue((book, 0))
        print(f"the book {book.title} was added successfully to your library ... ")

    def rate_book(self, book: Book, rate: int):
        """
        Rates a book in the user's library.

        Args:
            book (Book): The book to be rated.
            rate (int): The rating to assign to the book.

        Returns:
            None

        """
        change_value(self.library, (book, 0), (book, rate))
        print(f"You have rated the book {book.title} by {book.author}: {rate} ... ")

    def add_to_wishlist(self, book: book):
        """
        Adds a book to the user's wishlist.

        Args:
            book (Book): The book to be added to the wishlist.

        Returns:
            None

        """
        self.wishlist.enqueue(book)

    def search_author(self, author: str) -> LinkedList:
        """
        Searches for books by a specific author.

        Args:
            author (str): The author's name to search for.

        Returns:
            LinkedList: A linked list of books written by the specified author.

        """
        books = LinkedList()
        for book in books_list:
            if author in book.author:
                books.prepend(book)
        return books

    def search_book(self, title: str) -> LinkedList:
        """
        Searches for books with a specific title.

        Args:
            title (str): The title of the book to search for.

        Returns:
            LinkedList: A linked list of books with the specified title.

        """
        books = LinkedList()
        for book in books_list:
            if title in book.title:
                books.prepend(book)
        return books

    def search_book_isbn(self, isbn: str) -> LinkedList:
        """
    Searches for a book with a specific ISBN.

    Args:
        isbn (str): The ISBN of the book to search for.

    Returns:
        LinkedList: A linked list containing the book with the specified ISBN.

    """
        books = LinkedList()
        for book in books_list:
            if isbn in book.isbn:
                # prepend to get in decreasing order with low cost of operation (add at head)
                books.prepend(book)
        return books

    def search_genre(self, genre: str) -> LinkedList:
        """
    Searches for books within a specific genre.

    Args:
        genre (str): The genre to search for.

    Returns:
        LinkedList: A linked list of books belonging to the specified genre.

    """
        books = LinkedList()
        for book in books_list:
            if genre.lower() in book.genre:
                # prepend to get in decreasing order with low cost of operation (add at head)
                books.prepend(book)
        return books

    def search_keywords(self, *keywords: str):
        """
    Searches for books based on one or more keywords.

    Args:
        *keywords (str): Variable number of keyword arguments.

    Returns:
        None

    """
        pass

    def _find_max_count(self, data: list) -> str:
        """
    Finds the element with the maximum count in a list.

    Args:
        data (list): A list of elements.

    Returns:
        str: The element with the maximum count.

    """
        # using count dictionary approach
        counts = dict()
        for elem in data:
            # set count to 0 if checked for the first time else add 1
            counts[elem] = counts.get(elem, 0) + 1

        max_count = 0
        for (elem, count) in counts.items():
            (max_name, max_count) = (
                elem, count) if max_count < count else (max_name, max_count)

        return max_name

    def most_checked_genre(self) -> str:
        """
    Returns the genre that has been checked out the most in the user's library.

    Returns:
        str: The genre with the highest check-out count.

    """
        genres = [genre for book in self.library for genre in book.genre]
        if genres != []:
            return self._find_max_count(genres)

    def most_checked_author(self) -> str:
        """
    Returns the author who has been checked out the most in the user's library.

    Returns:
        str: The author with the highest check-out count.

    """
        # same approach as genre
        authors = [
            author for book in self.library for author in book.author.split(',')]
        if authors != []:
            return self._find_max_count(authors)

    def recommend_books_genre(self) -> dict:
        """
    Recommends books based on the user's favorite genres.

    Returns:
        dict: A dictionary where the keys are the user's favorite genres and the values are linked lists of recommended books.

    """
        books = dict()
        for genre in self.fav_genres:
            count = 3
            books[genre.val] = LinkedList()
            # traverse books_list (heap) in reverse order to get values in decreasing order
            for i in range(len(books_list) - 1, -1, -1):
                if genre.val in books_list[i].genre and count != 0:
                    books[genre.val].append(books_list[i])
                    count -= 1
        return books

In [2]:
"""
import spacy
from nltk.tokenize import word_tokenize

def search_keywords(self, *keywords: str):
        
    #Searches for books based on one or more keywords.

    Args:
        *keywords (str): Variable number of keyword arguments.

    Returns:
        LinkedList: A linked list of books that match the provided keywords.

    #
    books = LinkedList()
    nlp = spacy.load("en_core_web_sm")

    for book in books_list:
        title_tokens = word_tokenize(book.title)
        desc_tokens = word_tokenize(book.desc)

        title_keywords = [token.lower() for token in title_tokens]
        desc_keywords = [token.lower() for token in desc_tokens]

        for keyword in keywords:
            keyword = keyword.lower()
            if keyword in title_keywords or keyword in desc_keywords:
                books.prepend(book)

    return books
    """

'\nimport spacy\nfrom nltk.tokenize import word_tokenize\n\ndef search_keywords(self, *keywords: str):\n        \n    #Searches for books based on one or more keywords.\n\n    Args:\n        *keywords (str): Variable number of keyword arguments.\n\n    Returns:\n        LinkedList: A linked list of books that match the provided keywords.\n\n    #\n    books = LinkedList()\n    nlp = spacy.load("en_core_web_sm")\n\n    for book in books_list:\n        title_tokens = word_tokenize(book.title)\n        desc_tokens = word_tokenize(book.desc)\n\n        title_keywords = [token.lower() for token in title_tokens]\n        desc_keywords = [token.lower() for token in desc_tokens]\n\n        for keyword in keywords:\n            keyword = keyword.lower()\n            if keyword in title_keywords or keyword in desc_keywords:\n                books.prepend(book)\n\n    return books\n    '