Real-World Problem Solving with Algorithms and Data Structures - Book Recommendation Model

In [4]:
import pandas as pd
import ipytest

- Reading dataset from Amazon Books Dataset: Genre, Sub-genre, and Books (available at https://www.kaggle.com/datasets/tanisha1604/amazon-books-dataset-genre-sub-genre-and-books)

In [5]:
books_df = pd.read_csv("Books_df.csv")

In [6]:
print("Books DataFrame:")
books_df.head()

Books DataFrame:


Unnamed: 0.1,Unnamed: 0,Title,Author,Main Genre,Sub Genre,Type,Price,Rating,No. of People rated,URLs
0,0,The Complete Novel of Sherlock Holmes,Arthur Conan Doyle,"Arts, Film & Photography",Cinema & Broadcast,Paperback,₹169.00,4.4,19923.0,https://www.amazon.in/Complete-Novels-Sherlock...
1,1,Black Holes (L) : The Reith Lectures [Paperbac...,Stephen Hawking,"Arts, Film & Photography",Cinema & Broadcast,Paperback,₹99.00,4.5,7686.0,https://www.amazon.in/Black-Holes-Lectures-Ste...
2,2,The Kite Runner,Khaled Hosseini,"Arts, Film & Photography",Cinema & Broadcast,Kindle Edition,₹175.75,4.6,50016.0,https://www.amazon.in/Kite-Runner-Khaled-Hosse...
3,3,Greenlights: Raucous stories and outlaw wisdom...,Matthew McConaughey,"Arts, Film & Photography",Cinema & Broadcast,Paperback,₹389.00,4.6,32040.0,https://www.amazon.in/Greenlights-Raucous-stor...
4,4,The Science of Storytelling: Why Stories Make ...,Will Storr,"Arts, Film & Photography",Cinema & Broadcast,Paperback,₹348.16,4.5,1707.0,https://www.amazon.in/Science-Storytelling-Wil...


- Converting price to Euro:

In [7]:
exchange_rate_inr_to_eur = 1 / 89.43  # 1 Euro = 89.43 INR

# Removing commas and currency symbol from the prices and convert to float
books_df['Price'] = books_df['Price'].str.replace(',', '').str.replace('₹', '').astype(float)

# Calculating the price in Euros
books_df['Price_EUR'] = books_df['Price'] * exchange_rate_inr_to_eur

# Rounding the price in Euros to two decimal places
books_df['Price_EUR'] = books_df['Price_EUR'].apply(lambda x: round(x, 2))

print("Books DataFrame with Prices in Euros:")
books_df.head()


Books DataFrame with Prices in Euros:


Unnamed: 0.1,Unnamed: 0,Title,Author,Main Genre,Sub Genre,Type,Price,Rating,No. of People rated,URLs,Price_EUR
0,0,The Complete Novel of Sherlock Holmes,Arthur Conan Doyle,"Arts, Film & Photography",Cinema & Broadcast,Paperback,169.0,4.4,19923.0,https://www.amazon.in/Complete-Novels-Sherlock...,1.89
1,1,Black Holes (L) : The Reith Lectures [Paperbac...,Stephen Hawking,"Arts, Film & Photography",Cinema & Broadcast,Paperback,99.0,4.5,7686.0,https://www.amazon.in/Black-Holes-Lectures-Ste...,1.11
2,2,The Kite Runner,Khaled Hosseini,"Arts, Film & Photography",Cinema & Broadcast,Kindle Edition,175.75,4.6,50016.0,https://www.amazon.in/Kite-Runner-Khaled-Hosse...,1.97
3,3,Greenlights: Raucous stories and outlaw wisdom...,Matthew McConaughey,"Arts, Film & Photography",Cinema & Broadcast,Paperback,389.0,4.6,32040.0,https://www.amazon.in/Greenlights-Raucous-stor...,4.35
4,4,The Science of Storytelling: Why Stories Make ...,Will Storr,"Arts, Film & Photography",Cinema & Broadcast,Paperback,348.16,4.5,1707.0,https://www.amazon.in/Science-Storytelling-Wil...,3.89


UserManager function:

- register_user(username, password): Registers a new user with the provided username and password. If the username already exists, it prompts the user to choose a different one. Otherwise, it adds the user to the system and prints a success message;
  
- login_user(username, password): Authenticates a user by checking if the provided username and password match the stored credentials. If the credentials are correct, it prints a success message and returns True; otherwise, it prints an error message and returns False;
  
- set_preference(username, preference_key, preference_value): Sets a preference for a user identified by their username. It allows users to customize their experience by storing preferences related to specific keys. It prints a success message upon setting the preference;
  
- get_preferences(username): Retrieves the preferences associated with a user identified by their username. It returns the preferences dictionary if the user is found and has preferences set; otherwise, it prints an error message and returns None;
  
- input_credentials(): Prompts the user to enter their username and password and returns them as a tuple;

- check_all_users(): Prints a list of all registered users along with their passwords and preferences, if any. If no users are registered, it prints a corresponding message;

- get_user_by_username(): Retrieves user information based on a provided username by computing an index and checking for a match in the user list, returning the user information if found, or None otherwise.

In [8]:
class UserManagement:
  def __init__(self):
    # Simulate a fixed-size hash table (adjust SIZE as needed)
    self.SIZE = 10
    self.users = [None] * self.SIZE  # Array to store user information

  def _hash(self, username):
    # Simple hash function (can be improved for better distribution)
    return sum(ord(char) for char in username) % self.SIZE

  def _find_slot(self, username):
    # Find an empty slot or the slot containing the user
    index = self._hash(username)
    for i in range(self.SIZE):
      probe_index = (index + i) % self.SIZE  # Linear probing for collisions
      if self.users[probe_index] is None or self.users[probe_index]['username'] == username:
        return probe_index
    raise Exception("Hash table is full!")

  def register_user(self, username, password):
    index = self._find_slot(username)
    if self.users[index] is None:
      self.users[index] = {'username': username, 'password': password, 'preferences': {}}
      print("Registration successful!")
    else:
      print("Username already exists. Please choose a different username.")

  def login_user(self, username, password):
    index = self._find_slot(username)
    if self.users[index] is not None and self.users[index]['username'] == username and self.users[index]['password'] == password:
      print("Login successful!")
      return True
    else:
      print("Invalid username or password. Please try again.")
      return False

  def set_preference(self, username, preference_key, preference_value):
    index = self._find_slot(username)
    if self.users[index] is not None:
      if 'preferences' not in self.users[index]:
        self.users[index]['preferences'] = {}
      if preference_key in self.users[index]['preferences']:
        self.users[index]['preferences'][preference_key].append(preference_value)
      else:
        self.users[index]['preferences'][preference_key] = [preference_value]
      print("Preference set successfully!")
    else:
      print("User not found. Please login again.")

  def get_preferences(self, username):
    index = self._find_slot(username)
    if self.users[index] is not None and 'preferences' in self.users[index]:
      return self.users[index]['preferences']
    else:
      print("User not found or no preferences set.")
      return None

  def input_credentials(self):
    username = input("Enter your username: ")
    password = input("Enter your password: ")
    return username, password

  def input_preferences(self):
    favorite_genre = input("Enter your favorite genre: ")
    favorite_author = input("Enter your favorite author: ")
    favorite_book_type = input("Enter your favorite type of book: ")
    return favorite_genre, favorite_author, favorite_book_type

  def check_all_users(self):
    if self.users:
      print("List of all users:")
      for i in range(self.SIZE):
        if self.users[i] is not None:
          print(f"Username: {self.users[i]['username']}, Password: {self.users[i]['password']}, Preferences: {self.users[i]['preferences']}")
    else:
      print("No users registered yet.")

  def get_user_by_username(self, username):
        index = self._find_slot(username)
        if self.users[index] is not None and self.users[index]['username'] == username:
            return self.users[index]
        else:
            return None



- Testing UserManager methods:

- Registering users on the system:

In [9]:
user_manager = UserManagement()

# Register users
user_manager.register_user('Sean', 'password123')
user_manager.register_user('Saoirse', 'password789')
user_manager.register_user('Finn', 'password456')
user_manager.register_user('Aoife', 'password997')

Registration successful!
Registration successful!
Registration successful!
Registration successful!


- Please input your registration information:

In [10]:
# Prompt for credentials
username, password = user_manager.input_credentials()

# Register the user based on the provided credentials
user_manager.register_user(username, password)


Enter your username:  Andressa
Enter your password:  1234


Registration successful!


In [11]:
# Check all users
user_manager.check_all_users()

List of all users:
Username: Sean, Password: password123, Preferences: {}
Username: Aoife, Password: password997, Preferences: {}
Username: Finn, Password: password456, Preferences: {}
Username: Saoirse, Password: password789, Preferences: {}
Username: Andressa, Password: 1234, Preferences: {}


- Users setting preferences:

- Sean:

In [12]:
user_manager.set_preference('Sean', 'favorite_genre', 'Science Fiction')
user_manager.set_preference('Sean', 'favorite_author', 'J.K. Rowling')
user_manager.set_preference('Sean', 'favorite_author', 'David Nicholls')
user_manager.set_preference('Sean', 'favorite_type', 'Paperback')

Preference set successfully!
Preference set successfully!
Preference set successfully!
Preference set successfully!


- Saoirse:

In [13]:
user_manager.set_preference('Saoirse', 'favorite_genre', 'Arts, Film & Photography')
user_manager.set_preference('Saoirse', 'favorite_type', 'Kindle Edition')
user_manager.set_preference('Sean', 'favorite_author', 'Oscar Wilde')

Preference set successfully!
Preference set successfully!
Preference set successfully!


- Finn:

In [14]:
user_manager.set_preference('Finn', 'favorite_genre', 'Cinema & Broadcast')
user_manager.set_preference('Finn', 'favorite_author', 'Aristotle')
user_manager.set_preference('Finn', 'favorite_type', 'Hardcover')

Preference set successfully!
Preference set successfully!
Preference set successfully!


- Aoife:

In [15]:
user_manager.set_preference('Aoife', 'favorite_genre', 'Economics')
user_manager.set_preference('Aoife', 'favorite_genre', 'Administration & Policy')
user_manager.set_preference('Aoife', 'favorite_author', 'Paramahansa Yogananda')
user_manager.set_preference('Aoife', 'favorite_type', 'Paperback')

Preference set successfully!
Preference set successfully!
Preference set successfully!
Preference set successfully!


- Please insert your preferences:

In [16]:
username = input("Enter your username: ").strip() 

user_info = user_manager.get_user_by_username(username)

if user_info is not None:

    favorite_author = input("Enter your favorite author: ")
    #In case the dataset doesn't include books from your favorite author, you can select one of those options (for testing purposes):
    #["Zoey Draven", "John Marrs", "Victoria Aveline", "Ivy Barrett", "Talia Rhea", "Bella Matthews", "C.W. Farnsworth", "Bella Matthews", "Franz Kafka"]

    favorite_genre = input("Enter your favorite genre: ")
    #examples of genre in the dataset: ["Romance", "Science & Mathematics", "Sports", "Fantasy"]

    favorite_book_type = input("Enter your favorite type of book: ")
     #examples of book type in teh dataset: ["Paperback", "Kindle Edition", "Hardcover"]

    user_manager.set_preference(username, 'favorite_author', favorite_author)
    user_manager.set_preference(username, 'favorite_genre', favorite_genre)
    user_manager.set_preference(username, 'favorite_book_type', favorite_book_type)
else:
    print("User not found. Please register first.")


Enter your username:  Andressa
Enter your favorite author:  Fraz Kafka
Enter your favorite genre:  Fantasy
Enter your favorite type of book:  Paperback


Preference set successfully!
Preference set successfully!
Preference set successfully!


In [17]:
# Check all users
user_manager.check_all_users()

List of all users:
Username: Sean, Password: password123, Preferences: {'favorite_genre': ['Science Fiction'], 'favorite_author': ['J.K. Rowling', 'David Nicholls', 'Oscar Wilde'], 'favorite_type': ['Paperback']}
Username: Aoife, Password: password997, Preferences: {'favorite_genre': ['Economics', 'Administration & Policy'], 'favorite_author': ['Paramahansa Yogananda'], 'favorite_type': ['Paperback']}
Username: Finn, Password: password456, Preferences: {'favorite_genre': ['Cinema & Broadcast'], 'favorite_author': ['Aristotle'], 'favorite_type': ['Hardcover']}
Username: Saoirse, Password: password789, Preferences: {'favorite_genre': ['Arts, Film & Photography'], 'favorite_type': ['Kindle Edition']}
Username: Andressa, Password: 1234, Preferences: {'favorite_author': ['Fraz Kafka'], 'favorite_genre': ['Fantasy'], 'favorite_book_type': ['Paperback']}


GenreRecommender function:

- recommend_books(): This method recommends books based on a user's favorite genre by retrieving user information, validating login credentials, filtering books by the favorite genre, and returning the top recommended books;
  
- filter_books_by_genre(): This method filters books by a given favorite genre, converting the genre to a string if it's a list and then filtering the books DataFrame based on the genre match;
  
- get_top_recommendations(): This method retrieves the top recommended books from a filtered DataFrame based on ratings, selecting specific columns for the recommendations and limiting the number of recommendations to three;
  
- get_user_info(): This method retrieves user information based on the provided username, calling the get_user_by_username method from the user manager and returning the user information.

In [18]:
class GenreRecommender:
    def __init__(self, user_manager, books_df):
        self.user_manager = user_manager
        self.books_df = books_df

    def recommend_books(self, username):
        """Recommends books based on user's favorite genre from preferences"""
        user_info = self.user_manager.get_user_by_username(username)
        if user_info:
            if self.user_manager.login_user(username, user_info['password']):
                favorite_genre = user_info['preferences'].get('favorite_genre')
                if favorite_genre:
                    filtered_books = self.filter_books_by_genre(favorite_genre)
                    recommendations = self.get_top_recommendations(filtered_books)
                    print(f"Here are some recommendations for {username} based on favorite genre '{favorite_genre}':")
                    return recommendations
                else:
                    print(f"{username} doesn't have a favorite genre set in preferences.")
                    return pd.DataFrame()  # Return an empty DataFrame
            else:
                print("Invalid username or password. Please try again.")
                return pd.DataFrame()  # Return an empty DataFrame
        else:
            print(f"User {username} not found.")
            return pd.DataFrame()  # Return an empty DataFrame

    def filter_books_by_genre(self, favorite_genre):
        if isinstance(favorite_genre, list):
            favorite_genre = ", ".join(favorite_genre)
        if self.books_df['Main Genre'].dtype != object:
            self.books_df['Main Genre'] = self.books_df['Main Genre'].astype(str)
        return self.books_df[self.books_df['Main Genre'].str.contains(favorite_genre, case=False)]

    def get_top_recommendations(self, filtered_books):
        selected_columns = ['Title', 'Author', 'Price_EUR', 'Rating', 'URLs']
        sorted_books = filtered_books.sort_values(by='Rating', ascending=False)[selected_columns]
        recommendation_count = 3
        return sorted_books.head(recommendation_count)

    def get_user_info(self, username):
        """Retrieve user information"""
        user_info = self.user_manager.get_user_by_username(username)
        return user_info

In [19]:
# Create an instance of GenreRecommender
genre_recommender = GenreRecommender(user_manager, books_df)

recommendations = genre_recommender.recommend_books('Saoirse')
styled_recommendations = recommendations.style
styled_recommendations


Login successful!
Here are some recommendations for Saoirse based on favorite genre '['Arts, Film & Photography']':


Unnamed: 0,Title,Author,Price_EUR,Rating,URLs
132,"Funny Jokes for 15 Year Old Teens: The Ultimate Q&A, One-Liner, Dad, Knock-Knock, Riddle, and Tongue Twister Collection! Hilarious and Silly Humor for Teenagers",Cooper The Pooper,1.85,5.0,https://www.amazon.in/Funny-Jokes-Year-Old-Teens-ebook/dp/B0CVFKC999/ref=zg_bs_g_1318063031_d_sccl_3/000-0000000-0000000?psc=1
135,अभिनेता जीवन-एक संघर्ष: Evaluate yourself before you enter in the industry! (Hindi Edition),Pankaj Gupta,1.11,5.0,https://www.amazon.in/%E0%A4%85%E0%A4%AD%E0%A4%BF%E0%A4%A8%E0%A5%87%E0%A4%A4%E0%A4%BE-%E0%A4%9C%E0%A5%80%E0%A4%B5%E0%A4%A8-%E0%A4%8F%E0%A4%95-%E0%A4%B8%E0%A4%82%E0%A4%98%E0%A4%B0%E0%A5%8D%E0%A4%B7-aspiring-actors-ebook/dp/B092TB1XJ4/ref=zg_bs_g_1318063031_d_sccl_6/000-0000000-0000000?psc=1
120,அசுரனின் காதல் (Tamil Edition),Ebin Rider,4.47,5.0,https://www.amazon.in/%E0%AE%85%E0%AE%9A%E0%AF%81%E0%AE%B0%E0%AE%A9%E0%AE%BF%E0%AE%A9%E0%AF%8D-%E0%AE%95%E0%AE%BE%E0%AE%A4%E0%AE%B2%E0%AF%8D-Tamil-Ebin-Rider-ebook/dp/B0CTFRN16B/ref=zg_bs_g_1318063031_d_sccl_21/000-0000000-0000000?psc=1


In [21]:
# Instantiate GenreRecommender
genre_recommender = GenreRecommender(user_manager, books_df)

# Prompt the user to enter their username
username = input("Enter your username: ").strip()

# Retrieve user information and recommendations
user_info = genre_recommender.recommend_books(username)
styled_recommendations = user_info.style
styled_recommendations


Enter your username:  Andressa


Login successful!
Here are some recommendations for Andressa based on favorite genre '['Fantasy']':


Unnamed: 0,Title,Author,Price_EUR,Rating,URLs
3046,Metamorphosis (Pocket Classics),Franz Kafka,0.89,5.0,https://www.amazon.in/Metamorphosis-Pocket-Classics-Franz-Kafka/dp/8119623037/ref=zg_bs_g_1318165031_d_sccl_17/000-0000000-0000000?psc=1
2997,Rogue Ascension: Book 5: First Ascension: A Progression LitRPG,Hunter Mythos,4.63,5.0,https://www.amazon.in/Rogue-Ascension-First-Progression-LitRPG-ebook/dp/B0CRHVNTBX/ref=zg_bs_g_1318163031_d_sccl_18/000-0000000-0000000?psc=1
3016,"JUJUTSU KAISEN, VOL. 20",Gege Akutami,7.02,4.9,https://www.amazon.in/JUJUTSU-KAISEN-VOL-Gege-Akutami/dp/1974738744/ref=zg_bs_g_1318165031_d_sccl_17/000-0000000-0000000?psc=1


GetBooksByFavoriteAuthors function:

- get_books_by_favorite_authors(): This function recommends books based on a user's favorite authors. It takes the username, user manager instance, a DataFrame containing book information, and optional desired columns as input. It first retrieves the user information using the provided username and checks if the user exists and has preferences set. Then, it retrieves the user's favorite authors from the preferences. For each favorite author, it filters the books DataFrame to select only the books authored by that author. If books are found, they are sorted by rating in descending order, and only the top 3 books per author are selected. Finally, it concatenates the recommended books into a single DataFrame and returns it, or returns an empty DataFrame if no books are found.

In [28]:
def GetBooksByFavoriteAuthors(username, user_manager, books_df, desired_columns=['Title', 'Author', 'Rating', 'Price_EUR', 'URLs']):
    """
    This function recommends books based on a user's favorite authors, selecting
    specified columns for the recommendations and ensuring unique book recommendations.

    Args:
        username: The username of the user.
        user_manager: An instance of UserManagement containing user information.
        books_df: A Pandas DataFrame containing book information.
        desired_columns (list, optional): A list of column names to select
            from the recommended books. Defaults to ['Title', 'Author',
            'Rating', 'Price_EUR', 'URLs'].

    Returns:
        A Pandas DataFrame containing recommended books (empty if none found).
    """

    user_info = user_manager.get_user_by_username(username)

    if user_info is None:
        print("User not found.")
        return pd.DataFrame()

    if 'preferences' not in user_info:
        print(f"User '{username}' has no preferences set.")
        return pd.DataFrame()

    favorite_authors = user_info['preferences'].get('favorite_author', [])

    if not favorite_authors:
        print(f"User '{username}' has no favorite authors set.")
        return pd.DataFrame()

    recommended_books = []

    for author in favorite_authors:
        author_books = books_df[books_df['Author'] == author]
        if not author_books.empty:
            # Sort the books by rating in descending order
            sorted_books = author_books.sort_values(by='Rating', ascending=False)
            # Select only the top 3 books per author
            top_3_books = sorted_books.head(3)[desired_columns]
            recommended_books.append(top_3_books)
        else:
            print(f"No books found for author {author}.")

    if recommended_books:
        # Concatenate the list of DataFrames into a single DataFrame
        df = pd.concat(recommended_books)
        return df
    else:
        print("No books found matching the criteria.")
        return pd.DataFrame()

  #insert method here for the user input!


In [27]:
# Call the function with the appropriate arguments
recommendations = GetBooksByFavoriteAuthors('Sean', user_manager, books_df)
styled_recommended_books_by_author = recommendations.style
styled_recommended_books_by_author


Unnamed: 0,Title,Author,Rating,Price_EUR,URLs
7742,Harry Potter and the Prisoner of Azkaban: MinaLima Edition,J.K. Rowling,4.9,28.65,https://www.amazon.in/Harry-Potter-Prisoner-Azkaban-MinaLima/dp/1526666324/ref=zg_bs_g_67803449031_d_sccl_15/000-0000000-0000000?psc=1
932,Harry Potter and the Prisoner of Azkaban: MinaLima Edition,J.K. Rowling,4.9,28.65,https://www.amazon.in/Harry-Potter-Prisoner-Azkaban-MinaLima/dp/1526666324/ref=zg_bs_g_67803457031_d_sccl_3/000-0000000-0000000?psc=1
2956,Harry Potter and the Half-Blood Prince,J.K. Rowling,4.8,3.51,https://www.amazon.in/Harry-Potter-Half-Blood-Prince-Rowling-ebook/dp/B019PIOJZE/ref=zg_bs_g_1318163031_d_sccl_7/000-0000000-0000000?psc=1
10,One Day,David Nicholls,4.2,9.97,https://www.amazon.in/One-Day/dp/B079P515NK/ref=zg_bs_g_1318054031_d_sccl_11/000-0000000-0000000?psc=1
4499,US (REISSUE),David Nicholls,4.0,9.72,https://www.amazon.in/Us-David-Nicholls/dp/0340897015/ref=zg_bs_g_1318187031_d_sccl_20/000-0000000-0000000?psc=1
5377,Us: The Booker Prize-longlisted novel from the author of ONE DAY,David Nicholls,4.0,3.13,https://www.amazon.in/Us-David-Nicholls-ebook/dp/B00IWZNYTE/ref=zg_bs_g_89265414031_d_sccl_1/000-0000000-0000000?psc=1
3002,The Picture of Dorian Gray (Deluxe Hardbound Edition),Oscar Wilde,4.7,3.62,https://www.amazon.in/Picture-Dorian-Gray-Deluxe-Hardbound/dp/9354402178/ref=zg_bs_g_1318165031_d_sccl_3/000-0000000-0000000?psc=1
3571,The Picture of Dorian Gray (Deluxe Hardbound Edition),Oscar Wilde,4.7,3.62,https://www.amazon.in/Picture-Dorian-Gray-Deluxe-Hardbound/dp/9354402178/ref=zg_bs_g_1318159031_d_sccl_22/000-0000000-0000000?psc=1
3629,The Picture of Dorian Gray (Deluxe Hardbound Edition),Oscar Wilde,4.7,3.62,https://www.amazon.in/Picture-Dorian-Gray-Deluxe-Hardbound/dp/9354402178/ref=zg_bs_g_1318161031_d_sccl_30/000-0000000-0000000?psc=1


In [None]:
  #insert method  call here for the user input!

PriceSort function:

- PriceSort class: This class is designed to sort a DataFrame containing book information by price in ascending order and provides a method to find the book closest to a target price with a minimum rating requirement;
  
- __init__ method: Initialises the PriceSort object by sorting the input DataFrame (books_df) by price in ascending order. It first converts the 'Price_EUR' column to numeric type, removes rows with NaN values in the 'Price_EUR' column, and then sorts the DataFrame by price;
  
- find_closest_book method(): Utilises binary search to find the book closest to the target price with a rating of at least min_rating. It takes the target price and an optional minimum rating as input. The method initializes low and high indices for binary search, sets closest_book to None, and iterates through the sorted DataFrame using binary search. It calculates the current price and rating at the midpoint and checks if the rating meets the minimum criteria. If the price matches the target price, it returns the book. Otherwise, it updates the closest_book if the current book is closer to the target price. Finally, it adjusts the search range based on the price and returns the closest book found.

In [24]:
class PriceSort:
  def __init__(self, books_df):
        """
        Sorts the book dataframe by price in ascending order.
        """
        # Convert Price_EUR column to numeric type
        books_df['Price_EUR'] = pd.to_numeric(books_df['Price_EUR'], errors='coerce')
        # Remove rows with NaN values in Price_EUR column
        books_df = books_df.dropna(subset=['Price_EUR'])
        
        # Sort the dataframe by price in ascending order
        self.sorted_books_df = books_df.sort_values(by='Price_EUR')

  def find_closest_book(self, target_price, min_rating=4):
    """
    Uses binary search to find the book closest to the target price with a rating of min_rating or over.
    """
    low = 0
    high = len(self.sorted_books_df) - 1
    closest_book = None  # Initialize closest book to None

    while low <= high:
        mid = (low + high) // 2
        current_price = self.sorted_books_df.iloc[mid]['Price_EUR']
        current_rating = self.sorted_books_df.iloc[mid]['Rating']

        # Check if current book meets rating criteria
        if current_rating >= min_rating:
            # If price matches, return the book
            if current_price == target_price:
                return self.sorted_books_df.iloc[mid]
            # Update closest_book if closer to the target price
            elif closest_book is None or abs(current_price - target_price) < abs(closest_book['Price_EUR'] - target_price):
                closest_book = self.sorted_books_df.iloc[mid]

        # Adjust search range based on price
        if current_price < target_price:
            low = mid + 1
        else:
            high = mid - 1

    return closest_book  # Return the closest book found


In [25]:
# Ensure the user enters the target price in Euros
target_price_eur = float(input("Enter your target price in Euros: "))
min_rating = int(input("Enter minimum rating (optional, defaults to 4): ") or 4)

# Create an instance of PriceSort
price_sort = PriceSort(books_df.copy()) 

# Find the closest book
closest_book = price_sort.find_closest_book(target_price_eur, min_rating)

if closest_book is not None:
    print(f"The book closest to your target price of €{target_price_eur} with a rating of {min_rating} or over is:")
    display(pd.DataFrame(closest_book[['Title', 'Author', 'Price_EUR', 'Rating','URLs']]).style)
else:
    print(f"No book found within your budget (€{target_price_eur}) and minimum rating of {min_rating}.")


Enter your target price in Euros:  100
Enter minimum rating (optional, defaults to 4):  3


The book closest to your target price of €100.0 with a rating of 3 or over is:


Unnamed: 0,4165
Title,Kaplan & Sadock's Concise Textbook of Clinical Psychiatry
Author,Marcia Verduin
Price_EUR,98.350000
Rating,4.600000
URLs,https://www.amazon.in/Sadocks-Concise-Textbook-Clinical-Psychiatry/dp/1975167481/ref=zg_bs_g_4149702031_d_sccl_16/000-0000000-0000000?psc=1
