# BookRecommendation AI 
##### By Mahalakshmi Totad

## Part 1: Project Introduction

The two main principles of writing a good prompt:
1.  Providing clear instructions,
2.  Enhancing LLM reasoning capabilities.

Under providing clear instructions, you learnt how to structure the body of a prompt through 5 components - **Task, Role, Context, Guidelines and Output Format**.


Let’s now get started on designing our first LLM application - BookRecommendation AI.


#### Project Background

Book Recommendation Chatbot can be used to discover exciting books tailored to user preferences and interests. From different geners like thrilling mysteries, heartwarming romances, or thought-provoking non-fiction. **BookRecommendation AI, a chatbot that combines the power of large language models and rule-based functions to ensure relevant recommendations**.


#### Problem Statement

*Given a dataset containing information about books (book name, author,genere, rating , description), build a chatbot that parses the dataset and provides accurate book recommendations based on user's interest*.


In [1]:
# Install OpenAI library
# !pip install -U -q openai tenacity

In [2]:
# pip install openai --upgrade

^C
[31mERROR: Operation cancelled by user[0m[31m
[0mNote: you may need to restart the kernel to use updated packages.


In [3]:
# from google.colab import drive
# drive.mount('/content/drive')

In [4]:
# import os
# # os.chdir('/content/drive/MyDrive/GenAI Revamp - 2024/ShopAssist')
# os.chdir('/goodreads_data.csv')
# !ls

In [35]:
# Import the libraries
import pandas as pd
import re
from IPython.display import display, HTML
# Set the display width to control the output width
pd.set_option('display.width', 100)
# Read the dataset and read the Laptop Dataset


Unnamed: 0.1,Unnamed: 0,Book,Author,Description,Genres,Avg_Rating,Num_Ratings,URL
0,0,To Kill a Mockingbird,Harper Lee,The unforgettable novel of a childhood in a sl...,"['Classics', 'Fiction', 'Historical Fiction', ...",4.27,5691311,https://www.goodreads.com/book/show/2657.To_Ki...
1,1,Harry Potter and the Philosopher’s Stone (Harr...,J.K. Rowling,Harry Potter thinks he is an ordinary boy - un...,"['Fantasy', 'Fiction', 'Young Adult', 'Magic',...",4.47,9278135,https://www.goodreads.com/book/show/72193.Harr...
2,2,Pride and Prejudice,Jane Austen,"Since its immediate success in 1813, Pride and...","['Classics', 'Fiction', 'Romance', 'Historical...",4.28,3944155,https://www.goodreads.com/book/show/1885.Pride...
3,3,The Diary of a Young Girl,Anne Frank,Discovered in the attic in which she spent the...,"['Classics', 'Nonfiction', 'History', 'Biograp...",4.18,3488438,https://www.goodreads.com/book/show/48855.The_...
4,4,Animal Farm,George Orwell,Librarian's note: There is an Alternate Cover ...,"['Classics', 'Fiction', 'Dystopia', 'Fantasy',...",3.98,3575172,https://www.goodreads.com/book/show/170448.Ani...
...,...,...,...,...,...,...,...,...
9995,9995,"Breeders (Breeders Trilogy, #1)",Ashley Quigley,How far would you go? If human society was gen...,"['Dystopia', 'Science Fiction', 'Post Apocalyp...",3.44,276,https://www.goodreads.com/book/show/22085400-b...
9996,9996,Dynamo,Eleanor Gustafson,Jeth Cavanaugh is searching for a new life alo...,[],4.23,60,https://www.goodreads.com/book/show/20862902-d...
9997,9997,The Republic of Trees,Sam Taylor,This dark fable tells the story of four Englis...,"['Fiction', 'Horror', 'Dystopia', 'Coming Of A...",3.29,383,https://www.goodreads.com/book/show/891262.The...
9998,9998,"Waking Up (Healing Hearts, #1)",Renee Dyer,For Adriana Monroe life couldn’t get any bette...,"['New Adult', 'Romance', 'Contemporary Romance...",4.13,263,https://www.goodreads.com/book/show/19347252-w...


In [None]:
df = pd.read_csv('goodreads_data.csv')
df

#### Approach:

1. **Conversation and Information Gathering**: The chatbot will utilize language models to understand and generate natural responses. Through a conversational flow, it will ask relevant questions to gather information about the user's requirements.
2. **Information Extraction**: Once the essential information is collected, rule-based functions come into play, extracting top 3 books that best matches the user's interest.
3. **Personalized Recommendation**: Leveraging this extracted information, the chatbot engages in further dialogue with the user, efficiently addressing their questions to narrow down the book.

## Part 2: System Design


#### Dataset

We have a dataset `goodreads_data.csv` where  each row describes the features of a single book and also has a small description and link at the end. The chatbot that we build will leverage LLMs to parse this `Genere` , `Rating' and  `Description` column and provide recommendations

- Determine the user's interest. For simplicity, we have used below features to encapsulate the user's interest. The features are as follows:
    - Genere
    - Short description (capture user specific interest within provided Genres)
    - Preferred Author (Optional)
    - Rating

- Confirm if the user's requirements have been correctly captured at the end.

After that the chatbot lists down the top 3 books that are the most relevant, and engages in further conversation to help the user find the best one.


#### Building the Chatbot

Now let's go ahead and understand the system design for the chatbot.


I'm leveraging the same system designed used for learning and build the chat bot in following stages
|`Stage 1`

- Intent Clarity Layer
- Intent Confirmation Layer

`Stage 2`

- Book Mapping Layer
- Book Information Extraction Layer

`Stage 3`

- Book Recommendation Layer

##### Major functions behind the Chatbot

Let's now look at a brief overview of the major functions that form the chatbot. We'll take a deep dive later



- `initialize_conversation()`: This initializes the variable conversation with the system message.
- `get_chat_completions()`: This takes the ongoing conversation as the input and returns the response by the assistant
- `moderation_check()`: This checks if the user's or the assistant's message is inappropriate. If any of these is inappropriate, it ends the conversation.
- `intent_confirmation_layer()`: This function takes the assistant's response and evaluates if the chatbot has captured the user's profile clearly. Specifically, this checks if the following properties for the user has been captured or not  Author, Genere and short description containing interest
- `dictionary_present()`: This function checks if the final understanding of user's profile is returned by the chatbot as a python dictionary or not. If there is a dictionary, it extracts the information as a Python dictionary.
- `compare_book_with_userinterest()`: This function compares the user's profile with the different books and come back with the top 3 recommendations baed on rating.
- `initialize_conv_reco()`: Initializes the recommendations conversation

In the next sections, we will look at how to write the code for the above functions.

## Part 3: Implementation

## Stage 1

### 3.1 - Import the libraries

Let's start by importing the libraries that we'll require for this project. Following are the ones:
- openai
- pandas
- os, json, ast

Make sure the api key is stored in the text file `OPENAI_API_Key.txt`.

In [37]:
# Import the libraries
import os, json, ast
import openai
from tenacity import retry, wait_random_exponential, stop_after_attempt

In [59]:
# Read the OpenAI API key
openai.api_key = open("OPENAI_API_Key.txt", "r").read().strip()
os.environ['OPENAI_API_KEY'] = openai.api_key

In [39]:
# # Recall that messages to the LLM is a list of dicts containing system_message, user_input and assistant_message
# conversation = [{"role": "system", "content": system_message},
#                 {"role": "user", "content": user_input},
#                 {"role": "assistant", "content": assistant_message}]

### 3.2 - Implementing Intent Clarity and Intent Confirmation Layers

Let's start with the first part of the implementation - building the `intent clarity` and `intent confirmation` layers. As mentioned earlier, this layer helps in identifying the user requirements and passing it on to the book matching layer. Here are the functions that we would be using for building these layers:

- `initialize_conversation()`


### `initialize_conversation()`:
This initializes the variable conversation with the system message. Using prompt engineering and chain of thought reasoning, the function will enable the chatbot to keep asking questions until the user requirements have been captured in a dictionary. It also includes Few Shot Prompting(sample conversation between the user and assistant) to align the model about user and assistant responses at each step.

In [190]:
def initialize_conversation():
    '''
    Returns a list [{"role": "system", "content": system_message}]
    '''

    delimiter = "####"

    example_user_dict = {'Genres': ["Fantasy","Classic"],
                        'Author':"J.K.Rowling",
                        'Description': "something involving castles",
                        'Rating': "greater than 3.5"
                        }

    example_user_req = {'Genres': ["_"],
                        'Author': "_",
                        'Description': "_",
                        'Rating': "_"
                        }

    system_message = f"""
    You are an intelligent book recommendation expert and your goal is to find the best book for a user.
    You need to ask relevant questions and understand the user interest  by analysing the user's interest in Genress.
    You final objective is to fill the values for the different keys ('Genres','Author','Description','Rating') in the python dictionary and be confident of the values.
    These key value pairs define the user's profile.
    The python dictionary looks like this
    {{'Genres': ['values'],'Author': 'values','Description': 'values','Rating': 'values'}}
    The value for 'Rating' should include a numerical value extracted from the user's response , it can also include with less than or greater than or absolute value as well.
    The values of 'Description' should include users preferrence based on provided Genres values, as stated by user.
    All the values in the example dictionary are only representative values.
    {delimiter}
    Here are some instructions around the values for the different keys. If you do not follow this, you'll be heavily penalised:
    - The values for 'Genres' should have atlest one or more Genress given as input by users and format it into comma seperated list
    - The values for 'Author' can be optional but do ask the user for thier preference, it's ok if author is not provided.
    - The value for 'Description' should be a short description of what user is looking for
    - The value for 'Rating' should include a numerical value extracted from the user's response , it can also include with less than or greater than or absolute value as well..
    - 'Rating' value needs to be between 1 and 5 and user input can say 'greater than a certain number'. The value can also be fraction value between 1 and 5. If the user says less than 0 or greater than 5, please mention that they need to provide value in given range.
    - Do not randomly assign values to any of the keys.
    - The values need to be inferred from the user's response.
    {delimiter}

    To fill the dictionary, you need to have the following chain of thoughts:
    Follow the chain-of-thoughts below and only output the final updated python dictionary for the keys as described in {example_user_req}. \n
    {delimiter}
    Thought 1: Ask a question to understand the user's profile and requirements. \n
    If their Genres or short description of the book is unclear. Ask followup questions to understand their and gather proper Genres.
    You are trying to fill the values of all the keys {{'Genres','Author','Description','Rating'}} in the python dictionary by understanding the user requirements.
    Identify the keys for which you can fill the values confidently using the understanding. \n
    Remember the instructions around the values for the different keys.
    If the necessary information has been extracted, only then proceed to the next step. \n
    Otherwise, rephrase the question to capture their profile or interest clearly. \n

    {delimiter}
    Thought 2: Now, you are trying to fill the values for the rest of the keys which you couldn't in the previous step.
    Remember the instructions around the values for the different keys.
    Ask questions you might have for all the keys to strengthen your understanding of the user's profile or interest.
    If yes, move to the next Thought. If no, ask question on the keys whose values you are unsure of. \n
    It is a good practice to ask question with a sound logic as opposed to directly citing the key you want to understand value for.
    {delimiter}

    {delimiter}
    Thought 3: Check if you have correctly updated the values for the different keys in the python dictionary.
    If you are not confident about any of the values, ask clarifying questions.
    {delimiter}

    {delimiter}
    Here is a sample conversation between the user and assistant:
    User: "Hi, I like fiction."
    Assistant: "Great! As an fiction reader, you like books that transport you to different worlds, times, and experiences, offering a delightful escape from reality. May I know if you have any preferred authors? Do you preferrer any theme within fiction? Understanding the theme or preferences will help me tailor my recommendations accordingly. Let me know if my understanding is correct until now."
    User: "I read works of J.K.Rowling and similar."
    Assistant: "Tell about the kind of stories you love, to help you discover your next favorite book. Whether you're in the mood for a gripping thriller, a tender romance, or a fantastical adventure"
    User: "Yes, I'm in the mood for gripping thriller."
    Assistant:"Thank you for the information. Could you kindly let me know what rating are you looking for the recommended books? This will help me find options that fit within your range and I might be able to find good book in rating range while meeting the specified requirements as well."
    User: "my rating is greater than 3.5"
    Assistant: "{example_user_dict}"
    {delimiter}

    Start with a short welcome message and encourage the user to share their requirements.
    """
    conversation = [{"role": "system", "content": system_message}]
    # conversation = system_message
    return conversation

Let's see what does `initialize_conversation()` does.

<br>

We have added a prefix `debug_` to each of the variables so that we can play around with the inputs and outputs and it doesn't disturb the main function.

In [191]:
debug_conversation = initialize_conversation()
print(debug_conversation)

[{'role': 'system', 'content': '\n    You are an intelligent book recommendation expert and your goal is to find the best book for a user.\n    You need to ask relevant questions and understand the user interest  by analysing the user\'s interest in Genress.\n    You final objective is to fill the values for the different keys (\'Genres\',\'Author\',\'Description\',\'Rating\') in the python dictionary and be confident of the values.\n    These key value pairs define the user\'s profile.\n    The python dictionary looks like this\n    {\'Genres\': [\'values\'],\'Author\': \'values\',\'Description\': \'values\',\'Rating\': \'values\'}\n    The value for \'Rating\' should include a numerical value extracted from the user\'s response , it can also include with less than or greater than or absolute value as well.\n    The values of \'Description\' should include users preferrence based on provided Genres values, as stated by user.\n    All the values in the example dictionary are only repre

In [192]:
# Let's look at the content in the debug_conversation key
print(debug_conversation[0]['content'])


    You are an intelligent book recommendation expert and your goal is to find the best book for a user.
    You need to ask relevant questions and understand the user interest  by analysing the user's interest in Genress.
    You final objective is to fill the values for the different keys ('Genres','Author','Description','Rating') in the python dictionary and be confident of the values.
    These key value pairs define the user's profile.
    The python dictionary looks like this
    {'Genres': ['values'],'Author': 'values','Description': 'values','Rating': 'values'}
    The value for 'Rating' should include a numerical value extracted from the user's response , it can also include with less than or greater than or absolute value as well.
    The values of 'Description' should include users preferrence based on provided Genres values, as stated by user.
    All the values in the example dictionary are only representative values.
    ####
    Here are some instructions around the val

In [193]:
# Let's initialise conversation
system_message = initialize_conversation()
print(system_message[0]["content"])


    You are an intelligent book recommendation expert and your goal is to find the best book for a user.
    You need to ask relevant questions and understand the user interest  by analysing the user's interest in Genress.
    You final objective is to fill the values for the different keys ('Genres','Author','Description','Rating') in the python dictionary and be confident of the values.
    These key value pairs define the user's profile.
    The python dictionary looks like this
    {'Genres': ['values'],'Author': 'values','Description': 'values','Rating': 'values'}
    The value for 'Rating' should include a numerical value extracted from the user's response , it can also include with less than or greater than or absolute value as well.
    The values of 'Description' should include users preferrence based on provided Genres values, as stated by user.
    All the values in the example dictionary are only representative values.
    ####
    Here are some instructions around the val

Let's now look at the next function.
- `get_chat_completions()`: This takes the ongoing conversation as the input and returns the response by the assistant. We'll use the Chat Completions function for performing LLM calls to OpenAI.

### `get_chat_completions()`:

This function perform LLM call using the Chat Completions API to get the LLM response.

In [194]:
# Define a Chat Completions API call
# Retry up to 6 times with exponential backoff, starting at 1 second and maxing out at 20 seconds delay
# @retry(wait=wait_random_exponential(min=1, max=20), stop=stop_after_attempt(6))
def get_chat_completions(input, json_format = False):
    MODEL = 'gpt-3.5-turbo'

    system_message_json_output = """<<. Return output in JSON format to the key output.>>"""

    # If the output is required to be in JSON format
    if json_format == True:
        # Append the input prompt to include JSON response as specified by OpenAI
        input[0]['content'] += system_message_json_output

        # JSON return type specified
        chat_completion_json = openai.chat.completions.create(
            model = MODEL,
            messages = input,
            response_format = { "type": "json_object"},
            seed = 1234)

        output = json.loads(chat_completion_json.choices[0].message.content)

    # No JSON return type specified
    else:
        chat_completion = openai.chat.completions.create(
            model = MODEL,
            messages = input,
            seed = 2345)

        output = chat_completion.choices[0].message.content

    return output

In [49]:
# Testing the OpenAI functions defined above
# input_prompt ='What is the capital of France?'
# messages = [{'role':'user','content':input_prompt}]
# # system_message_json_output = """<<. Return output in JSON format.>>"""
# # messages[0]['content']+=system_message_json_output
# messages

[{'role': 'user', 'content': 'What is the capital of France?'}]

In [50]:
## Get LLM Outputs - normal
# get_chat_completions(messages) ## Chat Completions API

'The capital of France is Paris.'

### iterate_response() - Helper Function:
We've created a small helper test function to ensure the model's response is consistent.
Uncomment the code blocks and run the function `iterate_response(response)` to check if the response of the `intent_confirmation_layer`is consistent.}

In [195]:
def iterate_llm_response(funct, debug_response, num = 10):
    """
    Calls a specified function repeatedly and prints the results.
    This function is designed to test the consistency of a response from a given function.
    It calls the function multiple times (default is 10) and prints out the iteration count,
    the function's response(s).
    Args:
        funct (function): The function to be tested. This function should accept a single argument
                          and return the response value(s).
        debug_response (dict): The input argument to be passed to 'funct' on each call.
        num (int, optional): The number of times 'funct' will be called. Defaults to 10.
    Returns:
        This function only returns the results to the console.
    """
    i = 0  # Initialize counter

    while i < num:  # Loop to call the function 'num' times

        response = funct(debug_response)  # Call the function with the debug response

        # Print the iteration number, result, and reason from the response
        print("Iteration: {0}".format(i))
        print(response)
        print('-' * 50)  # Print a separator line for readability
        i += 1  # Increment the counter

# Example usage: test the consistency of responses from 'intent_confirmation_layer'
# iterate_llm_response(get_chat_completions, messages)

Let's pass the initialized conversation `debug_conversation` and see what is the assistant's response.

In [196]:
debug_user_input = "Hi, I am Mahalakshmi. I need a book from classic and non-fiction Genres."

In [197]:
debug_conversation.append({"role": "user", "content": debug_user_input})

In [198]:
# print(debug_conversation[0]["content"]) # System Message
print(debug_conversation[1]["content"]) # User Input

Hi, I am Mahalakshmi. I need a book from classic and non-fiction Genres.


In [199]:
# Let's look at the debug_conversation list
display(debug_conversation)

[{'role': 'system',
  'content': '\n    You are an intelligent book recommendation expert and your goal is to find the best book for a user.\n    You need to ask relevant questions and understand the user interest  by analysing the user\'s interest in Genress.\n    You final objective is to fill the values for the different keys (\'Genres\',\'Author\',\'Description\',\'Rating\') in the python dictionary and be confident of the values.\n    These key value pairs define the user\'s profile.\n    The python dictionary looks like this\n    {\'Genres\': [\'values\'],\'Author\': \'values\',\'Description\': \'values\',\'Rating\': \'values\'}\n    The value for \'Rating\' should include a numerical value extracted from the user\'s response , it can also include with less than or greater than or absolute value as well.\n    The values of \'Description\' should include users preferrence based on provided Genres values, as stated by user.\n    All the values in the example dictionary are only rep

In [69]:
#Test example
# Getting the response from the Assistant by passing the conversation to the Chat Completions API
# debug_response_assistant = get_chat_completions(debug_conversation)
# display(debug_response_assistant)

'Great! Fantasy and fiction genres offer a wide range of imaginative and captivating stories. Do you have any preferred authors in mind or any specific themes within these genres that you enjoy? Understanding your preferences will help me tailor my recommendations accordingly. Let me know if my understanding is correct until now.'

In [200]:
# Getting the response from the Assistant by passing the conversation to the Chat Completions API
debug_response_assistant = get_chat_completions(debug_conversation)
display(debug_response_assistant)

'Great! As a reader who enjoys classic and non-fiction genres, you appreciate timeless literary works and factual narratives that expand your knowledge and understanding of the world. Could you please provide me with more details about your preferences within these genres? Do you have any favorite authors in mind or specific themes that you are interested in classic and non-fiction books? Understanding your preferences will allow me to tailor my recommendations effectively.'

Let's play around a bit and add the following user's input `debug_user_input` to the conversation `debug_conversation` and see what the assistant responds with.

In [201]:
# Let's append this to the conversation list
debug_conversation.append(({"role": "system", "content": debug_response_assistant}))
debug_conversation

[{'role': 'system',
  'content': '\n    You are an intelligent book recommendation expert and your goal is to find the best book for a user.\n    You need to ask relevant questions and understand the user interest  by analysing the user\'s interest in Genress.\n    You final objective is to fill the values for the different keys (\'Genres\',\'Author\',\'Description\',\'Rating\') in the python dictionary and be confident of the values.\n    These key value pairs define the user\'s profile.\n    The python dictionary looks like this\n    {\'Genres\': [\'values\'],\'Author\': \'values\',\'Description\': \'values\',\'Rating\': \'values\'}\n    The value for \'Rating\' should include a numerical value extracted from the user\'s response , it can also include with less than or greater than or absolute value as well.\n    The values of \'Description\' should include users preferrence based on provided Genres values, as stated by user.\n    All the values in the example dictionary are only rep

Typically, whenever the chatbot is interacting with the user, all the conversations should be moderated to identify any inappropriate content. Let's look at the function that can help with it.

### `moderation_check()`:
 This checks if the user's or the assistant's message is inappropriate. If any of these is inappropriate, you can add a break statement to end the conversation.

In [202]:
# Define a function called moderation_check that takes user_input as a parameter.

def moderation_check(user_input):
    # Call the OpenAI API to perform moderation on the user's input.
    response = openai.moderations.create(input=user_input)

    # Extract the moderation result from the API response.
    moderation_output = response.results[0].flagged
    # Check if the input was flagged by the moderation system.
    if response.results[0].flagged == True:
        # If flagged, return "Flagged"
        return "Flagged"
    else:
        # If not flagged, return "Not Flagged"
        return "Not Flagged"

In [83]:
moderation_check("I want to kill them.")

'Flagged'

Let's test moderation on the `debug_user_input`

In [84]:
debug_moderation = moderation_check(debug_user_input)
print(debug_moderation)

Not Flagged


So, this moderation api may not be perfect but if you ask this to the ChatGPT or it's API (GPT 3.5), it'll not help you with such requests. Remember, moderation should also be applied on the GPT 3.5's output.

Let's now check moderation on the assistant's response `debug_response_assistant`.

In [None]:
moderation_check(debug_response_assistant)

As mentioned earlier, you need to understand the user's profile, which essentially means that all the features: Genres, Author, Description, Rating are captured or not. Let's look at the function that helps us verify that.

### `intent_confirmation_layer()`:

This function takes the assistant's response and evaluates if the chatbot has captured the user's profile clearly. Specifically, this checks if the following properties for the user has been captured or not
   - Genres
   - Author(optional)
   - Description
   - Rating



```
def intent_confirmation_layer(response_assistant):
    """
    This function serves as an intent confirmation layer for a laptop recommendation system using OpenAI LLM API.

    Parameters:
    - response_assistant (str): The input text containing user requirements captured through 6 keys:
        'GPU intensity', 'Display quality', 'Portability', 'Multitasking', 'Processing speed', and 'Budget'.

    Returns:
    - str: A one-word string in JSON format indicating if the values for the specified keys are correctly filled.
        - 'Yes' if the values are correctly filled for all keys ('GPU intensity', 'Display quality', 'Portability',
          'Multitasking', 'Processing speed') based on the importance as stated by the user.
        - 'No' otherwise.

    Note:
    - The values for all keys, except 'Budget', should be 'low', 'medium', or 'high' based on their importance as stated by the user.
    - The input text should be structured such that it contains the necessary keys and their corresponding values.
    - The function uses OpenAI's Chat Completion API to evaluate the correctness of the input values.
    """
```



In [203]:
def intent_confirmation_layer(response_assistant):

    delimiter = "####"

    # allowed_values = {'low','medium','high'}

    # The values for 'Genres' should have atlest one or more Genress given as input by users
    # - The values for 'Author' can be optional but do ask the user for thier preference, it's ok if author is not provided.
    # - The value for 'Description' should be a short description of what user is looking for
    # - The value for 'Rating' should be a numerical value extracted from the user's response.
    # - 'Rating' value needs to be between 1 and 5 and user input can say 'greater than a certain number'. The value can also be fraction value between 1 and 5. If the user says less than 0 or greater than 5, please mention that they need to provide value in given range.

    
    prompt = f"""
    You are a senior evaluator who has an eye for detail.The input text will contain a user requirement captured through 3 or 4 keys.
    You are provided an input. You need to evaluate if the input text has the following keys'Author' being optional:
    {{
    'Genres': ['values'],
    'Author':'values',
    'Description':'values',
    'Rating':'values'
    }}
    The values for the 'Description' should have short description of user interest
    The 'Rating' key can take a numerical value along with key words like greater than or less tham.
    Next you need to evaluate if the keys have the the values filled correctly.
    Only output a one-word string in JSON format at the key 'result' - Yes/No.
    Thought 1 - Output a string 'Yes' if the values are correctly filled for all keys, otherwise output 'No'.
    Thought 2 - If the answer is No, mention the reason in the key 'reason'.
    THought 3 - Think carefully before the answering.
    """

    messages=[{"role": "system", "content":prompt },
              {"role": "user", "content":f"""Here is the input: {response_assistant}""" }]

    response = openai.chat.completions.create(
                                    model="gpt-3.5-turbo",
                                    messages = messages,
                                    response_format={ "type": "json_object" },
                                    seed = 1234
                                    # n = 5
                                    )

    json_output = json.loads(response.choices[0].message.content)

    return json_output

In [88]:
# Here are some sample input output pairs for better understanding:
# {delimiter}
# input: "{{'GPU intensity': 'low', 'Display quality': 'high', 'Portability': 'low', 'Multitasking': 'high', 'Processing speed': 'low'}}"
# output: No

# input: "{{'GPU intensity': 'low', 'Display quality': 'high', 'Portability': 'low', 'Multitasking': 'high', 'Processing speed': '', 'Budget': '90000'}}"
# output: No

# input: "Here is your user profile 'GPU intensity': 'high','Display quality': 'high','Portability': 'medium','Multitasking': 'low','Processing speed': 'high','Budget': '200000'"
# output: Yes

# input: "Here is your recommendation {{'GPU intensity': 'low', 'Display quality': 'high', 'Portability': 'low', 'Multitasking': 'high', 'Processing speed': 'low', 'Budget': '90000'}}"
# output: Yes

# input: "Here is your recommendation - 'GPU intensity': 'high' - 'Display quality': 'low' - 'Portability': 'low'  - 'Multitasking': 'high' - 'Processing speed': 'high' - 'Budget': '90000' "
# output: Yes

# input: "You can look at this - GPU intensity: high - Display quality: low - Portability: low  - Multitasking: high - Processing speed: high - Budget: 90000"
# output: Yes

# input: "{{GPU intensity: low, Display quality: high, Portability: low, Multitasking:high,Processing speed: Low, Budget: 70000}}"
# output: No

# {delimiter}

Let's apply the function to the assistant's reponse and see if it has captured the user profile.

In [204]:
debug_response_assistant

'Great! As a reader who enjoys classic and non-fiction genres, you appreciate timeless literary works and factual narratives that expand your knowledge and understanding of the world. Could you please provide me with more details about your preferences within these genres? Do you have any favorite authors in mind or specific themes that you are interested in classic and non-fiction books? Understanding your preferences will allow me to tailor my recommendations effectively.'

In [205]:
debug_confirmation = intent_confirmation_layer(debug_response_assistant)
display(debug_confirmation)

{'result': 'No', 'reason': 'Missing keys Genres, Rating, Description'}

In [206]:
# Printing the value for better clarity
print("Result:",debug_confirmation.get('result'),"\t", "Reason:", debug_confirmation.get('reason'))

Result: No 	 Reason: Missing keys Genres, Rating, Description


#### Inference: We can confirm that this layer is working fine since Author key is defined as an optinal key and in the reason from confirmation layer it's not listed as missing key.

Now, you can keep adding user and assistant responses to debug_conversation and get to a point where intent_confirmation_layer() gives yes as a response. Let's see if the following response by the assistant passes the intent_confirmation_layer() test.

In [207]:
#Let's add the above assistant response to the debug_conversation.
debug_conversation.append({"role": "assistant", "content": debug_response_assistant})

In [208]:
debug_conversation

[{'role': 'system',
  'content': '\n    You are an intelligent book recommendation expert and your goal is to find the best book for a user.\n    You need to ask relevant questions and understand the user interest  by analysing the user\'s interest in Genress.\n    You final objective is to fill the values for the different keys (\'Genres\',\'Author\',\'Description\',\'Rating\') in the python dictionary and be confident of the values.\n    These key value pairs define the user\'s profile.\n    The python dictionary looks like this\n    {\'Genres\': [\'values\'],\'Author\': \'values\',\'Description\': \'values\',\'Rating\': \'values\'}\n    The value for \'Rating\' should include a numerical value extracted from the user\'s response , it can also include with less than or greater than or absolute value as well.\n    The values of \'Description\' should include users preferrence based on provided Genres values, as stated by user.\n    All the values in the example dictionary are only rep

Let's say that after a series of conversations you get the following response from the assistant.

In [None]:
# # Example 1 - Let's check with the confirmation_layer if all the keys are present
# debug_response_assistant_1 = f"""
# Great, thank you for clarifying your requirements.
# Based on your inputs, here is the final profile for the laptop you are looking for:
# {{'GPU intensity':'high',
#  'Display quality':'high',
#  'Portability':'low',
#  'Multitasking':'low',
#  'Processing speed':'low',
#  'Budget':'50000 INR'}}
# """
# #Note that you are using double curly braces

# print(debug_response_assistant_1)

<!-- Do you think it'll pass the `intent_confirmation_layer()` test?

 Let's try it out. -->

In [None]:
# response = intent_confirmation_layer(debug_response_assistant_1)
# response.get('result') # Extract the result key from the dictionary

In [None]:
# # Example 2 - Let's check confirmation_layer if all the keys are present
# debug_response_assistant_2 = f"""
# Great, thank you for clarifying your requirements.
# Based on your inputs, here is the final profile for the laptop you are looking for:
# {{'GPU intensity':'high',
#  'Display quality':'high',
#  'Portability':'low',
#  'Multitasking':'low',
#  'Processing speed':'low'}}
# """
# #Note that you are using double curly braces

# print(debug_response_assistant_2)

In [None]:
# intent_confirmation_layer(debug_response_assistant_2)
# # iterate_llm_response(intent_confirmation_layer, debug_response_assistant_2)

In [None]:
# # Example 3 - Let's check confirmation_layer if all the keys are present
# debug_response_assistant_3 = f"""
# Great, thank you for clarifying your requirements.
# Based on your inputs, here is the final profile for the laptop you are looking for:
# {{'GPU intensity':'high',
#  'Display quality':'high',
#  'Portability':'low',
#  'Multitasking':'low',
#  'Processing speed':'low',
#  'Budget':'50000'}}
# """
# #Note that you are using double curly braces

# print(debug_response_assistant_3)

In [None]:
# intent_confirmation_layer(debug_response_assistant_3)
# # iterate_llm_response(intent_confirmation_layer, debug_response_assistant_3)

Let's now look at the working of `dictionary_present()`.

### `dictionary_present()`:

This function checks if the final understanding of user's profile is returned by the chatbot is a Python dictionary or not. This is important as it'll be used later on for finding the right laptops using dictionary matching.

In [210]:
def dictionary_present(response):
    delimiter = "####"

    # user_req = {'GPU intensity': 'high',
    #             'Display quality': 'high',
    #             'Portability': 'medium',
    #             'Multitasking': 'high',
    #             'Processing speed': 'high',
    #             'Budget': '200000'}
    
    user_req = {'Genres': ['Classic , Fiction'],
                'Author':'J.K. rowling ',
                'Description':'something involving castles',
                'Rating':'greater than 3.5'
                }
    
    

    prompt = f"""You are a python expert. You are provided an input.
            You have to check if there is a python dictionary present in the string.
            It will have the following format {user_req}.
            Your task is to just extract the relevant values from the input and return only the python dictionary in JSON format.
            The output should match the format as {user_req}.

            {delimiter}
            Make sure that the value of rating is also present in the user input. ###
            The output should contain the exact keys and values as present in the input.
            Ensure the keys and values are in the given format:
            {{
            'Genres': ['values'],
            'Author':'values',
            'Description':'values',
            'Rating':'values'
            }}
            Here are some sample input output pairs for better understanding:
            {delimiter}
            input 1: - Genres: Fiction - Author: J.K.Rowling - Description: something involving castles - Rating: greater than 3.5
            output 1: {{'Genres': 'Fiction', 'Author': 'J.K.Rowling', 'Description': 'something involving castles', 'Rating': 'greater than 3.5'}}

            input 2: - Genres: Motivational - Author:  - Description: book to motivate towards healthy life - Rating: greater than 3.5
            output 2: {{'Genres': 'Motivational', 'Author': '', 'Description': 'book to motivate towards healthy life', 'Rating': 'greater than 3.5'}}

            input 3: - Genres: Biography - Author:  - Description: biography of scientists - Rating: greater than 3.5
            output 3: {{'Genres': 'Biography', 'Author': 'J.K.Rowling', 'Description': 'biography of scientists', 'Rating': 'greater than 3.5'}}
            {delimiter}
            """
    messages = [{"role": "system", "content":prompt },
                {"role": "user", "content":f"""Here is the user input: {response}""" }]

    confirmation = get_chat_completions(messages, json_format = True)

    return confirmation

Let's start by passing the `debug_response_assistant`.

In [211]:
debug_response_assistant_n = """
{'Genres': ['Classic , Fiction'],
'Author':'J.K. rowling ',
'Description':'something involving castles',
'Rating':'greater than 3.5'
}
"""

In [212]:
response_dict_n = dictionary_present(debug_response_assistant_n)
display(response_dict_n)

{'Genres': ['Classic , Fiction'],
 'Author': 'J.K. rowling ',
 'Description': 'something involving castles',
 'Rating': 'greater than 3.5'}

In [213]:
type(response_dict_n)

dict

What if you pass something like this where it is not in the form of a dictionary? Or some key or some values are missing? Let's see.

In [214]:
debug_response_assistant_n = f"""Thank you for providing your information.
Based on your requirements  I will consider this while recommending suitable books options for you.
Here is the final recommendation for your book/s:
'Genres': 'Classic , Fiction',
'Author':'J.K. rowling ',
'Description':'something involving castles',
'Rating':'greater than 3.5'


Please note that these specifications are based on your requirements.
Let me know if there's anything else I can assist you with!"""

In [215]:
response_dict_n = dictionary_present(debug_response_assistant_n)
display(response_dict_n)

{'Genres': 'Classic , Fiction',
 'Author': 'J.K. rowling ',
 'Description': 'something involving castles',
 'Rating': 'greater than 3.5'}

In [216]:
type(response_dict_n)

dict

In [217]:
# Check for LLM function's consistency
iterate_llm_response(dictionary_present, debug_response_assistant_n)

Iteration: 0
{'Genres': 'Classic , Fiction', 'Author': 'J.K. rowling ', 'Description': 'something involving castles', 'Rating': 'greater than 3.5'}
--------------------------------------------------
Iteration: 1
{'Genres': 'Classic , Fiction', 'Author': 'J.K. rowling ', 'Description': 'something involving castles', 'Rating': 'greater than 3.5'}
--------------------------------------------------
Iteration: 2
{'Genres': 'Classic , Fiction', 'Author': 'J.K. rowling ', 'Description': 'something involving castles', 'Rating': 'greater than 3.5'}
--------------------------------------------------
Iteration: 3
{'Genres': 'Classic , Fiction', 'Author': 'J.K. rowling ', 'Description': 'something involving castles', 'Rating': 'greater than 3.5'}
--------------------------------------------------
Iteration: 4
{'Genres': 'Classic , Fiction', 'Author': 'J.K. rowling ', 'Description': 'something involving castles', 'Rating': 'greater than 3.5'}
--------------------------------------------------
Itera

Let's quickly take a look at the code that we have run until now.

In [None]:
# debug_conversation

In [236]:
debug_conversation = initialize_conversation()
debug_user_input = "Hi, I am Mahalakshmi. I need a book recommendation for Fiction and classic."
debug_moderation = moderation_check(debug_user_input)
debug_conversation.append({"role": "user", "content": debug_user_input})

debug_response_assistant = get_chat_completions(debug_conversation)
debug_moderation = moderation_check(debug_response_assistant)
debug_conversation.append({"role": "assistant", "content": debug_response_assistant})

debug_confirmation = intent_confirmation_layer(debug_response_assistant)
# After a series of conversation...
response_dict_n = dictionary_present(debug_response_assistant_n)
print(response_dict_n)

{'Genres': 'Classic , Fiction', 'Author': 'J.K. rowling ', 'Description': 'something involving castles', 'Rating': 'greater than 3.5'}


So, now that you have the user profile stored in `response_dict_n`. We'll use this to generate recommendations. Before that, we need to create a similar profile for every laptop. Let's see how we do it.

## Stage 2

[Stage 2 Flowchart](https://cdn.upgrad.com/uploads/production/c71aa254-32db-4265-a083-f4a540dac014/Stage+2.jpg)

### 3.3 Implementing the Product Mapping and Information Extraction Layers
This stage consists of the steps that will extract information and form book features .

### `product_map_layer()`:

-  Use a prompt that assign it the role of a Book Specifications Classifier, whose objective is to extract key features and classify them based on book fields and descriptions.

- Provide step-by-step instructions for extracting book features from description as well. This function is responsible for extracting key features and criteria from book descriptions. 

- Since all the details of the books are already available in given columns and there is no logic is to be applied for the segregation of data we will use simple pandas commands to extact the required information and create books faetures column which will contin a dictionary with our key and values

- Assign specific rules for each feature (e.g., Author, Description , Genres, Avg Rating) and associate them with the appropriate classification rating value.

- Includes Few Shot Prompting (sample conversation between the user and assistant) to demonstrate the expected result of the feature extraction and classification process.

In [315]:
book_df= pd.read_csv('goodreads_data.csv')

In [316]:
book_df

Unnamed: 0.1,Unnamed: 0,Book,Author,Description,Genres,Avg_Rating,Num_Ratings,URL
0,0,To Kill a Mockingbird,Harper Lee,The unforgettable novel of a childhood in a sl...,"['Classics', 'Fiction', 'Historical Fiction', ...",4.27,5691311,https://www.goodreads.com/book/show/2657.To_Ki...
1,1,Harry Potter and the Philosopher’s Stone (Harr...,J.K. Rowling,Harry Potter thinks he is an ordinary boy - un...,"['Fantasy', 'Fiction', 'Young Adult', 'Magic',...",4.47,9278135,https://www.goodreads.com/book/show/72193.Harr...
2,2,Pride and Prejudice,Jane Austen,"Since its immediate success in 1813, Pride and...","['Classics', 'Fiction', 'Romance', 'Historical...",4.28,3944155,https://www.goodreads.com/book/show/1885.Pride...
3,3,The Diary of a Young Girl,Anne Frank,Discovered in the attic in which she spent the...,"['Classics', 'Nonfiction', 'History', 'Biograp...",4.18,3488438,https://www.goodreads.com/book/show/48855.The_...
4,4,Animal Farm,George Orwell,Librarian's note: There is an Alternate Cover ...,"['Classics', 'Fiction', 'Dystopia', 'Fantasy',...",3.98,3575172,https://www.goodreads.com/book/show/170448.Ani...
...,...,...,...,...,...,...,...,...
9995,9995,"Breeders (Breeders Trilogy, #1)",Ashley Quigley,How far would you go? If human society was gen...,"['Dystopia', 'Science Fiction', 'Post Apocalyp...",3.44,276,https://www.goodreads.com/book/show/22085400-b...
9996,9996,Dynamo,Eleanor Gustafson,Jeth Cavanaugh is searching for a new life alo...,[],4.23,60,https://www.goodreads.com/book/show/20862902-d...
9997,9997,The Republic of Trees,Sam Taylor,This dark fable tells the story of four Englis...,"['Fiction', 'Horror', 'Dystopia', 'Coming Of A...",3.29,383,https://www.goodreads.com/book/show/891262.The...
9998,9998,"Waking Up (Healing Hearts, #1)",Renee Dyer,For Adriana Monroe life couldn’t get any bette...,"['New Adult', 'Romance', 'Contemporary Romance...",4.13,263,https://www.goodreads.com/book/show/19347252-w...


In [317]:
def create_row_dict(row):
    return row.to_dict()

In [318]:
book_df.columns

Index(['Unnamed: 0', 'Book', 'Author', 'Description', 'Genres', 'Avg_Rating', 'Num_Ratings',
       'URL'],
      dtype='object')

In [223]:
# book_df_dropped = book_df.drop(columns=[ 'URL', 'Num_Ratings','Book'])
# book_df_dropped.columns

Index(['Unnamed: 0', 'Author', 'Description', 'Genres', 'Avg_Rating'], dtype='object')

In [321]:
book_df_cleaned = book_df.loc[:, ~book_df.columns.str.contains('^Unnamed')]

In [322]:
book_df_cleaned.columns

Index(['Book', 'Author', 'Description', 'Genres', 'Avg_Rating', 'Num_Ratings', 'URL'], dtype='object')

In [323]:
# Run this code once to extract book info in the form of a dictionary
# Create a new column "book_feature" that contains the dictionary of the product features
book_df_cleaned['book_features'] = book_df_cleaned.apply(create_row_dict, axis=1)
book_df_cleaned['book_features']

0       {'Book': 'To Kill a Mockingbird', 'Author': 'H...
1       {'Book': 'Harry Potter and the Philosopher’s S...
2       {'Book': 'Pride and Prejudice', 'Author': 'Jan...
3       {'Book': 'The Diary of a Young Girl', 'Author'...
4       {'Book': 'Animal Farm', 'Author': 'George Orwe...
                              ...                        
9995    {'Book': 'Breeders (Breeders Trilogy, #1)', 'A...
9996    {'Book': 'Dynamo', 'Author': 'Eleanor Gustafso...
9997    {'Book': 'The Republic of Trees', 'Author': 'S...
9998    {'Book': 'Waking Up (Healing Hearts, #1)', 'Au...
9999    {'Book': 'Bits and Pieces: Tales and Sonnets',...
Name: book_features, Length: 10000, dtype: object

In [324]:
# book_df['book_feature'] = book_df.apply(lambda x: product_map_layer(x))

In [325]:
book_df_cleaned.to_csv("goodreads_updated_data.csv",index=False,header = True)

### `compare_books_with_user()`:

This function compares the user's profile with the different books and come back with the top recommendations. It will perform the following steps:
- It will take the user requirements dictionary as input
- Filter the books based on their rating, keeping only the ones within the user's rating perference.
- Calculate a score for each book based on how well it matches the user's requirements.
- Sort the books based on their scores in descending order.
- Return the top 3 books as a JSON-formatted string.

In [237]:
response_dict_n

{'Genres': 'Classic , Fiction',
 'Author': 'J.K. rowling ',
 'Description': 'something involving castles',
 'Rating': 'greater than 3.5'}

In [180]:
# user_requirements = response_dict_n

# numbers = re.findall(r'\d+\.\d+|\d+', user_requirements.get('Rating', '0'))

# # extract_numbers = re.findall(r'\d+', user_requirements.get('Rating', '0'))

# numbers = [float(num) for num in numbers]
# type(numbers)
# numbers

[3.5]

In [181]:
# numbers = re.findall(r'\d+\.\d+|\d+', user_requirements.get('Rating', '0'))
# rating = [float(num) for num in numbers]

# filtered_books_1 = book_df_cleaned.copy()
# filtered_books_1['Avg_Rating'] = filtered_books_1['Avg_Rating'].astype(float)
# filtered_books_1 = filtered_books_1[filtered_books_1['Avg_Rating'] >= rating[0]].copy()
# filtered_books_1

Unnamed: 0,Author,Description,Genres,Avg_Rating,book_features
0,Harper Lee,The unforgettable novel of a childhood in a sl...,"['Classics', 'Fiction', 'Historical Fiction', ...",4.27,"{'Author': 'Harper Lee', 'Description': 'The u..."
1,J.K. Rowling,Harry Potter thinks he is an ordinary boy - un...,"['Fantasy', 'Fiction', 'Young Adult', 'Magic',...",4.47,"{'Author': 'J.K. Rowling', 'Description': 'Har..."
2,Jane Austen,"Since its immediate success in 1813, Pride and...","['Classics', 'Fiction', 'Romance', 'Historical...",4.28,"{'Author': 'Jane Austen', 'Description': 'Sinc..."
3,Anne Frank,Discovered in the attic in which she spent the...,"['Classics', 'Nonfiction', 'History', 'Biograp...",4.18,"{'Author': 'Anne Frank', 'Description': 'Disco..."
4,George Orwell,Librarian's note: There is an Alternate Cover ...,"['Classics', 'Fiction', 'Dystopia', 'Fantasy',...",3.98,"{'Author': 'George Orwell', 'Description': 'Li..."
...,...,...,...,...,...
9993,Tom Vetter,"In Call To Crusade, Tom Vetter begins the Sieg...","['Historical Fiction', 'Historical']",4.56,"{'Author': 'Tom Vetter', 'Description': 'In Ca..."
9994,Marc-Uwe Kling,"""Kannst du heute mal bezahlen?"", fragt das Kän...","['Humor', 'Fiction', 'Audiobook', 'German Lite...",4.30,"{'Author': 'Marc-Uwe Kling', 'Description': '""..."
9996,Eleanor Gustafson,Jeth Cavanaugh is searching for a new life alo...,[],4.23,"{'Author': 'Eleanor Gustafson', 'Description':..."
9998,Renee Dyer,For Adriana Monroe life couldn’t get any bette...,"['New Adult', 'Romance', 'Contemporary Romance...",4.13,"{'Author': 'Renee Dyer', 'Description': 'For A..."


#### Scoring Criterion:
- Genre : Score is incremented for every Genre match 
    - Eg: if user requirement is Classic and Fiction and the book feature has both then the score would be incremented by 2

- Author: Score is incremented by 1 if there is match on Author name as well

- Description : If the user desciption word/s are present in book feature description then score is incremented by only 1 for each hits. 


In [268]:
user_requirements = response_dict_n

In [272]:
#logic for Genres and description
str1 = "Classic , Fiction"
str2 = "['Classics', 'Fiction', 'Historical Fiction', 'School', 'Literature', 'Young Adult', 'Historical']"

# Split str1 into individual strings
str1_list = [s.strip() for s in user_str1.split(' ')]

# Check if any of the strings in str1_list are present in str2
found_strings = [s for s in str1_list if s in book_str2]

# Print result
if found_strings:
    filtered_list = [s for s in found_strings if len(s) > 1]
    print(f"Found filtered strings: {filtered_list}")
    print("Score" , len(filtered_list))
else:
    print("No strings found")

Found filtered strings: ['Classic', 'Fiction']
Score 2


In [274]:
# Logic for Author
# book_str1 = "J.K. Rowling"
# user_str2 = "J.K. rowling"

def normalize_string(s):
    # Convert to lowercase
    s = s.lower()
    # Remove punctuation and extra spaces
    s = re.sub(r'\W+', ' ', s)
    # Remove leading and trailing whitespace
    s = s.strip()
    return s

# Strings to compare
book_str1 = "J.K. Rowling"
user_str2 = "j   K rowling"

# Normalize the strings
normalized_book_str1 = normalize_string(book_str1)
normalized_user_str2 = normalize_string(user_str2)

# Check if the normalized strings are the same
are_strings_same = normalized_book_str1 == normalized_user_str2

print(f"Are the strings the same? {are_strings_same}")


Are the strings the same? True


In [285]:
#logic for description
str2 = "The unforgettable novel of a childhood in a sleepy Southern town and the crisis of conscience that rocked it. To Kill A Mockingbird became both an instant bestseller and a critical success when it was first published in 1960. It went on to win the Pulitzer Prize in 1961 and was later made into an Academy Award-winning film, also a classic.Compassionate, dramatic, and deeply moving, To Kill A Mockingbird takes readers to the roots of human behavior - to innocence and experience, kindness and cruelty, love and hatred, humor and pathos. Now with over 18 million copies in print and translated into forty languages, this regional story by a young Alabama woman claims universal appeal. Harper Lee always considered her book to be a simple love story. Today it is regarded as a masterpiece of American literature."

str1 = "something involving  castles"

# Split str1 into individual strings
str1_list = [s.strip() for s in str1.split(' ')]

# Check if any of the strings in str1_list are present in str2
found_strings = [s for s in str1_list if s in str2]

# Print result
if found_strings:
    filtered_list = [s for s in found_strings if len(s) > 1]
    print(f"Found filtered strings: {filtered_list}")
    print("Score = 1")
          
else:
    print("No strings found")

Found filtered strings: []
Score = 1


In [298]:
print(response_dict_n)

{'Genres': 'Classic , Fiction', 'Author': 'J.K. rowling ', 'Description': 'something involving castles', 'Rating': 'greater than 3.5'}


In [299]:
response_dict_n = {'Genres': 'Classic , Fiction', 'Author': 'J.K. rowling ', 'Description': 'something involving castles', 'Rating': 'greater than 4.4'}

In [300]:
print(response_dict_n)

{'Genres': 'Classic , Fiction', 'Author': 'J.K. rowling ', 'Description': 'something involving castles', 'Rating': 'greater than 4.4'}


In [344]:
import pandas as pd  # Importing the pandas library for data manipulation
import json
import time
def compare_books_with_user(user_req_string):
    #It will take the user requirements dictionary as input
    user_requirements = response_dict_n
    
    #Filter the books based on their rating, keeping only the ones within the user's rating perference.
    book_df = pd.read_csv('goodreads_updated_data.csv')

    # Extracting the rating value from user_requirements and converting it to an float value
    numbers = re.findall(r'\d+\.\d+|\d+', user_requirements.get('Rating', '0'))
    rating = [float(num) for num in numbers]
   
    # Creating a copy of the DataFrame and filtering laptops based on the rating
    filtered_books = book_df.copy()
    filtered_books['Avg_Rating'] = filtered_books['Avg_Rating'].astype(float)
    filtered_books = filtered_books[filtered_books['Avg_Rating'] >= rating[0]].copy()
    # filtered_books

    ##  Calculate a score for each book based on how well it matches the user's requirements.
    
    # Creating a new column 'Score' in the filtered DataFrame and initializing it to 0
    filtered_books['Score'] = 0
    # Start time
    start_time = time.time()
    # # # Iterating over each book in the filtered DataFrame to calculate scores based on user requirements
    for index, row in filtered_books.iterrows():
    
        book_features = row['book_features']
        
        book_features_values = dictionary_present(book_features)
        score = 0
        
        # Breaking out of loop after 30 seconds to get results soon 
        #Check if 30 seconds have elapsed
        if time.time() - start_time > 120:
            # print("30 seconds elapsed. Exiting loop.")
            break
#         print("book_values",book_features_values) 
#         print("user_requirements",user_requirements)

    #     # Comparing user requirements with book features and updating scores
        for key, user_value in user_requirements.items():
            # if key.lower() == 'budget':
            
            # print("key",key)
            # print("user_value",user_value)
            # print(type(user_value))
            
            if key == 'Rating':
                continue  # Skipping rating comparison
            
            if key == 'Genres':
                #logic for Genres
                # str1 = "Classic , Fiction"
                # str2 = "['Classics', 'Fiction', 'Historical Fiction', 'School', 'Literature', 'Young Adult', 'Historical']"

                # Split str1 into individual strings
                str1_list = [s.strip() for s in user_value.split(' ')]

                # Check if any of the strings in str1_list are present in str2
                found_strings = [s for s in str1_list if s in book_features_values.get(key , None)]

                # Print result
                if found_strings:
                    filtered_list = [s for s in found_strings if len(s) > 1]
                    # print(f"Found filtered strings: {filtered_list}")
                    # print("Score" , len(filtered_list))
                    score += len(filtered_list)
                    
            if key == 'Author':
                # Normalize the strings
                normalized_book_value = normalize_string(book_features_values.get(key , None))
                normalized_user_value = normalize_string(user_value)

                # Check if the normalized strings are the same
                are_strings_same = normalized_book_value == normalized_user_value

                if are_strings_same:
                    score += 1

            if key == 'Description':
                # Split str1 into individual strings 
                str1_list = [s.strip() for s in user_value.split(' ')]

                # Check if any of the strings in str1_list are present in str2
                found_strings = [s for s in str1_list if s in book_features_values.get(key , None)]

                # Print result
                if found_strings:
                    filtered_list = [s for s in found_strings if len(s) > 1]
                    # print(f"Found filtered strings: {filtered_list}")
                    # print("Score" , len(filtered_list))
                    if len(filtered_list) > 0: 
                        score += 1

            
            filtered_books.loc[index, 'Score'] = score # Updating the 'Score' column in the DataFrame
            # print("score",filtered_books['Score'])
         
    # Sorting books by score in descending order and selecting the top 3 products
    top_books = filtered_books.drop('book_features', axis=1)
    top_books = top_books.sort_values('Score', ascending=False).head(3)
    
    # Converting the top books DataFrame to JSON format
    top_books_json = top_books.to_json(orient='records')  

    # top_laptops
    return top_books_json

Now that you have the `compare_books_with_user()` function ready, let's pass the `response_dict_n` to the function to get top 3 recommendation.

In [327]:
display(response_dict_n, '\n',type(response_dict_n))

{'Genres': 'Classic , Fiction',
 'Author': 'J.K. rowling ',
 'Description': 'something involving castles',
 'Rating': 'greater than 4.4'}

'\n'

dict

In [328]:
dictionary_present(response_dict_n)

{'Genres': ' Classic , Fiction',
 'Author': 'J.K. rowling ',
 'Description': 'something involving castles',
 'Rating': 'greater than 4.4'}

In [329]:
top_3_books = compare_books_with_user(response_dict_n)

display(top_3_books)

30 seconds elapsed. Exiting loop.


'[{"Book":"Harry Potter and the Philosopher\\u2019s Stone (Harry Potter, #1)","Author":"J.K. Rowling","Description":"Harry Potter thinks he is an ordinary boy - until he is rescued by an owl, taken to Hogwarts School of Witchcraft and Wizardry, learns to play Quidditch and does battle in a deadly duel. The Reason ... HARRY POTTER IS A WIZARD!","Genres":"[\'Fantasy\', \'Fiction\', \'Young Adult\', \'Magic\', \'Childrens\', \'Middle Grade\', \'Classics\']","Avg_Rating":4.47,"Num_Ratings":"9,278,135","URL":"https:\\/\\/www.goodreads.com\\/book\\/show\\/72193.Harry_Potter_and_the_Philosopher_s_Stone","Score":3},{"Book":"Harry Potter and the Deathly Hallows (Harry Potter, #7)","Author":"J.K. Rowling","Description":"Harry has been burdened with a dark, dangerous and seemingly impossible task: that of locating and destroying Voldemort\'s remaining Horcruxes. Never has Harry felt so alone, or faced a future so full of shadows. But Harry must somehow find within himself the strength to complete

In [330]:
# Get output in JSON Format
top_3_books_json = json.loads(top_3_books)
# type(top_3_laptops_json)
top_3_books_json

[{'Book': 'Harry Potter and the Philosopher’s Stone (Harry Potter, #1)',
  'Author': 'J.K. Rowling',
  'Description': 'Harry Potter thinks he is an ordinary boy - until he is rescued by an owl, taken to Hogwarts School of Witchcraft and Wizardry, learns to play Quidditch and does battle in a deadly duel. The Reason ... HARRY POTTER IS A WIZARD!',
  'Genres': "['Fantasy', 'Fiction', 'Young Adult', 'Magic', 'Childrens', 'Middle Grade', 'Classics']",
  'Avg_Rating': 4.47,
  'Num_Ratings': '9,278,135',
  'URL': 'https://www.goodreads.com/book/show/72193.Harry_Potter_and_the_Philosopher_s_Stone',
  'Score': 3},
 {'Book': 'Harry Potter and the Deathly Hallows (Harry Potter, #7)',
  'Author': 'J.K. Rowling',
  'Description': "Harry has been burdened with a dark, dangerous and seemingly impossible task: that of locating and destroying Voldemort's remaining Horcruxes. Never has Harry felt so alone, or faced a future so full of shadows. But Harry must somehow find within himself the strength to 

### `product_validation_layer()`:

This function verifies that the laptop recommendations are good enough, has score greater than 2, and matches the user's requirements.

In [332]:
def recommendation_validation(book_recommendation):
    data = json.loads(book_recommendation)
    data1 = []
    for i in range(len(data)):
        if data[i]['Score'] >= 2:
            data1.append(data[i])

    return data1

In [333]:
validated_data = recommendation_validation(top_3_books)
display(validated_data,'\n')

[{'Book': 'Harry Potter and the Philosopher’s Stone (Harry Potter, #1)',
  'Author': 'J.K. Rowling',
  'Description': 'Harry Potter thinks he is an ordinary boy - until he is rescued by an owl, taken to Hogwarts School of Witchcraft and Wizardry, learns to play Quidditch and does battle in a deadly duel. The Reason ... HARRY POTTER IS A WIZARD!',
  'Genres': "['Fantasy', 'Fiction', 'Young Adult', 'Magic', 'Childrens', 'Middle Grade', 'Classics']",
  'Avg_Rating': 4.47,
  'Num_Ratings': '9,278,135',
  'URL': 'https://www.goodreads.com/book/show/72193.Harry_Potter_and_the_Philosopher_s_Stone',
  'Score': 3},
 {'Book': 'Harry Potter and the Deathly Hallows (Harry Potter, #7)',
  'Author': 'J.K. Rowling',
  'Description': "Harry has been burdened with a dark, dangerous and seemingly impossible task: that of locating and destroying Voldemort's remaining Horcruxes. Never has Harry felt so alone, or faced a future so full of shadows. But Harry must somehow find within himself the strength to 

'\n'

Now that you the top 3 books extracted, let's pass it to the recommendation layer that'll send it to the user and the user can ask questions around it.

## Stage 3

[Stage 3 Flowchart](https://cdn.upgrad.com/uploads/production/4c12bc73-8c12-4095-90f2-3dfca0f277e5/Stage+3.jpg)

### 3.4: Product Recommendation Layer

Finally, we come to the product recommendation layer. It takes the output from the `compare_books_with_user` function in the previous layer and provides the recommendations to the user. It has the following steps.
1. Initialize the conversation for recommendation.
2. Generate the recommendations and display in a presentable format.
3. Ask questions basis the recommendations.



In [334]:
def initialize_conv_reco(products):
    system_message = f"""
    You are an intelligent reccomender and an expert on all kinds of books and you are tasked with the objective to \
    solve the user queries about any product from the catalogue in the user message \
    You should keep the user profile in mind while answering the questions.\

    Start with a brief summary of each book in the following format, in decreasing order of rating of books:
    1. <Book Name> : <Author Name> , <Short description of book>, <Genre>, <Rating>
    2. <Book Name> : <Author Name>  , <Short description of book>,<Genre>, <Rating>

    """
    user_message = f""" These are the user's products: {products}"""
    conversation = [{"role": "system", "content": system_message },
                    {"role":"user","content":user_message}]
    # conversation_final = conversation[0]['content']
    return conversation

Let's initialize the conversation for recommendation.

In [335]:
debug_conversation_reco = initialize_conv_reco(top_3_books)
debug_conversation_reco

[{'role': 'system',
  'content': '\n    You are an intelligent reccomender and an expert on all kinds of books and you are tasked with the objective to     solve the user queries about any product from the catalogue in the user message     You should keep the user profile in mind while answering the questions.\n    Start with a brief summary of each book in the following format, in decreasing order of rating of books:\n    1. <Book Name> : <Author Name> , <Short description of book>, <Genre>, <Rating>\n    2. <Book Name> : <Author Name>  , <Short description of book>,<Genre>, <Rating>\n\n    '},
 {'role': 'user',
  'content': ' These are the user\'s products: [{"Book":"Harry Potter and the Philosopher\\u2019s Stone (Harry Potter, #1)","Author":"J.K. Rowling","Description":"Harry Potter thinks he is an ordinary boy - until he is rescued by an owl, taken to Hogwarts School of Witchcraft and Wizardry, learns to play Quidditch and does battle in a deadly duel. The Reason ... HARRY POTTER I

Let's see what the assistant responds with the new initialization.

In [336]:
debug_recommendation = get_chat_completions(debug_conversation_reco)
print(debug_recommendation + '\n')

1. Harry Potter and the Deathly Hallows (Harry Potter, #7) : J.K. Rowling  , Harry has been burdened with a dark, dangerous and seemingly impossible task: that of locating and destroying Voldemort's remaining Horcruxes. Never has Harry felt so alone, or faced a future so full of shadows. But Harry must somehow find within himself the strength to complete the task he has been given. He must leave the warmth, safety and companionship of The 
Burrow and follow without fear or hesitation the inexorable path laid out for him...In this final, seventh installment of the Harry Potter series, J.K. Rowling unveils in spectacular fashion the answers to the many questions that have been so eagerly awaited., Fantasy, 4.62
2. Harry Potter and the Prisoner of Azkaban (Harry Potter, #3) : J.K. Rowling  , Harry Potter, along with his best friends, Ron and Hermione, is about to start his third year at Hogwarts School of Witchcraft and Wizardry. Harry can't wait to get back to school after the summer hol

Now, you can converse with the chatbot on the filtered products.

In [337]:
response_dict_n

{'Genres': 'Classic , Fiction',
 'Author': 'J.K. rowling ',
 'Description': 'something involving castles',
 'Rating': 'greater than 4.4'}

In [338]:
debug_conversation_reco.append({"role": "user", "content": "This is my user book preference profile" + str(response_dict_n)})
debug_conversation_reco.append({"role": "assistant", "content": debug_recommendation})

In [339]:
debug_user_input = "Which one is ideal for kids to start with?"

In [340]:
debug_conversation_reco.append({"role": "user", "content": debug_user_input})
debug_response_asst_reco = get_chat_completions(debug_conversation_reco)
display('\n' + debug_response_asst_reco + '\n')

'\nFor kids who are starting to read the Harry Potter series, it is ideal to begin with "Harry Potter and the Philosopher’s Stone (Harry Potter, #1)" by J.K. Rowling. This book introduces readers to the magical world of Hogwarts School of Witchcraft and Wizardry, and sets the foundation for the adventures that follow in the rest of the series. It is a great starting point for young readers to dive into the world of Harry Potter and his friends.\n'

You can repeat the process of appending the assistant and user messages and test the chatbot out.

## Combining all the 3 stages

In this layer, we combine all the three stages that we defined above.

`Stage 1` + `Stage 2` + `Stage 3`

### 3.5 Dialogue Management System

Bringing everything together, we create a `diagloue_mgmt_system()` function that contains the logic of how the different layers would interact with each other. This will be the function that we'll call to initiate the chatbot

In [342]:
def dialogue_mgmt_system():
    conversation = initialize_conversation()

    introduction = get_chat_completions(conversation)

    display(introduction + '\n')

    top_3_books = None

    user_input = ''

    while(user_input != "exit"):

        user_input = input("")

        moderation = moderation_check(user_input)
        if moderation == 'Flagged':
            display("Sorry, this message has been flagged. Please restart your conversation.")
            break

        if top_3_books is None:

            conversation.append({"role": "user", "content": user_input})

            response_assistant = get_chat_completions(conversation)
            moderation = moderation_check(response_assistant)
            if moderation == 'Flagged':
                display("Sorry, this message has been flagged. Please restart your conversation.")
                break


            confirmation = intent_confirmation_layer(response_assistant)

            print("Intent Confirmation Yes/No:",confirmation.get('result'))

            if "No" in confirmation.get('result'):
                conversation.append({"role": "assistant", "content": str(response_assistant)})
                print("\n" + str(response_assistant) + "\n")

            else:
                print("\n" + str(response_assistant) + "\n")
                print('\n' + "Variables extracted!" + '\n')

                response = dictionary_present(response_assistant)

                print("Thank you for providing all the information. Kindly wait, while I fetch the products: \n")
                top_3_books = compare_books_with_user(response)

                print("top 3 laptops are", top_3_books)

                validated_reco = recommendation_validation(top_3_books)

                conversation_reco = initialize_conv_reco(validated_reco)

                conversation_reco.append({"role": "user", "content": "This is my user profile" + str(response)})

                recommendation = get_chat_completions(conversation_reco)

                moderation = moderation_check(recommendation)
                if moderation == 'Flagged':
                    display("Sorry, this message has been flagged. Please restart your conversation.")
                    break

                conversation_reco.append({"role": "assistant", "content": str(recommendation)})

                print(str(recommendation) + '\n')
        else:
            conversation_reco.append({"role": "user", "content": user_input})

            response_asst_reco = get_chat_completions(conversation_reco)

            moderation = moderation_check(response_asst_reco)
            if moderation == 'Flagged':
                print("Sorry, this message has been flagged. Please restart your conversation.")
                break

            print('\n' + response_asst_reco + '\n')
            conversation.append({"role": "assistant", "content": response_asst_reco})

In [346]:
dialogue_mgmt_system()

"Hello! It's great to have you here. Please share your book preferences or any specific requirements you have so that I can recommend the perfect book for you.\n"

 I prefer fictional crime mysteries


Intent Confirmation Yes/No: No

Great choice! Fictional crime mysteries are known for their thrilling plots and intriguing characters. To help you discover your next favorite book in this genre, could you please share if you have any preferred authors or specific themes within fictional crime mysteries? Understanding your preferences will ensure that my recommendations are tailored to your taste.



 Agatha Christie is one of my favourites


Intent Confirmation Yes/No: No

That's fantastic! Agatha Christie is a legendary author known for her captivating crime mysteries. To further tailor my recommendations, could you please describe the kind of fictional crime mysteries you enjoy? Whether you prefer intricate plots, detective-driven stories, or a specific setting? Understanding your preferences will help me suggest books that align with your taste.



 detective-driven stories it is


Intent Confirmation Yes/No: No

Thank you for sharing that preference! Detective-driven stories are filled with suspense and intrigue, focusing on the investigative journey of the protagonist. Based on your love for detective-driven stories and Agatha Christie's works, I will recommend books that offer thrilling mysteries with a strong emphasis on solving crimes. 
Could you please specify the rating range you are looking for in the recommended books? This information will help me tailor my recommendations to meet your expectations.



 more than 4


Intent Confirmation Yes/No: Yes

{'Genres': ['Fictional Crime Mysteries'], 'Author': 'Agatha Christie', 'Description': 'Engaging detective-driven stories with suspenseful plots and intriguing characters', 'Rating': 'more than 4'}


Variables extracted!

Thank you for providing all the information. Kindly wait, while I fetch the products: 

top 3 laptops are [{"Book":"Harry Potter and the Philosopher\u2019s Stone (Harry Potter, #1)","Author":"J.K. Rowling","Description":"Harry Potter thinks he is an ordinary boy - until he is rescued by an owl, taken to Hogwarts School of Witchcraft and Wizardry, learns to play Quidditch and does battle in a deadly duel. The Reason ... HARRY POTTER IS A WIZARD!","Genres":"['Fantasy', 'Fiction', 'Young Adult', 'Magic', 'Childrens', 'Middle Grade', 'Classics']","Avg_Rating":4.47,"Num_Ratings":"9,278,135","URL":"https:\/\/www.goodreads.com\/book\/show\/72193.Harry_Potter_and_the_Philosopher_s_Stone","Score":3},{"Book":"The Complete Sherlock Holmes","Author

 which is better for experienced reader?



For an experienced reader who enjoys fictional crime mysteries, "The Complete Sherlock Holmes" by Arthur Conan Doyle would be a better choice. This classic collection of detective stories featuring Sherlock Holmes offers intricate plots, well-developed characters, and engaging mysteries that are sure to captivate an experienced reader with a preference for the genre.



 exit



If you have any more questions in the future, feel free to ask. Have a great day!



### Challenges faced and Future Possible Improvement 

- In book mapping step extracting and storing features from 10k records from dataset was time consuming.
- Similarly in book extraction step matching user interest with each book feature was time comsuming
- These two steps consume lot of time and affect user interractions and overall experience. I have tried to solve this by using time out but there might be more elegant solutions out there