<center><p float="center">
  <img src="https://upload.wikimedia.org/wikipedia/commons/e/e9/4_RGB_McCombs_School_Brand_Branded.png" width="300" height="100"/>
  <img src="https://mma.prnewswire.com/media/1458111/Great_Learning_Logo.jpg?p=facebook" width="200" height="100"/>
</p></center>

<center><font size=10>Generative AI for Business Applications</center></font>
<center><font size=6>Large Language Models & Prompt Engineering - Week 2</center></font>

<center><p float="center">
  <img src="https://images.pexels.com/photos/262918/pexels-photo-262918.jpeg?auto=compress&cs=tinysrgb&w=1260&h=750&dpr=1" width=720/>
</p></center>

<center><font size=6>Restaurant Review Analysis</center></font>

## Problem Statement

### Business Context

In the food industry, customer satisfaction plays a pivotal role in shaping the success of individual outlets and the overall brand. A leading global food aggregator is keen on understanding and improving customer experiences across the diverse range of restaurants it lists on its platform. The company recognizes the significance of customer reviews in gaining insights into service quality, food offerings, and overall satisfaction.

### Objective

The objective is to develop a **Large Language Model (LLM)-based sentiment analysis system** that can extract meaningful insights from restaurant reviews using only **prompt engineering** (without Retrieval-Augmented Generation). The system will:

1. Identify the **overall sentiment** (positive, negative, neutral) for each review.
2. Capture **aspect-level sentiments** for key experience categories such as food quality, service, and ambience.
3. Extract **liked and disliked features** within each aspect to provide granular insights for each restaurant.

This approach aims to enable **scalable, automated review analysis** that helps restaurants understand customer feedback in detail, improve service quality, and enhance customer satisfaction, all achieved through **carefully designed prompts**.

### Data Dictionary

The dataset comprises three columns:

1. **restaurant\_id** – Unique identifier for each restaurant.
2. **rating\_review** – Numerical or categorical rating provided by the customer.
3. **review\_full** – Full text of the customer’s review.


## Installing and Importing Necessary Libraries

In [None]:
!pip install -q transformers==4.53.2 \
                  accelerate==1.8.1 \
                  bitsandbytes==0.46.1

**Note**:
- After running the above cell, kindly restart the runtime (for Google Colab) or notebook kernel (for Jupyter Notebook), and run all cells sequentially from the next cell.
- On executing the above line of code, you might see a warning regarding package dependencies. This error message can be ignored as the above code ensures that all necessary libraries and their dependencies are maintained to successfully execute the code in ***this notebook***.

**Prompt:**

<font size=3 color="#4682B4"><b>I want to analyze the provided CSV data and work with AI models to understand the restrauant reviews. Help me import the necessary Python libraries to:

1. Read and manipulate the data</ul>
2. Working with system enviroment
3. Use models from Hugging Face with AutoTokenizer and AutoModelForCausalLM

</font>

<font size=3 color="#4682B4"><b>
These libraries will help us load the data, connect with AI models, and prepare for further steps in the project.

</font>

In [None]:
import pandas as pd
import json
import os
from transformers import AutoTokenizer, AutoModelForCausalLM

## Import the dataset

***Prompt***:

<font size=3 color="#4682B4"><b> Mount the Google Drive
</font>

In [None]:
# from google.colab import drive
# drive.mount('/content/drive')

***Prompt***:

<font size=3 color="#4682B4"><b> Load the CSV file named "restaurant_reviews.csv" and store it in the variable data.
</font>

In [None]:
data = pd.read_csv("restaurant_reviews.csv")

## Data Overview

***Prompt***:

<font size=3 color="#4682B4"><b> Display the first 5 rows of the `data`.
</font>

In [None]:
# checking the first five rows of the data
data.head()

***Prompt***:

<font size=3 color="#4682B4"><b> Display the number of rows and columns in the `data`.
</font>

In [None]:
data.shape

**Observations**

- Data has 20 rows and 3 columns

In [None]:
# checking for missing values
data.isnull().sum()

**Observations**

- There are no missing values in the data

In [None]:
# creating a copy of the data
df = data.copy()

# Model Loading

**NOTE**

1. We're loading the entire model, which might take some time to initialize. To optimize this, we use 8-bit loading to reduce memory usage and speed up inference without significantly impacting performance.

2. Before loading the model, you must first agree to its terms and conditions on Hugging Face. To do this, search for the model on the Hugging Face website, review its license or usage restrictions, and click “Agree and Access” to enable programmatic access via code.


In [None]:
file_name = 'config.json'                                                       # Name of the configuration file
with open(file_name, 'r') as file:                                              # Open the config file in read mode
    config = json.load(file)                                                    # Load the JSON content as a dictionary
    HF_TOKEN = config.get("HF_TOKEN")


# Store API credentials in environment variables
os.environ['HF_TOKEN'] = HF_TOKEN


***Prompt***:

<font size=3 color="#4682B4"><b> Load the `mistralai/Mistral-7B-Instruct-v0.1` from hugging face using 8-bit quantization.

</font>

In [None]:
import torch

model_id = "mistralai/Mistral-7B-Instruct-v0.1"

tokenizer = AutoTokenizer.from_pretrained(model_id)

model = AutoModelForCausalLM.from_pretrained(
    model_id,
    load_in_8bit=True,  # Load the model with 8-bit quantization
    torch_dtype=torch.float16,         # Use 16-bit floats on GPU
    device_map="auto",                 # Automatically assign GPU or CPU
    token=HF_TOKEN
)

* `load_in_8bit=True`: Loads the model using 8-bit quantization to save memory.
* `torch_dtype=torch.float16`: Uses half-precision (16-bit) floats for faster computation on GPU.
* `device_map="auto"`: Automatically places model layers across available devices.


Hugging Face model is now ready. Let’s test it on an example input.

***Prompt***:

<font size=3 color="#4682B4"><b> Ask the Mistral model: What is the capital of France?
</font>

In [None]:
# Define the prompt (question)
prompt = "### Question: What is the capital of France?\n### Answer:"

# Tokenize input
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

# Generate response
outputs = model.generate(**inputs)

# Decode and print the output
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Now that the model is returning results successfully.

Let’s define a function that takes a `prompt` and a `query` as inputs and returns the model’s output.  

This will make it easier to reuse the model across different inputs.

***Prompt***:

<font size=3 color="#4682B4"><b> Create a function that accepts a prompt and query, and returns the response generated by the Mistral model.
</font>

In [None]:
def query_mistral(prompt, query):
    """
    Queries the Mistral model with a given prompt and query.

    Args:
        prompt (str): The prompt for the model.
        query (str): The query to be answered by the model.

    Returns:
        str: The model's response.
    """
    messages = [
        {"role": "system", "content": prompt},
        {"role": "user", "content": query}
    ]
    inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)

    pad_token_id = tokenizer.pad_token_id or tokenizer.eos_token_id

    attention_mask = (inputs != pad_token_id).long()

    outputs = model.generate(
        inputs,
        attention_mask=attention_mask,
        max_new_tokens=300,  # Adjust as needed
        do_sample=True,
        temperature=0.7,     # Adjust as needed
        top_p=0.9,           # Adjust as needed
        pad_token_id=pad_token_id  # Prevents warning
    )

    # Decode and print the output, skipping the input tokens
    response = tokenizer.decode(outputs[0][inputs.shape[-1]:], skip_special_tokens=True)
    return response


In the code snippet defined above, the following components were used before generation:

1. `tokenizer.apply_chat_template()`: This method converts the `messages` list into a single formatted string (e.g., adding special tokens or chat-style formatting), and tokenizes it into a tensor using PyTorch (`return_tensors="pt"`). The `.to(model.device)` part ensures the tokenized input is moved to the same device as the model (like a GPU or CPU).

2. `pad_token_id`: This variable is assigned the padding token ID used by the tokenizer. If the tokenizer does not explicitly define a `pad_token_id`, it falls back to the `eos_token_id` (end-of-sequence token). This is needed to handle padding properly during attention and generation.

3. `attention_mask:` This creates a mask that tells the model which tokens should be attended to (represented by 1) and which should be ignored (usually padding tokens, represented by 0). It ensures the model focuses only on valid input tokens during processing.

In the `generate()` function defined above, the following arguments were used:

1. `max_new_tokens`: This parameter determines the maximum length of the generated sequence. In the provided code, max_new_tokens is set to 100, which means the generated sequence should not exceed 100 tokens.

2. `temperature`: The temperature parameter controls the level of randomness in the generation process. A higher temperature (e.g., closer to 1) makes the output more diverse and creative but potentially less focused, while a lower temperature (e.g., close to 0) produces more deterministic and focused but potentially repetitive outputs. In the code, temperature is set to 0.7, indicating a very low temperature and, consequently, a more deterministic sampling.

3. `do_sample`: This is a boolean parameter that determines whether to use sampling during generation (do_sample=True) or use greedy decoding (do_sample=False). When set to True, as in the provided code, the model samples from the distribution of predicted tokens at each step, introducing randomness in the generation process.

4. `top_p`: Controls how many top probable tokens to consider during generation. If set to 0.9, it samples from the smallest set of tokens whose combined probability is at least 90%, balancing creativity and coherence.


# Reviews Sentiment Analysis

## 1. Overall Sentiment Analysis

In [None]:
# defining the instructions for the model
instruction_1 = """
    You are an AI analyzing restaurant reviews. Classify the sentiment of the provided review into the following categories:
    - Positive
    - Negative
    - Neutral

    And return in only JSON format. No extra text and analysis
    {"Sentiment":"Positive"}
"""

***Prompt***:

<font size=3 color="#4682B4"><b> Define a function named classify_sentiment that takes the instructtion_1 and the review text as input, gets the result from query_mistral function, and returns the result in a JSON format.
</font>

In [None]:
def classify_sentiment(prompt, query):
    try:
        response_text = query_mistral(prompt, query)
        # Attempt to parse the response text as JSON
        classification_result = json.loads(response_text)
        return classification_result
    except json.JSONDecodeError as e:
        print(f"Error decoding JSON from OpenAI response: {e}")
        print(f"Raw OpenAI response: {response_text}")
        return None
    except Exception as e:
        print(f"An unexpected error occurred: {e}")
        return None



***Prompt***:

<font size=3 color="#4682B4"><b>Generate the category for each support_ticket_text in the DataFrame using the classify_ticket_openai function, and store the result in a new column.

</font>

In [None]:
# Apply the classification function to each row in the DataFrame
df['Sentiment'] = df['review_full'].apply(lambda x: classify_sentiment(instruction_1, x)['Sentiment'] if classify_sentiment(instruction_1, x) else None)

In [None]:
df

In [None]:
df['Sentiment'].value_counts()

Across the different restaurants, negative sentiment slightly outweighs positive and neutral feedback, indicating more dissatisfaction overall.


## 2. Sentiment toward Different Aspects of the Experience

In [None]:
# defining the instructions for the model
instruction_2 = """
    You are an AI analyzing restaurant reviews. Classify the following aspects in the review and classify the sentiment of each aspect as "Positive", "Negative", or "Neutral":
    1. "Food Quality"
    2. "Service"
    3. "Ambience"

    Output the overall sentiment and sentiment for each category in a JSON format with the following keys:
    {
        "Food Quality": "your_sentiment_prediction",
        "Service": "your_sentiment_prediction",
        "Ambience": "your_sentiment_prediction"
    }

    In case one of the three aspects is not mentioned in the review, set "Not Applicable" (including quotes) for the corresponding JSON key value.
    Only return the JSON, do not return any other information.
"""

***Prompt***:

<font size=3 color="#4682B4"><b>Define a function that takes the instruction_2 prompt and query as input get the result from query_mistral function return the result in JSON format
</font>

In [None]:
def classify_aspect_sentiment(prompt, query):
    """
    Classifies the sentiment of aspects of the review using the Mistral model
    and returns the result as a JSON object.

    Args:
        prompt (str): The prompt for the model (instruction_2).
        query (str): The review text.

    Returns:
        dict: A dictionary containing the sentiment classification for each aspect,
              or None if JSON decoding fails.
    """
    try:
        response_text = query_mistral(prompt, query)
        # Attempt to parse the response text as JSON
        classification_result = json.loads(response_text)
        return classification_result
    except json.JSONDecodeError as e:
        print(f"Error decoding JSON from model response: {e}")
        print(f"Raw model response: {response_text}")
        return None
    except Exception as e:
        print(f"An unexpected error occurred: {e}")
        return None

***Prompt***:

<font size=3 color="#4682B4"><b>Generate the aspect_sentiment for each review in the DataFrame using classify_aspect_sentiment, store it in a new column, and extract individual fields into separate columns.
</font>

In [None]:
# Apply the classification function to each row in the DataFrame
df['aspect_sentiment'] = df['review_full'].apply(lambda x: classify_aspect_sentiment(instruction_2, x))

# Normalize the JSON results into separate columns
aspect_sentiment_df = pd.json_normalize(df['aspect_sentiment'])

# Concatenate the original DataFrame with the new columns
df = pd.concat([df, aspect_sentiment_df], axis=1)

In [None]:
df

In [None]:
df['Food Quality'].value_counts()

Overall, food quality feedback leans negative, with 8 unfavorable mentions compared to 6 positive, while a few reviews were neutral or not applicable.


In [None]:
df['Service'].value_counts()

Service-related feedback is predominantly negative, with 10 unfavorable mentions outweighing the 8 positive and 2 neutral reviews.


In [None]:
df['Ambience'].value_counts()

Ambience is viewed largely positively, with 8 favorable mentions and minimal negative feedback.


## 3. Identifying Liked/Disliked Features of the Different Aspects of the Experience

In [None]:
# defining the instructions for the model
instruction_3 = """

You are an AI model assigned to analyze restaurant reviews. Your task is to extract the **specific features** that the customer **liked or disliked**, categorized under the following aspects of the dining experience:

* Food Quality
* Service
* Ambience

Return the result in the following strict JSON format:

{
  "Food Quality Features": ["specific liked/disliked features"],
  "Service Features": ["specific liked/disliked features"],
  "Ambience Features": ["specific liked/disliked features"]
}

**Instructions:**

* Only list **concrete features** (e.g., “taste”, “temperature”, “presentation”, “waiting time”, “staff behavior”, “lighting”, “music volume”) that are mentioned positively or negatively in the review.
* Do **not** include generic phrases like “liked feature” or “disliked feature”.
* If a particular aspect has no feature mentioned, return an empty list for that aspect.
* Output **only the JSON**, with keys exactly as specified. Do not add any explanations or comments.

"""

***Prompt***:

<font size=3 color="#4682B4"><b>Define a function that takes the instruction_3 prompt and query as input get the result from query_mistral function return the result in JSON format
</font>

In [None]:
def classify_features(prompt, query):
    """
    Extracts liked/disliked features from the review using the Mistral model
    and returns the result as a JSON object.

    Args:
        prompt (str): The prompt for the model (instruction_3).
        query (str): The review text.

    Returns:
        dict: A dictionary containing the extracted features for each aspect,
              or None if JSON decoding fails.
    """
    try:
        response_text = query_mistral(prompt, query)
        # Attempt to parse the response text as JSON
        classification_result = json.loads(response_text)
        return classification_result
    except json.JSONDecodeError as e:
        print(f"Error decoding JSON from model response: {e}")
        print(f"Raw model response: {response_text}")
        return None
    except Exception as e:
        print(f"An unexpected error occurred: {e}")
        return None

***Prompt***:

<font size=3 color="#4682B4"><b>Generate the Features for each review in the DataFrame using classify_features, store them in a new column, and extract individual fields into separate columns.
</font>

In [None]:
# Apply the classification function to each row in the DataFrame
df['aspect_features'] = df['review_full'].apply(lambda x: classify_features(instruction_3, x))

# Normalize the JSON results into separate columns
aspect_features_df = pd.json_normalize(df['aspect_features'])

# Concatenate the original DataFrame with the new columns
df = pd.concat([df, aspect_features_df], axis=1)

In [None]:
df

## 4. Sharing a Response

In [None]:
# defining the instructions for the model
instruction_4 = """
You are an AI analyzing restaurant reviews. Your task is to generate a **polite and empathetic response** directly based on the sentiment of the review.

Follow this structure:

* Start with a thank you for their feedback.
* Then:

  1. If the review is positive, say you’re glad they enjoyed the experience and that it would be great to have them again.
  2. If the review is neutral, thank them and ask what the restaurant could have done better.
  3. If the review is negative, apologize for the inconvenience and mention that the team will look into the concerns raised.

Constraints:

* Do not start with “Dear Customer” or any greeting.
* Only output the final response. No sentiment label, explanation, or extra text.
"""

***Prompt***:

<font size=3 color="#4682B4"><b>Define a function that takes the instruction_4 prompt and query as input, gets the result from the query_mistral function, and returns the result
</font>

In [None]:
def generate_customer_response(prompt, query):
    """
    Generates a customer response based on the review using the Mistral model.

    Args:
        prompt (str): The prompt for the model (instruction_4).
        query (str): The review text.

    Returns:
        str: The generated customer response.
    """
    response_text = query_mistral(prompt, query)
    return response_text

***Prompt***:

<font size=3 color="#4682B4"><b>Generate the response for each review in the DataFrame using generate_customer_response, and store them in a new column.
</font>

In [None]:
# Apply the classification function to each row in the DataFrame
df['customer_response'] = df['review_full'].apply(lambda x: generate_customer_response(instruction_4, x))

In [None]:
df

## Conclusions

We used a Large Language Model (LLM) in a **multi-stage process** to progressively extract richer insights from restaurant reviews:

1. We began by identifying the **overall sentiment** of each review, which showed that across restaurants, negative sentiment (8 reviews) slightly outweighed positive (6) and neutral (6).
2. We then extended the analysis to capture **sentiment for specific aspects** of the customer experience (food quality, service, ambience):

   * **Food Quality** – 8 negative, 6 positive, 5 neutral, 1 not applicable
   * **Service** – 10 negative, 8 positive, 2 neutral (most criticized aspect)
   * **Ambience** – 8 positive, 6 neutral, 5 not applicable, 1 negative (strongest positive driver)
3. Next, we extracted **metadata** for each review, food quality feature, service feature and ambience feature, enabling restaurant-specific insights.
4. Finally, we generated a **personalized response** that could be shared with the customer based on their review content, overall sentiment, and aspect-level feedback.

To evaluate the LLM's performance, we can **manually label** a subset of data (for overall and aspect-level sentiments) and **compare it with the model's output** to obtain a quantitative measure of accuracy and reliability.

To further improve performance, we explored several tuning strategies, including:

* **Refining the prompt** for clarity and specificity
* **Adjusting model parameters** such as `temperature`, `top_p`, and others to control response diversity and confidence

This step-by-step approach allows for scalable, automated review analysis while maintaining control over **insight quality, depth, and customer engagement tone**.


<font size=6 color='#4682B4'>Power Ahead</font>
___