
<center><font size=6>Introduction to Prompt Engineering</center></font>

<center><font size=6>Restaurant Review Analysis</center></font>

## Problem Statement

### Business Context

The company receives large volumes of customer reviews across its restaurants. These reviews contain valuable insights that can guide business and marketing decisions.

### Problem Definition

Manual analysis of unstructured text reviews is slow and unscalable.
The company struggles to automatically extract and understand customer sentiments (positive, negative, or neutral) from this data.

### Objective

Build an automated sentiment analysis model using two LLMs to predict customer sentiment, enabling data-driven insights and improved customer satisfaction.

## Installing and Importing Necessary Libraries

In [4]:
# Installation for GPU llama-cpp-python
# uncomment and run the following code in case GPU is being used
!CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python==0.2.45 --force-reinstall --no-cache-dir -q

# Installation for CPU llama-cpp-python
# uncomment and run the following code in case GPU is not being used
# !CMAKE_ARGS="-DLLAMA_CUBLAS=off" FORCE_CMAKE=1 pip install llama-cpp-python==0.2.45 --force-reinstall --no-cache-dir -q

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m36.7/36.7 MB[0m [31m109.7 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Installing backend dependencies ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m62.1/62.1 kB[0m [31m250.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m45.5/45.5 kB[0m [31m261.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m134.9/134.9 kB[0m [31m414.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m16.6/16.6 MB[0m [31m339.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m44.6/44.6 kB[0m [31m268.1 MB/s[0m eta [36m0:00:00[0m
[?25h  Building wheel for llama-cpp-python (

**Note**: pip's dependency error can be ignored as it does not affect further execution.

In [2]:
# For downloading the models from HF Hub
!pip install huggingface_hub



In [5]:
# Importing library for data manipulation
import pandas as pd

# Function to download the model from the Hugging Face model hub
from huggingface_hub import hf_hub_download

# Importing the Llama class from the llama_cpp module
from llama_cpp import Llama

# Importing the json module
import json

## Import the dataset

In [6]:
# from google.colab import drive
# drive.mount('/content/drive')

Mounted at /content/drive


In [8]:
data = pd.read_csv("restaurant_reviews.csv")

## Data Overview

In [9]:
# checking the first five rows of the data
data.head()

Unnamed: 0,restaurant_ID,rating_review,review_full
0,FLV202,5,"Totally in love with the Auro of the place, re..."
1,SAV303,5,Kailash colony is brimming with small cafes no...
2,YUM789,5,Excellent taste and awesome decorum. Must visi...
3,TST101,5,I have visited at jw lough/restourant. There w...
4,EAT456,5,Had a great experience in the restaurant food ...


In [10]:
data['review_full'][3]

'I have visited at jw lough/restourant. There were a first class service at lough, specially Ms.laxmi  who were superbed for handling the client need, me and my family lots enjoyed her specialty in the manner, and Laxmi is a very very good in the client service, I hope when I will come against I would definitely serve from Ms. Laxmi and she is wonderful girl in that service. See you again Ms. Laxmi for the your best service which I have received from you at jw lough/resourant. Thank you JW Marriott Hotel at Atrocity, Delhi'

In [11]:
# checking the shape of the data
data.shape

(20, 3)

**Observations**

- Data has 20 rows and 3 columns

In [12]:
# checking for missing values
data.isnull().sum()

Unnamed: 0,0
restaurant_ID,0
rating_review,0
review_full,0


**Observations**

- There are no missing values in the data

## Model Building

| Model                                  | Parameters | Quantization | Approx. RAM/VRAM Need | Comment                                          |
| -------------------------------------- | ---------- | ------------ | --------------------- | ------------------------------------------------ |
| **Mistral-7B-Instruct-v0.2.Q6_K.gguf** | 7B         | Q6_K         | ~8–10 GB RAM          | Efficient and fast; good quality                 |
| **Llama-2-13B-Chat.Q5_K_M.gguf**       | 13B        | Q5_K_M       | ~13–16 GB RAM         | Heavier; may need lower batch or smaller context |


| Term                     | Meaning                                                         |
| ------------------------ | --------------------------------------------------------------- |
| **GGUF(GPT-Generated Unified Format.)**                 | Modern, compact format for running LLMs locally                 |
| **Quantization (Q4–Q6)** | Reduces model size and speeds up inference                      |
| **Why use it**           | Runs big models on small GPUs or CPUs efficiently               |
| **Who uses it**          | TheBloke, llama.cpp, ctransformers, text-generation-webui, etc. |


### Loading the model (Llama)

In [13]:
model_name_or_path = "TheBloke/Llama-2-13B-chat-GGUF"
model_basename = "llama-2-13b-chat.Q5_K_M.gguf" # the model is in gguf format

In [14]:
# Using hf_hub_download to download a model from the Hugging Face model hub
# The repo_id parameter specifies the model name or path in the Hugging Face repository
# The filename parameter specifies the name of the file to download
model_path = hf_hub_download(
    repo_id=model_name_or_path,
    filename=model_basename)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


llama-2-13b-chat.Q5_K_M.gguf:   0%|          | 0.00/9.23G [00:00<?, ?B/s]

In [15]:
Llama_llm = Llama(
    model_path=model_path,
    n_threads=2,  # Number of CPU threads used for parallel processing.
    n_batch=512,  # Number of tokens processed in one forward pass (like batch size for text). Should be between 1 and n_ctx, consider the amount of VRAM in your GPU.
    n_gpu_layers=43,  # uncomment and change this value based on GPU VRAM pool.Number of layers offloaded to GPU. If you comment this out, everything runs on CPU.
    n_ctx=4096,  # Context window — how many tokens the model can “see” at once.
)

llama_model_loader: loaded meta data with 19 key-value pairs and 363 tensors from /root/.cache/huggingface/hub/models--TheBloke--Llama-2-13B-chat-GGUF/snapshots/4458acc949de0a9914c3eab623904d4fe999050a/llama-2-13b-chat.Q5_K_M.gguf (version GGUF V2)
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = llama
llama_model_loader: - kv   1:                               general.name str              = LLaMA v2
llama_model_loader: - kv   2:                       llama.context_length u32              = 4096
llama_model_loader: - kv   3:                     llama.embedding_length u32              = 5120
llama_model_loader: - kv   4:                          llama.block_count u32              = 40
llama_model_loader: - kv   5:                  llama.feed_forward_length u32              = 13824
llama_model_loader: - kv   6:                 llama.rope.dimension_

### Loading the model (Mistral)

In [16]:
model_name_or_path = "TheBloke/Mistral-7B-Instruct-v0.2-GGUF"
model_basename = "mistral-7b-instruct-v0.2.Q6_K.gguf"

In [17]:
model_path = hf_hub_download(
    repo_id=model_name_or_path,
    filename=model_basename
)

mistral-7b-instruct-v0.2.Q6_K.gguf:   0%|          | 0.00/5.94G [00:00<?, ?B/s]

In [18]:
mistral_llm = Llama(
    model_path=model_path,n_ctx=1024)

llama_model_loader: loaded meta data with 24 key-value pairs and 291 tensors from /root/.cache/huggingface/hub/models--TheBloke--Mistral-7B-Instruct-v0.2-GGUF/snapshots/3a6fbf4a41a1d52e415a4958cde6856d34b2db93/mistral-7b-instruct-v0.2.Q6_K.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = llama
llama_model_loader: - kv   1:                               general.name str              = mistralai_mistral-7b-instruct-v0.2
llama_model_loader: - kv   2:                       llama.context_length u32              = 32768
llama_model_loader: - kv   3:                     llama.embedding_length u32              = 4096
llama_model_loader: - kv   4:                          llama.block_count u32              = 32
llama_model_loader: - kv   5:                  llama.feed_forward_length u32              = 14336
llama_model_loade

### Defining Model Response Parameters

In [19]:
# Prompt construction + model inference
def generate_llama_response(instruction, review):

    # System message explicitly instructing not to include the review text
    system_message = """
        [INST]<<SYS>>
        {}
        <</SYS>>[/INST]
    """.format(instruction)

    # Combine user_prompt and system_message to create the prompt
    prompt = f"{review}\n{system_message}"

    # Generate a response from the LLaMA model
    response = Llama_llm(
        prompt=prompt,
        max_tokens=1024,
        temperature=0,
        top_p=0.95,
        repeat_penalty=1.2,
        top_k=50,
        stop=['INST'],
        echo=False,
        seed=42,
    )

    # Extract the sentiment from the response
    response_text = response["choices"][0]["text"]
    return response_text

- **`max_tokens`**: This parameter **specifies the maximum number of tokens that the model should generate** in response to the prompt.

- **`temperature`**: This parameter **controls the randomness of the generated response**. A higher temperature value will result in a more random response, while a lower temperature value will result in a more predictable response.

- **`top_p`**: This parameter **controls the diversity of the generated response by establishing a cumulative probability cutoff for token selection**. A higher value of top_p will result in a more diverse response, while a lower value will result in a less diverse response.

- **`repeat_penalty`**: This parameter **controls the penalty for repeating tokens in the generated response**. A higher value of repeat_penalty will result in a lower probability of repeating tokens, while a lower value will result in a higher probability of repeating tokens.

- **`top_k`**: This parameter **controls the maximum number of most-likely next tokens to consider** when generating the response at each step.

- **`stop`**: This parameter is a **list of tokens that are used to dynamically stop response generation** whenever the tokens in the list are encountered.

- **`echo`**: This parameter **controls whether the input (prompt) to the model should be returned** in the model response.

- **`seed`**: This parameter **specifies a seed value that helps replicate results**.


### Utility function

In [20]:
# Response cleanup + JSON parsing
def extract_json_data(json_str):
    try:
        # Find the indices of the opening and closing curly braces
        json_start = json_str.find('{')
        json_end = json_str.rfind('}')

        if json_start != -1 and json_end != -1:
            extracted_sentiment = json_str[json_start:json_end + 1]  # Extract the JSON object
            data_dict = json.loads(extracted_sentiment)
            return data_dict
        else:
            print(f"Warning: JSON object not found in response: {json_str}")
            return {}
    except json.JSONDecodeError as e:
        print(f"Error parsing JSON: {e}")
        return {}

## 1. Sentiment Analysis (Llama)

In [21]:
# creating a copy of the data
data_1 = data.copy()

In [22]:
# defining the instructions for the model
instruction_1 = """
    You are an AI analyzing restaurant reviews. Classify the sentiment of the provided review into the following categories:
    - Positive
    - Negative
    - Neutral
"""

In [23]:
data_1['model_response'] = data_1['review_full'].apply(lambda x: generate_llama_response(instruction_1, x))


llama_print_timings:        load time =    1013.46 ms
llama_print_timings:      sample time =      86.62 ms /   151 runs   (    0.57 ms per token,  1743.31 tokens per second)
llama_print_timings: prompt eval time =    1013.12 ms /   237 tokens (    4.27 ms per token,   233.93 tokens per second)
llama_print_timings:        eval time =    8253.95 ms /   150 runs   (   55.03 ms per token,    18.17 tokens per second)
llama_print_timings:       total time =    9880.58 ms /   387 tokens
Llama.generate: prefix-match hit

llama_print_timings:        load time =    1013.46 ms
llama_print_timings:      sample time =      55.97 ms /    98 runs   (    0.57 ms per token,  1750.94 tokens per second)
llama_print_timings: prompt eval time =     751.02 ms /   307 tokens (    2.45 ms per token,   408.78 tokens per second)
llama_print_timings:        eval time =    5515.84 ms /    97 runs   (   56.86 ms per token,    17.59 tokens per second)
llama_print_timings:       total time =    6649.18 ms /   404 

In [24]:
data_1['model_response'].head()

Unnamed: 0,model_response
0,Sure! Here's the sentiment analysis of the re...
1,Sure! Here's the sentiment analysis of the pr...
2,"Sure, I can help you with that! Based on the ..."
3,Sure! Here's the classification of the review...
4,Sure! Here's the sentiment analysis of the re...


In [25]:
data.shape

(20, 3)

In [26]:
i = 2
print(data_1.loc[i, 'review_full'])

Excellent taste and awesome decorum. Must visit. Subham Barnwal had given us a great service. One of the best experience.


In [27]:
print(data_1.loc[i, 'model_response'])

 Sure, I can help you with that! Based on the review you provided, I would classify it as:

Positive

Reason: The reviewer enjoyed their experience at the restaurant, mentioning that it has "excellent taste" and "awesome decorum." They also appreciated the "great service" provided by Subham Barnwal, which suggests that the staff was attentive and helpful. Overall, the review expresses a positive opinion of the restaurant.


In [28]:
def extract_sentiment(model_response):
    if 'positive' in model_response.lower():
        return 'Positive'
    elif 'negative' in model_response.lower():
        return 'Negative'
    elif 'neutral' in model_response.lower():
        return 'Neutral'

In [29]:
# applying the function to the model response
data_1['sentiment'] = data_1['model_response'].apply(extract_sentiment)
data_1['sentiment'].head()

Unnamed: 0,sentiment
0,Positive
1,Positive
2,Positive
3,Positive
4,Positive


In [30]:
data_1['sentiment'].value_counts()

Unnamed: 0_level_0,count
sentiment,Unnamed: 1_level_1
Positive,16
Negative,3
Neutral,1


In [31]:
final_data_1 = data_1.drop(['model_response'], axis=1)
final_data_1.head()

Unnamed: 0,restaurant_ID,rating_review,review_full,sentiment
0,FLV202,5,"Totally in love with the Auro of the place, re...",Positive
1,SAV303,5,Kailash colony is brimming with small cafes no...,Positive
2,YUM789,5,Excellent taste and awesome decorum. Must visi...,Positive
3,TST101,5,I have visited at jw lough/restourant. There w...,Positive
4,EAT456,5,Had a great experience in the restaurant food ...,Positive


## 1. Sentiment Analysis (Mistral)

In [32]:
# creating a copy of the data
data_1 = data.copy()

**We are going to use an instruction-tuned Mistral model. Hence, the format of the input to the model varies from that of Llama.**

In [33]:
#Defining the response funciton for Task 1.
def response_1(prompt,review):
    model_output = mistral_llm(
      f"""
      Q: {prompt}
      Review: {review}
      A:
      """,
      max_tokens=32,
      stop=["Q:", "\n"],
      temperature=0.01,
      echo=False,
    )

    temp_output = model_output["choices"][0]["text"]

    return temp_output

In [34]:
# defining the instructions for the model
instruction_1 = """
    You are an AI analyzing restaurant reviews. Classify the sentiment of the provided review into the following categories:
    - Positive
    - Negative
    - Neutral
"""

In [41]:
data_1['model_response'] = data_1['review_full'].apply(lambda x: response_1(instruction_1, x))

Llama.generate: prefix-match hit

llama_print_timings:        load time =    3031.00 ms
llama_print_timings:      sample time =      17.52 ms /    32 runs   (    0.55 ms per token,  1826.28 tokens per second)
llama_print_timings: prompt eval time =    2580.35 ms /   166 tokens (   15.54 ms per token,    64.33 tokens per second)
llama_print_timings:        eval time =   25996.99 ms /    31 runs   (  838.61 ms per token,     1.19 tokens per second)
llama_print_timings:       total time =   28713.08 ms /   197 tokens
Llama.generate: prefix-match hit

llama_print_timings:        load time =    3031.00 ms
llama_print_timings:      sample time =      18.56 ms /    32 runs   (    0.58 ms per token,  1723.95 tokens per second)
llama_print_timings: prompt eval time =    2999.11 ms /   235 tokens (   12.76 ms per token,    78.36 tokens per second)
llama_print_timings:        eval time =   26158.63 ms /    31 runs   (  843.83 ms per token,     1.19 tokens per second)
llama_print_timings:       to

In [42]:
data_1['model_response'].head()

Unnamed: 0,model_response
0,This review expresses a very positive sentime...
1,"Based on the given review, the sentiment can ..."
2,The sentiment expressed in this review is pos...
3,"Based on the given review, it can be classifi..."
4,"Based on the provided review, the sentiment c..."


In [43]:
i = 2
print(data_1.loc[i, 'review_full'])

Excellent taste and awesome decorum. Must visit. Subham Barnwal had given us a great service. One of the best experience.


In [44]:
print(data_1.loc[i, 'model_response'])

 The sentiment expressed in this review is positive. The use of words like "excellent taste," "awesome decorum," "must visit," "g


In [45]:
def extract_sentiment(model_response):
    if 'positive' in model_response.lower():
        return 'Positive'
    elif 'negative' in model_response.lower():
        return 'Negative'
    elif 'neutral' in model_response.lower():
        return 'Neutral'

In [46]:
# applying the function to the model response
data_1['sentiment'] = data_1['model_response'].apply(extract_sentiment)
data_1['sentiment'].head()

Unnamed: 0,sentiment
0,Positive
1,Positive
2,Positive
3,Positive
4,Positive


In [47]:
data_1['sentiment'].value_counts()

Unnamed: 0_level_0,count
sentiment,Unnamed: 1_level_1
Positive,10
Negative,7
Neutral,3


In [48]:
final_data_1 = data_1.drop(['model_response'], axis=1)
final_data_1.head()

Unnamed: 0,restaurant_ID,rating_review,review_full,sentiment
0,FLV202,5,"Totally in love with the Auro of the place, re...",Positive
1,SAV303,5,Kailash colony is brimming with small cafes no...,Positive
2,YUM789,5,Excellent taste and awesome decorum. Must visi...,Positive
3,TST101,5,I have visited at jw lough/restourant. There w...,Positive
4,EAT456,5,Had a great experience in the restaurant food ...,Positive


## 2. Sentiment Analysis and Returning Structured Output (Llama)

In [49]:
# creating a copy of the data
data_2 = data.copy()

In [50]:
# defining the instructions for the model
instruction_2 = """
    You are an AI analyzing restaurant reviews. Classify the sentiment of the provided review into the following categories:
    - Positive
    - Negative
    - Neutral

    Format the output as a JSON object with a single key-value pair as shown below:
    {"sentiment": "your_sentiment_prediction"}
"""

In [51]:
data_2['model_response'] = data_2['review_full'].apply(lambda x: generate_llama_response(instruction_2, x))

Llama.generate: prefix-match hit

llama_print_timings:        load time =    1013.46 ms
llama_print_timings:      sample time =      96.73 ms /   161 runs   (    0.60 ms per token,  1664.34 tokens per second)
llama_print_timings: prompt eval time =     760.32 ms /   272 tokens (    2.80 ms per token,   357.75 tokens per second)
llama_print_timings:        eval time =   10609.92 ms /   160 runs   (   66.31 ms per token,    15.08 tokens per second)
llama_print_timings:       total time =   12093.63 ms /   432 tokens
Llama.generate: prefix-match hit

llama_print_timings:        load time =    1013.46 ms
llama_print_timings:      sample time =     136.43 ms /   225 runs   (    0.61 ms per token,  1649.23 tokens per second)
llama_print_timings: prompt eval time =     833.78 ms /   343 tokens (    2.43 ms per token,   411.38 tokens per second)
llama_print_timings:        eval time =   15798.28 ms /   224 runs   (   70.53 ms per token,    14.18 tokens per second)
llama_print_timings:       to

In [52]:
data_2['model_response'].head()

Unnamed: 0,model_response
0,Sure! Here's the sentiment analysis of the re...
1,Sure! Here's my analysis of the review you pr...
2,"Sure, I can help you with that! Based on the ..."
3,Sure! Here's the sentiment analysis of the re...
4,Sure! Here's the sentiment analysis of the re...


In [53]:
i = 2
print(data_2.loc[i, 'review_full'])

Excellent taste and awesome decorum. Must visit. Subham Barnwal had given us a great service. One of the best experience.


In [54]:
print(data_2.loc[i, 'model_response'])

 Sure, I can help you with that! Based on the review you provided, I would classify the sentiment as Positive. Here's the JSON object with the sentiment prediction:

{"sentiment": "Positive"}


In [55]:
# applying the function to the model response
data_2['model_response_parsed'] = data_2['model_response'].apply(extract_json_data)
data_2['model_response_parsed'].head()

Unnamed: 0,model_response_parsed
0,{'sentiment': 'Positive'}
1,{'sentiment': 'Positive'}
2,{'sentiment': 'Positive'}
3,{'sentiment': 'Positive'}
4,{'sentiment': 'Positive'}


In [56]:
model_response_parsed_df_2 = pd.json_normalize(data_2['model_response_parsed'])
model_response_parsed_df_2.head()

Unnamed: 0,sentiment
0,Positive
1,Positive
2,Positive
3,Positive
4,Positive


In [57]:
data_with_parsed_model_output_2 = pd.concat([data_2, model_response_parsed_df_2], axis=1)
data_with_parsed_model_output_2.head()

Unnamed: 0,restaurant_ID,rating_review,review_full,model_response,model_response_parsed,sentiment
0,FLV202,5,"Totally in love with the Auro of the place, re...",Sure! Here's the sentiment analysis of the re...,{'sentiment': 'Positive'},Positive
1,SAV303,5,Kailash colony is brimming with small cafes no...,Sure! Here's my analysis of the review you pr...,{'sentiment': 'Positive'},Positive
2,YUM789,5,Excellent taste and awesome decorum. Must visi...,"Sure, I can help you with that! Based on the ...",{'sentiment': 'Positive'},Positive
3,TST101,5,I have visited at jw lough/restourant. There w...,Sure! Here's the sentiment analysis of the re...,{'sentiment': 'Positive'},Positive
4,EAT456,5,Had a great experience in the restaurant food ...,Sure! Here's the sentiment analysis of the re...,{'sentiment': 'Positive'},Positive


In [58]:
final_data_2 = data_with_parsed_model_output_2.drop(['model_response','model_response_parsed'], axis=1)
final_data_2.head()

Unnamed: 0,restaurant_ID,rating_review,review_full,sentiment
0,FLV202,5,"Totally in love with the Auro of the place, re...",Positive
1,SAV303,5,Kailash colony is brimming with small cafes no...,Positive
2,YUM789,5,Excellent taste and awesome decorum. Must visi...,Positive
3,TST101,5,I have visited at jw lough/restourant. There w...,Positive
4,EAT456,5,Had a great experience in the restaurant food ...,Positive


In [59]:
final_data_2['sentiment'].value_counts()

Unnamed: 0_level_0,count
sentiment,Unnamed: 1_level_1
Neutral,7
Negative,7
Positive,6


## 3. Identifying Overall Sentiment and Sentiment of Aspects of the Experience (Llama)

In [60]:
# creating a copy of the data
data_3 = data.copy()

In [61]:
# defining the instructions for the model
instruction_3 = """
    You are an AI analyzing restaurant reviews. Classify the overall sentiment of the provided review into the following categories:
    - "Positive"
    - "Negative"
    - "Neutral"

    Once that is done, check for a mention of the following aspects in the review and classify the sentiment of each aspect as "Positive", "Negative", or "Neutral":
    1. "Food Quality"
    2. "Service"
    3. "Ambience"

    Output the overall sentiment and sentiment for each category in a JSON format with the following keys:
    {
        "Overall": "your_sentiment_prediction",
        "Food Quality": "your_sentiment_prediction",
        "Service": "your_sentiment_prediction",
        "Ambience": "your_sentiment_prediction"
    }

    In case one of the three aspects is not mentioned in the review, set "Not Applicable" (including quotes) for the corresponding JSON key value.
    Only return the JSON, do not return any other information.
"""

In [62]:
data_3['model_response'] = data_3['review_full'].apply(lambda x: generate_llama_response(instruction_3, x))

Llama.generate: prefix-match hit

llama_print_timings:        load time =    1013.46 ms
llama_print_timings:      sample time =     611.33 ms /  1024 runs   (    0.60 ms per token,  1675.03 tokens per second)
llama_print_timings: prompt eval time =     958.65 ms /   451 tokens (    2.13 ms per token,   470.45 tokens per second)
llama_print_timings:        eval time =   68596.74 ms /  1023 runs   (   67.05 ms per token,    14.91 tokens per second)
llama_print_timings:       total time =   75000.86 ms /  1474 tokens
Llama.generate: prefix-match hit

llama_print_timings:        load time =    1013.46 ms
llama_print_timings:      sample time =     111.44 ms /   190 runs   (    0.59 ms per token,  1704.89 tokens per second)
llama_print_timings: prompt eval time =    1364.03 ms /   522 tokens (    2.61 ms per token,   382.69 tokens per second)
llama_print_timings:        eval time =   12485.04 ms /   189 runs   (   66.06 ms per token,    15.14 tokens per second)
llama_print_timings:       to

In [63]:
data_3['model_response'].head()

Unnamed: 0,model_response
0,"\n {\n ""Overall"": ""Positive"",\n ..."
1,"\n {\n ""Overall"": ""Positive"",\n ..."
2,"{\n ""Overall"": ""Positive"",\n ..."
3,"{\n ""Overall"": ""Positive"",\n ""F..."
4,"\n {\n ""Overall"": ""Positive"",\n ..."


In [64]:
i = 2
print(data_3.loc[i, 'review_full'])

Excellent taste and awesome decorum. Must visit. Subham Barnwal had given us a great service. One of the best experience.


In [65]:
print(data_3.loc[i, 'model_response'])

 {
            "Overall": "Positive",
            "Food Quality": "Positive",
            "Service": "Positive",
            "Ambience": "Positive"
          }


In [66]:
# applying the function to the model response
data_3['model_response_parsed'] = data_3['model_response'].apply(extract_json_data)
data_3['model_response_parsed'].head()

Unnamed: 0,model_response_parsed
0,"{'Overall': 'Positive', 'Food Quality': 'Posit..."
1,"{'Overall': 'Positive', 'Food Quality': 'Posit..."
2,"{'Overall': 'Positive', 'Food Quality': 'Posit..."
3,"{'Overall': 'Positive', 'Food Quality': 'Posit..."
4,"{'Overall': 'Positive', 'Food Quality': 'Posit..."


In [67]:
model_response_parsed_df_3 = pd.json_normalize(data_3['model_response_parsed'])
model_response_parsed_df_3.head()

Unnamed: 0,Overall,Food Quality,Service,Ambience
0,Positive,Positive,Positive,Positive
1,Positive,Positive,Neutral,Positive
2,Positive,Positive,Positive,Positive
3,Positive,Positive,Positive,Positive
4,Positive,Positive,Positive,Positive


In [68]:
data_with_parsed_model_output_3 = pd.concat([data_3, model_response_parsed_df_3], axis=1)
data_with_parsed_model_output_3.head()

Unnamed: 0,restaurant_ID,rating_review,review_full,model_response,model_response_parsed,Overall,Food Quality,Service,Ambience
0,FLV202,5,"Totally in love with the Auro of the place, re...","\n {\n ""Overall"": ""Positive"",\n ...","{'Overall': 'Positive', 'Food Quality': 'Posit...",Positive,Positive,Positive,Positive
1,SAV303,5,Kailash colony is brimming with small cafes no...,"\n {\n ""Overall"": ""Positive"",\n ...","{'Overall': 'Positive', 'Food Quality': 'Posit...",Positive,Positive,Neutral,Positive
2,YUM789,5,Excellent taste and awesome decorum. Must visi...,"{\n ""Overall"": ""Positive"",\n ...","{'Overall': 'Positive', 'Food Quality': 'Posit...",Positive,Positive,Positive,Positive
3,TST101,5,I have visited at jw lough/restourant. There w...,"{\n ""Overall"": ""Positive"",\n ""F...","{'Overall': 'Positive', 'Food Quality': 'Posit...",Positive,Positive,Positive,Positive
4,EAT456,5,Had a great experience in the restaurant food ...,"\n {\n ""Overall"": ""Positive"",\n ...","{'Overall': 'Positive', 'Food Quality': 'Posit...",Positive,Positive,Positive,Positive


In [69]:
final_data_3 = data_with_parsed_model_output_3.drop(['model_response','model_response_parsed'], axis=1)
final_data_3.head()

Unnamed: 0,restaurant_ID,rating_review,review_full,Overall,Food Quality,Service,Ambience
0,FLV202,5,"Totally in love with the Auro of the place, re...",Positive,Positive,Positive,Positive
1,SAV303,5,Kailash colony is brimming with small cafes no...,Positive,Positive,Neutral,Positive
2,YUM789,5,Excellent taste and awesome decorum. Must visi...,Positive,Positive,Positive,Positive
3,TST101,5,I have visited at jw lough/restourant. There w...,Positive,Positive,Positive,Positive
4,EAT456,5,Had a great experience in the restaurant food ...,Positive,Positive,Positive,Positive


In [70]:
final_data_3['Overall'].value_counts()

Unnamed: 0_level_0,count
Overall,Unnamed: 1_level_1
Neutral,7
Negative,7
Positive,6


In [71]:
final_data_3['Food Quality'].value_counts()

Unnamed: 0_level_0,count
Food Quality,Unnamed: 1_level_1
Positive,7
Neutral,7
Negative,3
Mixed,2
Not Applicable,1


In [72]:
final_data_3['Service'].value_counts()

Unnamed: 0_level_0,count
Service,Unnamed: 1_level_1
Negative,9
Positive,8
Neutral,1
Slow,1
Inconsistent,1


In [73]:
final_data_3['Ambience'].value_counts()

Unnamed: 0_level_0,count
Ambience,Unnamed: 1_level_1
Positive,9
Not Applicable,6
Neutral,4
Negative,1


## 3. Identifying Overall Sentiment and Sentiment of Aspects of the Experience (Mistral)

In [74]:
# creating a copy of the data
data_3 = data.copy()

In [75]:
def response_2(prompt,review,sentiment):
    model_output = llm(
      f"""
      Q: {prompt}
      review: {review}
      sentiment: {sentiment}
      A:
      """,
      max_tokens=64,
      stop=["Q:", "\n"],
      temperature=0.01,
      echo=False,
    )

    temp_output = model_output["choices"][0]["text"]
    final_output = temp_output[temp_output.index('{'):]

    return final_output

**Note:** We have already predicted the sentiment of the review. We can use this information while designing the prompt for this task. This way, it will reduce the computational complexity.

The sentiment is stored in the 'final_data_1' dataframe which is from the TASK 1.

In [76]:
# defining the instructions for the model
instruction_3 = """
    You are provided a review and it's sentiment.

    Instructions:
    Classify the sentiment of each aspect as either of "Positive", "Negative", or "Neutral" only and not any other for the given review:
    1. "Food Quality"
    2. "Service"
    3. "Ambience"
    In case one of the three aspects is not mentioned in the review, return "Not Applicable" (including quotes) for the corresponding JSON key value.
    Return the output in the format {"Overall": given sentiment input,"Food Quality": "your_sentiment_prediction","Service": "your_sentiment_prediction","Ambience": "your_sentiment_prediction"}

"""

In [77]:
data_3['model_response'] = final_data_1[['review_full','sentiment']].apply(lambda x: response_2(instruction_3, x[0],x[1]),axis=1)

  data_3['model_response'] = final_data_1[['review_full','sentiment']].apply(lambda x: response_2(instruction_3, x[0],x[1]),axis=1)


NameError: name 'llm' is not defined

In [None]:
data_3['model_response'].values

In [None]:
i = 2
print(data_3.loc[i, 'review_full'])

In [None]:
print(data_3.loc[i, 'model_response'])

In [None]:
# applying the function to the model response
data_3['model_response_parsed'] = data_3['model_response'].apply(extract_json_data)
data_3['model_response_parsed']

In [None]:
model_response_parsed_df_3 = pd.json_normalize(data_3['model_response_parsed'])
model_response_parsed_df_3

In [None]:
model_response_parsed_df_3 = model_response_parsed_df_3.apply(lambda x: x.astype(str).str.lower())

In [None]:
data_with_parsed_model_output_3 = pd.concat([data_3, model_response_parsed_df_3], axis=1)
data_with_parsed_model_output_3.head()

In [None]:
final_data_3 = data_with_parsed_model_output_3.drop(['model_response','model_response_parsed'], axis=1)
final_data_3.head()

In [None]:
final_data_3['Overall'].value_counts()

In [None]:
final_data_3['Food Quality'].value_counts()

**Note:** One of the sentiment is 'if not exceptional'. This is most likely positive.

In [None]:
final_data_3['Service'].value_counts()

In [None]:
final_data_3['Ambience'].value_counts()

## 4. Identifying Overall Sentiment, Sentiment of Aspects of the Experience, and the Liked/Disliked Features of the Different Aspects of the Experience (Llama)

In [78]:
# creating a copy of the data
data_4 = data.copy()

In [79]:
# defining the instructions for the model
instruction_4 = """
    You are an AI tasked with analyzing restaurant reviews. Your goal is to classify the overall sentiment of the provided review into the following categories:
        - Positive
        - Negative
        - Neutral

    Subsequently, assess the sentiment of specific aspects mentioned in the review, namely:
        1. Food quality
        2. Service
        3. Ambience

    Further, identify liked and/or disliked features associated with each aspect in the review.

    Return the output in the specified JSON format, ensuring consistency and handling missing values appropriately:

    {
        "Overall": "your_sentiment_prediction",
        "Food Quality": "your_sentiment_prediction",
        "Service": "your_sentiment_prediction",
        "Ambience": "your_sentiment_prediction",
        "Food Quality Features": ["liked/disliked features"],
        "Service Features": ["liked/disliked features"],
        "Ambience Features": ["liked/disliked features"]
    }

    The sentiment prediction for Overall, Food Quality, Service, and Ambience should be one of "Positive", "Negative", or "Neutral" only.
    In case one of the three aspects is not mentioned in the review, set "Not Applicable" (including quotes) in the corresponding JSON key value for the sentiment.
    In case there are no liked/disliked features for a particular aspect, assign an empty list in the corresponding JSON key value for the aspect.
    Only return the JSON, do NOT return any other text or information.
"""

In [80]:
data_4['model_response'] = data_4['review_full'].apply(lambda x: generate_llama_response(instruction_4, x).replace('\n', ''))

Llama.generate: prefix-match hit

llama_print_timings:        load time =    1013.46 ms
llama_print_timings:      sample time =     174.08 ms /   294 runs   (    0.59 ms per token,  1688.88 tokens per second)
llama_print_timings: prompt eval time =    1442.66 ms /   574 tokens (    2.51 ms per token,   397.88 tokens per second)
llama_print_timings:        eval time =   19248.26 ms /   293 runs   (   65.69 ms per token,    15.22 tokens per second)
llama_print_timings:       total time =   22077.05 ms /   867 tokens
Llama.generate: prefix-match hit

llama_print_timings:        load time =    1013.46 ms
llama_print_timings:      sample time =     148.35 ms /   246 runs   (    0.60 ms per token,  1658.23 tokens per second)
llama_print_timings: prompt eval time =    1635.70 ms /   645 tokens (    2.54 ms per token,   394.33 tokens per second)
llama_print_timings:        eval time =   17117.89 ms /   245 runs   (   69.87 ms per token,    14.31 tokens per second)
llama_print_timings:       to

In [82]:
print(data_4.loc[i, 'model_response'])

 Sure, I can assist you in that! Here's my analysis of your provided review: {"Overall": "Positive", "Food Quality": "Positive", "Service": "Positive", "Ambience": "Positive", "Food Quality Features": ["Excellent taste"], "Service Features": ["Great service"], "Ambience Features": ["Awesome decorum"]} Based on the review, I have determined that: * The overall sentiment is positive, as mentioned in the review's opening sentence. * The food quality is also positive, as described as "Excellent taste." * The service is positive, as described as "Great." * The ambience is positive, as described as "Awesome decorum." * There were no mentioned liked/disliked features for any aspect, so I have assigned empty lists for those values. * I have set "Not Applicable" for the sentiment prediction for "Ambience" since it was not mentioned in the review. Please note that I have handled missing values appropriately by assigning appropriate values based on the context of the review.


In [83]:
# applying the function to the model response
data_4['model_response_parsed'] = data_4['model_response'].apply(extract_json_data)
data_4['model_response_parsed'].head()

Unnamed: 0,model_response_parsed
0,"{'Overall': 'Positive', 'Food Quality': 'Posit..."
1,"{'Overall': 'Positive', 'Food Quality': 'Posit..."
2,"{'Overall': 'Positive', 'Food Quality': 'Posit..."
3,"{'Overall': 'Positive', 'Food Quality': 'Posit..."
4,"{'Overall': 'Positive', 'Food Quality': 'Posit..."


In [84]:
data_4[data_4.model_response_parsed == {}]

Unnamed: 0,restaurant_ID,rating_review,review_full,model_response,model_response_parsed


- There are three model responses that the JSON parser function could not parse
- We'll manually add the values for these three responses

In [85]:
print(data_4.loc[3, 'model_response'])

 Sure! Here's the JSON output based on your review:{    "Overall": "Positive",    "Food Quality": "Positive",    "Service": "Positive",    "Ambience": "Neutral",    "Food Quality Features": ["superb", "first-class"],    "Service Features": ["superb", "handling"],    "Ambience Features": [""]}Here's how I analyzed your review:* Overall: You mentioned that you had a "superb" experience at JW Marriott Hotel Aerocity, Delhi, which suggests a positive sentiment.* Food Quality: You praised the "first-class" food quality, which also suggests a positive sentiment.* Service: You appreciated the "superb" service you received, which further supports a positive sentiment.* Ambience: You did not mention anything specific about the ambience, so I have marked it as "Neutral".For each aspect, I've listed the liked features (in quotes) and disliked features (empty list), based on your review:* Food Quality Features: ["superb", "first-class"] (liked)* Service Features: ["superb", "handling"] (liked)* Am

In [86]:
print(data_4.loc[6, 'model_response'])

    Sure! Here's my analysis of the review you provided:    {        "Overall": "Neutral",        "Food Quality": "Mixed",        "Service": "Positive",        "Ambience": "Positive",        "Food Quality Features": ["pullled duck was prepared and seasoned well but the meat had been marinated with too much balsamic vinegar"],        "Service Features": ["friendly and not intimidating"],        "Ambience Features": ["cozy feel"]    }Here's my analysis:* Overall: Neutral - The reviewer had mixed feelings about their experience at the restaurant.* Food Quality: Mixed - While the pulled duck was prepared and seasoned well, the meat was marinated too much in balsamic vinegar.* Service: Positive - The service was described as friendly and not intimidating.* Ambience: Positive - The reviewer enjoyed the cozy feel of the restaurant's ambience.Liked features:* Food Quality: prepared and seasoned well* Service: friendly* Ambience: cozy feelDisliked features:* Food Quality: too much balsamic vine

In [87]:
print(data_4.loc[7, 'model_response'])

    Sure! Here's the JSON output for the review you provided:{    "Overall": "Neutral",    "Food Quality": "Mixed",    "Service": "Slow",    "Ambience": "Not Applicable",    "Food Quality Features": ["mixed"],    "Service Features": ["slow"],    "Ambience Features": ["Not Applicable"]}Here's the breakdown:* Overall: Neutral - Based on the review, the reviewer had a neutral impression of their visit to Green Bites.* Food Quality: Mixed - While the reviewer mentions that the food quality is "mixed", they do not provide any specific liked or disliked features for this aspect.* Service: Slow - The reviewer notes that the service was slow, which contributes to the neutral sentiment for this aspect.* Ambience: Not Applicable - As the reviewer does not mention the ambience at all, this aspect is not applicable for this review.I hope this helps! Let me know if you have any further questions or if you'd like me to analyze any other reviews.


In [88]:
upd_val_1 = {
    "Overall": "Positive",
    "Food Quality": "Positive",
    "Service": "Positive",
    "Ambience": "Not Applicable",
    "Food Quality Features": [],
    "Service Features": ["excellent service"],
    "Ambience Features": []
}

upd_val_2 = {
    "Overall": "Neutral",
    "Food Quality": "Neutral",
    "Service": "Neutral",
    "Ambience": "Not Applicable",
    "Food Quality Features": ["well prepared"],
    "Service Features": ["slow and inattentive"],
    "Ambience Features": ["interior is friendly", "not intimidating"]
}

upd_val_3 = {
    "Overall": "Neutral",
    "Food Quality": "Positive",
    "Service": "Negative",
    "Ambience": "Positive",
    "Food Quality Features": ["Some tasty, others average"],
    "Service Features": ["Attentive staff", "Slow service"],
    "Ambience Features": []
}

# defining the list of indices to update
idx_list = [3,6,7]
data_4.loc[idx_list, 'model_response_parsed'] = [upd_val_1, upd_val_2, upd_val_3]

**Note**: The values model responses that cannot be parsed correctly by the JSON parser function may vary with execution due to the randomness associated with LLMs. Kindly update as observed when run in your system.

In [89]:
model_response_parsed_df_4 = pd.json_normalize(data_4['model_response_parsed'])
model_response_parsed_df_4.head()

Unnamed: 0,Overall,Food Quality,Service,Ambience,Food Quality Features,Service Features,Ambience Features
0,Positive,Positive,Positive,Positive,[straight from the oven],[staff wearing masks],[quite fancy]
1,Positive,Positive,Positive,Neutral,[exquisite taste],[freshly made],[]
2,Positive,Positive,Positive,Positive,[Excellent taste],[Great service],[Awesome decorum]
3,Positive,Positive,Positive,Not Applicable,[],[excellent service],[]
4,Positive,Positive,Positive,Neutral,[fabulous],[professional],[]


In [90]:
data_with_parsed_model_output_4 = pd.concat([data_4, model_response_parsed_df_4], axis=1)
data_with_parsed_model_output_4.head()

Unnamed: 0,restaurant_ID,rating_review,review_full,model_response,model_response_parsed,Overall,Food Quality,Service,Ambience,Food Quality Features,Service Features,Ambience Features
0,FLV202,5,"Totally in love with the Auro of the place, re...","{ ""Overall"": ""Positive"", ""Fo...","{'Overall': 'Positive', 'Food Quality': 'Posit...",Positive,Positive,Positive,Positive,[straight from the oven],[staff wearing masks],[quite fancy]
1,SAV303,5,Kailash colony is brimming with small cafes no...,Sure! Here's my analysis of your restauran...,"{'Overall': 'Positive', 'Food Quality': 'Posit...",Positive,Positive,Positive,Neutral,[exquisite taste],[freshly made],[]
2,YUM789,5,Excellent taste and awesome decorum. Must visi...,"Sure, I can assist you in that! Here's my ana...","{'Overall': 'Positive', 'Food Quality': 'Posit...",Positive,Positive,Positive,Positive,[Excellent taste],[Great service],[Awesome decorum]
3,TST101,5,I have visited at jw lough/restourant. There w...,Sure! Here's the JSON output based on your re...,"{'Overall': 'Positive', 'Food Quality': 'Posit...",Positive,Positive,Positive,Not Applicable,[],[excellent service],[]
4,EAT456,5,Had a great experience in the restaurant food ...,"{ ""Overall"": ""Positive"", ""Fo...","{'Overall': 'Positive', 'Food Quality': 'Posit...",Positive,Positive,Positive,Neutral,[fabulous],[professional],[]


In [91]:
final_data_4 = data_with_parsed_model_output_4.drop(['model_response','model_response_parsed'], axis=1)
final_data_4.head()

Unnamed: 0,restaurant_ID,rating_review,review_full,Overall,Food Quality,Service,Ambience,Food Quality Features,Service Features,Ambience Features
0,FLV202,5,"Totally in love with the Auro of the place, re...",Positive,Positive,Positive,Positive,[straight from the oven],[staff wearing masks],[quite fancy]
1,SAV303,5,Kailash colony is brimming with small cafes no...,Positive,Positive,Positive,Neutral,[exquisite taste],[freshly made],[]
2,YUM789,5,Excellent taste and awesome decorum. Must visi...,Positive,Positive,Positive,Positive,[Excellent taste],[Great service],[Awesome decorum]
3,TST101,5,I have visited at jw lough/restourant. There w...,Positive,Positive,Positive,Not Applicable,[],[excellent service],[]
4,EAT456,5,Had a great experience in the restaurant food ...,Positive,Positive,Positive,Neutral,[fabulous],[professional],[]


In [92]:
final_data_4['Overall'].value_counts()

Unnamed: 0_level_0,count
Overall,Unnamed: 1_level_1
Neutral,7
Positive,6
Negative,6
your_sentiment_prediction,1


In [93]:
final_data_4['Food Quality'].value_counts()

Unnamed: 0_level_0,count
Food Quality,Unnamed: 1_level_1
Positive,8
Neutral,7
Not Applicable,2
Mixed,1
your_sentiment_prediction,1
Negative,1


In [94]:
final_data_4['Service'].value_counts()

Unnamed: 0_level_0,count
Service,Unnamed: 1_level_1
Negative,9
Positive,8
Neutral,1
Slow,1
your_sentiment_prediction,1


In [95]:
final_data_4['Ambience'].value_counts()

Unnamed: 0_level_0,count
Ambience,Unnamed: 1_level_1
Positive,6
Neutral,6
Not Applicable,6
your_sentiment_prediction,1
Negative,1


## 5. Identifying Overall Sentiment, Sentiment of Aspects of the Experience, Liked/Disliked Features of the Different Aspects of the Experience, and Sharing a Response (Llama)

In [96]:
# creating a copy of the data
data_5 = data.copy()

In [97]:
# defining the instructions for the model
instruction_5 = """
    You are an AI analyzing restaurant reviews. Classify the overall sentiment of the provided review into the following categories:
    - "Positive"
    - "Negative"
    - "Neutral"

    Once that is done, check for a mention of the following aspects in the review and clasify the sentiment of each aspect as positive, negative, or neutral:
    1. Food quality
    2. Service
    3. Ambience

    Once that is done, look for liked and/or disliked features mentioned against each of the above aspects in the review and extract them.

    Finally, draft a response for the customer based on the review. Start out with a thank you note and then add on to it as per the following:
    1. If the review is positive, mention that it would be great to have them again
    2. If the review is neutral, ask them for what the restaurant could have done better
    3. If the review is negative, apologive for the inconvenience and mention that we'll be looking into the points raised

    Return the output in the specified JSON format, ensuring consistency and handling missing values appropriately Ensure that all values in the JSON are formatted as strings, and each element within the lists should be enclosed in double quotes:

    {
        "Overall": "your_sentiment_prediction",
        "Food Quality": "your_sentiment_prediction",
        "Service": "your_sentiment_prediction",
        "Ambience": "your_sentiment_prediction",
        "Food Quality Features": ["liked/disliked features"],
        "Service Features": ["liked/disliked features"],
        "Ambience Features": ["liked/disliked features"],
        "Response": "your_response_to_the_customer_review",
    }

    The sentiment prediction for Overall, Food Quality, Service, and Ambience should be one of "Positive", "Negative", or "Neutral" only.
    In case one of the three aspects is not mentioned in the review, set "Not Applicable" (including quotes) in the corresponding JSON key value for the sentiment.
    In case there are no liked/disliked features for a particular aspect, assign an empty list in the corresponding JSON key value for the aspect.
    Be polite and empathetic in the response to the customer review.
    Only return the JSON, do NOT return any other text or information.
"""

In [99]:
data_5['model_response'] = data_5['review_full'].apply(lambda x: generate_llama_response(instruction_5, x))

Llama.generate: prefix-match hit

llama_print_timings:        load time =    1013.46 ms
llama_print_timings:      sample time =     371.46 ms /   370 runs   (    1.00 ms per token,   996.06 tokens per second)
llama_print_timings: prompt eval time =    1683.01 ms /   764 tokens (    2.20 ms per token,   453.95 tokens per second)
llama_print_timings:        eval time =   24384.83 ms /   369 runs   (   66.08 ms per token,    15.13 tokens per second)
llama_print_timings:       total time =   28877.14 ms /  1133 tokens
Llama.generate: prefix-match hit

llama_print_timings:        load time =    1013.46 ms
llama_print_timings:      sample time =     214.35 ms /   360 runs   (    0.60 ms per token,  1679.46 tokens per second)
llama_print_timings: prompt eval time =    1848.80 ms /   835 tokens (    2.21 ms per token,   451.65 tokens per second)
llama_print_timings:        eval time =   24444.81 ms /   359 runs   (   68.09 ms per token,    14.69 tokens per second)
llama_print_timings:       to

In [100]:
i = 2
print(data_5.loc[i, 'review_full'])

Excellent taste and awesome decorum. Must visit. Subham Barnwal had given us a great service. One of the best experience.


In [101]:
print(data_5.loc[i, 'model_response'])

 {
"Overall": "Positive",
"Food Quality": "Positive",
"Service": "Positive",
"Ambience": "Positive",
"Food Quality Features": ["Excellent taste"],
"Service Features": ["Great service"],
"Ambience Features": ["Awesome decorum"],
"Response": "Thank you for taking the time to share your positive review with us! We're thrilled to hear that you enjoyed our food quality, service, and ambience. We're constantly striving to improve, so your feedback is invaluable to us. We hope to have you back again soon!"]
}


In [102]:
# applying the function to the model response
data_5['model_response_parsed'] = data_5['model_response'].apply(extract_json_data)
data_5['model_response_parsed'].head()

Error parsing JSON: Expecting property name enclosed in double quotes: line 10 column 5 (char 539)
Error parsing JSON: Expecting ',' delimiter: line 9 column 277 (char 503)
Error parsing JSON: Expecting property name enclosed in double quotes: line 10 column 5 (char 579)
Error parsing JSON: Expecting property name enclosed in double quotes: line 10 column 5 (char 669)


Unnamed: 0,model_response_parsed
0,"{'Overall': 'Positive', 'Food Quality': 'Posit..."
1,{}
2,{}
3,"{'Overall': 'Positive', 'Food Quality': 'Posit..."
4,{}


In [103]:
model_response_parsed_df_5 = pd.json_normalize(data_5['model_response_parsed'])
model_response_parsed_df_5.head()

Unnamed: 0,Overall,Food Quality,Service,Ambience,Food Quality Features,Service Features,Ambience Features,Response
0,Positive,Positive,Positive,Positive,[straight from the oven],[quite delicious],[quite quaint and cute],Thank you for taking the time to review us! We...
1,,,,,,,,
2,,,,,,,,
3,Positive,Positive,Positive,Positive,[superb food],[superb service],[very nice ambience],Thank you so much for taking the time to revie...
4,,,,,,,,


In [104]:
model_response_parsed_df_5['Response'][0]

"Thank you for taking the time to review us! We're thrilled to hear that you enjoyed our food quality, service, and ambience. We're especially glad that you appreciated our open kitchen concept and our use of disposable cutlery to ensure your safety during these challenging times. We hope to have you back again soon! If you have any further suggestions or feedback, please don't hesitate to reach out to us."

In [105]:
data_with_parsed_model_output_5 = pd.concat([data_5, model_response_parsed_df_5], axis=1)
data_with_parsed_model_output_5.head()

Unnamed: 0,restaurant_ID,rating_review,review_full,model_response,model_response_parsed,Overall,Food Quality,Service,Ambience,Food Quality Features,Service Features,Ambience Features,Response
0,FLV202,5,"Totally in love with the Auro of the place, re...","\n {\n ""Overall"": ""Positive"",\n ...","{'Overall': 'Positive', 'Food Quality': 'Posit...",Positive,Positive,Positive,Positive,[straight from the oven],[quite delicious],[quite quaint and cute],Thank you for taking the time to review us! We...
1,SAV303,5,Kailash colony is brimming with small cafes no...,"\n {\n ""Overall"": ""Positive"",\n ...",{},,,,,,,,
2,YUM789,5,Excellent taste and awesome decorum. Must visi...,"{\n""Overall"": ""Positive"",\n""Food Quality"": ""P...",{},,,,,,,,
3,TST101,5,I have visited at jw lough/restourant. There w...,"{\n ""Overall"": ""Positive"",\n ""F...","{'Overall': 'Positive', 'Food Quality': 'Posit...",Positive,Positive,Positive,Positive,[superb food],[superb service],[very nice ambience],Thank you so much for taking the time to revie...
4,EAT456,5,Had a great experience in the restaurant food ...,"\n {\n ""Overall"": ""Positive"",\n ...",{},,,,,,,,


In [106]:
final_data_5 = data_with_parsed_model_output_5.drop(['model_response','model_response_parsed'], axis=1)
final_data_5.head()

Unnamed: 0,restaurant_ID,rating_review,review_full,Overall,Food Quality,Service,Ambience,Food Quality Features,Service Features,Ambience Features,Response
0,FLV202,5,"Totally in love with the Auro of the place, re...",Positive,Positive,Positive,Positive,[straight from the oven],[quite delicious],[quite quaint and cute],Thank you for taking the time to review us! We...
1,SAV303,5,Kailash colony is brimming with small cafes no...,,,,,,,,
2,YUM789,5,Excellent taste and awesome decorum. Must visi...,,,,,,,,
3,TST101,5,I have visited at jw lough/restourant. There w...,Positive,Positive,Positive,Positive,[superb food],[superb service],[very nice ambience],Thank you so much for taking the time to revie...
4,EAT456,5,Had a great experience in the restaurant food ...,,,,,,,,


In [107]:
final_data_5['Overall'].value_counts()

Unnamed: 0_level_0,count
Overall,Unnamed: 1_level_1
Neutral,9
Negative,5
Positive,2


In [108]:
final_data_5['Food Quality'].value_counts()

Unnamed: 0_level_0,count
Food Quality,Unnamed: 1_level_1
Neutral,9
Positive,3
Negative,2
Mixed,1
Not Applicable,1


In [109]:
final_data_5['Service'].value_counts()

Unnamed: 0_level_0,count
Service,Unnamed: 1_level_1
Negative,9
Positive,5
Neutral,2


In [110]:
final_data_5['Ambience'].value_counts()

Unnamed: 0_level_0,count
Ambience,Unnamed: 1_level_1
Neutral,10
Positive,4
Not Applicable,1
Negative,1


## Conclusions

- We used an LLM to do multiple tasks, one stage at a time
    1. We first identified the overall sentiment of the review using the LLM
    2. We then identified the overall sentiment of the review and got the output in a structured format from the LLM for ease-of-access
    3. Next, we identified the overall sentiment of the review as well as sentiment of specific aspects of the experience
    4. Next, in addition to the overall sentiment of the review as well as sentiment of specific aspects of the experience, we also identified the liked/disliked features of the different aspects of the experience
    5. Finally, in addition to all the above, we also got a response we can share with the customer based on their review

- One can manually label the data (overall sentiment and sentiments of different aspects) and then compare the model's output with the same to get a quantitative measure of the models performance.

- To try and improve the model performance, one can try the following:
    1. Update the prompt
    2. Update the model parameters (`temparature`, `top_p`, ...)



___