# LLM-based Customer Review Analyzer with Actionable Feedback Generation

### ✅ Problem Statement

Customer reviews contain valuable insights for product improvement.
However, due to the overwhelming volume of reviews, manual analysis is difficult, time-consuming, and often inconsistent.

Companies struggle to identify and act on recurring customer complaints, which leads to missed opportunities for improving product quality and customer satisfaction.

<br>

### 💡 Proposed Solution

This project developed a system that automatically classifies customer reviews by sentiment (positive, negative, improvement) using a Large Language Model (LLM), and then generates actionable suggestions for improvement based on negative and improvement feedback.

The pipeline also includes a feedback category tagging system (e.g., quality, delivery, price), allowing businesses to monitor which areas receive the most criticism.

For example:

“The material is too thin” → “Consider improving the product fabric quality.”

The results are structured in a way that allows for real-time feedback monitoring, making it easier for teams to prioritize product enhancements and customer support initiatives.

<br>

### 🧩 Key Features
	•	LLM-based sentiment classification with confidence score
	•	Automated extraction of reasoning behind sentiment
	•	Actionable feedback generation from negative and improvement reviews
	•	Feedback category labeling for targeted analysis
	

# 1. Data Ingestion

Import Data set for **Few-shot prompting** and using Analysis of **reviews and feedback**

Why do I use the amazonreviews dataset? <br>
-> Collect customer review data

to get real, natural customer reviews for two purposes:

	1. To create few-shot examples (like positive/negative samples)
        -> These are shown to the LLM inside the prompt
    2. To extract real reviews for testing
        -> These are passed into the model to perform sentiment classification and generate feedback

In short:
✅ Prompt examples + ✅ Test inputs → all come from amazonreviews

In [1]:
import bz2

train_lines = [] 
# extract train data
with bz2.open("/kaggle/input/amazonreviews/train.ft.txt.bz2", "rt", encoding="utf-8") as f:
    for i in range(10):
        line = next(f).strip()
        train_lines.append(line)

# check the train dataset
for data in train_lines[:5]:
    print(data)

__label__2 Stuning even for the non-gamer: This sound track was beautiful! It paints the senery in your mind so well I would recomend it even to people who hate vid. game music! I have played the game Chrono Cross but out of all of the games I have ever played it has the best music! It backs away from crude keyboarding and takes a fresher step with grate guitars and soulful orchestras. It would impress anyone who cares to listen! ^_^
__label__2 The best soundtrack ever to anything.: I'm reading a lot of reviews saying that this is the best 'game soundtrack' and I figured that I'd write a review to disagree a bit. This in my opinino is Yasunori Mitsuda's ultimate masterpiece. The music is timeless and I'm been listening to it for years now and its beauty simply refuses to fade.The price tag on this is pretty staggering I must say, but if you are going to buy any cd for this much money, this is the only one that I feel would be worth every penny.
__label__2 Amazing!: This soundtrack is m

The Data as Input is currently formatted.

### Preprocess text

In [2]:
import pandas as pd

# seperate labels with reviews
data = []
for line in train_lines:
    if line.startswith("__label__"):
        label = line.split(' ')[0].strip() # __label__2
        text = ' '.join(line.split(' ')[1:]).strip() # review
        data.append([label, text])

# Data Frame
df = pd.DataFrame(data, columns=['label', 'review'])

df.head()

Unnamed: 0,label,review
0,__label__2,Stuning even for the non-gamer: This sound tra...
1,__label__2,The best soundtrack ever to anything.: I'm rea...
2,__label__2,Amazing!: This soundtrack is my favorite music...
3,__label__2,Excellent Soundtrack: I truly like this soundt...
4,__label__2,"Remember, Pull Your Jaw Off The Floor After He..."


In [3]:
df['label'].unique()

array(['__label__2', '__label__1'], dtype=object)

In [4]:
# to check the label which means
df[df['label']=='__label__1'].head(3)

Unnamed: 0,label,review
6,__label__1,"Buyer beware: This is a self-published book, a..."


In [5]:
# mapping label whether is positive or not
def map_label(lbl):
    if lbl == '__label__2':
        return 'positive'
    else:
        return 'negative'

df['sentiment'] = df['label'].apply(map_label)

In [6]:
df.head()

Unnamed: 0,label,review,sentiment
0,__label__2,Stuning even for the non-gamer: This sound tra...,positive
1,__label__2,The best soundtrack ever to anything.: I'm rea...,positive
2,__label__2,Amazing!: This soundtrack is my favorite music...,positive
3,__label__2,Excellent Soundtrack: I truly like this soundt...,positive
4,__label__2,"Remember, Pull Your Jaw Off The Floor After He...",positive


# 2. Sentiment Classification

## pick some sample for few-shot prompting and set a prompt templete

I want to make a improvement sentiment for more detail analysis so that I created new label which is called **"improvement"**.

In [7]:
imp_keywords = [
    "but", "however", "although", "except", "unfortunately",
    "wish", "could be better", "not great", "only issue", "a bit", "too bad"
]

df['review_lower'] = df['review'].str.lower()

def detect_improvement(row):
    if row['sentiment'] in ['positive', 'negative']:
        for keyword in imp_keywords:
            if keyword in row['review_lower']:
                return 'improvement'
    return row['sentiment']

df['sentiment'] = df.apply(detect_improvement, axis=1)

#### Classify sentiment into: Positive, Improvement, Negative

In [8]:
# postive sample 3
pos_examples = df[df['sentiment'] == 'positive'].sample(1, random_state=42)

# negative sample 3
neg_examples = df[df['sentiment'] == 'negative'].sample(1, random_state=42)

# improvement sample 3
imp_examples = df[df['sentiment'] == 'improvement'].sample(1, random_state=42)

In [9]:
print('Postive reviews :')
for review in pos_examples['review']:
    print(f'{review}\n')
print('----------------------')
print('Negative reviews :')
for review in neg_examples['review']:
    print(f'{review}\n')
print('----------------------')
print('Improvement reviews :')
for review in imp_examples['review']:
    print(f'{review}\n')

Postive reviews :
Remember, Pull Your Jaw Off The Floor After Hearing it: If you've played the game, you know how divine the music is! Every single song tells a story of the game, it's that good! The greatest songs are without a doubt, Chrono Cross: Time's Scar, Magical Dreamers: The Wind, The Stars, and the Sea and Radical Dreamers: Unstolen Jewel. (Translation varies) This music is perfect if you ask me, the best it can be. Yasunori Mitsuda just poured his heart on and wrote it down on paper.

----------------------
Negative reviews :
Buyer beware: This is a self-published book, and if you want to know why--read a few paragraphs! Those 5 star reviews must have been written by Ms. Haddon's family and friends--or perhaps, by herself! I can't imagine anyone reading the whole thing--I spent an evening with the book and a friend and we were in hysterics reading bits and pieces of it to one another. It is most definitely bad enough to be entered into some kind of a "worst book" contest. I 

#### I structured the sentiment analysis prompt using **LangChain’s FewShotPromptTemplate** and **PromptTemplate.**
#### The build_sentiment_prompt function takes a batch of customer reviews and generates a complete few-shot prompt using FewShotPromptTemplate from LangChain.<br>
    - It first formats each review by including its ID and text, then injects them into the suffix section of the prompt using the {formatted_reviews} variable.
    - After generating the base prompt with example inputs and task instructions, it appends an additional directive at the end to enforce a strict JSON output format.

#### The **JSON output requirement** specifies:
    - Each result must include the review’s ID, predicted sentiment (positive, negative, or improvement), a short reason, and a confidence score.
    - If any text contains double quotes, they must be escaped with a backslash (\") to ensure proper JSON formatting.

In [10]:
from langchain.prompts import FewShotPromptTemplate, PromptTemplate

#define the format
example_prompt = PromptTemplate(
    input_variables=["id", "review", "sentiment", "reason", "confidence"],
    template='''id: {id}
review: {review}
sentiment: {sentiment}
reason: {reason}
confidence: {confidence}
''',
    template_format="f-string"
)

# examples for few-shot
examples = [
    {
        "id": "0",
        "review": "Egbert is such a wonderful name...",
        "sentiment": "positive",
        "reason": "The review expresses satisfaction with the product.",
        "confidence": 0.9
    },
    {
        "id" : "1",
        "review": "TERRIBLE!! DO NOT BUY THIS: I bought this for my wife...",
        "sentiment": "negative",
        "reason": "The review expresses dissatisfaction.",
        "confidence": 0.85
    },
    {
        "id" : "2",
        "review": "Not the best shaper: This would be great if it had boning in it...",
        "sentiment": "improvement",
        "reason": "The review is mostly positive but mentions an issue to improve.",
        "confidence": 0.88
    }
]

# explain what LLM need to do
prefix = '''You are a sentiment analysis assistant.

Your task is to analyze customer product reviews and classify them into one of the following sentiment labels:
- positive
- negative
- improvement

Return your response with the following keys:
- id (corresponding to the review input)
- sentiment
- reason
- confidence (float from 0 to 1)

Do not make up, modify, or skip any ID. Use the exact same ID provided in the input.
'''
# the final part of the prompt where the actual input(the reviews to be analyzed) is inserted
suffix = "Now analyze the following reviews(each with an ID):\n{formatted_reviews}"

# LangChain FewShotPromptTemplate
few_shot_sentiment_prompt = FewShotPromptTemplate(
    examples=examples,
    example_prompt=example_prompt,
    prefix=prefix,
    suffix=suffix,
    input_variables=["formatted_reviews"]
)

# create prompt using base prompt for sentiment
def build_sentiment_prompt(batch_reviews):
    formatted_reviews = "\n".join([f"[id: \"{row['id']}\"] {row['review']}" for _, row in batch_reviews.iterrows()])
    prompt_value = few_shot_sentiment_prompt.format_prompt(formatted_reviews=formatted_reviews)
    return prompt_value.to_string() + """

ONLY return the results in the following JSON array format.

IMPORTANT:
If any value contains double quotes ("), you MUST escape them with a backslash like this: \".

[
  {
    "id": "string",
    "sentiment": "positive | negative | improvement",
    "reason": "short explanation",
    "confidence": float (between 0 and 1)
  },
  ...
]
"""

In [11]:
# to check prompt
test_reviews = ["The fabric was soft, but the stitching came loose."]
test_df = pd.DataFrame({
    "id": ['0'], 
    "review": test_reviews
})
prompt = build_sentiment_prompt(test_df)

print("🔍 Final Prompt:")
print(prompt)

🔍 Final Prompt:
You are a sentiment analysis assistant.

Your task is to analyze customer product reviews and classify them into one of the following sentiment labels:
- positive
- negative
- improvement

Return your response with the following keys:
- id (corresponding to the review input)
- sentiment
- reason
- confidence (float from 0 to 1)

Do not make up, modify, or skip any ID. Use the exact same ID provided in the input.


id: 0
review: Egbert is such a wonderful name...
sentiment: positive
reason: The review expresses satisfaction with the product.
confidence: 0.9


id: 1
review: TERRIBLE!! DO NOT BUY THIS: I bought this for my wife...
sentiment: negative
reason: The review expresses dissatisfaction.
confidence: 0.85


id: 2
review: Not the best shaper: This would be great if it had boning in it...
sentiment: improvement
reason: The review is mostly positive but mentions an issue to improve.
confidence: 0.88


Now analyze the following reviews(each with an ID):
[id: "0"] The fa

In [12]:
# for using LLM
# needed once
# pip install -U google-generativeai 

In [13]:
# setting API key
import os

from kaggle_secrets import UserSecretsClient
user_secrets = UserSecretsClient()
secret_value_0 = user_secrets.get_secret("90-AI") # bring my project api key


import google.generativeai as genai
genai.configure(api_key=secret_value_0)


# For testing: try running it with Flash first
model = genai.GenerativeModel("gemini-1.5-flash")
#model = genai.GenerativeModel("models/gemini-1.5-pro")

# # for test
# # process sentiment analysis
# response = model.generate_content(prompt_template)

# # check results
# print("Sentiment review analysis results:")
# print(response.text.strip())

# 3. Feedback Generation

* For negative, improvement reviews only, generate summaries and actionable suggestions
* Use LLM for natural language generation(NLP)
(e.g., “Delivery was slow” -> “Delivery process may need improvement”)

The confidence value isn’t calculated as a real probability by the model.

Instead, it’s a subjective estimate generated to match the structure requested in the prompt.

In [14]:
example_prompt = PromptTemplate(
    input_variables=["id", "review", "feedback"],
    template='''id: {id}
    review: {review}
    feedback: {feedback}
''',
    template_format="f-string"
)

# for few shot
examples = [
    {
        "id": "101000",
        "review": "The material is too thin and feels cheap.",
        "feedback": "Improve the fabric quality for better durability."
    },
    {
        "id": "101001",
        "review": "The size chart was confusing and inaccurate.",
        "feedback": "Revise the size chart for clarity and accuracy."
    },
    {
        "id": "101002",
        "review": "The packaging was damaged when it arrived.",
        "feedback": "Use more protective packaging to avoid damage during delivery."
    }
]

# explain what the LLM needs to do
prefix = """You are a product improvement assistant.

Your task is to analyze each customer review and suggest one clear, actionable product improvement.

Return your response in a JSON array format like this:
[
  {{
    "id": "same_as_input",
    "feedback": "Your suggestion"
  }},
  ...
]

IMPORTANT:
- Copy the exact same ID.
- Do NOT generate, modify, or skip IDs.
- If any value contains double quotes ("), escape them with a backslash.
"""

# the job what LLM analyzed
suffix = """Now analyze the following reviews:
{formatted_reviews}

Return ONLY the output in the following format (no markdown or extra text):

[
  {{
    "id": "same_as_input",
    "feedback": "Your suggestion here"
  }},
  ...
]

IMPORTANT:
- Copy the exact same ID.
- If any value contains double quotes ("), you MUST escape them with a backslash like this: \"
"""

# FewShotPromptTemplate
few_shot_feedback_prompt = FewShotPromptTemplate(
    examples=examples,
    example_prompt=example_prompt,
    prefix=prefix,
    suffix=suffix,
    input_variables=["formatted_reviews"]
)

def build_feedback_prompt(batch_df):
    formatted_reviews = "\n\n".join(
        [f'### Input\n[id: "{row["id"]}"] {row["review"]}' for _, row in batch_df.iterrows()]
    )
    prompt = few_shot_feedback_prompt.format_prompt(formatted_reviews=formatted_reviews)
    return prompt.to_string()

In [15]:
#pip install --upgrade langchain

# 4. process of sentiment and feedback analysis

In [16]:
import time
import json
import os
import pandas as pd
from tqdm import tqdm # to see process bar

# recall LLM if failed 
def safe_generate(prompt, retries=3, delay=60):
    for i in range(retries):
        try:
            return model.generate_content(prompt)
        except Exception as e:
            if "RESOURCE_EXHAUSTED" in str(e) or "429" in str(e):
                print(f"\u26a0\ufe0f Rate limit hit. Waiting {delay} seconds... (attempt {i+1})")
                time.sleep(delay)
            else:
                raise e
    raise RuntimeError("\u274c Failed after multiple retries.")

# main fuc
def analyze_reviews_with_id(df_reviews, batch_size=5, checkpoint_path="/kaggle/working/results_with_feedback.csv",
                              fail_log_path="/kaggle/working/failed_ids.txt"):
    results = []
    problem_ids = []

    # call id what is already done
    existing_ids = set()
    if os.path.exists(checkpoint_path):
        print("found the checkpoint")
        existing_df = pd.read_csv(checkpoint_path, dtype={"id": str})
        existing_ids = set(existing_df["id"])

    failed_ids = set()
    if os.path.exists(fail_log_path):
        with open(fail_log_path, "r") as f:
            failed_ids = set(line.strip() for line in f)

    excluded_ids = existing_ids.union(failed_ids)
    df_reviews["id"] = df_reviews["id"].astype(str)

    print(f"Excluding {len(excluded_ids)} previously processed or failed reviews.")

    for start_idx in tqdm(range(0, len(df_reviews), batch_size), desc="Building Sentiment"):
        batch_df = df_reviews.iloc[start_idx : start_idx + batch_size].copy()
        batch_df = batch_df[~batch_df["id"].isin(excluded_ids)]

        if batch_df.empty:
            continue

        try:
            prompt = build_sentiment_prompt(batch_df)
            response = safe_generate(prompt)
            cleaned = clean_json_response(response.text.strip())
            batch_results = json.loads(cleaned)

            invalid_ids = validate_llm_response(batch_df, batch_results, log_prefix=f"Batch {start_idx // batch_size + 1}")
            if invalid_ids:
                with open(fail_log_path, "a") as f:
                    for i in invalid_ids:
                        f.write(str(i) + "\n")
                print(f"❌ Skipping batch {start_idx // batch_size + 1} due to validation errors.")
                continue
                 
            for result in batch_results:
                result_id = str(result.get("id"))
                sentiment = result.get("sentiment", "").lower()
                if sentiment in ["negative", "improvement"]:
                    problem_ids.append(result_id)

            # save check point - sentiment
            df_batch_result = pd.DataFrame(batch_results)
            df_batch_result.to_csv(checkpoint_path, mode="a", index=False, header=not os.path.exists(checkpoint_path))

        except Exception as e:
            print(f"\u274c Failed to parse sentiment batch {start_idx // batch_size + 1}")
            print("Error:", e)
            with open(fail_log_path, "a") as f:
                for i in batch_df["id"]:
                    f.write(str(i) + "\n")

        time.sleep(2.0)

    print("\n\u2705 Sentiment analysis done. Starting feedback generation...\n")

    # create feedback 
    feedback_map = {}
    problem_df = df_reviews[df_reviews["id"].astype(str).isin(problem_ids)]

    for start_idx in tqdm(range(0, len(problem_df), batch_size), desc="📝 Building Feedback"):
        batch_df = problem_df.iloc[start_idx : start_idx + batch_size].copy()

        try:
            feedback_prompt = build_feedback_prompt(batch_df)
            response = safe_generate(feedback_prompt)
            cleaned = clean_json_response(response.text.strip())
            batch_feedbacks = json.loads(cleaned)

            f_invalid_ids = validate_llm_response(batch_df, batch_feedbacks, log_prefix=f"Feedback Batch {start_idx // batch_size + 1}")
            if f_invalid_ids:
                with open(fail_log_path, "a") as f:
                    for i in f_invalid_ids:
                        f.write(str(i) + "\n")
                print(f"❌ Skipping feedback batch {start_idx // batch_size + 1} due to validation errors.")
                continue
                
            expected_ids = set(batch_df["id"].astype(str))
            for feedback in batch_feedbacks:
                fid = str(feedback.get("id"))
                if fid not in expected_ids:
                    print(f"⚠️ Unexpected ID returned: {fid}")
                    continue
                feedback_map[fid] = feedback.get("feedback", None)

        except Exception as e:
            print(f"\u274c Failed to parse feedback batch {start_idx // batch_size + 1}")
            print("Error:", e)
            with open(fail_log_path, "a") as f:
                for i in batch_df["id"]:
                    f.write(str(i) + "\n")

        time.sleep(2.0)

    # final result
    final_df = pd.read_csv(checkpoint_path, dtype={"id": str}) # check point - sentiment + feedback
    final_df["feedback"] = final_df.apply(lambda row: feedback_map.get(row["id"], row.get("feedback")), axis=1)
    final_df.to_csv(checkpoint_path, index=False)
    print("Saved to csv file. Congrats!")
    
    return final_df


#### Why I used **json.load()**?
- using json.loads() to convert a JSON-formatted string (returned by the LLM) into a usable Python object, like a list of dictionaries.

In [17]:
# for preventing json error
def clean_json_response(text):
    text = text.strip()

    # remove code block
    if text.startswith("```"):
        parts = text.split("```")
        if len(parts) >= 2:
            text = parts[1].strip()

    # remove introduction "Here's the result:\n"
    lines = text.splitlines()
    json_start = next((i for i, line in enumerate(lines) if line.strip().startswith("[") or line.strip().startswith("{")), 0)
    text = "\n".join(lines[json_start:]).strip()

    # wrap a single dict in a list for appending in list
    if text.startswith("{") and not text.startswith("["):
        text = "[" + text + "]"

    return text

**The validate_llm_response function** is a structured utility for checking missing IDs and detecting duplicate entries in LLM responses.

In [18]:
from collections import Counter # for counting how many duplicate LLM has

def validate_llm_response(batch_df, batch_results, log_prefix=""):
    """
    Validates ID-related issues in the LLM response results.
    
    - Checks for order consistency
    - Detects duplicate IDs
    - Identifies missing IDs
    """
    input_ids = list(batch_df["id"].astype(str))
    expected_ids = set(input_ids)
    returned_ids = [str(result.get("id")) for result in batch_results]

    invalid_ids = set()

    # 1. checking id order by
    for i, (expected, returned) in enumerate(zip(input_ids, returned_ids)):
        if expected != returned:
            print(f"⚠️ {log_prefix} ID mismatch at position {i}: expected '{expected}', got '{returned}'")
            invalid_ids.add(expected)

    # 2. detecting duplicate of ID
    counter = Counter(returned_ids)
    for rid, count in counter.items():
        if count > 1:
            print(f"⚠️ {log_prefix} Duplicate ID found in response: '{rid}' ({count} times)")
            invalid_ids.add(rid)

    # 3. identifing missing ID
    missing = expected_ids - set(returned_ids)
    if missing:
        print(f"⚠️ {log_prefix} Missing IDs in response: {missing}")
        invalid_ids.update(missing)

    return list(invalid_ids)

In [19]:
# for json convertion test
df_sample = pd.DataFrame([{
    "id": "999",
    "review": "The fit was nice, but the material feels cheap and weak."
}])

prompt = build_sentiment_prompt(df_sample)
print("📥 S Prompt:")
print(prompt)

feedback_prompt = build_feedback_prompt(df_sample)
print("📥 f prompt")
print(feedback_prompt)

response = safe_generate(prompt)
raw_text = response.text.strip()

print("📥 Raw LLM Response:")
print(raw_text)

# parsing
cleaned = clean_json_response(raw_text)
parsed = json.loads(cleaned)

print("✅ Parsed Response:")
for r in parsed:
    print(r)

📥 S Prompt:
You are a sentiment analysis assistant.

Your task is to analyze customer product reviews and classify them into one of the following sentiment labels:
- positive
- negative
- improvement

Return your response with the following keys:
- id (corresponding to the review input)
- sentiment
- reason
- confidence (float from 0 to 1)

Do not make up, modify, or skip any ID. Use the exact same ID provided in the input.


id: 0
review: Egbert is such a wonderful name...
sentiment: positive
reason: The review expresses satisfaction with the product.
confidence: 0.9


id: 1
review: TERRIBLE!! DO NOT BUY THIS: I bought this for my wife...
sentiment: negative
reason: The review expresses dissatisfaction.
confidence: 0.85


id: 2
review: Not the best shaper: This would be great if it had boning in it...
sentiment: improvement
reason: The review is mostly positive but mentions an issue to improve.
confidence: 0.88


Now analyze the following reviews(each with an ID):
[id: "999"] The fit 

bring some reviews in test_lines for demo.

## Input

In [20]:
# just one time
# sample data
import random

# for the test of product review feedback generator
test_data = []
positive = []
negative = []
num_each = 5000  # num_each reviews for each positive/negative 

# the reviews what we analysis
with bz2.open("/kaggle/input/amazonreviews/test.ft.txt.bz2", "rt") as f:
    for line in f:
        if line.startswith('__label__1') and len(negative) < num_each:
            _, review = line.strip().split(" ", 1)
            negative.append(review)
        elif line.startswith('__label__2') and len(positive) < num_each:
            _, review = line.strip().split(" ", 1)
            positive.append(review)
        
        if len(positive) >= num_each and len(negative) >= num_each:
            break


balanced_reviews = positive + negative #list
random.shuffle(balanced_reviews) 

df_reviews = pd.DataFrame({
    "id": [str(i) for i in range(len(balanced_reviews))], # give ID
    "review": balanced_reviews
})

df_reviews.to_csv("balanced_reviews_fixed.csv", index=False)
df_reviews = pd.read_csv("balanced_reviews_fixed.csv")

### Why each review is assigned a **unique ID**?

- Each review is assigned a unique ID to simplify and **speed up the response process** from the LLM.
- Instead of having the LLM return the full original review along with its feedback—which would increase token usage and processing time we only ask for a **minimal** response containing the **id and feedback**.

- Using lightweight numeric IDs allows us to later map the feedback back to the original reviews on the backend.
- This approach helps **reduce token costs and improves overall efficiency in batch processing**.


In [21]:
df_reviews["id"] = df_reviews["id"].astype(str)

# 5.Run the code

In [22]:
df_test = df_reviews.iloc[:30].copy()
df_result = analyze_reviews_with_id(df_reviews[:30], batch_size=5)
# df_second_half = df_reviews.iloc[30:60].copy()
# df_second_half = df_reviews.iloc[60:3000].copy()
# df_second_half = df_reviews.iloc[3000:7000].copy()
# df_second_half = df_reviews.iloc[7000:10000].copy()
# result_df = analyze_reviews_with_id(df_second_half, batch_size=5, checkpoint_path="/kaggle/working/results_with_feedback.csv")

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_reviews["id"] = df_reviews["id"].astype(str)


Excluding 0 previously processed or failed reviews.


Building Sentiment: 100%|██████████| 6/6 [00:26<00:00,  4.46s/it]



✅ Sentiment analysis done. Starting feedback generation...



📝 Building Feedback: 100%|██████████| 3/3 [00:11<00:00,  3.83s/it]

Saved to csv file. Congrats!





In [23]:
# Previously failed IDs remain from the last run,
# which can cause already successful IDs to be mistakenly recognized as failed again.
# open("failed_ids.txt", "w").close()

In [24]:
df_result # output from LLM

Unnamed: 0,id,sentiment,reason,confidence,feedback
0,0,positive,The reviewer found the DVD helpful and enjoyed...,0.95,
1,1,positive,The review expresses appreciation for the book...,0.8,
2,2,negative,"The reviewer states the product is ""useless"" f...",0.9,Explore alternative designs or materials that ...
3,3,positive,"The review is overwhelmingly positive, praisin...",0.98,
4,4,negative,"The review expresses strong disapproval, descr...",0.92,This feedback is not about a product. It's a ...
5,5,negative,"The review describes the movie as a ""dog's din...",0.8,This feedback is not about a product. It's a m...
6,6,negative,The review expresses disappointment with the C...,0.75,This feedback is not about a product. It's a m...
7,7,positive,The review praises the Sesame Street product f...,0.95,
8,8,positive,The review expresses strong approval for Clay ...,0.9,
9,9,positive,The review expresses satisfaction with the pro...,0.92,


# 6. create Category of feedback

In [25]:
# Build prompt for category classification
example_prompt = PromptTemplate(
    input_variables=["id", "feedback", "category"],
    template='''id: {id}
feedback: {feedback}
category: {category}
''',
    template_format="f-string"
)

example_data = [
    # Shipping
    {"id": "100001", "feedback": "The package arrived late", "category": "Shipping"},
    {"id": "100002", "feedback": "It took too long to be delivered", "category": "Shipping"},

    # Quality
    {"id": "100003", "feedback": "The material tore after one wash", "category": "Quality"},
    {"id": "100004", "feedback": "Feels poorly made and flimsy", "category": "Quality"},

    # Price
    {"id": "100005", "feedback": "Not worth the money", "category": "Price"},
    {"id": "100006", "feedback": "Too expensive for the quality", "category": "Price"},

    # Design
    {"id": "100007", "feedback": "Style looks old-fashioned", "category": "Design"},
    {"id": "100008", "feedback": "The design was not what I expected", "category": "Design"},

    # Service
    {"id": "100009", "feedback": "Had trouble reaching service team", "category": "Service"},
    {"id": "100010", "feedback": "Customer support was not helpful", "category": "Service"},

    # Other
    {"id": "100011", "feedback": "Nothing stood out about this product", "category": "Other"},
    {"id": "100012", "feedback": "Just okay, nothing memorable", "category": "Other"},
]

prefix = """You are a customer feedback categorization assistant.

Your task is to classify each of the following product improvement suggestions into **one** of these categories:
["Shipping", "Quality", "Design", "Price", "Service", "Other"]

Each feedback contains a unique ID like this: [id: "42"]
⚠️ You MUST copy the exact ID into your response.  
Do NOT make up, skip, or modify any ID.

Your output should contain:
- id (same as the input)
- category (one of the 6 categories above)
"""

suffix = "Now categorize the following feedbacks:\n{formatted_feedbacks}"


# few-shot
category_prompt_template = FewShotPromptTemplate(
    examples=example_data,
    example_prompt=example_prompt,
    prefix=prefix,
    suffix=suffix,
    input_variables=["formatted_feedbacks"]
)


def build_category_prompt(batch_df):
    formatted_feedbacks = "\n".join([
        f'[id: "{row["id"]}"] {row["feedback"]}' for _, row in batch_df.iterrows()
    ])
    prompt_value = category_prompt_template.format_prompt(formatted_feedbacks=formatted_feedbacks)
    return prompt_value.to_string() + """

ONLY return the results in the following JSON array format.

IMPORTANT:
If any value contains double quotes ("), you MUST escape them with a backslash like this: \".

[
  {
    "id": "string",
    "category": "Shipping | Quality | Design | Price | Service | Other"
  },
  ...
]
"""

In [26]:
# Category classification main function
def categorize_feedbacks(df_feedback, batch_size=5, checkpoint_path="categorized_feedback.csv", fail_log_path="category_failed_ids.txt"):
    results = []
    failed_ids = []
    df_result = pd.DataFrame()

    existing_ids = set()
    if os.path.exists(checkpoint_path):
        print("📍 Found existing checkpoint.")
        existing_df = pd.read_csv(checkpoint_path, dtype={"id": str})
        existing_ids = set(existing_df["id"])

    excluded_df = df_feedback[~df_feedback["id"].astype(str).isin(existing_ids)].copy()
    excluded_df["id"] = excluded_df["id"].astype(str)

    print(f"Excluded {len(existing_ids)} IDs. Processing {len(excluded_df)} feedback entries.")

    for start in tqdm(range(0, len(excluded_df), batch_size), desc="🧠 Categorizing"):
        batch_df = excluded_df.iloc[start : start + batch_size].copy()
        if batch_df.empty:
            continue

        try:
            prompt = build_category_prompt(batch_df)
            response = safe_generate(prompt)
            cleaned = clean_json_response(response.text.strip())
            batch_results = json.loads(cleaned)

            invalid_ids = validate_llm_response(batch_df, batch_results, log_prefix=f"Batch {start // batch_size + 1}")
            if invalid_ids:
                with open(fail_log_path, "a") as f:
                    for i in invalid_ids:
                        f.write(str(i) + "\n")
                print(f"❌ Skipping batch {start // batch_size + 1} due to validation errors.")
                continue

            df_batch = pd.DataFrame(batch_results)
            df_result = pd.concat([df_result, df_batch], ignore_index=True)
            
            df_batch.to_csv(checkpoint_path, mode="a", index=False, header=not os.path.exists(checkpoint_path))

        except Exception as e:
            print(f"❌ Error (Batch {start // batch_size + 1}):", e)
            with open(fail_log_path, "a") as f:
                for i in batch_df["id"]:
                    f.write(str(i) + "\n")

        time.sleep(2.0)

    print("\n✅ Category classification complete. Results saved.")

    return df_result

In [27]:
df = pd.read_csv("/kaggle/working/results_with_feedback.csv")
df = df[df["feedback"].notna()]  # Remove empty feedbacks
df_export = df[['id', 'feedback']]
df_c = categorize_feedbacks(df_export[:30], batch_size=5) # for test

Excluded 0 IDs. Processing 15 feedback entries.


🧠 Categorizing: 100%|██████████| 3/3 [00:09<00:00,  3.22s/it]


✅ Category classification complete. Results saved.





In [28]:
df_c

Unnamed: 0,id,category
0,2,Design
1,4,Other
2,5,Other
3,6,Other
4,14,Other
5,17,Design
6,18,Price
7,20,Design
8,21,Design
9,23,Design


In [29]:
# in case any failed ids are found due to batch error, token limits or LLM issues
def retry_failed_reviews(df_reviews,
                         original_checkpoint="/kaggle/working/results_with_feedback.csv",
                         failed_ids_path="/kaggle/working/failed_ids.txt",
                         retry_checkpoint="/kaggle/working/retry_results.csv",
                         retry_failed_log="/kaggle/working/retry_failed_ids.txt",
                         merge_result=True,
                         merged_output_path="/kaggle/working/results_with_feedback_merged.csv"):
    """
    Retries only the failed IDs, saves the results, and optionally merges them with the original data.

    Args:
        df_reviews: DataFrame containing all reviews.
        original_checkpoint: Path to the original checkpoint file.
        failed_ids_path: Path to the text file containing IDs of failed reviews.
        retry_checkpoint: Path to save the retry results.
        retry_failed_log: Path to save IDs that failed again during retry.
        merge_result: Whether to merge the retry results with the original data.
        merged_output_path: Path to save the final merged result.
    """

    # call failed ids
    if not os.path.exists(failed_ids_path):
        print("❌ No failed_ids.txt found. Skipping retry.")
        return None

    with open(failed_ids_path, "r") as f:
        failed_ids = set(line.strip() for line in f)

    print(f"🔁 Retrying {len(failed_ids)} failed reviews...")

    # extract reviews of failed ids 
    df_retry = df_reviews[df_reviews["id"].astype(str).isin(failed_ids)].copy()

    # retry func
    df_retried = analyze_reviews_with_id(
        df_retry,
        checkpoint_path=retry_checkpoint,
        fail_log_path=retry_failed_log
    )

    # merge
    if merge_result and os.path.exists(original_checkpoint):
        df_original = pd.read_csv(original_checkpoint, dtype={"id": str})
        df_final = pd.concat([df_original, df_retried], ignore_index=True)
        df_final = df_final.drop_duplicates(subset="id")
        df_final.to_csv(merged_output_path, index=False)
        print(f"✅ Merged results saved to: {merged_output_path}")
        return df_final

    return df_retried