In [1]:
import numpy as np
import pandas as pd
import re
import time
import requests
import json


In [5]:
df = pd.read_csv('2. Cleaned_laptop_df.csv')
df.head(2)

Unnamed: 0,Index,ASIN,Model Number,Manufacturer Name,Generic Name,Product Title,Price,Overall Rating,Rating Count,Review Count,...,Customer Say,Insight,Individual Rating,Review Title,Review Date,Product Description,Verified Purchase,Review Text,Helpful Votes,Cleaned Review Text
0,1,B0DZDC247V,MW0Y3HN/A,Apple,MacBook Air,"Apple 2025 MacBook Air (13-inch, Apple M4 chip...",89990,4.5,33,15,...,,,5.0,Fast and Powerful!,Reviewed in India on 19 April 2025,Colour: Starlight|Size: 16GB Unified Memory|St...,True,I recently purchased the MacBook M4 with 512GB...,38,i recently purchased the macbook m4 with 512gb...
1,2,B0DZDC247V,MW0Y3HN/A,Apple,MacBook Air,"Apple 2025 MacBook Air (13-inch, Apple M4 chip...",89990,4.5,33,15,...,,,5.0,Elevate Your Productivity with the Perfect Laptop,Reviewed in India on 4 April 2025,Colour: Midnight|Size: 16GB Unified Memory|Sty...,True,I placed the order on March 28th and received ...,118,i placed the order on march 28th and received ...


## 🤖 Building Prompt for LLM-Based Review Analysis

This function `build_amazon_prompt(text)` creates a prompt to extract structured insights from Amazon product reviews using a local LLM.

In [87]:
def build_amazon_prompt(text):
    return f"""
You are an intelligent assistant extracting structured insights from Amazon product reviews from Indian customer.

Return a valid flat JSON object with these fields:
- "Sentiment": One of ["Positive", "Negative", "Neutral"]
- "Aspect-PainPoint Pairs": Extract all mentioned aspect-specific complaints and return them as strings in the format, "Aspect - Pain Point"

Use only the review content. If any field is not present, return null.

### Examples - Learn from these examples.

Review:
"the hp laptop 15s with amd ryzen 3 5300u is perfect for daily tasks. the 15.6-inch fhd display is clear and vibrant, great for work and entertainment. its lightweight 1.69 kg and easy to carry around, making it ideal for students and professionals.performance is smooth with 8gb ram and a 512gb ssd, ensuring fast boot times and multitasking. the dual speakers provide clear audio, and the preloaded windows 11 and ms office 2019 are a huge plus.overall, its a stylish, reliable laptop at a great price. highly recommended!"
Output:
{{
  "Sentiment": "Positive",
  "Aspect-PainPoint Pairs": null
}}

Review:
"battery low @charging is low every time.i think its not good quality.my suggestion is dont buy electric items online"
Output:
{{
  "Sentiment": "Negative",
  "Aspect-PainPoint Pairs": [
    "Display - Poor viewing angles",
    "Wi-Fi - Disconnects frequently"
  ]
}}

Review:
"avoid buying this laptop, especially the hp 15s series. the display quality is terrible even a slight side view turns the screen black and white. the worst part is the wi-fi it disconnects after 1520 minutes of use, which is frustrating during important tasks. totally disappointing."
Output:
{{
  "Sentiment": "Negative",
  "Aspect-PainPoint Pairs": [
    "Display - Poor viewing angles",
    "Wi-Fi - Disconnects frequently"
  ]
}}

Review:
"touch pad is not working.i bought the product on 13 th of may so obviously my warranty should be from the date of buy to one year. but the date of expiry is shown in website for this pc is 18 november 2025 and start date of the laptop is 16 november 2024 . i dont know that i bought the laptop recently in 2025 but how come the warranty got started from 2024."
Output:
{{
  "Sentiment": "Negative",
  "Aspect-PainPoint Pairs": [
    "Touchpad - Not working",
    "Warranty - Incorrect start date"
  ]
}}

Review:
"if you can shift 3 to 4000 upper side then you should go to ryzen 5 it will be more suitable for you according to price and it will give you next level performance and note for hp ryzen 3 it gives only three to four and maximum is 5.5 hours battery backup which will frustrate you after purchasing"
Output:
{{
  "Sentiment": "Negative",
  "Aspect-PainPoint Pairs": [
    "Battery - Short backup time",
    "Processor - Low performance"
  ]
}}

### Now, analyze the following review:

\"\"\"{text}\"\"\"

Output:
"""


## 🧠 Extracting Insights from Reviews using LLM (LLaMA 3 via Ollama)

This block sends each cleaned review to a locally running LLaMA 3 model and extracts structured insights.

In [None]:
def extract_info(text):
    prompt = build_amazon_prompt(text)
    try:
        response = requests.post(
            "http://localhost:11434/api/generate",
            json={"model": "llama3", "prompt": prompt, "stream": False}
        )
        return response.json()["response"]
    except Exception as e:
        print(f"❌ Error calling LLM: {e}")
        return None

def parse_response(resp):
    try:
        match = re.search(r'\{.*\}', resp, re.DOTALL)
        if match:
            return json.loads(match.group())
        else:
            return {}
    except Exception as e:
        print(f"❌ JSON parse error: {e}")
        return {}

df = df.reset_index(drop=True)
df["llm_raw"] = None
df["llm_parsed"] = None

for idx in range(len(df)):
    text = df.loc[idx, "Cleaned Review Text"]
    
    raw = extract_info(text)
    parsed = parse_response(raw) if raw else {}

    df.at[idx, "llm_raw"] = raw
    df.at[idx, "llm_parsed"] = parsed

    sentiment = parsed.get("Sentiment", None)
    aspect_pairs = parsed.get("Aspect-PainPoint Pairs", None)
    
    print(f"✅ Row {idx} | Sentiment: {sentiment} | Aspect-PainPoint Pairs: {aspect_pairs}")
    
    time.sleep(1)

df_extracted = pd.json_normalize(df["llm_parsed"]).reset_index(drop=True)
final_df = pd.concat([df.reset_index(drop=True), df_extracted], axis=1)

final_df.head()

> ⚠️ Output of the above cell has been removed to keep the notebook readable. Below is sample output.

✅ Row 18 | Sentiment: Negative | Aspect-PainPoint Pairs: ['Performance - Frequent freezing and system lag', 'Cooling - Extreme overheating', 'Noise - Loud fan noise', 'Network/Wi-Fi - Disconnects frequently', 'Customer Support - Unhelpful']  
✅ Row 19 | Sentiment: Negative | Aspect-PainPoint Pairs: ['Battery - Short backup time']  
✅ Row 20 | Sentiment: Positive | Aspect-PainPoint Pairs: ['Heat - Exhaust on bottom side']  
✅ Row 21 | Sentiment: Positive | Aspect-PainPoint Pairs: ['Display - Poor viewing angles', 'Battery - Short backup time']  
✅ Row 22 | Sentiment: Positive | Aspect-PainPoint Pairs: ['Graphics - Weak']  
✅ Row 23 | Sentiment: Negative | Aspect-PainPoint Pairs: ['Sound - Low sound quality']  
✅ Row 24 | Sentiment: Negative | Aspect-PainPoint Pairs: ['Performance - Hanging issues']  
✅ Row 25 | Sentiment: Neutral | Aspect-PainPoint Pairs: None  


In [103]:
# final_df.to_csv('3. llm_outpu_new1.csv', index=False)