# AI Assisted Customer Review Triage System

Author: Christopher Bacani, 2026 

## Problem Statement

High-volume gueset reviews require structured triage to:
- Summarize feedback
- Detect sentiment and escalation risk
- Ground responses in policy context
- Generate brand-safe replies

This notebook demonstrates a clean, end-to-end proof of concept.

## Solution Components

This proof of concept includes:

- **Structured LLM Triage**  
  Extracts summary, sentiment label, sentiment score, and escalation risk.

- **Hybrid Risk Logic**  
  Combines rule-based keyword triggers with LLM reasoning to reduce false positives.

- **Semantic FAQ Retrieval**  
  Uses sentence embeddings and cosine similarity to ground responses in policy context.

- **Guardrailed Response Generation**  
  Drafts brand-safe replies that avoid liability admission or unsupported commitments.

- **End-to-End Orchestration**  
  A single `process_review()` function executes the complete workflow.

## Reproducibility

1. Install dependencies:

   `pip install -r requirements.txt`

2. Create a `.env` file in the project root containing:

   `OPENAI_API_KEY=your generated API key`


   Generate an API key at https://platform.openai.com

   
   Do not commit your `.env` file to version control.


4. Restart the kernel and run all cells (Restart & Run All). The notebook executes cleanly without manual cell reordering.

## Dataset Import

In [2]:
from pathlib import Path
import pandas as pd

PROJECT_ROOT = Path(".")
DATA_PATH = PROJECT_ROOT / "data" / "DisneylandReviews.csv"

DATA_PATH

PosixPath('data/DisneylandReviews.csv')

In [3]:
reviews_df = pd.read_csv(DATA_PATH, encoding="latin-1")
print(reviews_df.shape)
reviews_df.head()

(42656, 6)


Unnamed: 0,Review_ID,Rating,Year_Month,Reviewer_Location,Review_Text,Branch
0,670772142,4,2019-4,Australia,If you've ever been to Disneyland anywhere you...,Disneyland_HongKong
1,670682799,4,2019-5,Philippines,Its been a while since d last time we visit HK...,Disneyland_HongKong
2,670623270,4,2019-4,United Arab Emirates,Thanks God it wasn t too hot or too humid wh...,Disneyland_HongKong
3,670607911,4,2019-4,Australia,HK Disneyland is a great compact park. Unfortu...,Disneyland_HongKong
4,670607296,4,2019-4,United Kingdom,"the location is not in the city, took around 1...",Disneyland_HongKong


#### Quick data clean

In [4]:
reviews_df = reviews_df.dropna(subset=["Review_Text"]).copy()

reviews_df["Review_Text"] = reviews_df["Review_Text"].astype(str).str.strip()

reviews_df = reviews_df.reset_index(drop = True)

print(reviews_df.shape)
reviews_df.head()

(42656, 6)


Unnamed: 0,Review_ID,Rating,Year_Month,Reviewer_Location,Review_Text,Branch
0,670772142,4,2019-4,Australia,If you've ever been to Disneyland anywhere you...,Disneyland_HongKong
1,670682799,4,2019-5,Philippines,Its been a while since d last time we visit HK...,Disneyland_HongKong
2,670623270,4,2019-4,United Arab Emirates,Thanks God it wasn t too hot or too humid wh...,Disneyland_HongKong
3,670607911,4,2019-4,Australia,HK Disneyland is a great compact park. Unfortu...,Disneyland_HongKong
4,670607296,4,2019-4,United Kingdom,"the location is not in the city, took around 1...",Disneyland_HongKong


#### Creating Small Working Subset `sample_df`

In [5]:
sample_df = reviews_df.sample(20, random_state = 42).copy()

sample_df.reset_index(drop = True, inplace = True)

print(sample_df.shape)
sample_df.head()

(20, 6)


Unnamed: 0,Review_ID,Rating,Year_Month,Reviewer_Location,Review_Text,Branch
0,540713188,5,2017-9,Malta,Disneyland is so beautiful and large.To see al...,Disneyland_Paris
1,119781124,1,2011-10,Canada,"The lines for rides are too long. Yes, the fas...",Disneyland_California
2,576395715,5,2018-4,Australia,Loved Hong Kong Disneyland although it is much...,Disneyland_HongKong
3,310041955,5,2015-9,United States,Love Disneyland! We are annual pass holders an...,Disneyland_California
4,184009554,4,2013-11,United States,The California Adventure Park is much improved...,Disneyland_California


## Environment & API Setup

#### Setup of OpenAI Client

In [6]:
from openai import OpenAI
import os
from dotenv import load_dotenv

In [7]:
load_dotenv()

api_key = os.getenv("OPENAI_API_KEY")

if api_key:
    print("API key successfully loaded")
else:
    print("API key NOT found, please reference OpenAI settings.")

client = OpenAI()

if client:
    print("Client initialized successfully.")

API key successfully loaded
Client initialized successfully.


#### Testing API Call

In [8]:
response = client.chat.completions.create(
    model = "gpt-4o-mini",
    messages=[
        {"role": "system", "content": "You are a concise and friendly assistant."},
        {"role": "user", "content": "Summarize this sentence: Disneyland was way too crowded and everything was so expensive but the food was very tasty and the experience was magical."}
    ],
    temperature = 0
)

print(response.choices[0].message.content)

Disneyland was crowded and pricey, but the food was delicious and the experience magical.


## Triage Logic

#### Delineating risk flags

In [27]:
RISK_KEYWORDS = [
    "injury",
    "hurt",
    "discrimination",
    "unsafe",
    "refund",
    "charged twice",
    "lawsuit",
    "legal",
    "harassment",
    "fraud",
    "scam"
]

#### Creating a function to determine `rule_based_risk()`

In [28]:
def rule_based_risk(review_text):
    text = review_text.lower()
    for keyword in RISK_KEYWORDS:
        if keyword in text:
            return True, f"Keyword trigger: {keyword}"
    return False, None

#### Building `analyze_review()` function

In [10]:
import json
import re

def analyze_review(review_text):
    rule_flag, rule_reason = rule_based_risk(review_text)
    prompt = f"""
    You are an AI customer support triage assistant.

    Analayze the following review and return ONLY valid JSON with the following fields:
    - a summary (concise)
    - sentiment_label (positive, neutral, negative)
    - sentiment_score (number between -1 and 1)
    - model_risk_flag (true ONLY if the review includes safety concerns, legal threats, discrimination claims, fraud accusations, refund disputes, or harassment allegations. Otherwise false.)
    - model_risk_reason (short explanation if true, otherwise null)

    Review:
    \"\"\"{review_text}\"\"\"
    """

    response = client.chat.completions.create(
        model = "gpt-4o-mini",
        messages = [
        {"role": "system", "content": "You output strictly valid JSON only."},
        {"role": "user", "content": prompt}
        ],
        temperature = 0.2
    )

    content = response.choices[0].message.content

    # Remove markdown fences if present
    content = re.sub(r"```json|```", "", content).strip()

    parsed = json.loads(content)

    final_risk = rule_flag or parsed["model_risk_flag"]

    # try:
    #     return json.loads(content)
    # except:
    #     print("JSON parsing unsuccessful, raw output: ")
    #     print(content)
    #     return None

    return {
        "summary": parsed["summary"],
        "sentiment_label": parsed["sentiment_label"],
        "sentiment_score": parsed["sentiment_score"],
        "risk_flag": final_risk,
        "risk_reason": rule_reason if rule_flag else parsed["model_risk_reason"]
    }


In [11]:
test_review = reviews_df.loc[0, "Review_Text"]

result = analyze_review(test_review)

result

{'summary': 'Disneyland Hong Kong has a familiar layout and enjoyable rides, despite being busy and hot.',
 'sentiment_label': 'positive',
 'sentiment_score': 0.7,
 'risk_flag': False,
 'risk_reason': None}

#### Batch test

In [12]:
from tqdm import tqdm

results = []

for review in tqdm(sample_df["Review_Text"]):
    result = analyze_review(review)
    results.append(result)

results[:3]

100%|███████████████████████████████████████████| 20/20 [00:36<00:00,  1.85s/it]


[{'summary': 'Positive experience at Disneyland, recommending a 3-day stay due to crowds.',
  'sentiment_label': 'positive',
  'sentiment_score': 0.9,
  'risk_flag': False,
  'risk_reason': None},
 {'summary': 'Long wait times for rides detract from the experience, leading to frustration.',
  'sentiment_label': 'negative',
  'sentiment_score': -0.7,
  'risk_flag': False,
  'risk_reason': None},
 {'summary': 'Positive experience at a smaller, less crowded Hong Kong Disneyland.',
  'sentiment_label': 'positive',
  'sentiment_score': 0.9,
  'risk_flag': False,
  'risk_reason': None}]

In [13]:
triage_df = pd.DataFrame(results)

triage_df.head()

Unnamed: 0,summary,sentiment_label,sentiment_score,risk_flag,risk_reason
0,"Positive experience at Disneyland, recommendin...",positive,0.9,False,
1,Long wait times for rides detract from the exp...,negative,-0.7,False,
2,"Positive experience at a smaller, less crowded...",positive,0.9,False,
3,"Positive experience at Disneyland, highlightin...",positive,0.9,False,
4,California Adventure Park has improved with Ca...,positive,0.6,False,


In [14]:
# Checking sample if any risk flags have arisen
triage_df["risk_flag"].value_counts()

risk_flag
False    20
Name: count, dtype: int64

## FAQ Retrieval (Semantic Matching)

For the purposes of this project, as we do not have express written permission to use the internal knowledge base of the Disney Parks subsidiaries, we will be using a hard-coded set of FAQs.

#### FAQ Corpus

In [15]:
faq_data = [
    {
        "question": "What is the refund policy for Disneyland tickets?",
        "answer": "Tickets are generally non-refundable unless purchased with flexible cancellation options."
    },
    {
        "question": "What are the height requirements for rides?",
        "answer": "Each ride has specific height requirements for safety. Guests should check signage before queuing."
    },
    {
        "question": "What should I do if I am double charged?",
        "answer": "Guests who believe they were double charged should contact customer service with their transaction details."
    },
    {
        "question": "What should I do if someone is injured in the park?",
        "answer": "Guests should immediately notify park staff or visit first aid stations located throughout the park."
    },
    {
        "question": "How long are wait times for rides?",
        "answer": "Wait times vary depending on season and demand. Guests can check the official app for live updates."
    }
]

In [16]:
faq_df = pd.DataFrame(faq_data)
faq_df

Unnamed: 0,question,answer
0,What is the refund policy for Disneyland tickets?,Tickets are generally non-refundable unless pu...
1,What are the height requirements for rides?,Each ride has specific height requirements for...
2,What should I do if I am double charged?,Guests who believe they were double charged sh...
3,What should I do if someone is injured in the ...,Guests should immediately notify park staff or...
4,How long are wait times for rides?,Wait times vary depending on season and demand...


In [17]:
from sentence_transformers import SentenceTransformer
from sklearn.metrics.pairwise import cosine_similarity

embedding_model = SentenceTransformer("all-MiniLM-L6-v2")

In [18]:
faq_embeddings = embedding_model.encode(faq_df["question"].tolist())

In [19]:
def match_faq(review_text, threshold = 0.65):
    review_embedding = embedding_model.encode([review_text])

    similarities = cosine_similarity(review_embedding, faq_embeddings)[0]

    best_idx = similarities.argmax()
    best_score = similarities[best_idx]

    if best_score >= threshold:
        return{
            "faq_question": faq_df.iloc[best_idx]["question"],
            "faq_answer": faq_df.iloc[best_idx]["answer"],
            "faq_score": float(best_score)
        }

    else:
        return None

In [20]:
def debug_faq_match(review_text):
    review_embedding = embedding_model.encode([review_text])
    similarities = cosine_similarity(review_embedding, faq_embeddings)[0]

    debug_df = faq_df.copy()
    debug_df["similarity"] = similarities

    return debug_df.sort_values("similarity", ascending=False)

In [21]:
print(sample_df.Review_Text[0])
debug_faq_match(sample_df.loc[0, "Review_Text"])

Disneyland is so beautiful and large.To see all you need to stay there at least 3 days,as there is a lot of people and large quits, but it is worth it! Very good organized! We have the best holidays experience!


Unnamed: 0,question,answer,similarity
0,What is the refund policy for Disneyland tickets?,Tickets are generally non-refundable unless pu...,0.410492
4,How long are wait times for rides?,Wait times vary depending on season and demand...,0.277537
1,What are the height requirements for rides?,Each ride has specific height requirements for...,0.175846
3,What should I do if someone is injured in the ...,Guests should immediately notify park staff or...,0.144898
2,What should I do if I am double charged?,Guests who believe they were double charged sh...,-0.109987


## Response Generation

#### Building guardrailed `draft_response()` function

In [22]:
def draft_response(summary, sentiment_label, risk_flag, park_name, faq_match=None):
    base_prompt = f"""
    You are Morty F., a customer support agent for {park_name} drafting a professional 
    response to a Disneyland guest review.

    Context:
    - Review summary: {summary}
    - Sentiment: {sentiment_label}
    - Escalation required: {risk_flag}

    If escalation is required, respond emphatically and indicate that the issue will
    be reviewed by the appropriate team.

    If not, provide a polite and brand-safe response.

    Do NOT admit legal liability.
    Do NOT promise refunds unless policy is explicitly provided.
    """

    if faq_match:
        base_prompt += f"""
        {faq_match}
        Use this information if helpful, but do not quote it verbatim
        """

        prompt += """
        End the response with:
        — Morty F., Guest Experience Team

        Return only the response text.
        """

    base_prompt += "\nReturn only the response text."

    response = client.chat.completions.create(
        model = "gpt-4o",
        messages = [
            {"role": "system", "content": "You are a professional theme park support representative."},
            {"role": "user", "content": base_prompt}
        ],
        temperature = 0.5
    )

    return response.choices[0].message.content.strip()
        

In [23]:
row = triage_df.iloc[0]

park_name = sample_df.loc[0, "Branch"] 

faq_result = match_faq(sample_df.loc[0, "Review_Text"])

response_text = draft_response(
    summary=row["summary"],
    sentiment_label=row["sentiment_label"],
    risk_flag=row["risk_flag"],
    park_name=park_name,
    faq_match=faq_result["faq_answer"] if faq_result else None
)

print(response_text)


Dear Valued Guest,

Thank you for taking the time to share your delightful experience at Disneyland Paris with us! We are thrilled to hear that you enjoyed your visit and appreciate your recommendation of a 3-day stay to fully immerse in the magic amidst the lively atmosphere. Your kind words serve as a wonderful encouragement to our team, who are dedicated to creating memorable moments for all our guests.

We hope to welcome you back soon for another enchanting adventure. Until then, have a magical day!

Warm regards,

Morty F.  
Disneyland Paris Customer Support


In [24]:
import random

# Pick random index
rand_idx = random.randint(0, len(reviews_df) - 1)

review_text = reviews_df.loc[rand_idx, "Review_Text"]
park_name = reviews_df.loc[rand_idx, "Branch"]  

print("=== RAW REVIEW ===")
print(review_text)
print("\n")

# Run triage analysis
analysis = analyze_review(review_text)

print("=== TRIAGE JSON OUTPUT ===")
print(analysis)
print("\n")

# FAQ match
faq_result = match_faq(review_text)

print("=== FAQ MATCH ===")
print(faq_result)
print("\n")

# Draft response
response_text = draft_response(
    summary=analysis["summary"],
    sentiment_label=analysis["sentiment_label"],
    risk_flag=analysis["risk_flag"],
    park_name=park_name,
    faq_match=faq_result["faq_answer"] if faq_result else None
)

print("=== DRAFT RESPONSE ===")
print(response_text)


=== RAW REVIEW ===
Loved the hotel, disliked that the hotel does not have it's own shuttle service. Smokers allowed to light up too close to the breakfast dining room entrance.


=== TRIAGE JSON OUTPUT ===
{'summary': 'Enjoyed the hotel but unhappy about the lack of shuttle service and smoking near the dining area.', 'sentiment_label': 'neutral', 'sentiment_score': 0.0, 'risk_flag': False, 'risk_reason': None}


=== FAQ MATCH ===
None


=== DRAFT RESPONSE ===
Dear Valued Guest,

Thank you for taking the time to share your feedback regarding your recent stay at Disneyland California. We're delighted to hear that you enjoyed your time at the hotel. Your comfort and satisfaction are very important to us.

We apologize for any inconvenience you experienced due to the lack of shuttle service and the presence of smoking near the dining area. We strive to maintain a comfortable and enjoyable environment for all our guests, and your comments are invaluable in helping us improve our services.



#### Packaging end-to-end orchestration function `process_review()`

In [25]:
def process_review(review_text, park_name):
    """
    End-to-end triage pipeline for a single guest review.

    This function orchestrates the full review processing workflow:
    1. Performs structured LLM-based analysis (summary, sentiment, risk detection).
    2. Retrieves relevant FAQ context using semantic similarity.
    3. Generates a guardrailed, brand-safe response tailored to the park.

    Parameters
    ----------
    review_text : str
        Raw guest review text.
    park_name : str
        Name of the Disneyland park associated with the review.

    Returns
    -------
    dict
        Structured triage output including:
        - summary
        - sentiment_label
        - sentiment_score
        - risk_flag
        - risk_reason
        - faq_question (if matched)
        - faq_score (if matched)
        - drafted response text
    """
    
    # Step 1: Structured triage analysis
    analysis = analyze_review(review_text)

    # Step 2: FAQ Search and retrieval
    faq_result = match_faq(review_text)

    faq_answer = faq_result["faq_answer"] if faq_result else None
    faq_question = faq_result["faq_question"] if faq_result else None
    faq_score = faq_result["faq_score"] if faq_result else None

  # Step 3: Draft response
    response_text = draft_response(
        summary=analysis["summary"],
        sentiment_label=analysis["sentiment_label"],
        risk_flag=analysis["risk_flag"],
        park_name=park_name,
        faq_match=faq_answer
    )

    # Step 4: Return structured object
    return {
        "summary": analysis["summary"],
        "sentiment_label": analysis["sentiment_label"],
        "sentiment_score": analysis["sentiment_score"],
        "risk_flag": analysis["risk_flag"],
        "risk_reason": analysis["risk_reason"],
        "faq_question": faq_question,
        "faq_score": faq_score,
        "response": response_text
    }

#### Demonstration of end-to-end function; random review

In [26]:
import random
import json

rand_idx = random.randint(0, len(reviews_df) - 1)

review_text = reviews_df.loc[rand_idx, "Review_Text"]
park_name = reviews_df.loc[rand_idx, "Branch"]  

print("=== RAW REVIEW ===")
print(review_text)
print("\n")

result = process_review(review_text, park_name)

print("=== STRUCTURED OUTPUT ===")
print(json.dumps(result, indent=2))


=== RAW REVIEW ===
My first visit to Disneyland was 3 days after it opened in 1955. And now 57 years later it is still a class act, the original. Going around I can vividly remember past attractions and past rides as well as appreciate the newer ones. My favorites continue to be Matterhorn Bobsleds, Star Tours, Big Thunder Railroad, Pirates, Mark Twain, & Haunted Mansion. Now at holiday time Haunted Mansion is all done over in Tim Burton's  Nightmare Before Christmas  and it was outstanding. But my favorite attraction is Indiana Jones. It had been closed for 3 months and just reopened when we were there. Rode it twice; truly an excellent adventure. Try to get a dinning reservation for Blue Bayou or Cafe Orleans. Both are great fun spots to eat (though pricey of course).


=== STRUCTURED OUTPUT ===
{
  "summary": "The reviewer reflects positively on their long history with Disneyland, highlighting favorite attractions and dining experiences.",
  "sentiment_label": "positive",
  "sentime

## Conclusion

This prototype demonstrates a modular, end-to-end review triage workflow integrating LLM reasoning, hybrid rule-based safeguards, semantic FAQ grounding, and controlled response generation. In production, the FAQ corpus and escalation logic would be connected to internal knowledge systems and monitored for calibration.
