# Customer Support Dispute Intake Assistant

This notebook demonstrates **short-term memory management** for multi-turn conversations in a customer support context.

## Key Concepts Demonstrated

1. **Conversation History Management**: Last-k message window to prevent context overflow
2. **Sliding Session Summary**: Running 3-5 bullet summary updated after each turn
3. **State vs Memory Distinction**: Captured fields (state) vs conversation facts (memory)
4. **Memory-Driven Consistency**: Avoiding re-asking for known information

## Scenario
Customer support intake for card transaction disputes where customers describe issues over multiple turns, and the assistant must remember previous details without repetition.

In [1]:
import os
import json
from datetime import datetime
from typing import Dict, List, Optional, Literal
from dataclasses import dataclass, asdict
from openai import OpenAI

# Load environment variables
from dotenv import load_dotenv
load_dotenv()

# Initialize OpenAI client
client = OpenAI(
    base_url="https://openai.vocareum.com/v1",
    api_key=os.getenv("OPENAI_API_KEY")
)


In [2]:
@dataclass
class DisputeData:
    """Data model for transaction dispute information"""
    card_last4: Optional[str] = None
    merchant: Optional[str] = None
    amount: Optional[float] = None
    currency: str = "USD"
    txn_date: Optional[str] = None  # ISO date format
    channel: Optional[Literal["IN_STORE", "ONLINE", "IN_APP", "CONTACTLESS", "OTHER"]] = None
    claim_reason: Optional[Literal["NOT_RECOGNIZED", "CHARGED_TWICE", "NOT_RECEIVED", "NOT_AS_DESCRIBED", "FRAUD"]] = None
    additional_notes: Optional[str] = None
    
    def get_missing_fields(self) -> List[str]:
        """Return list of fields that still need to be collected"""
        required_fields = ["card_last4", "merchant", "amount", "txn_date", "channel", "claim_reason"]
        missing = []
        for field in required_fields:
            if getattr(self, field) is None:
                missing.append(field)
        return missing
    
    def get_captured_fields(self) -> Dict:
        """Return dictionary of fields that have been captured"""
        captured = {}
        for field, value in asdict(self).items():
            if value is not None:
                captured[field] = value
        return captured
    
    def is_complete(self) -> bool:
        """Check if all required fields have been collected"""
        return len(self.get_missing_fields()) == 0

In [3]:
@dataclass
class ConversationTurn:
    """Represents a single turn in the conversation"""
    user_input: str
    assistant_response: str
    timestamp: str
    
class DisputeIntakeAssistant:
    """Customer support assistant with short-term memory management"""
    
    def __init__(self, max_history_turns: int = 6):
        self.dispute_data = DisputeData()
        self.conversation_history: List[ConversationTurn] = []
        self.session_summary: List[str] = []
        self.max_history_turns = max_history_turns
        
    def add_conversation_turn(self, user_input: str, assistant_response: str):
        """Add a turn to conversation history with sliding window management"""
        turn = ConversationTurn(
            user_input=user_input,
            assistant_response=assistant_response,
            timestamp=datetime.now().isoformat()
        )
        
        self.conversation_history.append(turn)
        
        # Implement sliding window: keep only last k turns
        if len(self.conversation_history) > self.max_history_turns:
            self.conversation_history = self.conversation_history[-self.max_history_turns:]
    
    def update_session_summary(self, user_input: str):
        """Update the running session summary with new information"""
        summary_prompt = f"""
Current session summary (3-5 key facts):
{chr(10).join('- ' + fact for fact in self.session_summary) if self.session_summary else 'None yet'}

Latest user input: "{user_input}"

Update the session summary to include any new key facts from the user input.
Keep it to 3-5 concise bullet points focusing on:
- Transaction details mentioned
- Customer situation/context
- Dispute reasoning
- Any emotional context or urgency

Return ONLY the updated bullet points, one per line, without bullet symbols.
"""
        
        try:
            response = client.chat.completions.create(
                model="gpt-4o-mini",
                messages=[{"role": "user", "content": summary_prompt}],
                temperature=0.1,
                max_tokens=200
            )
            
            # Parse response into bullet points
            new_summary = [
                line.strip() 
                for line in response.choices[0].message.content.strip().split('\n') 
                if line.strip()
            ]
            
            # Keep only 3-5 most recent/relevant facts
            self.session_summary = new_summary[-5:]
            
        except Exception as e:
            print(f"Error updating session summary: {e}")
    
    def extract_dispute_info(self, user_input: str) -> bool:
        """Extract structured dispute information from user input"""
        extraction_prompt = f"""
Extract transaction dispute information from this user input: "{user_input}"

Current captured data:
{json.dumps(self.dispute_data.get_captured_fields(), indent=2)}

Extract and return ONLY new information in this exact JSON format. You must respond with valid JSON only, no other text:

{{
    "card_last4": "four digits or null",
    "merchant": "merchant name or null", 
    "amount": number or null,
    "currency": "currency code or null",
    "txn_date": "YYYY-MM-DD or null",
    "channel": "IN_STORE|ONLINE|IN_APP|CONTACTLESS|OTHER or null",
    "claim_reason": "NOT_RECOGNIZED|CHARGED_TWICE|NOT_RECEIVED|NOT_AS_DESCRIBED|FRAUD or null",
    "additional_notes": "any extra details or null"
}}

Rules:
- Only include fields with new/updated information
- Use null for fields not mentioned or unclear
- For dates, convert to YYYY-MM-DD format (e.g., "July 2nd" becomes "2024-07-02")
- For amounts, extract numeric value only
- Match channel to: IN_STORE, ONLINE, IN_APP, CONTACTLESS, OTHER
- Match claim_reason to: NOT_RECOGNIZED, CHARGED_TWICE, NOT_RECEIVED, NOT_AS_DESCRIBED, FRAUD
- Return valid JSON only, no explanations or additional text
"""
        
        try:
            response = client.chat.completions.create(
                model="gpt-4o-mini",
                messages=[{"role": "user", "content": extraction_prompt}],
                temperature=0.1,
                max_tokens=300
            )
            
            response_content = response.choices[0].message.content.strip()
            
            # Clean up the response to extract JSON
            if "```json" in response_content:
                response_content = response_content.split("```json")[1].split("```")[0].strip()
            elif "```" in response_content:
                response_content = response_content.split("```")[1].strip()
            
            # Parse JSON response
            extracted_data = json.loads(response_content)
            
            # Update dispute data with extracted information
            updated = False
            for field, value in extracted_data.items():
                if value is not None and value != "null" and hasattr(self.dispute_data, field):
                    current_value = getattr(self.dispute_data, field)
                    if current_value is None or current_value != value:
                        setattr(self.dispute_data, field, value)
                        updated = True
            
            return updated
            
        except json.JSONDecodeError as e:
            print(f"JSON parsing error: {e}")
            print(f"Raw response: {response.choices[0].message.content}")
            return False
        except Exception as e:
            print(f"Error extracting dispute info: {e}")
            return False
    
    def generate_response(self, user_input: str) -> str:
        """Generate contextual response using memory and current state"""
        
        # Build context from memory
        memory_context = ""
        if self.session_summary:
            memory_context = f"""
Session Summary (key facts remembered):
{chr(10).join('- ' + fact for fact in self.session_summary)}
"""
        
        # Build recent conversation context
        conversation_context = ""
        if self.conversation_history:
            conversation_context = "Recent conversation:\n"
            for turn in self.conversation_history[-3:]:  # Last 3 turns for context
                conversation_context += f"User: {turn.user_input}\nAssistant: {turn.assistant_response}\n\n"
        
        captured_fields = self.dispute_data.get_captured_fields()
        missing_fields = self.dispute_data.get_missing_fields()
        
        response_prompt = f"""
You are a professional customer support assistant helping with a card transaction dispute intake.

{memory_context}

{conversation_context}

Current captured information:
{json.dumps(captured_fields, indent=2) if captured_fields else 'None yet'}

Still needed: {', '.join(missing_fields) if missing_fields else 'All required info collected'}

User just said: "{user_input}"

Guidelines:
1. Be professional, empathetic, and efficient
2. Based on session summary, acknowledge what you already know about their situation
3. Never re-ask for information you already have (check captured fields carefully)
4. If missing fields exist, ask for ONE missing field at a time to avoid overwhelming the customer
5. Prioritize asking for: card_last4, merchant, amount, txn_date, channel, claim_reason
6. If you can infer information from the session summary, acknowledge it and ask for confirmation
7. If all info is collected, offer to generate the dispute summary
8. Keep responses concise (2-3 sentences max)

Examples of what NOT to do:
- Don't ask for merchant if they already mentioned "Uber"
- Don't ask for amount if they already said "$15.50"
- Don't ask for date if they said "July 2nd"
- Don't ask for channel if they said "Uber app"

Respond naturally as a customer support agent:
"""
        
        try:
            response = client.chat.completions.create(
                model="gpt-4o-mini",
                messages=[{"role": "user", "content": response_prompt}],
                temperature=0.3,
                max_tokens=200
            )
            
            return response.choices[0].message.content.strip()
            
        except Exception as e:
            return f"I apologize, but I'm experiencing technical difficulties. Could you please repeat your request?"
    
    def generate_final_summary(self) -> str:
        """Generate final dispute summary for human agent handoff"""
        summary_prompt = f"""
Create a concise dispute summary for a human agent based on this information:

Captured Data:
{json.dumps(self.dispute_data.get_captured_fields(), indent=2)}

Session Summary:
{chr(10).join('- ' + fact for fact in self.session_summary)}

Generate a professional dispute summary in exactly this format:

TRANSACTION DISPUTE SUMMARY
Card: ****{self.dispute_data.card_last4 or 'XXXX'}
Merchant: {self.dispute_data.merchant or 'Unknown'}
Amount: ${self.dispute_data.amount or 'Unknown'} {self.dispute_data.currency}
Date: {self.dispute_data.txn_date or 'Unknown'}
Channel: {self.dispute_data.channel or 'Unknown'}
Claim: {self.dispute_data.claim_reason or 'Unknown'}
Notes: {self.dispute_data.additional_notes or 'None provided'}

Keep it under 8 short lines total.
"""
        
        try:
            response = client.chat.completions.create(
                model="gpt-4o-mini",
                messages=[{"role": "user", "content": summary_prompt}],
                temperature=0.1,
                max_tokens=300
            )
            
            return response.choices[0].message.content.strip()
            
        except Exception as e:
            return "Error generating summary. Please review captured data manually."
    
    def process_user_input(self, user_input: str) -> str:
        """Main processing function that coordinates memory, extraction, and response"""
        
        # 1. Update session summary with new information
        self.update_session_summary(user_input)
        
        # 2. Extract structured information
        self.extract_dispute_info(user_input)
        
        # 3. Generate contextual response
        assistant_response = self.generate_response(user_input)
        
        # 4. Add to conversation history (sliding window)
        self.add_conversation_turn(user_input, assistant_response)
        
        return assistant_response

In [4]:
# Quick test of the extraction method
test_assistant = DisputeIntakeAssistant(max_history_turns=4)

# Test with a simple input
test_input = "Hi, I need help with a dispute. I was charged twice by Uber on July 2nd."
print("Testing extraction with:", test_input)
print()

# Test the extraction
result = test_assistant.extract_dispute_info(test_input)
print(f"Extraction successful: {result}")
print(f"Captured fields: {test_assistant.dispute_data.get_captured_fields()}")

Testing extraction with: Hi, I need help with a dispute. I was charged twice by Uber on July 2nd.

Extraction successful: True
Captured fields: {'merchant': 'Uber', 'currency': 'USD', 'txn_date': '2023-07-02', 'claim_reason': 'CHARGED_TWICE'}


## Demo: Multi-Turn Dispute Intake Conversation

Let's simulate a customer conversation that demonstrates:
1. **Memory persistence** across multiple turns
2. **Avoiding repetition** of already-captured information  
3. **Sliding window management** when conversation gets long
4. **Consistent responses** based on session summary

In [5]:
# Initialize the assistant
assistant = DisputeIntakeAssistant(max_history_turns=4)  # Small window to demo sliding

def simulate_conversation():
    """Simulate a multi-turn customer conversation"""
    
    # Conversation turns to demonstrate memory management
    conversation_turns = [
        "Hi, I need help with a dispute. I was charged twice by Uber on July 2nd.",
        "It was $15.50 and I only took one ride. My card ends in 4567.",
        "It was through the Uber app on my phone.",
        "I definitely didn't authorize two charges. I can see both transactions on my statement.",
        "Actually, let me check... the first charge was $15.50 and the second was $15.75. Still wrong though.",
        "This is really frustrating. I've been a customer for years and this has never happened.",
        "Yes, I want to dispute both charges since I only took one ride.",
        "Can you help me file this dispute now? I need this resolved quickly."
    ]
    
    print("üé≠ CUSTOMER SUPPORT DISPUTE INTAKE SIMULATION")
    print("=" * 60)
    print()
    
    for i, user_input in enumerate(conversation_turns, 1):
        print(f"üë§ CUSTOMER (Turn {i}): {user_input}")
        
        # Process input and get response
        response = assistant.process_user_input(user_input)
        print(f"üéß ASSISTANT: {response}")
        print()
        
        # Show memory state every few turns
        if i % 2 == 0:
            print("üß† MEMORY STATE:")
            print(f"   üìù Session Summary: {assistant.session_summary}")
            print(f"   üíæ Captured Fields: {list(assistant.dispute_data.get_captured_fields().keys())}")
            print(f"   ‚ùì Missing Fields: {assistant.dispute_data.get_missing_fields()}")
            print(f"   üìö History Length: {len(assistant.conversation_history)} turns")
            print()
        
        # Generate final summary when ready
        if assistant.dispute_data.is_complete():
            print("‚úÖ ALL INFORMATION COLLECTED!")
            print("üìã FINAL DISPUTE SUMMARY:")
            print("-" * 40)
            final_summary = assistant.generate_final_summary()
            print(final_summary)
            break
    
    return assistant

# Run the simulation
demo_assistant = simulate_conversation()

üé≠ CUSTOMER SUPPORT DISPUTE INTAKE SIMULATION

üë§ CUSTOMER (Turn 1): Hi, I need help with a dispute. I was charged twice by Uber on July 2nd.
üéß ASSISTANT: Thank you for reaching out about the duplicate charges from Uber on July 2nd. I understand how frustrating this situation can be, and I'm here to help you resolve it. Could you please provide the last four digits of the card you used for the transaction?

üë§ CUSTOMER (Turn 2): It was $15.50 and I only took one ride. My card ends in 4567.
üéß ASSISTANT: Thank you for confirming the details about your Uber charges. I see that you were charged $15.50 on July 2nd for a ride you only took once, and your card ends in 4567. Could you please let me know the channel you used for this transaction (e.g., Uber app, website)?

üß† MEMORY STATE:
   üìù Session Summary: ['User was charged twice by Uber on July 2nd for a total of $15.50.', 'User only took one ride, indicating a duplicate charge.', "User's card ends in 4567, which may be 

## Memory Management Analysis

Let's examine how the assistant managed memory throughout the conversation:

In [6]:
print("üîç DETAILED MEMORY ANALYSIS")
print("=" * 50)
print()

print("1Ô∏è‚É£ FINAL SESSION SUMMARY:")
for i, fact in enumerate(demo_assistant.session_summary, 1):
    print(f"   {i}. {fact}")
print()

print("2Ô∏è‚É£ CONVERSATION HISTORY (Sliding Window):")
for i, turn in enumerate(demo_assistant.conversation_history, 1):
    print(f"   Turn {i}:")
    print(f"      üë§ User: {turn.user_input}")
    print(f"      üéß Assistant: {turn.assistant_response}")
    print()

print("3Ô∏è‚É£ EXTRACTED STRUCTURED DATA:")
final_data = demo_assistant.dispute_data.get_captured_fields()
for field, value in final_data.items():
    print(f"   {field}: {value}")
print()

print("4Ô∏è‚É£ MEMORY EFFECTIVENESS METRICS:")
print(f"   üìä Total turns processed: {len(demo_assistant.conversation_history)}")
print(f"   üéØ Fields captured: {len(final_data)}/8 possible")
print(f"   üß† Summary facts: {len(demo_assistant.session_summary)}")
print(f"   ‚úÖ Complete intake: {demo_assistant.dispute_data.is_complete()}")

üîç DETAILED MEMORY ANALYSIS

1Ô∏è‚É£ FINAL SESSION SUMMARY:
   1. User was charged twice by Uber on July 2nd for a total of $15.50.
   2. User only took one ride, indicating a duplicate charge.
   3. User's card ends in 4567, which may be relevant for dispute resolution.
   4. User is seeking help with a financial discrepancy involving Uber through the app on their phone.
   5. There is a sense of urgency to resolve the issue promptly.

2Ô∏è‚É£ CONVERSATION HISTORY (Sliding Window):
   Turn 1:
      üë§ User: Hi, I need help with a dispute. I was charged twice by Uber on July 2nd.
      üéß Assistant: Thank you for reaching out about the duplicate charges from Uber on July 2nd. I understand how frustrating this situation can be, and I'm here to help you resolve it. Could you please provide the last four digits of the card you used for the transaction?

   Turn 2:
      üë§ User: It was $15.50 and I only took one ride. My card ends in 4567.
      üéß Assistant: Thank you for confi

## Key Learning Points Demonstrated

### 1. **Short-Term Memory Components**
- **Conversation History**: Last-k message window (configurable, here k=4)
- **Session Summary**: 3-5 bullet points of key facts updated after each turn
- **Structured State**: Captured dispute fields separate from conversational memory

### 2. **Memory vs State Distinction**
- **State**: Structured data fields (card_last4, merchant, amount, etc.)
- **Memory**: Conversational facts and context preserved across turns
- **Integration**: Memory informs responses while state tracks progress

### 3. **Sliding Window Benefits**
- **Context Limits**: Prevents token overflow in long conversations
- **Relevancy**: Recent turns + summary maintain continuity
- **Performance**: Faster processing with bounded history

### 4. **Consistency Through Memory**
- **No Re-asking**: Assistant never asks for already captured information
- **Context Awareness**: References previous details naturally
- **Progressive Collection**: Builds understanding over multiple turns

### 5. **Production Considerations**
- **Persistence**: In production, store memory in databases/sessions
- **Tuning**: Adjust window size based on context limits and use case
- **Recovery**: Handle memory corruption and session restoration
- **Privacy**: Clear memory appropriately for data protection