# üí¨ Chat Mode Testing Notebook

This notebook tests the new **Chat Mode** feature that enables conversational Q&A
without scoring/analysis.

**Key Differences from Analysis Mode:**
- No score is generated (`score = None`)
- No categories are identified (`categories = []`)
- `result_type = "chat"` instead of `"analysis"`
- Validation is skipped (no score to validate)
- UI displays as chat bubble instead of score card

**Structure:**
1. Setup & Imports
2. Test LLM Service in Chat Mode
3. Test Processor in Chat Mode
4. Compare Analysis vs Chat Mode
5. Test Edge Cases


## 1. Setup & Imports


In [1]:
# Add project root to path
import sys
from pathlib import Path

PROJECT_ROOT = Path().absolute().parent
if str(PROJECT_ROOT) not in sys.path:
    sys.path.insert(0, str(PROJECT_ROOT))

print(f"Project root: {PROJECT_ROOT}")

Project root: c:\Users\zamko\Documents\vlzm\kyc-analyzer


In [2]:
# Set environment variables BEFORE imports
import os

# Required for local development
os.environ.setdefault("ENV", "LOCAL")
os.environ.setdefault("LLM_PROVIDER", "openai")  # or "ollama", "azure", "anthropic"
os.environ.setdefault("OLLAMA_BASE_URL", "http://localhost:11434")
os.environ.setdefault("OLLAMA_MODEL", "llama3.2")

# Database settings (matches docker-compose.yml)
# Run: docker compose up -d db
os.environ["DATABASE_HOST"] = "localhost"
os.environ["DATABASE_PORT"] = "5432"
os.environ["DATABASE_NAME"] = "app_db"
os.environ["DATABASE_USER"] = "postgres"
os.environ["DATABASE_PASSWORD"] = "localdevpassword123"

print("Environment configured:")
print(f"  ENV: {os.environ['ENV']}")
print(f"  LLM_PROVIDER: {os.environ['LLM_PROVIDER']}")
print(f"  DATABASE: {os.environ['DATABASE_NAME']} @ {os.environ['DATABASE_HOST']}:{os.environ['DATABASE_PORT']}")

Environment configured:
  ENV: LOCAL
  LLM_PROVIDER: openai
  DATABASE: app_db @ localhost:5432


In [3]:
# Import core modules
from app.database import init_db, get_session
from app.models import Request, RequestCreate, AnalysisResult
from app.services.llm_service import get_llm_service
from app.services.processor import Processor
from app.services.auth_mock import get_current_user
from app.services.llm.base import LLMResponse, CHAT_SYSTEM_PROMPT, DEFAULT_SYSTEM_PROMPT

print("All imports successful!")

All imports successful!


In [None]:
# # Migration: Add result_type column if it doesn't exist
# # Run this once to update the database schema for chat mode support
# from sqlmodel import text

# with get_session() as session:
#     try:
#         session.exec(text("""
#             ALTER TABLE analysis_results 
#             ADD COLUMN IF NOT EXISTS result_type VARCHAR(50) DEFAULT 'analysis'
#         """))
#         session.commit()
#         print("‚úÖ Migration complete: 'result_type' column added (or already exists)")
#     except Exception as e:
#         print(f"‚ö†Ô∏è Migration note: {e}")

In [11]:
# Drop and recreate all tables (run this if you need to reset the database schema)
from app.database import get_engine
from sqlmodel import SQLModel

engine = get_engine()

# Drop and recreate all tables
SQLModel.metadata.drop_all(engine)
SQLModel.metadata.create_all(engine)
print("‚úÖ Tables recreated!")

‚úÖ Tables recreated!


In [12]:
# Initialize database
init_db()
print("Database initialized!")

Database initialized!


## 2. Test LLM Service in Chat Mode

Test the LLM service directly with `mode="chat"`.

In [13]:
# Get LLM service
llm_service = get_llm_service()
print(f"LLM Provider: {llm_service.provider_name}")
print(f"Model Version: {llm_service.get_model_version()}")

LLM Provider: openai
Model Version: openai/gpt-5.2


In [14]:
# Test simple chat mode (without tools)
chat_response = llm_service.analyze(
    input_text="What is the capital of France?",
    mode="chat"
)

print("=" * 50)
print("CHAT MODE RESPONSE (Simple)")
print("=" * 50)
print(f"Score: {chat_response.score}")  # Should be None
print(f"Categories: {chat_response.categories}")  # Should be []
print(f"Mode: {chat_response.mode}")
print(f"\nResponse:\n{chat_response.reasoning}")

CHAT MODE RESPONSE (Simple)
Score: None
Categories: []
Mode: chat

Response:
The capital of France is Paris.


In [15]:
# Test chat mode with tools (agent mode)
chat_with_tools = llm_service.analyze_with_tools(
    input_text="What time is it now? Also, what's the weather like today?",
    mode="chat"
)

print("=" * 50)
print("CHAT MODE RESPONSE (With Tools)")
print("=" * 50)
print(f"Score: {chat_with_tools.score}")  # Should be None
print(f"Categories: {chat_with_tools.categories}")  # Should be []
print(f"Mode: {chat_with_tools.mode}")
print(f"Tools Used: {chat_with_tools.tools_used}")
print(f"\nResponse:\n{chat_with_tools.reasoning}")

if chat_with_tools.trace:
    print(f"\nTrace Mode: {chat_with_tools.trace.get('mode')}")

CHAT MODE RESPONSE (With Tools)
Score: None
Categories: []
Mode: chat
Tools Used: ['get_current_time']

Response:
Current time (UTC): 20:54:03 on Friday, 2026-01-09.

Weather today: I can‚Äôt determine the weather because I don‚Äôt have your location and there‚Äôs no weather tool/API available in this environment. If you tell me your city (and country) or ZIP/postal code, I can help you figure out what to check and how (e.g., the quickest sources/apps).

Trace Mode: agent_chat


## 3. Test Processor in Chat Mode

Test the full pipeline through the Processor with `mode="chat"`.

In [16]:
# Get a test user
test_user = get_current_user("analyst_a")
print(f"Using user: {test_user.username} (role: {test_user.role.value})")

Using user: Carol Analyst (Group A) (role: analyst)


In [17]:
# Test chat mode through processor
with get_session() as session:
    processor = Processor(session, user=test_user)
    
    # Create a chat request
    request_data = RequestCreate(
        input_text="Hello! Can you explain what machine learning is in simple terms?",
        context="I'm a beginner with no technical background.",
        group="group_a"
    )
    
    # Process in CHAT mode
    request, result = processor.process_request(request_data, mode="chat")

print("=" * 50)
print("CHAT MODE - FULL PIPELINE RESULT")
print("=" * 50)
print(f"Request ID: {request.id}")
print(f"Result ID: {result.id}")
print(f"Result Type: {result.result_type}")
print(f"Score: {result.score}")  # Should be None
print(f"Categories: {result.categories}")  # Should be []
print(f"Validation Status: {result.validation_status}")
print(f"Validation Details: {result.validation_details}")
print(f"\nAI Response:\n{result.summary}")

CHAT MODE - FULL PIPELINE RESULT
Request ID: 1
Result ID: 1
Result Type: chat
Score: None
Categories: []
Validation Status: PASS
Validation Details: Chat mode - validation skipped

AI Response:
Machine learning is a way to make computers learn from examples instead of being explicitly programmed with every rule.

### A simple way to think about it
- **Traditional programming:** You write rules (if this, then that) and the computer follows them.
- **Machine learning:** You give the computer **lots of examples**, and it figures out patterns on its own.

### Everyday analogy
Imagine teaching a child to recognize cats:
- You don‚Äôt give a perfect written rule for what a cat is.
- You show many pictures labeled **‚Äúcat‚Äù** and **‚Äúnot cat.‚Äù**
- Over time, the child learns what features usually mean ‚Äúcat.‚Äù
Machine learning works similarly: it learns from labeled or unlabeled examples.

### What it‚Äôs used for
Machine learning powers things like:
- **Recommendations:** Netflix/YouT

## 4. Compare Analysis vs Chat Mode

Process the same input in both modes to see the difference.

In [18]:
# Same input text for comparison
test_input = "The customer transferred $50,000 to an offshore account in the Cayman Islands."

with get_session() as session:
    processor = Processor(session, user=test_user)
    
    # ANALYSIS MODE
    analysis_request = RequestCreate(
        input_text=test_input,
        group="group_a"
    )
    req_analysis, result_analysis = processor.process_request(analysis_request, mode="analysis")
    
    # CHAT MODE
    chat_request = RequestCreate(
        input_text=f"What are the potential concerns with this transaction: {test_input}",
        group="group_a"
    )
    req_chat, result_chat = processor.process_request(chat_request, mode="chat")

print("=" * 60)
print("COMPARISON: ANALYSIS vs CHAT MODE")
print("=" * 60)

print("\nüìä ANALYSIS MODE:")
print(f"  Result Type: {result_analysis.result_type}")
print(f"  Score: {result_analysis.score}")
print(f"  Categories: {result_analysis.categories}")
print(f"  Validation: {result_analysis.validation_status}")
print(f"  Summary (truncated): {result_analysis.summary[:200]}...")

print("\nüí¨ CHAT MODE:")
print(f"  Result Type: {result_chat.result_type}")
print(f"  Score: {result_chat.score}")
print(f"  Categories: {result_chat.categories}")
print(f"  Validation: {result_chat.validation_status}")
print(f"  Response (truncated): {result_chat.summary[:200]}...")

COMPARISON: ANALYSIS vs CHAT MODE

üìä ANALYSIS MODE:
  Result Type: analysis
  Score: 72
  Categories: ['AML/Financial Crime', 'Offshore Jurisdiction', 'Large Transaction', 'Potential Tax Evasion/Concealment Risk']
  Validation: PASS
  Summary (truncated): Content indicates a customer transferred $50,000 to an offshore account in the Cayman Islands. This combination (material amount + offshore destination commonly associated with secrecy/asset shielding...

üí¨ CHAT MODE:
  Result Type: chat
  Score: None
  Categories: []
  Validation: PASS
  Response (truncated): Potential concerns (AML/financial crime) with a $50,000 transfer to an offshore account in the Cayman Islands include:

1) Offshore/high-risk jurisdiction considerations
- Cayman Islands is a well-kno...


## 5. Test Edge Cases

In [19]:
# Test chat with context
with get_session() as session:
    processor = Processor(session, user=test_user)
    
    request_data = RequestCreate(
        input_text="What should I do next?",
        context="I'm reviewing a suspicious transaction from yesterday. The customer made 5 wire transfers to different countries.",
        group="group_a"
    )
    
    request, result = processor.process_request(request_data, mode="chat")

print("=" * 50)
print("CHAT WITH CONTEXT")
print("=" * 50)
print(f"Result Type: {result.result_type}")
print(f"Score: {result.score}")
print(f"\nResponse:\n{result.summary}")

CHAT WITH CONTEXT
Result Type: chat
Score: None

Response:
Next steps (wire-fraud/AML triage) for 5 international wires in one day:

1) Contain risk immediately
- Place a temporary hold / enhanced review on any pending or not-yet-released wires.
- If any wires have already been sent, initiate wire recall/trace requests with your bank/operations team ASAP (speed matters).
- Consider temporarily restricting additional outbound wires on the account until review is complete.

2) Gather the transaction facts (build a quick case file)
- For each of the 5 wires: amount, currency, date/time, beneficiary name, beneficiary bank, country, SWIFT/BIC, purpose/notes, channel used (online/branch/API), and who approved/released.
- Pull customer profile: KYC info, expected activity, occupation/business, source of funds, historical wire patterns, prior alerts/cases.
- Check for red flags: first-time beneficiaries, high-risk jurisdictions, structuring (split amounts), urgency language, ‚Äúconsulting/inve

In [20]:
# Test that chat mode skips validation properly
with get_session() as session:
    processor = Processor(session, user=test_user)
    
    # Short input that would fail quality validation in analysis mode
    request_data = RequestCreate(
        input_text="Hi",
        group="group_a"
    )
    
    request, result = processor.process_request(request_data, mode="chat")

print("=" * 50)
print("CHAT MODE - VALIDATION SKIPPED")
print("=" * 50)
print(f"Result Type: {result.result_type}")
print(f"Validation Status: {result.validation_status}")
print(f"Validation Details: {result.validation_details}")
print(f"\nNote: In analysis mode, this might fail validation.")
print(f"In chat mode, validation is skipped.")

CHAT MODE - VALIDATION SKIPPED
Result Type: chat
Validation Status: PASS
Validation Details: Chat mode - validation skipped

Note: In analysis mode, this might fail validation.
In chat mode, validation is skipped.


In [21]:
# Verify LLMResponse model accepts None score
from app.services.llm.base import LLMResponse

# This should work without errors
test_response = LLMResponse(
    score=None,
    categories=[],
    reasoning="This is a test response",
    mode="chat"
)

print("=" * 50)
print("LLMResponse MODEL TEST")
print("=" * 50)
print(f"Score: {test_response.score}")
print(f"Score is None: {test_response.score is None}")
print(f"Categories: {test_response.categories}")
print(f"Mode: {test_response.mode}")
print("\n‚úÖ LLMResponse correctly accepts None score for chat mode!")

LLMResponse MODEL TEST
Score: None
Score is None: True
Categories: []
Mode: chat

‚úÖ LLMResponse correctly accepts None score for chat mode!


## 6. View System Prompts

Compare the system prompts used for each mode.

In [22]:
print("=" * 60)
print("ANALYSIS MODE SYSTEM PROMPT")
print("=" * 60)
print(DEFAULT_SYSTEM_PROMPT)

ANALYSIS MODE SYSTEM PROMPT
You are a helpful AI assistant specialized in analyzing and processing text.

Your task is to analyze the provided input and:
1. Assess the content based on relevant criteria
2. Identify key categories or themes
3. Provide a clear summary of your analysis
4. Optionally, provide processed/transformed content

Scoring Guidelines:
- LOW (0-25): Minimal significance or concern
- MEDIUM (26-50): Moderate significance, may need attention
- HIGH (51-75): Significant findings, requires review
- CRITICAL (76-100): Critical findings, immediate action recommended

Always respond in valid JSON format with the following structure:
{
    "score": <integer 0-100>,
    "categories": ["<category1>", "<category2>", ...],
    "summary": "<detailed analysis and reasoning>",
    "processed_content": "<optional transformed content>"
}



In [23]:
print("=" * 60)
print("CHAT MODE SYSTEM PROMPT")
print("=" * 60)
print(CHAT_SYSTEM_PROMPT)

CHAT MODE SYSTEM PROMPT
You are a helpful AI assistant. Your goal is to provide clear, accurate, and helpful responses to user questions.

You have access to tools that can help you gather information. Use them when needed.

When responding:
1. Be concise but thorough
2. If you're uncertain, say so
3. Provide actionable information when possible

You MUST always respond in valid JSON format with the following structure:
{
    "reasoning": "<your detailed answer here>",
    "score": null,
    "categories": []
}

Important: In chat mode, score is always null and categories is always empty. Put your full response in the "reasoning" field.



## 7. Summary

The chat mode feature enables:

1. **Conversational Q&A** - Users can ask questions without triggering scoring
2. **Flexible Responses** - No need to generate score/categories
3. **Same Infrastructure** - Uses the same tools, RAG, and tracing
4. **Skipped Validation** - No score validation in chat mode
5. **Different UI** - Results displayed as chat bubbles instead of score cards

**Key Code Changes:**
- `models.py`: `score` is now `Optional[int]`, added `result_type` field
- `base.py`: Added `CHAT_SYSTEM_PROMPT`, updated `LLMResponse` and parsing
- `processor.py`: Added `mode` parameter to `analyze_request()` and `process_request()`
- `main.py`: Added mode selector in UI, conditional rendering based on `result_type`