# AI Business Intake & Decision System

This notebook demonstrates an Applied AI system that converts unstructured
customer messages into validated, automation-ready JSON.

The system includes:
- LLM-based extraction
- Schema validation
- Retry & normalization
- Business rules
- Confidence-based routing
- Responsible AI safeguards

See README.md for full documentation.


In [1]:
# Install libraries for:
# - running LLMs locally in Colab
# - loading models efficiently
# - validating outputs (production mindset)
!pip -q install transformers accelerate bitsandbytes sentencepiece pydantic


In [2]:
import json
import re
from typing import Dict, List

from pydantic import BaseModel, Field, ValidationError


In [3]:
# This class defines the ONLY acceptable output format from the AI.
# If any field is missing or wrong type, validation fails (good!).
class IntakeResult(BaseModel):
    intent: str = Field(..., description="Main request in snake_case (e.g. refund_failed_transfer)")
    priority: str = Field(..., description="low|medium|high")
    entities: Dict[str, str] = Field(default_factory=dict, description="Extracted fields (strings only)")
    missing_required_fields: List[str] = Field(default_factory=list, description="Required info missing")
    recommended_next_action: str = Field(..., description="What the system should do next")
    confidence: float = Field(..., ge=0, le=1, description="0..1 clarity score")


In [4]:
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline

# Instruction model that follows prompts and outputs structured info well
MODEL_ID = "mistralai/Mistral-7B-Instruct-v0.2"

# Load tokenizer (text -> tokens)
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)

# Load model in 4-bit to save GPU memory
model = AutoModelForCausalLM.from_pretrained(
    MODEL_ID,
    device_map="auto",
    load_in_4bit=True
)

# Pipeline to generate responses easily
gen = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    max_new_tokens=300,
    temperature=0.0,   # temperature=0 gives stable output (better for JSON)
    do_sample=False
)

print("✅ Model loaded successfully!")


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.
The `load_in_4bit` and `load_in_8bit` arguments are deprecated and will be removed in the future versions. Please, pass a `BitsAndBytesConfig` object in `quantization_config` argument instead.


Loading checkpoint shards:   0%|          | 0/3 [00:00<?, ?it/s]

Device set to use cuda:0
The following generation flags are not valid and may be ignored: ['temperature']. Set `TRANSFORMERS_VERBOSITY=info` for more details.


✅ Model loaded successfully!


In [5]:
def build_prompt(message: str) -> str:
    # This prompt is designed for reliability:
    # - JSON-only output
    # - fixed keys
    # - rules for priority
    # - handles multi-intent messages
    return f"""
You are an enterprise intake extraction engine.

Return ONLY valid JSON (no markdown, no explanation) with EXACT keys:
{{
  "intent": "snake_case_string",
  "priority": "low|medium|high",
  "entities": {{}},
  "missing_required_fields": [],
  "recommended_next_action": "string",
  "confidence": 0.0
}}

Rules:
- intent must be short and specific (e.g. "refund_failed_transfer", "reset_password", "update_phone_number").
- priority=high for: payment failure, money deducted, fraud, account lock, urgent complaint.
- Extract entities if present: name, phone, account_id, transaction_id, order_id, amount, date.
- If a key entity is needed but missing (e.g. transaction_id for refunds), put it in missing_required_fields.
- If multiple intents exist: choose the PRIMARY intent and mention the others in recommended_next_action.
- confidence must be between 0 and 1.
- Do NOT add extra keys.

Message:
\"\"\"{message}\"\"\"
""".strip()


In [6]:
def extract_json_block(text: str) -> str:
    # Finds the first {...} JSON object in the output.
    match = re.search(r"\{.*\}", text, re.DOTALL)
    if not match:
        raise ValueError("No JSON object found in model output.")
    return match.group(0)


In [7]:
def normalize_entities(entities: dict) -> dict:
    # Forces all entity values into strings to prevent validation failures.
    fixed = {}
    for k, v in entities.items():
        if isinstance(v, dict) or v is None:
            fixed[k] = ""   # if model gave {} or None, replace with empty string
        else:
            fixed[k] = str(v)
    return fixed


In [8]:
def run_intake(message: str, retries: int = 2) -> IntakeResult:
    prompt = build_prompt(message)
    last_error = None

    for attempt in range(retries + 1):
        # Generate model output
        out = gen(prompt)[0]["generated_text"]

        # Some models echo the prompt, remove it if present
        candidate = out.replace(prompt, "").strip()

        try:
            # Extract and parse JSON
            json_str = extract_json_block(candidate)
            data = json.loads(json_str)

            # -------------------------
            # SAFETY DEFAULTS (important)
            # -------------------------
            data.setdefault("entities", {})
            data.setdefault("missing_required_fields", [])
            data.setdefault("recommended_next_action", "Route to support team for clarification")
            data.setdefault("confidence", 0.4)

            # Normalize entities (fixes your exact error)
            data["entities"] = normalize_entities(data["entities"])

            # If message likely has multi-intent, reduce confidence slightly
            if " and " in message.lower():
                data["confidence"] = min(float(data["confidence"]), 0.5)

            # Validate against schema (this ensures structure is correct)
            result = IntakeResult(**data)
            return result

        except (ValueError, json.JSONDecodeError, ValidationError) as e:
            last_error = str(e)

            # Retry with stricter instruction
            prompt = build_prompt(message) + "\n\nIMPORTANT: Output MUST be valid JSON ONLY. No extra text."

    raise RuntimeError(f"❌ Failed after retries. Last error: {last_error}")


In [9]:
tests = [
    "Hi, my transfer failed today and the money was deducted. My name is Fady, phone 01220000000. Please help asap.",
    "Please close my account.",
    "I want to update my phone number and I forgot my password.",
    "I was charged twice for order 9912 yesterday. Need a refund."
]

for t in tests:
    r = run_intake(t)
    print("INPUT:", t)
    print(r.model_dump_json(indent=2))
    print("-"*90)


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


INPUT: Hi, my transfer failed today and the money was deducted. My name is Fady, phone 01220000000. Please help asap.
{
  "intent": "failed_transfer",
  "priority": "high",
  "entities": {
    "name": "Fady",
    "phone": "01220000000"
  },
  "missing_required_fields": [
    "transaction_id"
  ],
  "recommended_next_action": "Request the user to provide the transaction ID and retry the transfer.",
  "confidence": 0.5
}
------------------------------------------------------------------------------------------


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


INPUT: Please close my account.
{
  "intent": "close_account",
  "priority": "medium",
  "entities": {},
  "missing_required_fields": [],
  "recommended_next_action": "Confirm account closure and provide any necessary account information.",
  "confidence": 0.8
}
------------------------------------------------------------------------------------------


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


INPUT: I want to update my phone number and I forgot my password.
{
  "intent": "update_phone_number",
  "priority": "medium",
  "entities": {
    "phone": ""
  },
  "missing_required_fields": [
    "password"
  ],
  "recommended_next_action": "Send a verification code to the new phone number and ask the user to enter it to confirm the update.",
  "confidence": 0.5
}
------------------------------------------------------------------------------------------
INPUT: I was charged twice for order 9912 yesterday. Need a refund.
{
  "intent": "refund",
  "priority": "high",
  "entities": {
    "intent": "refund",
    "order_id": "9912"
  },
  "missing_required_fields": [],
  "recommended_next_action": "Provide the transaction IDs for the duplicate charges.",
  "confidence": 0.95
}
------------------------------------------------------------------------------------------


In [10]:
HIGH_RISK_INTENTS = {"fraud_report", "account_hacked", "money_deducted_no_service"}

def apply_policy(result: IntakeResult) -> IntakeResult:
    # If low confidence, do not automate. Route to human.
    if result.confidence < 0.55:
        result.recommended_next_action = "Route to human agent for review (low confidence)."

    # High risk intents always escalated
    if result.intent in HIGH_RISK_INTENTS:
        result.priority = "high"
        result.recommended_next_action = "Escalate to security/fraud team + require human confirmation."

    return result

sample = "Someone hacked my account and did transfers!"
r = apply_policy(run_intake(sample))
print(r.model_dump_json(indent=2))


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


{
  "intent": "account_hack",
  "priority": "high",
  "entities": {
    "account": ""
  },
  "missing_required_fields": [],
  "recommended_next_action": "Route to human agent for review (low confidence).",
  "confidence": 0.5
}


In [11]:
# Business rules: what each intent needs to proceed
REQUIRED_FIELDS_BY_INTENT = {
    "payment_failure": ["transaction_id"],
    "refund_failed_transfer": ["transaction_id"],
    "charged_twice_refund": ["order_id"],
    "close_account": ["account_id"],
    "reset_password": ["username_or_phone"],
    "update_phone_number": ["account_id", "new_phone"]
}

def enforce_required_fields(result: IntakeResult) -> IntakeResult:
    required = REQUIRED_FIELDS_BY_INTENT.get(result.intent, [])

    # Only add missing fields that are truly missing from entities
    for field in required:
        if field not in result.entities or not result.entities[field].strip():
            if field not in result.missing_required_fields:
                result.missing_required_fields.append(field)

    return result


In [12]:
def detect_multi_intent(message: str) -> bool:
    # Simple heuristic (good enough for v1)
    return any(x in message.lower() for x in [" and ", " also ", " بالإضافة ", " كمان "])

def post_process(message: str, result: IntakeResult) -> IntakeResult:
    # Apply required fields rules
    result = enforce_required_fields(result)

    # If multi-intent, lower confidence and route carefully
    if detect_multi_intent(message):
        result.confidence = min(result.confidence, 0.55)
        result.recommended_next_action = (
            result.recommended_next_action
            + " | Note: message contains multiple requests; handle primary intent first, then follow up."
        )
    return result


In [13]:
import uuid
from datetime import datetime

def log_event(trace_id: str, message: str, result: IntakeResult):
    # Simple console log for now (later we can write to a file or DB)
    print("TRACE_ID:", trace_id)
    print("TIME:", datetime.utcnow().isoformat(), "UTC")
    print("INPUT:", message)
    print("OUTPUT:", result.model_dump())
    print("="*90)


In [14]:
def intake_pipeline(message: str) -> dict:
    trace_id = str(uuid.uuid4())

    # 1) Extract raw result from LLM
    result = run_intake(message)

    # 2) Apply post-processing rules (required fields + multi-intent)
    result = post_process(message, result)

    # 3) Apply responsible AI policy
    result = apply_policy(result)

    # 4) Log
    log_event(trace_id, message, result)

    # 5) Return final payload (trace_id included)
    payload = result.model_dump()
    payload["trace_id"] = trace_id
    return payload


In [15]:
tests = [
    "Hi, my transfer failed today and the money was deducted. My name is Fady, phone 01220000000. Please help asap.",
    "Please close my account.",
    "I want to update my phone number and I forgot my password.",
    "I was charged twice for order 9912 yesterday. Need a refund."
]

for t in tests:
    output = intake_pipeline(t)
    print(json.dumps(output, indent=2))
    print("-"*90)


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
  print("TIME:", datetime.utcnow().isoformat(), "UTC")
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


TRACE_ID: a3b10f1e-1bcf-4700-92f9-b9556881b745
TIME: 2026-01-15T21:59:51.272906 UTC
INPUT: Hi, my transfer failed today and the money was deducted. My name is Fady, phone 01220000000. Please help asap.
OUTPUT: {'intent': 'failed_transfer', 'priority': 'high', 'entities': {'name': 'Fady', 'phone': '01220000000'}, 'missing_required_fields': ['transaction_id'], 'recommended_next_action': 'Route to human agent for review (low confidence).', 'confidence': 0.5}
{
  "intent": "failed_transfer",
  "priority": "high",
  "entities": {
    "name": "Fady",
    "phone": "01220000000"
  },
  "missing_required_fields": [
    "transaction_id"
  ],
  "recommended_next_action": "Route to human agent for review (low confidence).",
  "confidence": 0.5,
  "trace_id": "a3b10f1e-1bcf-4700-92f9-b9556881b745"
}
------------------------------------------------------------------------------------------


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


TRACE_ID: 1df8ff27-e1c8-4b04-90b7-66e6871ac307
TIME: 2026-01-15T22:00:01.308066 UTC
INPUT: Please close my account.
OUTPUT: {'intent': 'close_account', 'priority': 'medium', 'entities': {}, 'missing_required_fields': ['account_id'], 'recommended_next_action': 'Confirm account closure and provide any necessary account information.', 'confidence': 0.8}
{
  "intent": "close_account",
  "priority": "medium",
  "entities": {},
  "missing_required_fields": [
    "account_id"
  ],
  "recommended_next_action": "Confirm account closure and provide any necessary account information.",
  "confidence": 0.8,
  "trace_id": "1df8ff27-e1c8-4b04-90b7-66e6871ac307"
}
------------------------------------------------------------------------------------------


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


TRACE_ID: 93a4ac00-8dd8-44ba-bc88-681de4be61a9
TIME: 2026-01-15T22:00:11.688927 UTC
INPUT: I want to update my phone number and I forgot my password.
OUTPUT: {'intent': 'update_phone_number', 'priority': 'medium', 'entities': {'phone': ''}, 'missing_required_fields': ['password', 'account_id', 'new_phone'], 'recommended_next_action': 'Route to human agent for review (low confidence).', 'confidence': 0.5}
{
  "intent": "update_phone_number",
  "priority": "medium",
  "entities": {
    "phone": ""
  },
  "missing_required_fields": [
    "password",
    "account_id",
    "new_phone"
  ],
  "recommended_next_action": "Route to human agent for review (low confidence).",
  "confidence": 0.5,
  "trace_id": "93a4ac00-8dd8-44ba-bc88-681de4be61a9"
}
------------------------------------------------------------------------------------------
TRACE_ID: 623810a2-c760-4f1a-b574-f33f36aff00e
TIME: 2026-01-15T22:00:22.279492 UTC
INPUT: I was charged twice for order 9912 yesterday. Need a refund.
OUTPUT: