# CrewAI Goal-Oriented Multi-Agent Workflow  
## Use Case: Banking & Financial Services — Intelligent Customer Resolution

**Objective:** Build a goal-oriented multi-agent workflow using CrewAI with 3–4 collaborating agents, clear handoffs, and explicit escalation logic (no RAG required).

**Notebook contents**
1. Architecture overview (agents, tasks, handoffs)  
2. Static policy/rules (hardcoded, non-RAG)  
3. CrewAI implementation (Agents + Tasks + Crew)  
4. End-to-end tests with sample inputs and printed outputs  
5. Design rationale & escalation logic explanation

---

## 0) Environment Setup

This project uses **CrewAI**. In Vocareum, dependencies may already be installed.  
If not, run the install cell below.

In [None]:
# If needed, install dependencies (uncomment if your environment does not have them)
# !pip -q install crewai langchain-openai pydantic python-dotenv

### API Key / LLM Configuration

CrewAI needs an LLM. In Vocareum, an API key may already be configured.

If you need to set it manually, set:
- `OPENAI_API_KEY`

In [None]:
import os

# Option 1: Set your key in the environment (recommended)
# os.environ["OPENAI_API_KEY"] = "YOUR_KEY_HERE"

print("OPENAI_API_KEY set:", bool(os.environ.get("OPENAI_API_KEY")))

---

## 1) Requirements Checklist

- ✅ At least **3–4 collaborating agents**
- ✅ Each agent has **clear role + specific goal**
- ✅ Agents show **dependency via handoffs**
- ✅ Workflow includes **explicit escalation logic** (risk/urgency/uncertainty)
- ✅ Includes **sample inputs + outputs**

---

## 2) Static Banking Policies (Non-RAG)

We simulate bank knowledge using static policy text and hardcoded rules.

### Policy Summary (static)
- **Fraud / suspicious activity** → immediate escalation + security instructions  
- **Card stolen/lost** → advise block card + escalate  
- **Failed payment / money deducted** → raise dispute ticket + reversal guidance  
- **Loan inquiry** → informational response only  
- **Low confidence** → escalate due to uncertainty

In [None]:
BANK_POLICY_TEXT = (
    "BANKING POLICY (STATIC, NON-RAG)\n"
    "1) Fraud/Suspicious Activity:\n"
    "   - Never ask for full PIN/OTP\n"
    "   - Advise immediate card block / security steps\n"
    "   - Escalate to human support for verification\n"
    "2) Transaction Disputes:\n"
    "   - If money deducted but transaction failed -> raise a dispute ticket\n"
    "   - Typical reversal window: 3-7 business days (informational)\n"
    "3) Card Lost/Stolen:\n"
    "   - Advise immediate card block\n"
    "   - Escalate to human support\n"
    "4) Loan Inquiries:\n"
    "   - Provide general informational guidance only\n"
    "   - Avoid personalized approval promises\n"
    "5) Uncertainty:\n"
    "   - If intent is unclear or confidence is low -> escalate\n"
)

print(BANK_POLICY_TEXT)

---

## 3) Handoff Schema (Explicit Contract)

We pass a structured dictionary-like object between agents.

**Handoff object fields**
- customer_query  
- intent, confidence  
- allowed_actions, policy_notes  
- draft_response  
- risk_score (supporting deterministic signal)  
- escalate, escalation_reason

In [None]:
from pydantic import BaseModel, Field
from typing import List, Optional, Literal

IntentType = Literal["fraud", "transaction", "loan", "general"]
ConfidenceType = Literal["high", "medium", "low"]

class Handoff(BaseModel):
    customer_query: str

    intent: Optional[IntentType] = None
    confidence: Optional[ConfidenceType] = None

    allowed_actions: List[str] = Field(default_factory=list)
    policy_notes: Optional[str] = None

    draft_response: Optional[str] = None

    risk_score: Optional[int] = None
    escalate: Optional[bool] = None
    escalation_reason: Optional[str] = None

print("Handoff schema ready.")

---

## 4) Deterministic Safety Signals (Static Rules)

To strengthen consistency and reduce false negatives, we compute a simple risk score (0–100)
based on static keywords + intent + uncertainty. This supports (not replaces) the LLM decision.

In [None]:
import re

FRAUD_KEYWORDS = [
    "unauthorized", "not me", "never made", "unknown transaction", "fraud",
    "suspicious", "hacked", "someone tried", "phishing"
]
STOLEN_KEYWORDS = ["stolen", "lost my card", "card stolen", "card lost"]
URGENT_KEYWORDS = ["immediately", "urgent", "right now", "asap"]

def keyword_hits(text: str, keywords: list) -> int:
    t = text.lower()
    return sum(1 for k in keywords if k in t)

def simple_risk_score(query: str, intent: str, confidence: str) -> int:
    score = 0
    q = query.lower()

    if intent == "fraud":
        score += 70
    if keyword_hits(q, STOLEN_KEYWORDS) > 0:
        score += 60
    if keyword_hits(q, FRAUD_KEYWORDS) > 0:
        score += 30
    if keyword_hits(q, URGENT_KEYWORDS) > 0:
        score += 10

    if confidence == "low":
        score += 20
    elif confidence == "medium":
        score += 10

    return min(score, 100)

print("Rule helpers ready.")

---

## 5) CrewAI Implementation

In [None]:
from crewai import Agent, Task, Crew
import json

# Optional explicit LLM (uncomment if your environment supports it)
# from langchain_openai import ChatOpenAI
# llm = ChatOpenAI(model="gpt-4o-mini", temperature=0.2)

llm = None  # Use CrewAI defaults unless you want to pass an explicit llm
print("CrewAI imported.")

### 5.1 Agent Prompts (Return ONLY JSON)

These prompts enforce structured outputs for clean handoffs.

In [None]:
INTENT_AGENT_PROMPT = (
    "You are the Intent Classification Agent for a bank.\n"
    "Classify the customer's message into exactly one intent:\n"
    "- fraud\n- transaction\n- loan\n- general\n\n"
    "Also provide confidence: high / medium / low.\n\n"
    "Return ONLY valid JSON with keys: intent, confidence.\n\n"
    "Customer message: {customer_query}"
)

POLICY_AGENT_PROMPT = (
    "You are the Banking Rules & Policy Reasoning Agent.\n"
    "Apply the following static policy text:\n\n"
    f"{BANK_POLICY_TEXT}\n\n"
    "Input: customer_query, intent, confidence\n\n"
    "Return ONLY valid JSON with keys: allowed_actions (array), policy_notes (string).\n\n"
    "Rules:\n"
    "- If intent is fraud -> allowed_actions must include 'block_card', 'escalate_to_human'\n"
    "- If card is stolen/lost -> include 'block_card', 'escalate_to_human'\n"
    "- If transaction dispute -> include 'raise_dispute_ticket'\n"
    "- If loan -> include 'provide_info'\n"
    "- Never request PIN/OTP in notes."
)

RESPONSE_AGENT_PROMPT = (
    "You are the Response Drafting Agent for a bank.\n"
    "Draft a safe, cautious, customer-friendly response based on:\n"
    "- customer_query\n- intent\n- policy_notes\n- allowed_actions\n\n"
    "Return ONLY valid JSON with key: draft_response\n\n"
    "Constraints:\n"
    "- Do NOT ask for PIN/OTP.\n"
    "- If fraud/stolen suspected: advise immediate card block and say a human agent will assist.\n"
    "- Keep response concise (5-9 lines max)."
)

ESCALATION_AGENT_PROMPT = (
    "You are the Risk & Escalation Agent.\n"
    "Decide whether to escalate to a human agent based on:\n"
    "- customer_query\n- intent\n- confidence\n- allowed_actions\n- draft_response\n\n"
    "Escalate if:\n"
    "- intent is fraud OR 'escalate_to_human' in allowed_actions\n"
    "- confidence is low\n"
    "- the message suggests urgent risk (stolen card, unauthorized activity)\n\n"
    "Return ONLY valid JSON with keys: escalate (true/false), escalation_reason (string)."
)

print("Prompts ready.")

### 5.2 Create Agents

In [None]:
intent_agent = Agent(
    role="Intent Classification Agent",
    goal="Classify the customer's query into a banking intent with confidence.",
    backstory="Expert triage agent that labels banking requests correctly.",
    verbose=True,
    llm=llm
)

policy_agent = Agent(
    role="Banking Rules & Policy Reasoning Agent",
    goal="Apply static bank policies and decide allowed actions.",
    backstory="Compliance-aware agent trained on banking safety rules.",
    verbose=True,
    llm=llm
)

response_agent = Agent(
    role="Response Drafting Agent",
    goal="Draft a safe and helpful customer response based on policy and intent.",
    backstory="Customer communication expert focused on clarity and compliance.",
    verbose=True,
    llm=llm
)

escalation_agent = Agent(
    role="Risk & Escalation Agent",
    goal="Determine escalation based on risk, urgency, and uncertainty.",
    backstory="Risk specialist who prevents unsafe automation.",
    verbose=True,
    llm=llm
)

print("Agents created.")

### 5.3 Create Tasks + Crew (Sequential Collaboration)

In [None]:

import json, re
from crewai import Agent, Task, Crew

def parse_json_strict(text: str) -> dict:
    text = str(text).strip()
    try:
        return json.loads(text)
    except Exception:
        match = re.search(r"\{.*\}", text, re.DOTALL)
        if not match:
            raise ValueError("No JSON object found in agent output.")
        return json.loads(match.group(0))

intent_task = Task(
    description=INTENT_AGENT_PROMPT,
    expected_output="JSON with keys: intent, confidence",
    agent=intent_agent
)

policy_task = Task(
    description=POLICY_AGENT_PROMPT,
    expected_output="JSON with keys: allowed_actions (array), policy_notes (string)",
    agent=policy_agent
)

response_task = Task(
    description=RESPONSE_AGENT_PROMPT,
    expected_output="JSON with key: draft_response",
    agent=response_agent
)

escalation_task = Task(
    description=ESCALATION_AGENT_PROMPT,
    expected_output="JSON with keys: escalate (true/false), escalation_reason (string)",
    agent=escalation_agent
)

crew = Crew(
    agents=[intent_agent, policy_agent, response_agent, escalation_agent],
    tasks=[intent_task, policy_task, response_task, escalation_task],
    verbose=True
)

print("Crew created.")


---

## 6) Orchestrator: Run the End-to-End Workflow

We run the crew and then build a final `Handoff` object by extracting the JSON outputs.
We also compute a deterministic risk score to support consistent escalation.

In [None]:
def run_banking_workflow(customer_query: str) -> Handoff:
    result = crew.kickoff(inputs={"customer_query": customer_query})
    text = str(result)

    # Extract JSON objects in order (intent, policy, response, escalation)
    json_objects = re.findall(r"\{.*?\}", text, flags=re.DOTALL)

    if len(json_objects) < 4:
        # Safety-first fallback
        return Handoff(
            customer_query=customer_query,
            draft_response=text,
            escalate=True,
            escalation_reason="Could not parse full structured outputs — escalate to be safe."
        )

    intent_out = parse_json_strict(json_objects[0])
    policy_out = parse_json_strict(json_objects[1])
    response_out = parse_json_strict(json_objects[2])
    escalation_out = parse_json_strict(json_objects[3])

    handoff = Handoff(
        customer_query=customer_query,
        intent=intent_out.get("intent"),
        confidence=intent_out.get("confidence"),
        allowed_actions=policy_out.get("allowed_actions", []),
        policy_notes=policy_out.get("policy_notes"),
        draft_response=response_out.get("draft_response"),
        escalate=escalation_out.get("escalate"),
        escalation_reason=escalation_out.get("escalation_reason")
    )

    if handoff.intent and handoff.confidence:
        handoff.risk_score = simple_risk_score(customer_query, handoff.intent, handoff.confidence)

    # Deterministic escalation safeguard
    if handoff.risk_score is not None and handoff.risk_score >= 70:
        handoff.escalate = True
        if not handoff.escalation_reason:
            handoff.escalation_reason = "High risk score based on static rules."

    return handoff

print("OrchOrchestrator ready")

---

## 7) Tests: Sample Inputs & Outputs (End-to-End)

We run 8 representative queries, including edge cases that should trigger escalation.

In [None]:
TEST_QUERIES = [
    "My debit card payment failed but money was deducted.",
    "What is the interest rate on personal loans?",
    "My account balance looks incorrect.",
    
]

for q in TEST_QUERIES:
    print("\n" + "="*100)
    print("USER QUERY:", q)
    try:
        out = run_banking_workflow(q)
        print("\nFINAL HANDOFF OBJECT:")
        print(out.model_dump_json(indent=2))
    except Exception as e:
        print("ERROR:", e)

---

## 8) Architecture Explanation (1–2 pages content)

You can copy-paste this section into your submission document.

### Agent Roles
1. **Intent Classification Agent** — identifies intent (fraud/transaction/loan/general) and confidence.  
2. **Banking Rules & Policy Reasoning Agent** — applies static policy text and rules to return allowed actions and policy notes.  
3. **Response Drafting Agent** — generates a safe, customer-friendly response using prior outputs and constraints.  
4. **Risk & Escalation Agent** — enforces a deliberate escalation decision gate using risk/urgency/uncertainty.

### Task Flow / Handoffs
The workflow is sequential with explicit dependency:
- The intent classifier output (intent + confidence) is required by the policy agent.
- The policy output (allowed_actions + policy_notes) is required by the response agent.
- The draft response plus risk signals are required by the escalation agent.
A final structured `Handoff` object represents the end-to-end outcome.

### Escalation Logic
Escalation is based on:
- **Risk:** fraud/suspicious activity escalates automatically.
- **Urgency:** stolen card / hacked / unauthorized transactions increase risk.
- **Uncertainty:** low confidence classification escalates to avoid unsafe automation.
Additionally, a deterministic `risk_score` (0–100) is computed from static rules for transparency and consistency.

### Sample Inputs & Outputs
This notebook runs eight representative queries (routine + edge cases) and prints the final structured output for each, demonstrating the end-to-end flow.

### Design Rationale
Responsibilities are separated to reduce hallucination risk and improve auditability:
- Classification is separated from policy reasoning.
- Response drafting is separated from escalation gating.
This implements a goal-oriented multi-agent workflow with controlled decision-making.