# Rc2Rp: Receipt to Report Extraction & Analysis

## Project Overview

Rc2Rp (Receipt to Report) is an multi-agent AI system that extracts, categorizes, analyzes, and summarizes receipt data into structured expense reports.

**Members:** Yuan Ding 
**Track:** Enterprise Agents  
**Course:** 5-Day AI Agents Intensive with Google  
**Date:** November 2025

---

## System Components

| Component                | Purpose               |
| ------------------------ | --------------------- |
| **Coordinator Agent**    | Orchestrates the entire ‚Äúreceipt ‚Üí report‚Äù workflow           |
| **Extraction Agent**     | Performs OCR/text parsing and extracts structured fields |
| **Categorization Agent** | Assigns each expense item to the correct accounting/expense category |
| **Report Agent**         | Creates the final expense report (tables, JSON, summaries) |
| **Validation Agent**     | Performs quality checks and flags items requiring human review |

---

### ‚úîÔ∏è Key Concepts Demonstrated

1. Multi-Agent System ‚Äì Coordinator orchestrates Extraction, Categorization, Analysis, Report, and Validation agents.
2. Custom Tools ‚Äì Specialized tools for receipt extraction, categorization, analysis, and validation.
3. Sessions & Memory ‚Äì Session-based workflow storing receipts, results, and processing context.
4. Observability ‚Äì Tracks tool calls, processing time, errors, validation issues, and session logs.
5. Agent Reset & State Management ‚Äì Full system reset function clearing sessions, metrics, memory, and configuration.

---

## **Environment Setup**

In [1]:
import sys
import os
import time
import json
from datetime import datetime
from typing import Dict, List, Any, Optional
from dataclasses import dataclass, field
import warnings

warnings.filterwarnings('ignore')

import google.generativeai as genai
from google.generativeai.types import FunctionDeclaration, Tool
from kaggle_secrets import UserSecretsClient
from IPython.display import display, HTML, clear_output

print("‚úì Libraries Loaded")


‚úì Libraries Loaded


## **API Configuration**

In [2]:
import os

# Try to get API key from environment variable first, then from Kaggle secrets
GOOGLE_API_KEY = os.environ.get("GOOGLE_API_KEY")

if not GOOGLE_API_KEY:
    try:
        from kaggle_secrets import UserSecretsClient
        GOOGLE_API_KEY = UserSecretsClient().get_secret("GOOGLE_API_KEY")
        os.environ["GOOGLE_API_KEY"] = GOOGLE_API_KEY
    except Exception as e:
        print(f"‚ö†Ô∏è Warning: Could not load API key from Kaggle secrets: {e}")
        print("üí° Please set GOOGLE_API_KEY environment variable or configure Kaggle secrets")

if GOOGLE_API_KEY:
    os.environ["GOOGLE_API_KEY"] = GOOGLE_API_KEY
    genai.configure(api_key=GOOGLE_API_KEY)
    print("‚úÖ Gemini API key setup complete.")
else:
    print("‚ùå Error: GOOGLE_API_KEY not found. Please set it as an environment variable.")

‚úÖ Gemini API key setup complete.


In [3]:
from typing import Any, Dict

from google.adk.agents import Agent, LlmAgent
from google.adk.apps.app import App, EventsCompactionConfig
from google.adk.models.google_llm import Gemini
from google.adk.sessions import DatabaseSessionService
from google.adk.sessions import InMemorySessionService
from google.adk.runners import Runner
from google.adk.tools.tool_context import ToolContext
from google.genai import types

print("‚úÖ ADK components imported successfully.")

‚úÖ ADK components imported successfully.


In [4]:
agent_config: Dict[str, Any] = {
    "model_name": "models/gemini-2.5-flash",
    "temperature": 0.3,
    "enable_validation_agent": True,
    "enable_analysis_agent": True,
}

CONFIG = {
    "team": "rc2rp",
    "model": agent_config["model_name"],
    "max_tokens": 20000,
    "temperature": agent_config["temperature"],
    "version": "2.0.0",
}

for k, v in CONFIG.items():
    print(f"{k:.<20} {v}")


GEMINI_MODEL_NAME = agent_config["model_name"]
gemini_model = Gemini(model=GEMINI_MODEL_NAME)



team................ rc2rp
model............... models/gemini-2.5-flash
max_tokens.......... 20000
temperature......... 0.3
version............. 2.0.0


# 1. Data models (for reasoning / documentation)


In [5]:
@dataclass
class ReceiptItem:
    description: str
    quantity: float
    unit_price: float
    total: float
    category: Optional[str] = None


@dataclass
class ParsedReceipt:
    raw_input: str
    merchant: str
    date: str
    currency: str
    subtotal: float
    tax: float
    total: float
    items: List[ReceiptItem] = field(default_factory=list)


@dataclass
class ExpenseReport:
    receipts: List[ParsedReceipt]
    summary_by_category: Dict[str, float]
    grand_total: float
    currency: str
    insights: str
    validation_issues: List[str] = field(default_factory=list)

# 2. Custom tools (function tools) for the agents


In [6]:
def ensure_api_key_configured():
    """Ensure genai is configured with API key"""
    api_key = os.environ.get("GOOGLE_API_KEY")
    if api_key:
        try:
            genai.configure(api_key=api_key)
        except Exception:
            pass  # Already configured
    else:
        raise ValueError("GOOGLE_API_KEY not found in environment variables")

In [7]:
def parse_receipt_text(raw_text: str) -> str:
    """
    Parse raw receipt text into a structured JSON-like dict.

    Args:
        raw_text (str): OCR text or plain receipt text.
    Returns:
        str: JSON string with structured receipt fields:
            merchant, date, currency, subtotal, tax, total, items[list].
    """
    print("üîç [Extraction Agent] Parsing receipt text...")
    ensure_api_key_configured()
    prompt = (
        f"Parse the following receipt text into a structured JSON format.\n\n"
        f"Receipt Text:\n{raw_text}\n\n"
        f"Extract and return a JSON object with the following structure:\n"
        f"- merchant: string (business name)\n"
        f"- date: string (YYYY-MM-DD format)\n"
        f"- currency: string (ISO format like USD, EUR, CAD)\n"
        f"- subtotal: float (amount before tax)\n"
        f"- tax: float (tax amount)\n"
        f"- total: float (final total amount)\n"
        f"- items: array of objects, each with:\n"
        f"  - description: string\n"
        f"  - quantity: float\n"
        f"  - unit_price: float\n"
        f"  - total: float\n\n"
        f"If information is missing, make reasonable estimates and mark estimated fields in a 'meta' object.\n"
        f"Return ONLY valid JSON, no additional text or explanation."
    )
    model = genai.GenerativeModel(CONFIG['model'])
    result = model.generate_content(prompt).text
    return result


In [8]:
def categorize_line_items(
    items: List[Dict[str, Any]],
) -> str:
    """
    Assign an expense category to each line item using AI.

    Args:
        items: List of dicts with fields like description and total.
    Returns:
        str: JSON array of categories, one per item in the same order.
    """
    print(f"üìÇ [Categorization Agent] Categorizing {len(items)} items...")
    ensure_api_key_configured()
    
    items_str = json.dumps(items, indent=2)
    prompt = (
        f"Assign expense categories to each line item in the following receipt items.\n\n"
        f"Items:\n{items_str}\n\n"
        f"Use this corporate expense taxonomy:\n"
        f"- Transportation (rides, taxis, flights, parking)\n"
        f"- Lodging (hotels, inns, resorts)\n"
        f"- Meals & Entertainment (restaurants, cafes, meals, coffee)\n"
        f"- Office Supplies (stationery, equipment)\n"
        f"- Software & Cloud (subscriptions, software licenses)\n"
        f"- Other (anything that doesn't fit above categories)\n\n"
        f"Return a JSON array of category strings, one for each item in the same order.\n"
        f"Example: [\"Meals & Entertainment\", \"Transportation\", \"Other\"]\n"
        f"Return ONLY the JSON array, no additional text."
    )
    model = genai.GenerativeModel(CONFIG['model'])
    result = model.generate_content(prompt).text
    return result

In [9]:
def validate_receipt_totals(
    subtotal: float,
    tax: float,
    total: float,
) -> str:
    """
    Check basic arithmetic consistency of a receipt and generate validation report.

    Returns:
        str: JSON string with validation results:
        - ok: bool
        - difference: float
        - analysis: str (AI-generated analysis if issues found)
    """
    print(f"‚úÖ [Validation Agent] Validating receipt totals (Subtotal: ${subtotal:.2f}, Tax: ${tax:.2f}, Total: ${total:.2f})...")
    
    # Basic arithmetic check
    expected = subtotal + tax
    diff = float(round(total - expected, 2))
    is_valid = abs(diff) <= 0.01
    
    if is_valid:
        print(f"   ‚úì Validation passed")
        return json.dumps({"ok": True, "difference": diff, "analysis": "All totals are consistent."})
    else:
        print(f"   ‚ö† Validation failed, difference: ${diff:.2f}")
        # Use LLM to analyze the discrepancy
        ensure_api_key_configured()

        prompt = (
            f"Analyze this receipt validation issue:\n\n"
            f"Subtotal: ${subtotal:.2f}\n"
            f"Tax: ${tax:.2f}\n"
            f"Expected Total: ${expected:.2f}\n"
            f"Actual Total: ${total:.2f}\n"
            f"Difference: ${diff:.2f}\n\n"
            f"Provide a brief analysis of what might have caused this discrepancy. "
            f"Possible reasons: rounding errors, missing items, tax calculation errors, etc.\n"
            f"Return a concise explanation (1-2 sentences)."
        )
        model = genai.GenerativeModel(CONFIG['model'])
        analysis = model.generate_content(prompt).text
        
        result = {
            "ok": False,
            "difference": diff,
            "analysis": analysis.strip()
        }
        return json.dumps(result)

In [10]:
def save_receipt_artifact(
    artifact_name: str,
    receipt_json: Dict[str, Any],
    tool_context: ToolContext,
) -> str:
    """
    Save structured receipt JSON as an artifact for later retrieval.

    Args:
        artifact_name: Name/key under which to store the artifact.
        receipt_json: Parsed receipt structure.
        tool_context: ADK ToolContext for artifact operations.
    Returns:
        str: The artifact name for reference.
    """
    print(f"üíæ [Extraction Agent] Saving receipt artifact: {artifact_name}")
    # Depending on ADK version there might be save_artifact / add_artifact; adapt as needed.
    # This is here mainly to demonstrate ToolContext usage in your capstone.
    tool_context.save_artifact(
        artifact_name,
        json.dumps(receipt_json, ensure_ascii=False, indent=2),
    )
    return artifact_name

In [11]:
# 2.5. Function Declarations for tools
function_declarations = [
    FunctionDeclaration(
        name="parse_receipt_text",
        description="Parse raw receipt text into structured JSON format with merchant, date, currency, totals, and line items",
        parameters={
            "type": "object",
            "properties": {
                "raw_text": {
                    "type": "string",
                    "description": "Raw receipt text from OCR or plain text input"
                }
            },
            "required": ["raw_text"]
        }
    ),
    FunctionDeclaration(
        name="categorize_line_items",
        description="Assign expense categories to each line item in a receipt using corporate expense taxonomy",
        parameters={
            "type": "object",
            "properties": {
                "items": {
                    "type": "array",
                    "description": "List of receipt items, each with description, quantity, unit_price, and total",
                    "items": {
                        "type": "object",
                        "properties": {
                            "description": {"type": "string"},
                            "quantity": {"type": "number"},
                            "unit_price": {"type": "number"},
                            "total": {"type": "number"}
                        }
                    }
                }
            },
            "required": ["items"]
        }
    ),
    FunctionDeclaration(
        name="validate_receipt_totals",
        description="Validate arithmetic consistency of receipt totals and generate analysis if discrepancies are found",
        parameters={
            "type": "object",
            "properties": {
                "subtotal": {
                    "type": "number",
                    "description": "Subtotal amount before tax"
                },
                "tax": {
                    "type": "number",
                    "description": "Tax amount"
                },
                "total": {
                    "type": "number",
                    "description": "Final total amount"
                }
            },
            "required": ["subtotal", "tax", "total"]
        }
    ),
    FunctionDeclaration(
        name="save_receipt_artifact",
        description="Save structured receipt JSON as an artifact for later retrieval in the session",
        parameters={
            "type": "object",
            "properties": {
                "artifact_name": {
                    "type": "string",
                    "description": "Name/key under which to store the artifact"
                },
                "receipt_json": {
                    "type": "object",
                    "description": "Parsed receipt structure as JSON object"
                }
            },
            "required": ["artifact_name", "receipt_json"]
        }
    )
]

tools = Tool(function_declarations=function_declarations)
print(f"‚úì Function Declarations Created ({len(function_declarations)} tools)")

‚úì Function Declarations Created (4 tools)


# 3. Helper to convert agent output into our dataclasses


In [12]:
from google.genai import types

def build_user_content_from_text(text: str) -> types.UserContent:
    """
    Constructs a UserContent object for the Runner.
    
    Note: ADK Part uses 'text' parameter, not 'raw_text'.
    """
    try:
        return types.UserContent(
            parts=[types.Part(text=text)] 
        )
    except Exception:
        return types.UserContent(text)

# 4. Define specialized LlmAgents


In [13]:
retry_config=types.HttpRetryOptions(
    attempts=5,  # Maximum retry attempts
    exp_base=7,  # Delay multiplier
    initial_delay=1,
    http_status_codes=[429, 500, 503, 504], # Retry on these HTTP errors
)

In [14]:
parse_text_agent = Agent(
    name="ParserAgent",
    model=Gemini(
        model="gemini-2.5-flash",
        retry_options=retry_config
    ),
    instruction="""You are the Receipt Extraction Agent in the Rc2Rp pipeline.
    
    Your responsibilities:
    1. Take raw receipt text (often noisy, from OCR or email).
    2. When appropriate, call the `parse_receipt_text` tool to produce a FIRST PASS
       structured JSON (merchant, date, currency, subtotal, tax, total, items).
    3. Refine / correct that structure using your own reasoning.
    4. Return a CLEAN JSON object only, no extra prose.
    
    IMPORTANT: After you return the JSON, the Coordinator will automatically pass it to 
    the Categorization Agent. You do NOT need to call the next agent yourself.
    
    Output format - Return ONLY valid JSON:
    {
      "merchant": "string",
      "date": "YYYY-MM-DD",
      "currency": "ISO code (USD, EUR, etc.)",
      "subtotal": float,
      "tax": float,
      "total": float,
      "items": [
        {
          "description": "string",
          "quantity": float,
          "unit_price": float,
          "total": float
        }
      ]
    }
    
    Always:
    - Normalize currency codes to ISO format (e.g., USD, EUR, CAD).
    - Ensure numeric fields are floats.
    - For each item, include: description, quantity, unit_price, total.
    - If information is missing, make a reasonable best-effort guess and mark
      a field `estimated: true` in a separate `meta` block.
    - Return ONLY the JSON object, no explanation, no markdown, no code blocks.""",
    output_key="json_output",  # The result of this agent will be stored in the session state with this key.
)

print("‚úÖ parse_text_agent created.")

‚úÖ parse_text_agent created.


In [15]:
receipt_cat_agent = Agent(
    name="receipt_cate_agent",
    model=Gemini(
        model="gemini-2.5-flash",
        retry_options=retry_config
    ),
    # The `{blog_outline}` placeholder automatically injects the state value from the previous agent's output.
    instruction="""You are the Categorization Agent in the Rc2Rp pipeline.
    
    Input:
    - You will receive a structured receipt JSON from the Extraction Agent.{json_output}
    - The JSON contains: merchant, date, currency, subtotal, tax, total, and an `items` array.
    
    Your tasks:
    1. For each item in the `items` array, decide the most appropriate category.
    2. Use this corporate expense taxonomy:
       - "Transportation" (rides, taxis, flights, parking, Uber, Lyft)
       - "Lodging" (hotels, inns, resorts, Airbnb)
       - "Meals & Entertainment" (restaurants, cafes, meals, coffee, food)
       - "Office Supplies" (stationery, equipment, office materials)
       - "Software & Cloud" (subscriptions, software licenses, cloud services)
       - "Other" (anything that doesn't fit above categories)
    3. You may call `categorize_line_items` tool to help classify, but refine using your reasoning.
    4. Add a "category" field to EACH item in the items array.
    5. Return the COMPLETE receipt JSON with categories added.
    
    Output format - Return ONLY valid JSON (same structure as input, with categories added):
    {
      "merchant": "string",
      "date": "YYYY-MM-DD",
      "currency": "ISO code",
      "subtotal": float,
      "tax": float,
      "total": float,
      "items": [
        {
          "description": "string",
          "quantity": float,
          "unit_price": float,
          "total": float,
          "category": "Transportation" | "Lodging" | "Meals & Entertainment" | "Office Supplies" | "Software & Cloud" | "Other"
        }
      ]
    }""",
    output_key="category_draft",  # The result of this agent will be stored with this key.
)

print("‚úÖ receipt_cate_agent created.")

‚úÖ receipt_cate_agent created.


In [16]:
expense_ana_agent = Agent(
    name="expense_anaAgent",
    model=Gemini(
        model="gemini-2.5-flash-lite",
        retry_options=retry_config
    ),
    # This agent receives the `{blog_draft}` from the writer agent's output.
    instruction="""You are the Analysis Agent in the Rc2Rp pipeline.
    
    You receive:
    - An array of structured receipts, each with items and categories.{category_draft}
    
    Your tasks:
    1. Aggregate total spend by category.
    2. Compute overall grand total.
    3. Identify notable patterns:
       - unusually high categories
       - potential policy violations (e.g., too many Meals & Entertainment)
       - tax proportion vs subtotal.
    
    Return:
    - A compact JSON object:
      {
        "summary_by_category": {category: amount, ...},
        "grand_total": <float>,
        "currency": "<ISO>",
        "insights_bullets": [
          "bullet 1",
          "bullet 2",
          "bullet 3"
        ]
      }""",
    output_key="analysis_output",  # This is the final output of the entire pipeline.
)

print("‚úÖ expense_ana_agent created.")

‚úÖ expense_ana_agent created.


In [17]:
report_agent = Agent(
    name="report_agent",
    model=Gemini(
        model="gemini-2.5-flash-lite",
        retry_options=retry_config
    ),
    # This agent receives the `{blog_draft}` from the writer agent's output.
    instruction="""You are the Reporting Agent in the Rc2Rp pipeline.
    
    Input: {analysis_output}
    - A JSON payload containing:
      - receipts: array of structured receipts
      - analysis: output from the Analysis Agent
    
    Output:
    - A FINAL JSON expense report:
      {
        "report_title": "Rc2Rp Expense Summary",
        "report_period": "<string or inferred>",
        "currency": "<ISO>",
        "grand_total": <float>,
        "summary_by_category": {...},
        "top_merchants": [
          {"merchant": "...", "total": <float>},
          ...
        ],
        "insights": "short paragraph",
        "insights_bullets": ["...", "..."],
        "raw_receipts_embedded": true
      }
    
    Keep the natural language short and business-like.""",
    output_key="final_output",  # This is the final output of the entire pipeline.
)

print("‚úÖ report_agent created.")

‚úÖ report_agent created.


In [18]:
from google.adk.agents import Agent, SequentialAgent, ParallelAgent, LoopAgent
from google.adk.runners import InMemoryRunner

root_agent = SequentialAgent(
    name="r2rPipeline",
    sub_agents=[parse_text_agent, receipt_cat_agent, expense_ana_agent,report_agent],
)

print("‚úÖ Sequential Agent created.")

‚úÖ Sequential Agent created.


In [21]:
runner = InMemoryRunner(agent=root_agent)
response = await runner.run_debug(
    """Here are two receipts:

1) Starbucks NYC, 2025-11-20
   1x Latte $5.00
   1x Sandwich $7.00
   Tax $0.96
   Total $12.96 USD

2) Uber ride from JFK to Manhattan, 2025-11-20
   Total $48.50 USD"""
)


 ### Created new session: debug_session_id

User > Here are two receipts:

1) Starbucks NYC, 2025-11-20
   1x Latte $5.00
   1x Sandwich $7.00
   Tax $0.96
   Total $12.96 USD

2) Uber ride from JFK to Manhattan, 2025-11-20
   Total $48.50 USD
ParserAgent > ```json
{
  "merchant": "Starbucks NYC",
  "date": "2025-11-20",
  "currency": "USD",
  "subtotal": 12.00,
  "tax": 0.96,
  "total": 12.96,
  "items": [
    {
      "description": "Latte",
      "quantity": 1.0,
      "unit_price": 5.00,
      "total": 5.00
    },
    {
      "description": "Sandwich",
      "quantity": 1.0,
      "unit_price": 7.00,
      "total": 7.00
    }
  ]
}
```
receipt_cate_agent > ```json
{
  "merchant": "Starbucks NYC",
  "date": "2025-11-20",
  "currency": "USD",
  "subtotal": 12.00,
  "tax": 0.96,
  "total": 12.96,
  "items": [
    {
      "description": "Latte",
      "quantity": 1.0,
      "unit_price": 5.00,
      "total": 5.00,
      "category": "Meals & Entertainment"
    },
    {
      "descriptio