<a href="https://colab.research.google.com/github/mukul-mschauhan/GenerativeAI/blob/main/AI_Tax_Audit_MultiAgent_CrewAI.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# AI Tax Research & Audit Support Assistant (Multi‑Agent Demo)
### Beginner-friendly CrewAI + Tavily + Gradio (No Web Scraping)

**What you’ll learn**
- What a multi-agent system is (specialized roles collaborating)
- How to wire CrewAI Agents + Tasks into a simple pipeline
- How to use TavilySearchTool for web search (no BeautifulSoup / no scraping)
- How to add Responsible AI guardrails so outputs remain safe & audit-friendly

> ⚠️ **Disclaimer:** This notebook is for audit support and learning only. It is **NOT** legal or tax advice. Always verify with official authority publications.


## Objectives (for the multi-agent system)
1. Accept a tax audit scenario/query + jurisdiction.
2. Use a Research Agent to search the web via Tavily and return relevant sources/snippets.
3. Use an Analysis Agent to extract audit-relevant points (obligations, exemptions, penalties).
4. Use a Writer Agent to produce a structured, citation-backed audit summary.
5. Enforce Responsible AI & guardrails:
   - No hallucinated laws/sections
   - No tax planning/avoidance or filing advice
   - Explicit uncertainty when sources are weak
   - Citations + “Last verified” timestamp


In [None]:
# Cell 1 — Install dependencies
# CrewAI tools are installed via extras. Tavily tool requires tavily-python.
!pip -q install -U "crewai[tools,openai]" tavily-python gradio langchain-tavily langchain-openai langchain-community


[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m80.4/80.4 kB[0m [31m3.6 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m43.6/43.6 kB[0m [31m3.0 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m68.6/68.6 kB[0m [31m5.2 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m40.5/40.5 kB[0m [31m1.3 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m67.3/67.3 kB[0m [31m5.1 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m67.8/67.8 kB[0m [31m4.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m766.8/766.8 kB[0m [31m17.0 MB/s[0m eta

In [None]:
# Cell 2 — Configure API keys (safe prompting)
import os, getpass
from google.colab import userdata
openai_api_key = userdata.get('OPENAI_API_KEY')

OPENAI_BASE_URL = "https://aibe.mygreatlearning.com/openai/v1"

os.environ["OPENAI_API_KEY"] = openai_api_key

os.environ["TAVILY_API_KEY"] = userdata.get('TAVILY_API_KEY')


## Responsible AI & Guardrails (what we enforce)
- NOT legal/tax advice banner on every result
- No tax planning / avoidance (e.g., “how to reduce tax”, “loopholes”, “evade”)
- No filing instructions (step-by-step returns/payment guidance)
- No hallucinated sections/penalties: if not in sources/snippets, say “Not found in sources”
- Prefer official domains when possible (gov / tax authority)
- Always include Last verified timestamp + citations


In [None]:
# Cell 3 — Imports + shared helpers
import time, re
from typing import Any, Dict, Tuple

from crewai import Agent, Task, Crew, Process, LLM
from crewai_tools import TavilySearchTool

def now_utc_str() -> str:
    return time.strftime("%Y-%m-%d %H:%M:%S UTC", time.gmtime())

DISCLAIMER = (
    "This is NOT legal or tax advice. Use for audit support only and verify with official authority publications."
)

# Simple domain preference hints (kept intentionally small for beginners)
PREFERRED_DOMAINS = {
    "UAE": ["tax.gov.ae", "u.ae", "mof.gov.ae", "uaecabinet.ae"],
    "India": ["incometax.gov.in", "cbic.gov.in", "gst.gov.in", "indiacode.nic.in", "egazette.nic.in"],
    "US": ["irs.gov", "treasury.gov", "govinfo.gov", "ecfr.gov"],
    "UK": ["gov.uk", "hmrc.gov.uk", "legislation.gov.uk"],
    "Other": []
}

# Basic disallowed patterns (anti-avoidance + anti-filing-instructions)
DISALLOWED_PATTERNS = [
    r"how to (reduce|minimi[sz]e|avoid) tax",
    r"tax saving",
    r"loophole",
    r"evad(e|ing)",
    r"step[- ]by[- ]step (fil(e|ing)|submit|pay)",
]


In [None]:
# Cell 4 — Tavily Search Tool (no scraping)
# Advanced search + include_answer for quick synthesis. Snippets are used for citations.
tavily_tool = TavilySearchTool(
    search_depth="advanced",
    max_results=6,
    include_answer=True,
    include_raw_content=False,  # keep OFF for beginner simplicity (no raw HTML)
    timeout=60
)


In [None]:
# Cell 5 — LLM setup with optional OPENAI_BASE_URL
import os

def make_llm() -> LLM:
    api_key = openai_api_key
    base_url = OPENAI_BASE_URL

    # Prefer a cheap/fast model for demos.
    candidates = ["openai/gpt-4o-mini", "gpt-4o-mini"]

    last_err = None
    for model_name in candidates:
        try:
            if base_url:
                return LLM(model=model_name, api_key=api_key, base_url=base_url,
                           temperature=0.1, timeout=60, max_retries=2)
            return LLM(model=model_name, api_key=api_key,
                       temperature=0.1, timeout=60, max_retries=2)
        except Exception as e:
            last_err = e
            continue
    raise RuntimeError(f"Could not initialize LLM. Last error: {last_err}")

llm = make_llm()
print("LLM ready ✅")


LLM ready ✅


In [None]:
# Cell 6 — Guardrail utilities (used by Tasks)
def _extract_text(task_output: Any) -> str:
    # CrewAI may pass TaskOutput objects; fall back to string.
    return getattr(task_output, "raw", None) or getattr(task_output, "result", None) or str(task_output)

def guardrail_no_disallowed_content(task_output: Any) -> Tuple[bool, Any]:
    """Blocks tax planning/avoidance + filing instructions."""
    text = _extract_text(task_output).lower()
    for pat in DISALLOWED_PATTERNS:
        if re.search(pat, text):
            return (False, f"Guardrail triggered: disallowed content matched pattern: {pat}")
    return (True, task_output)

def guardrail_report_format(task_output: Any) -> Tuple[bool, Any]:
    """Ensures the final report has required sections + disclaimer + timestamp/citations."""
    text = _extract_text(task_output)
    required_sections = [
        "A) Applicable Sources",
        "B) Key Provisions & Obligations",
        "C) Exemptions & Thresholds",
        "D) Penalties & Compliance Risks",
        "E) Audit Checklist",
        "F) Assumptions & Interpretation Limits",
        "G) Citations",
    ]
    missing = [s for s in required_sections if s not in text]
    if missing:
        return (False, f"Missing required sections: {missing}")

    if "This is NOT legal or tax advice" not in text:
        return (False, "Missing required disclaimer sentence.")

    if "Last verified on" not in text and "Last verified:" not in text:
        return (False, "Missing 'Last verified' timestamp.")

    # Soft check for citations: at least one [1] style bracket
    if "G) Citations" in text and "[" not in text:
        return (False, "Citations section present but no inline citations like [1].")

    ok, msg = guardrail_no_disallowed_content(task_output)
    if not ok:
        return (False, msg)

    return (True, task_output)


## Multi-Agent Design (simple mental model)
- Agent 1 — Researcher: Uses Tavily to find relevant laws/guidance; returns compact JSON sources/snippets.
- Agent 2 — Analyst: Reads research JSON; extracts obligations/exemptions/penalties; flags uncertainty.
- Agent 3 — Writer: Produces final audit-ready report with citations + guardrails.


In [None]:
# Cell 7 — Crew builder (agents + tasks)
def build_tax_audit_crew(jurisdiction: str, strictness: float, max_results: int) -> Crew:
    # Re-create tool with caller-selected max_results (simple)
    tool = TavilySearchTool(
        search_depth="advanced",
        max_results=max_results,
        include_answer=True,
        include_raw_content=False,
        timeout=60
    )

    researcher = Agent(
        role="Tax Law Researcher",
        goal="Find relevant tax laws, regulations, and official guidance for the given audit scenario.",
        backstory="You prioritize official tax authority sources and never fabricate legal details.",
        tools=[tool],
        llm=llm,
        verbose=False
    )

    analyst = Agent(
        role="Tax Audit Analyst",
        goal="Extract audit-relevant obligations, exemptions/thresholds, penalties, and key definitions from research.",
        backstory="You are conservative: if details are missing in sources, you say 'Not found in sources'.",
        llm=llm,
        verbose=False
    )

    writer = Agent(
        role="Audit Report Writer",
        goal="Write a structured, citation-backed audit support summary with strong Responsible AI guardrails.",
        backstory="You produce clear, defensible audit notes with citations and explicit limitations.",
        llm=llm,
        verbose=False
    )

    strict_mode = "Conservative" if strictness < 0.5 else "Broad (still evidence-based)"

    research_task = Task(
        description=(
            "Use Tavily search to find authoritative sources for this tax audit scenario.\n"
            "Inputs: jurisdiction={jurisdiction}, query={query}\n\n"
            "Rules:\n"
            "- Prefer official/gov/tax authority domains when possible.\n"
            "- Return JSON with fields: answer, results[].title, results[].url, results[].content_snippet.\n"
            "- Do NOT invent sections/penalties/dates; only use what appears in snippets.\n"
        ),
        expected_output="A valid JSON string with answer + results list (title/url/snippet).",
        agent=researcher,
        guardrail=guardrail_no_disallowed_content,
        guardrail_max_retries=2
    )

    analysis_task = Task(
        description=(
            "You receive the Research JSON from the previous task.\n"
            "Extract a compact evidence table for auditors:\n"
            "- Obligations & who they apply to\n"
            "- Exemptions/thresholds (if present)\n"
            "- Penalties/risks (if present)\n"
            "- Effective dates/amendments (if present)\n\n"
            "Rules:\n"
            "- If a detail is missing in snippets, write 'Not found in sources'.\n"
            "- No tax planning/avoidance or filing instructions.\n"
            f"- Interpretation mode: {strict_mode}\n"
            "Return Markdown with inline citations like [1], [2] matching sources.\n"
        ),
        expected_output="Markdown evidence summary with cautious language + inline citations.",
        agent=analyst,
        guardrail=guardrail_no_disallowed_content,
        guardrail_max_retries=2
    )

    report_task = Task(
        description=(
            "Using the evidence summary and sources, write the final audit support report.\n"
            "It MUST follow EXACT structure:\n"
            "A) Applicable Sources\n"
            "B) Key Provisions & Obligations\n"
            "C) Exemptions & Thresholds\n"
            "D) Penalties & Compliance Risks\n"
            "E) Audit Checklist\n"
            "F) Assumptions & Interpretation Limits\n"
            "G) Citations\n\n"
            "Rules:\n"
            f"- First line must be: {DISCLAIMER}\n"
            f"- Include: Last verified on {now_utc_str()}\n"
            "- Use inline citations [1], [2] referencing G) Citations.\n"
            "- If sources are weak/non-official, explicitly say: 'Insufficient authoritative guidance found.'\n"
            "- Do NOT invent law sections, penalties, or thresholds.\n"
            "- Do NOT provide filing instructions or tax planning/avoidance.\n"
        ),
        expected_output="A complete markdown report with sections A–G, citations, disclaimer, and timestamp.",
        agent=writer,
        guardrail=guardrail_report_format,
        guardrail_max_retries=2
    )

    return Crew(
        agents=[researcher, analyst, writer],
        tasks=[research_task, analysis_task, report_task],
        process=Process.sequential,
        verbose=False
    )


In [None]:
# Cell 8 — Run function (single entry point for Gradio)
def run_tax_audit_multi_agent(jurisdiction: str, query: str, strictness: float, max_results: int) -> str:
    if not query or len(query.strip()) < 10:
        return "❌ Please enter a more detailed audit scenario (at least 10 characters)."

    crew = build_tax_audit_crew(jurisdiction=jurisdiction, strictness=strictness, max_results=max_results)

    # Inputs get interpolated into task descriptions via {jurisdiction} and {query}
    result = crew.kickoff(inputs={"jurisdiction": jurisdiction, "query": query})

    return str(result)


## Gradio UI (minimal)
A tiny UI to let beginners try the multi-agent crew:
- Choose jurisdiction
- Enter a query/audit scenario
- Select strictness + max results
- Click Run Multi‑Agent Crew


In [None]:
# Cell 9 — Gradio app (beginner-friendly)
import gradio as gr

def gradio_run(jurisdiction, query, strictness, max_results):
    try:
        report = run_tax_audit_multi_agent(jurisdiction, query, float(strictness), int(max_results))
        header = f"### ✅ Multi-Agent Output\n\n**Last verified:** {now_utc_str()}\n\n"
        return header + report
    except Exception as e:
        return f"❌ Error: {e}"

with gr.Blocks(title="Multi-Agent Tax Audit Assistant (CrewAI + Tavily)") as demo:
    gr.Markdown("# Multi-Agent Tax Audit Assistant (CrewAI + Tavily)")
    gr.Markdown(f"⚠️ **Disclaimer:** {DISCLAIMER}")

    with gr.Row():
        jurisdiction = gr.Dropdown(["UAE", "India", "US", "UK", "Other"], value="UAE", label="Jurisdiction")
        max_results = gr.Slider(3, 10, value=6, step=1, label="Max search results (Tavily)")

    query = gr.Textbox(
        label="Tax audit scenario / question",
        lines=3,
        placeholder="Example: VAT registration threshold and penalties for late registration for a UAE services business..."
    )
    strictness = gr.Slider(0.0, 1.0, value=0.2, step=0.1, label="Strictness (Conservative ↔ Broad but evidence-based)")

    run_btn = gr.Button("Run Multi-Agent Crew", variant="primary")
    output = gr.Markdown(label="Audit Support Report")

    run_btn.click(gradio_run, inputs=[jurisdiction, query, strictness, max_results], outputs=[output])

demo.launch(share=True, debug=False)

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://251b68de603b1ca7c1.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)




## Business Outcome / Impact (what this demo shows)
- **Speed:** Converts “search + read + summarize” into a repeatable multi-agent workflow.
- **Consistency:** Output structure A–G is standardized for audit notes.
- **Traceability:** Citations + “Last verified” support defensible audit documentation.
- **Risk reduction:** Guardrails reduce hallucinations and policy-unsafe guidance.
- **Scalability:** Same pattern can expand to more agents (e.g., jurisdiction filter agent, policy QA agent, evidence extractor).
