COORDINATOR LOGIC

In [1]:
import json
import os

OUTPUT_DIR = "outputs"

with open(os.path.join(OUTPUT_DIR, "legal_agent_output.json"), "r", encoding="utf-8") as f:
    legal_output = json.load(f)

with open(os.path.join(OUTPUT_DIR, "compliance_agent_output.json"), "r", encoding="utf-8") as f:
    compliance_output = json.load(f)

with open(os.path.join(OUTPUT_DIR, "finance_agent_output.json"), "r", encoding="utf-8") as f:
    finance_output = json.load(f)

with open(os.path.join(OUTPUT_DIR, "operations_agent_output.json"), "r", encoding="utf-8") as f:
    operations_output = json.load(f)

print("Agent outputs loaded successfully")


Agent outputs loaded successfully


All the output files generated through agents have been loaded. This will be coordinator routing logic.

In [2]:
# Routing rules
ROUTING_RULES = {
    "legal": ["termination", "governing law", "jurisdiction", "indemnity"],
    "compliance": ["gdpr", "audit", "regulatory", "data protection"],
    "finance": ["payment", "fee", "penalty", "invoice"],
    "operations": ["deliverable", "timeline", "sla", "milestone"]
}

ROUTING_RULES

{'legal': ['termination', 'governing law', 'jurisdiction', 'indemnity'],
 'compliance': ['gdpr', 'audit', 'regulatory', 'data protection'],
 'finance': ['payment', 'fee', 'penalty', 'invoice'],
 'operations': ['deliverable', 'timeline', 'sla', 'milestone']}

Defined keyword-based routing rules to map user queries to relevant domain-specific agents. 

In [3]:
# Routing function
def route_query(query: str):
    query = query.lower()
    selected_agents = []

    for agent, keywords in ROUTING_RULES.items():
        for kw in keywords:
            if kw in query:
                if agent not in selected_agents:
                    selected_agents.append(agent)
                break  # stop checking more keywords for this agent

    return selected_agents


Implemented a rule based router that maps user queries to relevant agents based on keyword matching.

In [4]:
# Test Routing Logic 
test_queries = [
    "What are the termination and indemnity clauses?",
    "Are there any GDPR or data protection obligations?",
    "What payment penalties are mentioned?",
    "What are the SLA and milestone requirements?"
]

for q in test_queries:
    print(f"\nQuery: {q}")
    print("Routed Agents:", route_query(q))


Query: What are the termination and indemnity clauses?
Routed Agents: ['legal']

Query: Are there any GDPR or data protection obligations?
Routed Agents: ['compliance']

Query: What payment penalties are mentioned?
Routed Agents: ['finance']

Query: What are the SLA and milestone requirements?
Routed Agents: ['operations']


Tested the routing function with sample queries to verify correct agent selection.

In [5]:
# Coordinator execution logic
def coordinator_execute(query: str):
    agents = route_query(query)
    results = {}

    for agent in agents:
        if agent == "legal":
            results["legal"] = legal_output
        elif agent == "compliance":
            results["compliance"] = compliance_output
        elif agent == "finance":
            results["finance"] = finance_output
        elif agent == "operations":
            results["operations"] = operations_output

    return results

Implemented coordinator logic to aggregate structured outputs from relevant agents based on routing decisions.

In [6]:
# Run Coordinator for a legal query
query = "Explain termination and indemnity risks in this contract"

coordinator_result = coordinator_execute(query)
coordinator_result

{'legal': {'clause_type': 'Legal Analysis',
  'extracted_clauses': [{'clause_id': '6.01',
    'text': 'This Agreement may be terminated prior to the end of the Offer Period by Acquiror if a condition for withdrawal of the Offer has occurred. This Agreement shall be automatically terminated if the Offer has been withdrawn or the Offer is not successful due to the failure of obtaining the minimum threshold. This Agreement may not be terminated after the end of the Offer Period if the Offer is successful.',
    'risk_level': 'high',
    'confidence': 0.95,
    'evidence': ['This Agreement may be terminated prior to the end of the Offer Period by Acquiror if a condition for withdrawal of the Offer has occurred.',
     'This Agreement shall be automatically terminated if the Offer has been withdrawn or the Offer is not successful due to the failure of obtaining the minimum threshold.',
     'This Agreement may not be terminated after the end of the Offer Period if the Offer is successful.']

Executed the coordinator on a sample query to collect multi-agent outputs.

In [7]:
# Covertion to readable output
from pprint import pprint

pprint(coordinator_result)

{'legal': {'clause_type': 'Legal Analysis',
           'confidence': 0.88,
           'evidence': [],
           'extracted_clauses': [{'clause_id': '6.01',
                                  'confidence': 0.95,
                                  'evidence': ['This Agreement may be '
                                               'terminated prior to the end of '
                                               'the Offer Period by Acquiror '
                                               'if a condition for withdrawal '
                                               'of the Offer has occurred.',
                                               'This Agreement shall be '
                                               'automatically terminated if '
                                               'the Offer has been withdrawn '
                                               'or the Offer is not successful '
                                               'due to the failure of '
                

Printed the output to readable one.

In [8]:
# Test Query 1 — Finance Agent
query_finance = "What are the payment terms, fees, and penalties mentioned in the contract?"
coordinator_result_finance = coordinator_execute(query_finance)
pprint(coordinator_result_finance)

{'finance': {'clause_type': 'Finance',
             'confidence': 0.85,
             'evidence': [],
             'extracted_clauses': [{'clause_text': 'litigation to collect the '
                                                   'amount owed and Seller '
                                                   'prevails in the '
                                                   'litigation, Buyer will '
                                                   'reimburse Seller for '
                                                   'actual, reasonable, '
                                                   'substantiated '
                                                   'out-of-pocket expenses '
                                                   'incurred by Seller in '
                                                   'collecting the delinquent '
                                                   'amount and accrued late '
                                                   'payment fees on

In [9]:
# Test Query 2 — Compliance Agent
query_compliance = "Are there any GDPR or data protection obligations in this contract?"
coordinator_result_compliance = coordinator_execute(query_compliance)
pprint(coordinator_result_compliance)

{'compliance': {'clause_type': 'Compliance',
                'confidence': 0.95,
                'evidence': [],
                'extracted_clauses': ['16.1 Privacy and Security Matters. '
                                      'Concurrently with the execution of this '
                                      'Agreement, the Parties are executing a '
                                      'HIPAA Business Associate Agreement (the '
                                      '"BAA") in the form attached hereto as '
                                      'Exhibit E.',
                                      '16.2 Technical Standards. The Company '
                                      'will provide Allscripts with Updates so '
                                      'that the Subscription Software Services '
                                      'can be implemented and configured to '
                                      'comply in all material respects with '
                                      'ap

In [10]:
# Test Query 3 — Operations Agent
query_operations = "What are the SLA and milestone requirements in this contract?"
coordinator_result_operations = coordinator_execute(query_operations)   
pprint(coordinator_result_operations)

{'operations': {'clause_type': 'Operations',
                'confidence': 0.0,
                'error': "Invalid JSON from model: Expecting ',' delimiter: "
                         'line 32 column 45 (char 2097)',
                'evidence': [],
                'extracted_clauses': [],
                'risk_level': 'unknown'}}


In [11]:
# Multi AGent Query
query_multi = "What are the termination clauses, payment terms, and SLA requirements in this contract?"
coordinator_result_multi = coordinator_execute(query_multi)

print("\nLEGAL OUTPUT")
pprint(coordinator_result_multi["legal"])

print("\nFINANCE OUTPUT")
pprint(coordinator_result_multi["finance"])

print("\nOPERATIONS OUTPUT")
pprint(coordinator_result_multi["operations"])



LEGAL OUTPUT
{'clause_type': 'Legal Analysis',
 'confidence': 0.88,
 'evidence': [],
 'extracted_clauses': [{'clause_id': '6.01',
                        'confidence': 0.95,
                        'evidence': ['This Agreement may be terminated prior '
                                     'to the end of the Offer Period by '
                                     'Acquiror if a condition for withdrawal '
                                     'of the Offer has occurred.',
                                     'This Agreement shall be automatically '
                                     'terminated if the Offer has been '
                                     'withdrawn or the Offer is not successful '
                                     'due to the failure of obtaining the '
                                     'minimum threshold.',
                                     'This Agreement may not be terminated '
                                     'after the end of the Offer Period if the '
 

In [12]:
print(coordinator_result_multi.keys())

dict_keys(['legal', 'finance', 'operations'])


In [13]:
import json

with open("coordinator_output_multi.json", "w", encoding="utf-8") as f:
    json.dump(coordinator_result_multi, f, indent=2, ensure_ascii=False)

print("Coordinator output saved to coordinator_output_multi.json")

Coordinator output saved to coordinator_output_multi.json


The coordinator successfully routed a multi-domain query to Legal, Finance, and Operations agents and aggregated their structured outputs. 

LANGGRAPH BASICS

In [40]:
# Define Shared Graph State
from langgraph.graph import StateGraph, END
import json
from typing import TypedDict, Optional

class GraphState(TypedDict):
    query: str
    combined_compliance_text: str
    combined_legal_text: str
    compliance_output: Optional[dict]
    legal_output: Optional[dict]


Defined a shared state object to hold query context and structured outputs from all agents.

In [19]:
def validate_agent_output(output_str, clause_type=""):
    # Remove Markdown code fences if present
    cleaned = output_str.strip()
    cleaned = cleaned.replace("```json", "").replace("```", "").strip()

    # Try parsing JSON
    try:
        output = json.loads(cleaned)
    except Exception as e:
        return {
            "clause_type": clause_type,
            "extracted_clauses": [],
            "risk_level": "unknown",
            "confidence": 0.0,
            "evidence": [],
            "error": f"Invalid JSON from model: {str(e)}"
        }

    # Build validated result
    validated = {
        "clause_type": clause_type,
        "extracted_clauses": output.get("extracted_clauses", []),
        "risk_level": output.get("risk_level", "unknown"),
        "confidence": output.get("confidence", 0.0),
        "evidence": output.get("evidence", [])
    }

    return validated


Added a validator to ensure all agents output structurally correct JSON. This prevents downstream failures during analysis or graph execution.

In [20]:
import requests
import json

class BaseAgent:
    def __init__(self, agent_name, system_prompt, model="gemma3:4b"):
        self.agent_name = agent_name
        self.system_prompt = system_prompt
        self.model = model

    def run(self, context_text):
        payload = {
            "model": self.model,
            "prompt": f"{self.system_prompt}\n\nUser Input:\n{context_text}",
            "temperature": 0
        }

        # Stream=True → handle incremental JSON
        response = requests.post(
            "http://localhost:11434/api/generate",
            json=payload,
            stream=True
        )

        full_response = ""

        # Read streaming chunks
        for line in response.iter_lines():
            if not line:
                continue
            try:
                data = json.loads(line.decode("utf-8"))
            except:
                continue

            # Ollama sends chunks as {"response": "..."}
            if "response" in data:
                full_response += data["response"]

        return full_response.strip()


Implemented a reusable base agent class that communicates with the locally hosted Gemma 3 (4B) model through Ollama's HTTP API. This design ensures the agent framework is fully local, cost-free, and independent of external APIs.

In [21]:
COMPLIANCE_AGENT_PROMPT = """
You are a Compliance Risk Analysis Agent.

Your tasks:
1. Identify compliance-related clauses in the contract, including:
   - Data protection obligations
   - Regulatory requirements
   - Audit and reporting obligations
2. Extract exact compliance-related sentences.
3. Assess compliance risk as: low, medium, or high.
4. Provide a confidence score between 0 and 1.

Return ONLY valid JSON in this format:
{
  "extracted_clauses": [],
  "risk_level": "",
  "confidence": 0.0,
  "evidence": []
}
"""

In [22]:
# initialize Compliance Agent
compliance_agent = BaseAgent(
    agent_name="Compliance Agent",
    system_prompt=COMPLIANCE_AGENT_PROMPT,
    model="gemma3:4b"
)

In [41]:
# Define agent nodes
def compliance_node(state: GraphState):
    print("\n[Compliance Node Running]")

    combined_text = state["combined_compliance_text"]

    raw = compliance_agent.run(combined_text)
    validated = validate_agent_output(raw, clause_type="Compliance")

    state["compliance_output"] = validated
    return state


In [25]:
LEGAL_AGENT_PROMPT = """
You are a Legal Contract Analysis Agent.

Your tasks:
1. Identify legal clauses (Termination, Governing Law, Jurisdiction).
2. Extract exact clause text from the provided contract section.
3. Assess legal risk: low, medium, or high.
4. Provide a confidence score between 0 and 1.
5. Include evidence (exact sentences that justify your conclusion).

Return ONLY valid JSON in this format:
{
  "extracted_clauses": [],
  "risk_level": "",
  "confidence": 0.0,
  "evidence": []
}
"""
# Initialize legal agent
legal_agent = BaseAgent(
    agent_name="Legal Agent",
    system_prompt=LEGAL_AGENT_PROMPT,
    model="gemma3:4b"
)

In [42]:
def legal_node(state: GraphState):
    print("\n[Legal Node Running]")

    combined_text = state["combined_legal_text"]

    raw = legal_agent.run(combined_text)
    validated = validate_agent_output(raw, clause_type="Legal")

    state["legal_output"] = validated
    return state


Defined LangGraph nodes for Compliance and Legal agents with logging to trace execution order.

In [43]:
# Build Graph Skeleton & add nodes to graph
workflow = StateGraph(GraphState)
workflow.add_node("compliance_node", compliance_node)
workflow.add_node("legal_node", legal_node)


<langgraph.graph.state.StateGraph at 0x2250936fec0>

Initialized the LangGraph state graph using the shared graph state. Registered Compliance and Legal agent nodes in the graph.

In [44]:
# Define edges
workflow.set_entry_point("compliance_node")
workflow.add_edge("compliance_node", "legal_node")
workflow.add_edge("legal_node", END)


<langgraph.graph.state.StateGraph at 0x2250936fec0>

Configured a simple linear flow where the Compliance agent executes before the Legal agent.

In [45]:
# Compile graph
graph = workflow.compile()
print("Graph compiled successfully")

Graph compiled successfully


Compiled the LangGraph to prepare it for execution.

In [46]:
# Pinecone Setup

import os
import json
import numpy as np
from tqdm.auto import tqdm
import matplotlib.pyplot as plt

from pinecone import Pinecone, ServerlessSpec
from sentence_transformers import SentenceTransformer
from dotenv import load_dotenv

load_dotenv()

assert "PINECONE_API_KEY" in os.environ, "PINECONE_API_KEY not found"

print("Pinecone API key loaded successfully")

pc = Pinecone(api_key=os.environ["PINECONE_API_KEY"])

model = SentenceTransformer("all-MiniLM-L6-v2")

print("Sentence-Transformer model loaded successfully")

Pinecone API key loaded successfully
Sentence-Transformer model loaded successfully


In [33]:
INDEX_NAME = "cuad-index-minilm"   
DIMENSION = 384                  

existing_indexes = [idx["name"] for idx in pc.list_indexes()]

if INDEX_NAME not in existing_indexes:
    pc.create_index(
        name=INDEX_NAME,
        dimension=DIMENSION,
        metric="cosine",
        spec=ServerlessSpec(
            cloud="aws",
            region="us-east-1"
        )
    )

index = pc.Index(INDEX_NAME)

print(f"Connected to Pinecone index: {INDEX_NAME}")


Connected to Pinecone index: cuad-index-minilm


In [34]:
# Building RAG Search Wrapper
import re
import json
import matplotlib.pyplot as plt
from typing import List, Dict

# # Sentence-Transformers embedding function for Pinecone
def embed_batch(texts):
    return model.encode(
        texts,
        show_progress_bar=False,
        convert_to_numpy=True
    )

# Embed a Query
def embed_query(query: str):
    return embed_batch([query])[0].tolist()


In [35]:
# Core RAG architecture
def rag_search(
    query: str,
    index,
    top_k: int = 5
) -> List[Dict]:
   
    query_vector = embed_query(query)

    results = index.query(
        vector=query_vector,
        top_k=top_k,
        include_metadata=True
    )

    retrieved_chunks = []

    for match in results["matches"]:
        retrieved_chunks.append({
            "score": match["score"],
            "contract_id": match["metadata"]["contract_id"],
            "chunk_id": match["metadata"]["chunk_id"],
            "text": match["metadata"]["text"]
        })

    return retrieved_chunks


In [37]:
# Retrieve compliance-focused context 
compliance_query = (
    "data protection gdpr hipaa audit regulatory compliance security privacy"
)

compliance_rag_results = rag_search(
    compliance_query,
    index,
    top_k=5
)

combined_compliance_text = "\n\n".join(
    [c["text"] for c in compliance_rag_results]
)

print(combined_compliance_text[:400])

# Retrieve legal-focused context 
legal_query = (
    "termination clause termination rights governing law jurisdiction legal risk"
)

legal_rag_results = rag_search(
    legal_query,
    index,
    top_k=5
)

combined_legal_text = "\n\n".join(
    [c["text"] for c in legal_rag_results]
)

print(combined_legal_text[:400])



15.2 [***].

16. Regulatory Matters.

16.1 Privacy and Security Matters. Concurrently with the execution of this Agreement, the Parties are executing a HIPAA Business Associate Agreement (the "BAA") in the form attached hereto as Exhibit E.

16.2 Technical Standards. The Company will provide Allscripts with Updates so that the Subscription Software Services can be implemented and configured to com
16. TERMINATION

16.1 Termination events: without prejudice to any other rights under this Agreement and/or at Law, either Party shall be entitled to terminate all or part of this Agreement by Notice of termination, as per Clauses 16.4 ("Termination procedure") and 16.6 ("Consequences of termination"), in the following events:

16. TERMINATION

16.1 Termination events: without prejudice to any oth


In [47]:
initial_state = {
    "query": "Find termination and compliance risks",
    "combined_compliance_text": combined_compliance_text,
    "combined_legal_text": combined_legal_text,
    "compliance_output": None,
    "legal_output": None
}

result = graph.invoke(initial_state)



[Compliance Node Running]

[Legal Node Running]


Executed the LangGraph with an initial query and empty output state. This means the flowchart is legal->compliance.

In [48]:
# Inspect output
import pprint
pprint.pprint(result)

{'combined_compliance_text': '15.2 [***].\n'
                             '\n'
                             '16. Regulatory Matters.\n'
                             '\n'
                             '16.1 Privacy and Security Matters. Concurrently '
                             'with the execution of this Agreement, the '
                             'Parties are executing a HIPAA Business Associate '
                             'Agreement (the "BAA") in the form attached '
                             'hereto as Exhibit E.\n'
                             '\n'
                             '16.2 Technical Standards. The Company will '
                             'provide Allscripts with Updates so that the '
                             'Subscription Software Services can be '
                             'implemented and configured to comply in all '
                             'material respects with applicable privacy and '
                             'security standards (e.g., H

Reviewed the aggregated outputs to confirm correct execution order and state propagation.

TASK - Change execution order (Compliance → Legal)

In [49]:
workflow = StateGraph(GraphState)
workflow.add_node("legal_node", legal_node)
workflow.add_node("compliance_node", compliance_node)

workflow.set_entry_point("legal_node")
workflow.add_edge("legal_node", "compliance_node")
workflow.add_edge("compliance_node", END)

<langgraph.graph.state.StateGraph at 0x22509395970>

In [50]:
result = graph.invoke(initial_state)


[Compliance Node Running]

[Legal Node Running]


The execution order has been changed

MULTI AGENT LANGGRAPH

In [51]:
# Define agent nodes

import json
import os

OUTPUT_DIR = "outputs"

with open(os.path.join(OUTPUT_DIR, "compliance_agent_output.json"), "r", encoding="utf-8") as f:
    compliance_output_new = json.load(f)

with open(os.path.join(OUTPUT_DIR, "legal_agent_output.json"), "r", encoding="utf-8") as f:
    legal_output_new = json.load(f)

with open(os.path.join(OUTPUT_DIR, "finance_agent_output.json"), "r", encoding="utf-8") as f:
    finance_output_new = json.load(f)

with open(os.path.join(OUTPUT_DIR, "operations_agent_output.json"), "r", encoding="utf-8") as f:
    operations_output_new = json.load(f)

print("Agent outputs loaded")

Agent outputs loaded


In [None]:
# Define expanded graph state
from typing import TypedDict, Optional

class GraphState(TypedDict):
    query: str

    combined_legal_text: str
    combined_compliance_text: str
    combined_finance_text: str
    combined_operations_text: str

    legal_output: Optional[dict]
    compliance_output: Optional[dict]
    finance_output: Optional[dict]
    operations_output: Optional[dict]


In [55]:
FINANCE_AGENT_PROMPT = """
You are a Finance Risk Analysis Agent.

Your tasks:
1. Identify finance-related clauses in the contract, including:
   - Payment terms
   - Billing, fees, and invoicing
   - Penalties and late fees
   - Financial liabilities and monetary obligations
2. Extract the exact clause text.
3. Assess financial risk as: low, medium, or high.
4. Provide a confidence score between 0 and 1.
5. Include evidence sentences.

Return ONLY valid JSON:
{
  "extracted_clauses": [],
  "risk_level": "",
  "confidence": 0.0,
  "evidence": []
}
"""

# Initialize Finance Agent
finance_agent = BaseAgent(
    agent_name="Finance Agent",
    system_prompt=FINANCE_AGENT_PROMPT,
    model="gemma3:4b"
)

In [56]:
OPERATIONS_AGENT_PROMPT = """
You are an Operations Risk Analysis Agent.

Your tasks:
1. Identify operational clauses related to:
   - Deliverables
   - Timelines and milestones
   - Service obligations
   - Performance standards or SLAs
2. Extract the exact clause text.
3. Assess execution risk as: low, medium, or high.
4. Provide a confidence score between 0 and 1.
5. Include evidence sentences.

Return ONLY valid JSON:
{
  "extracted_clauses": [],
  "risk_level": "",
  "confidence": 0.0,
  "evidence": []
}
"""

# Initialize Operations Agent
operations_agent = BaseAgent(
    agent_name="Operations Agent",
    system_prompt=OPERATIONS_AGENT_PROMPT,
    model="gemma3:4b"
)

In [57]:
# Define agent nodes
def legal_node(state: GraphState):
    print("\n[Legal Agent Running]")
    text = state["combined_legal_text"]

    raw = legal_agent.run(text)
    validated = validate_agent_output(raw, clause_type="Legal")
    state["legal_output"] = validated
    return state

def compliance_node(state: GraphState):
    print("\n[Compliance Agent Running]")
    text = state["combined_compliance_text"]

    raw = compliance_agent.run(text)
    validated = validate_agent_output(raw, clause_type="Compliance")
    state["compliance_output"] = validated
    return state

def finance_node(state: GraphState):
    print("\n[Finance Agent Running]")
    text = state["combined_finance_text"]

    raw = finance_agent.run(text)
    validated = validate_agent_output(raw, clause_type="Finance")
    state["finance_output"] = validated
    return state

def operations_node(state: GraphState):
    print("\n[Operations Agent Running]")
    text = state["combined_operations_text"]

    raw = operations_agent.run(text)
    validated = validate_agent_output(raw, clause_type="Operations")
    state["operations_output"] = validated
    return state


Defined LangGraph nodes for each agent with logging to observe execution order.

In [58]:
# Build Graph skeleton
from langgraph.graph import StateGraph, END

graph = StateGraph(GraphState)


Initialized a LangGraph state graph using the shared graph state. Here every step receives and returns GraphState.

In [59]:
# Add nodes to graph
graph.add_node("legal_agent", legal_node)
graph.add_node("compliance_agent", compliance_node)
graph.add_node("finance_agent", finance_node)
graph.add_node("operations_agent", operations_node)

<langgraph.graph.state.StateGraph at 0x2251a70d520>

Registered Legal, Compliance, Finance, and Operations agents as graph nodes.

In [60]:
# Define edges
graph.set_entry_point("legal_agent")
graph.add_edge("legal_agent", "compliance_agent")
graph.add_edge("compliance_agent", "finance_agent")
graph.add_edge("finance_agent", "operations_agent")
graph.add_edge("operations_agent", END)

<langgraph.graph.state.StateGraph at 0x2251a70d520>

Defined a sequential execution flow across all agents.

In [61]:
# compile graph
app = graph.compile()

Compiled the LangGraph into an executable application.

In [63]:
finance_query = (
    "payment terms invoicing fees penalties late fee financial liability "
    "compensation reimbursement billing charges"
)

finance_rag_results = rag_search(
    finance_query,
    index,
    top_k=5
)

combined_finance_text = "\n\n".join(
    [c["text"] for c in finance_rag_results]
)

print("FINANCE CONTEXT PREVIEW:\n", combined_finance_text[:400])

operations_query = (
    "deliverables obligations timelines milestones service level agreement SLA "
    'performance standards responsibilities duties implementation execution"'
)

operations_rag_results = rag_search(
    operations_query,
    index,
    top_k=5
)

combined_operations_text = "\n\n".join(
    [c["text"] for c in operations_rag_results]
)

print("OPERATIONS CONTEXT PREVIEW:\n", combined_operations_text[:400])



FINANCE CONTEXT PREVIEW:
 Source: REYNOLDS CONSUMER PRODUCTS INC., S-1, 11/15/2019

litigation to collect the amount owed and Seller prevails in the litigation, Buyer will reimburse Seller for actual, reasonable, substantiated out-of-pocket expenses incurred by Seller in collecting the delinquent amount and accrued late payment fees on the delinquent amount. Under no circumstance will the late payment fee payable to Seller
OPERATIONS CONTEXT PREVIEW:
 H. Combinational impacts (i.e., how one Service Level affects another);

 I. System implications; and

 J. Issues relating to Applicable Law.

3. SLA TEAM REVIEW.

 A. A joint Metavante-Customer team (the "SLA Team") shall review, evaluate and potentially modify the Service Level Changes and associated Business Case Assessments.

 B. At a minimum, the SLA Team shall consist of personnel designated


In [64]:
# Execute graph
input_state = {
    "query": "Review termination, GDPR compliance, payment terms, and SLAs",

    "combined_legal_text": combined_legal_text,
    "combined_compliance_text": combined_compliance_text,
    "combined_finance_text": combined_finance_text,
    "combined_operations_text": combined_operations_text,

    "legal_output": None,
    "compliance_output": None,
    "finance_output": None,
    "operations_output": None
}

result = app.invoke(input_state)



[Legal Agent Running]

[Compliance Agent Running]

[Finance Agent Running]

[Operations Agent Running]


This means the flowchart is legal->compliance->finance->operations

TASK - CHANGE EXECUTION ORDER

In [97]:
graph = StateGraph(MultiAgentState)

graph.add_node("legal_agent_new", legal_node)
graph.add_node("compliance_agent_new", compliance_node)
graph.add_node("finance_agent_new", finance_node)
graph.add_node("operations_agent_new", operations_node)

graph.set_entry_point("compliance_agent_new")
graph.add_edge("compliance_agent_new", "legal_agent_new")
graph.add_edge("legal_agent_new", "finance_agent_new")
graph.add_edge("finance_agent_new", "operations_agent_new")
graph.add_edge("operations_agent_new", END)

app = graph.compile()
result = app.invoke(input_state)



Compliance agent running
Legal agent running
Finance agent running
Operations agent running


the order has been changed. The graph is now executed from compliance agent now.

Conditional Routing in LangGraph 

In [72]:
# Routing Function
from langgraph.graph import END

def route_query(state: dict) -> str:
    q = state["query"].lower()

    if any(k in q for k in ["termination", "governing law", "jurisdiction", "indemnity"]):
        return "legal_agent"

    if any(k in q for k in ["gdpr", "audit", "regulatory", "data protection", "hipaa"]):
        return "compliance_agent"

    if any(k in q for k in ["payment", "fee", "penalty", "invoice", "billing"]):
        return "finance_agent"

    if any(k in q for k in ["deliverable", "timeline", "sla", "milestone"]):
        return "operations_agent"

    return END


Defined a keyword based routing function to select the relevant agent based on the query.

In [73]:
# rebuild grpah with conditional entry
from langgraph.graph import StateGraph, END

graph = StateGraph(GraphState)


In [74]:
# add agent nodes
graph.add_node("legal_agent", legal_node)
graph.add_node("compliance_agent", compliance_node)
graph.add_node("finance_agent", finance_node)
graph.add_node("operations_agent", operations_node)

<langgraph.graph.state.StateGraph at 0x22509390b30>

Defined nodes for each agent.

In [75]:
# Conditinal entry point
graph.set_conditional_entry_point(
    route_query,
    {
        "legal_agent": "legal_agent",
        "compliance_agent": "compliance_agent",
        "finance_agent": "finance_agent",
        "operations_agent": "operations_agent",
        END: END
    }
)

<langgraph.graph.state.StateGraph at 0x22509390b30>

Configured LangGraph to dynamically select the appropriate agent at runtime using a conditional entry point.

In [76]:
# agent to end edges
graph.add_edge("legal_agent", END)
graph.add_edge("compliance_agent", END)
graph.add_edge("finance_agent", END)
graph.add_edge("operations_agent", END)

<langgraph.graph.state.StateGraph at 0x22509390b30>

In [77]:
app = graph.compile()

In [78]:
# Test Case 1 (Legal Query)
state = {
    "query": "Review termination clause",
    "combined_legal_text": combined_legal_text,
    "combined_compliance_text": "",
    "combined_finance_text": "",
    "combined_operations_text": "",
    "legal_output": None,
    "compliance_output": None,
    "finance_output": None,
    "operations_output": None
}

result = app.invoke(state)
result.keys()



[Legal Agent Running]


dict_keys(['query', 'combined_legal_text', 'combined_compliance_text', 'combined_finance_text', 'combined_operations_text', 'legal_output', 'compliance_output', 'finance_output', 'operations_output'])

The conditional routing logic correctly identified a legal intent in the query and executed only the Legal agent. This confirms that keyword-based routing is functioning as expected for legal clause analysis.

In [80]:
# Test Case 2 (Finance Query)
state = {
    "query": "Check late payment penalties",
    "combined_legal_text": "",
    "combined_compliance_text": "",
    "combined_finance_text": combined_finance_text,
    "combined_operations_text": "",
    "legal_output": None,
    "compliance_output": None,
    "finance_output": None,
    "operations_output": None
}

result = app.invoke(state)
result.keys()


[Finance Agent Running]


dict_keys(['query', 'combined_legal_text', 'combined_compliance_text', 'combined_finance_text', 'combined_operations_text', 'legal_output', 'compliance_output', 'finance_output', 'operations_output'])

Similarly it has correctly identified for the finance agent as well.

In [81]:
# Test Case 3 (Multi agent)
state = {
    "query": "Check GDPR compliance and payment terms",
    "combined_legal_text": "",
    "combined_compliance_text": combined_compliance_text,
    "combined_finance_text": combined_finance_text,
    "combined_operations_text": "",
    "legal_output": None,
    "compliance_output": None,
    "finance_output": None,
    "operations_output": None
}

result = app.invoke(state)
result.keys()


[Compliance Agent Running]


dict_keys(['query', 'combined_legal_text', 'combined_compliance_text', 'combined_finance_text', 'combined_operations_text', 'legal_output', 'compliance_output', 'finance_output', 'operations_output'])

Conversation Memory & State Persistence (Agent Memory)

In [82]:
from typing import TypedDict, List, Optional, Dict

class GraphState(TypedDict):
    query: str
    memory: List[dict]

    combined_legal_text: str
    combined_compliance_text: str
    combined_finance_text: str
    combined_operations_text: str

    legal: Optional[dict]
    compliance: Optional[dict]
    finance: Optional[dict]
    operations: Optional[dict]

Extended the shared graph state to include a memory field for storing intermediate agent outputs.

In [83]:
# Initialize Memory in Input State
initial_state = {
    "query": "Review contract for termination, GDPR compliance, payment terms, and SLAs",

    "combined_legal_text": combined_legal_text,
    "combined_compliance_text": combined_compliance_text,
    "combined_finance_text": combined_finance_text,
    "combined_operations_text": combined_operations_text,

    "legal": None,
    "compliance": None,
    "finance": None,
    "operations": None,

    "memory": []   
}

Initialized memory as an empty list to accumulate agent outputs during execution.

In [84]:
# Modify Agent Nodes to Write to Memory
def legal_node(state: GraphState):
    print("\n[Legal Agent Running]")
    text = state["combined_legal_text"]

    raw = legal_agent.run(text)
    validated = validate_agent_output(raw, clause_type="Legal")

    # store in state
    state["legal"] = validated

    # WRITE TO MEMORY
    state["memory"].append({
        "agent": "legal",
        "result": validated
    })

    return state

def compliance_node(state: GraphState):
    print("\n[Compliance Agent Running]")
    text = state["combined_compliance_text"]

    raw = compliance_agent.run(text)
    validated = validate_agent_output(raw, clause_type="Compliance")

    state["compliance"] = validated

    state["memory"].append({
        "agent": "compliance",
        "result": validated
    })

    return state

def finance_node(state: GraphState):
    print("\n[Finance Agent Running]")
    text = state["combined_finance_text"]

    raw = finance_agent.run(text)
    validated = validate_agent_output(raw, clause_type="Finance")

    state["finance"] = validated

    state["memory"].append({
        "agent": "finance",
        "result": validated
    })

    return state

def operations_node(state: GraphState):
    print("\n[Operations Agent Running]")
    text = state["combined_operations_text"]

    raw = operations_agent.run(text)
    validated = validate_agent_output(raw, clause_type="Operations")

    state["operations"] = validated

    state["memory"].append({
        "agent": "operations",
        "result": validated
    })

    return state


Modified agent nodes to append their outputs into shared memory, enabling state persistence across steps.

In [85]:
# Build Graph with Memory Support
graph = StateGraph(GraphState)

graph.add_node("legal_agent", legal_node)
graph.add_node("compliance_agent", compliance_node)
graph.add_node("finance_agent", finance_node)
graph.add_node("operations_agent", operations_node)

graph.set_entry_point("legal_agent")
graph.add_edge("legal_agent", "compliance_agent")
graph.add_edge("compliance_agent", "finance_agent")
graph.add_edge("finance_agent", "operations_agent")
graph.add_edge("operations_agent", END)

<langgraph.graph.state.StateGraph at 0x22509375d30>

Built a sequential multi-agent LangGraph with memory persistence enabled.

In [86]:
app = graph.compile()
result = app.invoke(initial_state)


[Legal Agent Running]

[Compliance Agent Running]

[Finance Agent Running]

[Operations Agent Running]


Executed the LangGraph while preserving and updating memory across agent nodes.

In [87]:
from pprint import pprint

print("\nFinal Memory Contents:")
pprint(result["memory"])


Final Memory Contents:
[{'agent': 'legal',
  'result': {'clause_type': 'Legal',
             'confidence': 0.85,
             'evidence': [],
             'extracted_clauses': [{'clause_text': 'Each party shall have the '
                                                   'right at any time to '
                                                   'terminate this Agreement '
                                                   'without prejudice to any '
                                                   'rights which it may have, '
                                                   'whether pursuant to the '
                                                   'provisions of this '
                                                   'Agreement or otherwise in '
                                                   'law or in equity or '
                                                   'otherwise, upon the '
                                                   'occurrence of any one or '
      

AGENT TO AGENT COMMUNICATION AND VALIDATION LOGIC

In [88]:
# Updated graph state
from typing import TypedDict, List, Optional, Dict

class GraphState(TypedDict):
    query: str

    # Combined context
    combined_legal_text: str
    combined_compliance_text: str
    combined_finance_text: str
    combined_operations_text: str

    # Agent outputs
    compliance: Optional[dict]
    finance: Optional[dict]
    legal: Optional[dict]

    # Shared knowledge
    memory: List[dict]                 # agent findings
    validation_notes: List[str]        # cross-agent validation


Enhanced shared state to support inter-agent memory and validation notes for collaborative reasoning.

In [89]:
# Initialize state
initial_state = {
    "query": "Evaluate GDPR clauses, financial penalties, and legal risks",

    "combined_compliance_text": combined_compliance_text,
    "combined_finance_text": combined_finance_text,
    "combined_legal_text": combined_legal_text,
    "combined_operations_text": "",     # not used in this flow

    "compliance": None,
    "finance": None,
    "legal": None,

    "memory": [],
    "validation_notes": []
}


Initialized shared memory and validation notes to support agent-to-agent communication.

In [None]:
# Compliance Agent (Writes Memory)
def compliance_node(state: GraphState):
    print("\n[Compliance Agent Running]")

    text = state["combined_compliance_text"]
    raw = compliance_agent.run(text)
    output = validate_agent_output(raw, clause_type="Compliance")

    state["compliance"] = output

    # Write to memory
    state["memory"].append({
        "agent": "compliance",
        "findings": output.get("extracted_clauses", [])
    })

    return state

# (Reads Compliance Memory + Adds Validation)
def finance_node(state: GraphState):
    print("\n[Finance Agent Running]")

    # Read compliance findings
    compliance_findings = [
        m for m in state["memory"] if m["agent"] == "compliance"
    ]

    text = state["combined_finance_text"]
    raw = finance_agent.run(text)
    output = validate_agent_output(raw, clause_type="Finance")

    state["finance"] = output

    # Add validation note if compliance findings exist
    if compliance_findings:
        state["validation_notes"].append(
            "Finance reviewed compliance findings for penalty or fee conflicts."
        )

    # Write finance findings to memory
    state["memory"].append({
        "agent": "finance",
        "findings": output.get("extracted_clauses", [])
    })

    return state


In [91]:
# Legal Agent (Final Validation & Check Both)
def legal_node(state: GraphState):
    print("\n[Legal Agent Running]")

    text = state["combined_legal_text"]
    raw = legal_agent.run(text)
    output = validate_agent_output(raw, clause_type="Legal")

    state["legal"] = output

    # Read all earlier findings
    compliance_findings = [m for m in state["memory"] if m["agent"] == "compliance"]
    finance_findings = [m for m in state["memory"] if m["agent"] == "finance"]

    # Add validation notes
    if compliance_findings:
        state["validation_notes"].append(
            "Legal cross-checked compliance obligations for contradictions."
        )

    if finance_findings:
        state["validation_notes"].append(
            "Legal evaluated financial liability and legal enforceability."
        )

    # Write to memory
    state["memory"].append({
        "agent": "legal",
        "findings": output.get("extracted_clauses", [])
    })

    return state


Extracted compliance clauses and stored findings in shared memory for downstream agents.

In [92]:
from langgraph.graph import StateGraph, END

graph = StateGraph(GraphState)

graph.add_node("compliance_agent", compliance_node)
graph.add_node("finance_agent", finance_node)
graph.add_node("legal_agent", legal_node)

graph.set_entry_point("compliance_agent")
graph.add_edge("compliance_agent", "finance_agent")
graph.add_edge("finance_agent", "legal_agent")
graph.add_edge("legal_agent", END)

<langgraph.graph.state.StateGraph at 0x2251a95c920>

Constructed a sequential LangGraph enabling agents to read from and write to shared memory.

In [93]:
app = graph.compile()
result = app.invoke(initial_state)


[Compliance Agent Running]

[Finance Agent Running]

[Legal Agent Running]


In [94]:
import pprint

print("\n=== FINAL MEMORY ===")
pprint.pprint(result["memory"])

print("\n=== VALIDATION NOTES ===")
pprint.pprint(result["validation_notes"])


=== FINAL MEMORY ===
[{'agent': 'compliance',
  'findings': ['16.1 Privacy and Security Matters. Concurrently with the '
               'execution of this Agreement, the Parties are executing a HIPAA '
               'Business Associate Agreement (the "BAA") in the form attached '
               'hereto as Exhibit E.',
               '16.2 Technical Standards. The Company will provide Allscripts '
               'with Updates so that the Subscription Software Services can be '
               'implemented and configured to comply in all material respects '
               'with applicable privacy and security standards (e.g., HITECH, '
               'HIPAA, and Omnibus rule) within a reasonably practicable '
               'timeframe (based on the scope of required enhancements and '
               'other factors) after their final, formal adoption and '
               'publication by the Secretary of the U.S. Department of Health '
               'and Human Services.',
               

COMPLIANCE PIPELINE

In [95]:
# Compliance query template
COMPLIANCE_QUERY = """
Identify clauses related to:
- Regulatory compliance
- Data protection
- Audits and reporting
- Privacy and security obligations
"""

Defined a structured query to retrieve compliance-related contract clauses.

In [97]:
# Pinecone Setup

import os
import json
import numpy as np
from tqdm.auto import tqdm
import matplotlib.pyplot as plt

from pinecone import Pinecone, ServerlessSpec
from sentence_transformers import SentenceTransformer
from dotenv import load_dotenv

load_dotenv()

assert "PINECONE_API_KEY" in os.environ, "PINECONE_API_KEY not found"

print("Pinecone API key loaded successfully")

pc = Pinecone(api_key=os.environ["PINECONE_API_KEY"])

model = SentenceTransformer("all-MiniLM-L6-v2")

print("Sentence-Transformer model loaded successfully")

Pinecone API key loaded successfully


'(ReadTimeoutError("HTTPSConnectionPool(host='huggingface.co', port=443): Read timed out. (read timeout=10)"), '(Request ID: d911be16-9302-44dc-91a5-376bb0777fce)')' thrown while requesting HEAD https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2/resolve/main/./config_sentence_transformers.json
Retrying in 1s [Retry 1/5].
'(ReadTimeoutError("HTTPSConnectionPool(host='huggingface.co', port=443): Read timed out. (read timeout=10)"), '(Request ID: 4f1f6839-90f2-4e31-8719-f633ba1d793f)')' thrown while requesting HEAD https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2/resolve/main/./config_sentence_transformers.json
Retrying in 2s [Retry 2/5].
'(ReadTimeoutError("HTTPSConnectionPool(host='huggingface.co', port=443): Read timed out. (read timeout=10)"), '(Request ID: 880b424b-3751-4011-b306-298b86816bd0)')' thrown while requesting HEAD https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2/resolve/main/./sentence_bert_config.json
Retrying in 1s [Retry 1/5].


Sentence-Transformer model loaded successfully


In [98]:
INDEX_NAME = "cuad-index-minilm"   
DIMENSION = 384                  

existing_indexes = [idx["name"] for idx in pc.list_indexes()]

if INDEX_NAME not in existing_indexes:
    pc.create_index(
        name=INDEX_NAME,
        dimension=DIMENSION,
        metric="cosine",
        spec=ServerlessSpec(
            cloud="aws",
            region="us-east-1"
        )
    )

index = pc.Index(INDEX_NAME)

print(f"Connected to Pinecone index: {INDEX_NAME}")


Connected to Pinecone index: cuad-index-minilm


In [99]:
# Building RAG Search Wrapper
import re
import json
import matplotlib.pyplot as plt
from typing import List, Dict

# # Sentence-Transformers embedding function for Pinecone
def embed_batch(texts):
    return model.encode(
        texts,
        show_progress_bar=False,
        convert_to_numpy=True
    )

# Embed a Query
def embed_query(query: str):
    return embed_batch([query])[0].tolist()

In [100]:
# Core RAG architecture
def rag_search(
    query: str,
    index,
    top_k: int = 5
) -> List[Dict]:
   
    query_vector = embed_query(query)

    results = index.query(
        vector=query_vector,
        top_k=top_k,
        include_metadata=True
    )

    retrieved_chunks = []

    for match in results["matches"]:
        retrieved_chunks.append({
            "score": match["score"],
            "contract_id": match["metadata"]["contract_id"],
            "chunk_id": match["metadata"]["chunk_id"],
            "text": match["metadata"]["text"]
        })

    return retrieved_chunks

Loaded the required prerequisites from milestone 1.

In [102]:
debug = index.query(
    vector=embed_query("test"),
    top_k=1,
    include_metadata=True
)

debug

QueryResponse(matches=[{'id': 'XENCORINC_10_25_2013-EX-10_24-COLLABORATION_AGREEMENT_3_cleaned_chunk_65',
 'metadata': {'chunk_id': 65,
              'contract_id': 'XENCORINC_10_25_2013-EX-10.24-COLLABORATION '
                             'AGREEMENT (3)_cleaned',
              'text': ". The testing laboratory's results shall be in writing "
                      'and shall be final and binding save for manifest error '
                      'on the face of its report. Unless otherwise agreed to '
                      'by the Parties in writing, the costs associated with '
                      'such testing and review shall be borne by the Party '
                      'against whom the testing laboratory result finally '
                      'rules. The testing laboratory shall be required to '
                      'enter into written undertakings of confidentiality no '
                      'less burdensome than set forth or referred to by this '
                      'Agreeme

In [104]:
def rag_search(query, index, top_k=5):
    query_vec = embed_query(query)

    results = index.query(
        vector=query_vec,
        top_k=top_k,
        include_metadata=True
    )

    retrieved_chunks = []

    for match in results["matches"]:
        metadata = match.get("metadata", {})

        retrieved_chunks.append({
            "score": match.get("score"),
            "contract_id": metadata.get("contract_id", "unknown"),
            "chunk_id": metadata.get("chunk_id", None),   # safe
            "text": metadata.get("text", "")
        })

    return retrieved_chunks


In [105]:
# Retrieve Compliance Context (RAG)
compliance_rag_results = rag_search(
    query=COMPLIANCE_QUERY,
    index=index,
    top_k=5
)

print(f"Retrieved {len(compliance_rag_results)} compliance-related chunks")

Retrieved 5 compliance-related chunks


Used vector-based semantic search to retrieve the most relevant contract chunks related to compliance obligations.

In [106]:
# Combine Retrieved Chunks
combined_compliance_text = "\n\n".join(
    [c["text"] for c in compliance_rag_results]
)

print(combined_compliance_text[:500])

13. Confidential Information; Audit Rights 13.1. Confidentiality Obligation. It is contemplated that in the course of the performance of this Agreement each Party may, from time to time, disclose proprietary and confidential information to the other ("Confidential Information")

The foregoing provisions shall not prevent the disclosure or use by either party of any information which is or hereafter, through no fault of the other party, become public knowledge or to the extent permitted by law. N


Merged retrieved contract chunks into a single text block for downstream compliance analysis.

In [107]:
raw_compliance_output = compliance_agent.run(combined_compliance_text)
print(raw_compliance_output)

```json
{
  "extracted_clauses": [
    "It is contemplated that in the course of the performance of this Agreement each Party may, from time to time, disclose proprietary and confidential information to the other (\"Confidential Information\")",
    "The parties agree to ensure that they will at all times comply with the provisions and obligations imposed by the Data Protection Act 1984, the EU Data Protection Directive 95/46 and any implementing legislation in the United Kingdom.",
    "Both parties agree to indemnify each other in respect of any unauthorised disclosure of data by them.",
    "You agree to abide by all applicable laws pertaining to the privacy of consumer, employee, and transactional information (\"Privacy Laws\")",
    "You agree to comply with our standards and policies that we may issue (without any obligation to do so) pertaining to the privacy of consumer, employee, and transactional information. If there is a conflict between our standards and policies and Priva

Executed the Compliance Agent on the retrieved contract context to extract compliance obligations and assess risk.

In [108]:
validated_compliance_output = validate_agent_output(
    raw_compliance_output,
    clause_type="Compliance"
)

validated_compliance_output

{'clause_type': 'Compliance',
 'extracted_clauses': ['It is contemplated that in the course of the performance of this Agreement each Party may, from time to time, disclose proprietary and confidential information to the other ("Confidential Information")',
  'The parties agree to ensure that they will at all times comply with the provisions and obligations imposed by the Data Protection Act 1984, the EU Data Protection Directive 95/46 and any implementing legislation in the United Kingdom.',
  'Both parties agree to indemnify each other in respect of any unauthorised disclosure of data by them.',
  'You agree to abide by all applicable laws pertaining to the privacy of consumer, employee, and transactional information ("Privacy Laws")',
  'You agree to comply with our standards and policies that we may issue (without any obligation to do so) pertaining to the privacy of consumer, employee, and transactional information. If there is a conflict between our standards and policies and Pri

Validated the Compliance Agent’s output against the standard JSON schema to ensure structured and reliable results.

In [110]:
compliance_risk_summary = {
    "risk_level": validated_compliance_output["risk_level"],
    "confidence": validated_compliance_output["confidence"],
    "num_clauses": len(validated_compliance_output["extracted_clauses"])
}

compliance_risk_summary

{'risk_level': 'high', 'confidence': 0.95, 'num_clauses': 6}

Generated a concise compliance risk summary including overall risk level, confidence score, and number of identified compliance clauses.

In [111]:
# Final Compliance Pipeline Output
compliance_pipeline_output = {
    "query": COMPLIANCE_QUERY.strip(),
    "retrieved_chunks": len(compliance_rag_results),
    "analysis": validated_compliance_output,
    "summary": compliance_risk_summary
}

# Optional: save to file
with open("compliance_pipeline_output.json", "w", encoding="utf-8") as f:
    json.dump(compliance_pipeline_output, f, indent=2)

compliance_pipeline_output


{'query': 'Identify clauses related to:\n- Regulatory compliance\n- Data protection\n- Audits and reporting\n- Privacy and security obligations',
 'retrieved_chunks': 5,
 'analysis': {'clause_type': 'Compliance',
  'extracted_clauses': ['It is contemplated that in the course of the performance of this Agreement each Party may, from time to time, disclose proprietary and confidential information to the other ("Confidential Information")',
   'The parties agree to ensure that they will at all times comply with the provisions and obligations imposed by the Data Protection Act 1984, the EU Data Protection Directive 95/46 and any implementing legislation in the United Kingdom.',
   'Both parties agree to indemnify each other in respect of any unauthorised disclosure of data by them.',
   'You agree to abide by all applicable laws pertaining to the privacy of consumer, employee, and transactional information ("Privacy Laws")',
   'You agree to comply with our standards and policies that we m

In [112]:
import json

with open("compliance_pipeline_output.json", "w", encoding="utf-8") as f:
    json.dump(compliance_pipeline_output, f, indent=2)

print("Saved → compliance_pipeline_output.json")


Saved → compliance_pipeline_output.json


FINANCE PIPELINE

In [113]:
FINANCE_QUERY = """
Identify clauses related to:
- Payment terms
- Fees and invoicing
- Penalties or late fees
- Financial liability
"""

Defined a focused query template to retrieve clauses related to regulatory finance, fees, and penalty.

In [None]:
# Retrieve Finance Context (RAG)
finance_rag_results = rag_search(
    query=FINANCE_QUERY,
    index=index,
    top_k=5
)

print(f"Retrieved {len(finance_rag_results)} finance-related chunks")

Retrieved 5 finance-related chunks


Used vector-based semantic search to retrieve the most relevant contract chunks related to finance obligations.

In [115]:
combined_finance_text = "\n\n".join(
    [c["text"] for c in finance_rag_results]
)

print(combined_finance_text[:500])

. e. Late Fees and Collection Costs. If Buyer fails to pay Seller an amount owed under this Agreement by the invoice due date, then Buyer will owe Seller: (i) the delinquent amount; and (ii) a late payment fee equal to two percent (2%) of the delinquent amount for each full or partial calendar month past the invoice due date that the delinquent amount remains unpaid. In addition, if Seller has to file

(b) Service Provider shall invoice Owner within thirty (30) days of completion of any Non-Cove


Merged retrieved contract chunks into a single context input for downstream finance analysis.

In [116]:
raw_finance_output = finance_agent.run(combined_finance_text)
print(raw_finance_output)

```json
{
  "extracted_clauses": [
    {
      "clause_text": "If Buyer fails to pay Seller an amount owed under this Agreement by the invoice due date, then Buyer will owe Seller: (i) the delinquent amount; and (ii) a late payment fee equal to two percent (2%) of the delinquent amount for each full or partial calendar month past the invoice due date that the delinquent amount remains unpaid.",
      "risk_level": "high",
      "confidence": 0.95,
      "evidence": [
        "Late Fees and Collection Costs. If Buyer fails to pay Seller an amount owed under this Agreement by the invoice due date, then Buyer will owe Seller: (i) the delinquent amount; and (ii) a late payment fee equal to two percent (2%) of the delinquent amount for each full or partial calendar month past the invoice due date that the delinquent amount remains unpaid."
      ]
    },
    {
      "clause_text": "Owner shall pay Service Provider within thirty (30) days after the invoice date. Fees are conditioned upon tim

Executed the finance agent on the retrieved context to extract finance-related clauses and assess associated risks. Then, validated it.

In [117]:
validated_finance_output = validate_agent_output(
    raw_finance_output,
    clause_type="Finance"
)

validated_finance_output

{'clause_type': 'Finance',
 'extracted_clauses': [{'clause_text': 'If Buyer fails to pay Seller an amount owed under this Agreement by the invoice due date, then Buyer will owe Seller: (i) the delinquent amount; and (ii) a late payment fee equal to two percent (2%) of the delinquent amount for each full or partial calendar month past the invoice due date that the delinquent amount remains unpaid.',
   'risk_level': 'high',
   'confidence': 0.95,
   'evidence': ['Late Fees and Collection Costs. If Buyer fails to pay Seller an amount owed under this Agreement by the invoice due date, then Buyer will owe Seller: (i) the delinquent amount; and (ii) a late payment fee equal to two percent (2%) of the delinquent amount for each full or partial calendar month past the invoice due date that the delinquent amount remains unpaid.']},
  {'clause_text': 'Owner shall pay Service Provider within thirty (30) days after the invoice date. Fees are conditioned upon timely payment and any past due balanc

Validated the Finance Agent’s output to ensure schema consistency and reliable structured results.

In [118]:
finance_risk_summary = {
    "risk_level": validated_finance_output["risk_level"],
    "confidence": validated_finance_output["confidence"],
    "num_clauses": len(validated_finance_output["extracted_clauses"])
}

finance_risk_summary

{'risk_level': 'high', 'confidence': 0.88, 'num_clauses': 4}

Generated a concise finance risk summary including risk level, confidence score, and number of identified clauses.

In [119]:
finance_pipeline_output = {
    "query": FINANCE_QUERY.strip(),
    "retrieved_chunks": len(finance_rag_results),
    "analysis": validated_finance_output,
    "summary": finance_risk_summary
}

finance_pipeline_output

{'query': 'Identify clauses related to:\n- Payment terms\n- Fees and invoicing\n- Penalties or late fees\n- Financial liability',
 'retrieved_chunks': 5,
 'analysis': {'clause_type': 'Finance',
  'extracted_clauses': [{'clause_text': 'If Buyer fails to pay Seller an amount owed under this Agreement by the invoice due date, then Buyer will owe Seller: (i) the delinquent amount; and (ii) a late payment fee equal to two percent (2%) of the delinquent amount for each full or partial calendar month past the invoice due date that the delinquent amount remains unpaid.',
    'risk_level': 'high',
    'confidence': 0.95,
    'evidence': ['Late Fees and Collection Costs. If Buyer fails to pay Seller an amount owed under this Agreement by the invoice due date, then Buyer will owe Seller: (i) the delinquent amount; and (ii) a late payment fee equal to two percent (2%) of the delinquent amount for each full or partial calendar month past the invoice due date that the delinquent amount remains unpai

Packaged retrieval metadata, validated financial analysis, and summarized risk into a single structured output.

In [120]:
import json

with open("finance_pipeline_output.json", "w", encoding="utf-8") as f:
    json.dump(finance_pipeline_output, f, indent=2)

print("Saved → finance_pipeline_output.json")

Saved → finance_pipeline_output.json


LEGAL PIPELINE

In [121]:
LEGAL_QUERY = """
Identify clauses related to:
- Termination
- Governing law
- Jurisdiction
- Indemnity
- Legal obligations and liabilities
"""

Defined a focused query template to retrieve clauses related to legal.

In [122]:
legal_rag_results = rag_search(
    query=LEGAL_QUERY,
    index=index,
    top_k=5
)

print(f"Retrieved {len(legal_rag_results)} legal chunks")

Retrieved 5 legal chunks


Used vector-based semantic search to retrieve the most relevant contract chunks related to legal obligations.

In [123]:
combined_legal_text = "\n\n".join(
    [c["text"] for c in legal_rag_results]
)

print(combined_legal_text[:500])

. Clauses 5.6 (Effect of Termination), 6 (Confidential Information), 7 (Proprietary Rights), 9 (Limitation of Liability), 10 (Indemnification) and 11 (General) shall survive the termination or expiration of this Agreement.

. (c) Termination. Either Party may terminate this Agreement prior to its expiration upon the occurrence of either of the following: (i) the other Party becomes insolvent, or institutes (or there is instituted against it) proceedings in bankruptcy, insolvency, reorganization 


Merged retrieved contract chunks into a single context input for downstream legal analysis.

In [124]:
raw_legal_output = legal_agent.run(combined_legal_text)
print(raw_legal_output)

```json
{
  "extracted_clauses": [
    "Either Party may terminate this Agreement prior to its expiration upon the occurrence of either of the following: (i) the other Party becomes insolvent, or institutes (or there is instituted against it) proceedings in bankruptcy, insolvency, reorganization or dissolution, makes an assignment for the benefit of creditors or becomes nationalized or has any of its material assets confiscated or expropriated; or (ii) the other Party (in this case, the \"breaching Party\") fails to perform any of its obligations hereunder and fails to correct such failure within Ninety (90) calendar days after receiving written demand therefore from the non-breaching Party, specifying the failure in sufficient detail for the breaching Party to correct such failure; provided, however, that upon a second breach of the same obligation by such Party, the other Party may forthwith terminate this Agreement upon notice to the breaching Party.",
    "Each party shall have the

Executed the legal agent on the retrieved context to extract legal-related clauses and assess associated risks.
Validated the agent’s output against a predefined schema to ensure structured and reliable legal analysis.

In [125]:
validated_legal_output = validate_agent_output(
    raw_legal_output,
    clause_type="Legal"
)

validated_legal_output

{'clause_type': 'Legal',
 'extracted_clauses': ['Either Party may terminate this Agreement prior to its expiration upon the occurrence of either of the following: (i) the other Party becomes insolvent, or institutes (or there is instituted against it) proceedings in bankruptcy, insolvency, reorganization or dissolution, makes an assignment for the benefit of creditors or becomes nationalized or has any of its material assets confiscated or expropriated; or (ii) the other Party (in this case, the "breaching Party") fails to perform any of its obligations hereunder and fails to correct such failure within Ninety (90) calendar days after receiving written demand therefore from the non-breaching Party, specifying the failure in sufficient detail for the breaching Party to correct such failure; provided, however, that upon a second breach of the same obligation by such Party, the other Party may forthwith terminate this Agreement upon notice to the breaching Party.',
  "Each party shall hav

Generated a concise legal risk summary including risk level, confidence score, and number of identified clauses.

In [126]:
# legal risk summary
legal_risk_summary = {
    "risk_level": validated_legal_output["risk_level"],
    "confidence": validated_legal_output["confidence"],
    "num_clauses": len(validated_legal_output["extracted_clauses"])
}

legal_risk_summary

{'risk_level': 'Medium', 'confidence': 0.95, 'num_clauses': 2}

Generated a concise summary of legal risks, including total extracted legal clauses, risk rating, and confidence score.

In [127]:
legal_pipeline_output = {
    "query": LEGAL_QUERY.strip(),
    "retrieved_chunks": len(legal_rag_results),
    "analysis": validated_legal_output,
    "summary": legal_risk_summary
}

legal_pipeline_output

{'query': 'Identify clauses related to:\n- Termination\n- Governing law\n- Jurisdiction\n- Indemnity\n- Legal obligations and liabilities',
 'retrieved_chunks': 5,
 'analysis': {'clause_type': 'Legal',
  'extracted_clauses': ['Either Party may terminate this Agreement prior to its expiration upon the occurrence of either of the following: (i) the other Party becomes insolvent, or institutes (or there is instituted against it) proceedings in bankruptcy, insolvency, reorganization or dissolution, makes an assignment for the benefit of creditors or becomes nationalized or has any of its material assets confiscated or expropriated; or (ii) the other Party (in this case, the "breaching Party") fails to perform any of its obligations hereunder and fails to correct such failure within Ninety (90) calendar days after receiving written demand therefore from the non-breaching Party, specifying the failure in sufficient detail for the breaching Party to correct such failure; provided, however, th

Prepared a structured package containing the original legal query, retrieval metadata, validated analysis, and risk summary.

In [128]:
import json

with open("legal_pipeline_output.json", "w", encoding="utf-8") as f:
    json.dump(legal_pipeline_output, f, indent=2)

print("Saved → legal_pipeline_output.json")

Saved → legal_pipeline_output.json


OPERATIONS PIPELINE

In [129]:
OPERATIONS_QUERY = """
Identify clauses related to:
- Operational deliverables
- Timelines and milestones
- Service obligations
- Performance standards (SLAs)
"""

Defined a focused query template to retrieve clauses related to regulatory operations.

In [130]:
operations_rag_results = rag_search(
    query=OPERATIONS_QUERY,
    index=index,
    top_k=5
)

print(f"Retrieved {len(operations_rag_results)} operations-related chunks")

Retrieved 5 operations-related chunks


Used vector-based semantic search to retrieve the most relevant contract chunks related to operations obligations.

In [131]:
combined_operations_text = "\n\n".join(
    [c["text"] for c in operations_rag_results]
)

print(combined_operations_text[:500])

1.3 Performance Objectives. Service Provider shall perform the Services and its obligations under this Agreement in a manner that (a) insures the operation of the SEF within all required operational parameters and requirements, (b) preserves all warranties provided by manufacturers, suppliers, or Service Providers who provided materials or labor under the EPC Agreement relating to the SEF, subject to Force Majeure, (c) maintains the SEF, and

(d) seeks to minimize the variable operating costs of


Merged retrieved contract chunks into a single context input for downstream operations analysis.

In [132]:
raw_operations_output = operations_agent.run(combined_operations_text)
print(raw_operations_output)

```json
{
  "extracted_clauses": [
    {
      "clause_text": "Service Provider shall perform the Services and its obligations under this Agreement in a manner that (a) insures the operation of the SEF within all required operational parameters and requirements, (b) preserves all warranties provided by manufacturers, suppliers, or Service Providers who provided materials or labor under the EPC Agreement relating to the SEF, subject to Force Majeure, (c) maintains the SEF, and (d) seeks to minimize the variable operating costs of and wear and tear on the SEF, including using commercially reasonable efforts to achieve industry standard levels of SEF availability.",
      "risk_level": "medium",
      "confidence": 0.95,
      "evidence": [
        "Service Provider shall perform the Services and its obligations under this Agreement in a manner that (a) insures the operation of the SEF within all required operational parameters and requirements"
      ]
    },
    {
      "clause_text": "

Executed the operations agent on the retrieved context to extract operations-related clauses and assess associated risks. 
Validated the agent’s output against a predefined schema to ensure structured and reliable operations analysis.

In [133]:
validated_operations_output = validate_agent_output(
    raw_operations_output,
    clause_type="Operations"
)

validated_operations_output

{'clause_type': 'Operations',
 'extracted_clauses': [{'clause_text': 'Service Provider shall perform the Services and its obligations under this Agreement in a manner that (a) insures the operation of the SEF within all required operational parameters and requirements, (b) preserves all warranties provided by manufacturers, suppliers, or Service Providers who provided materials or labor under the EPC Agreement relating to the SEF, subject to Force Majeure, (c) maintains the SEF, and (d) seeks to minimize the variable operating costs of and wear and tear on the SEF, including using commercially reasonable efforts to achieve industry standard levels of SEF availability.',
   'risk_level': 'medium',
   'confidence': 0.95,
   'evidence': ['Service Provider shall perform the Services and its obligations under this Agreement in a manner that (a) insures the operation of the SEF within all required operational parameters and requirements']},
  {'clause_text': 'C. Dependencies - Obligations, t

Generated a concise compliance risk summary including risk level, confidence score, and number of identified clauses.

In [134]:
# risk summary
operations_risk_summary = {
    "risk_level": validated_operations_output["risk_level"],
    "confidence": validated_operations_output["confidence"],
    "num_clauses": len(validated_operations_output["extracted_clauses"])
}

operations_risk_summary

{'risk_level': 'medium', 'confidence': 0.85, 'num_clauses': 3}

Generated a concise summary showing operational risk level, model confidence, and number of extracted operational clauses.

In [135]:
# package operations pipeline output
operations_pipeline_output = {
    "query": OPERATIONS_QUERY.strip(),
    "retrieved_chunks": len(operations_rag_results),
    "analysis": validated_operations_output,
    "summary": operations_risk_summary
}

operations_pipeline_output

{'query': 'Identify clauses related to:\n- Operational deliverables\n- Timelines and milestones\n- Service obligations\n- Performance standards (SLAs)',
 'retrieved_chunks': 5,
 'analysis': {'clause_type': 'Operations',
  'extracted_clauses': [{'clause_text': 'Service Provider shall perform the Services and its obligations under this Agreement in a manner that (a) insures the operation of the SEF within all required operational parameters and requirements, (b) preserves all warranties provided by manufacturers, suppliers, or Service Providers who provided materials or labor under the EPC Agreement relating to the SEF, subject to Force Majeure, (c) maintains the SEF, and (d) seeks to minimize the variable operating costs of and wear and tear on the SEF, including using commercially reasonable efforts to achieve industry standard levels of SEF availability.',
    'risk_level': 'medium',
    'confidence': 0.95,
    'evidence': ['Service Provider shall perform the Services and its obligati

Packaged the operations analysis, retrieved chunk metadata, and risk summary into a single structured output dictionary.

In [136]:
# save output
import json

with open("operations_pipeline_output.json", "w", encoding="utf-8") as f:
    json.dump(operations_pipeline_output, f, indent=2)

print("Saved → operations_pipeline_output.json")

Saved → operations_pipeline_output.json


COORDINATOR: MERGING PIPELINE OUTPUTS

In [138]:
legal_output = legal_pipeline_output
compliance_output = compliance_pipeline_output
finance_output = finance_pipeline_output
operations_output = operations_pipeline_output

Consumed validated outputs from domain-specific pipelines as inputs for coordinated analysis.

In [140]:
FINAL_SCHEMA = {
    "legal": {},
    "compliance": {},
    "finance": {},
    "operations": {},
    "overall_risk": "",
    "confidence_scores": {},
    "total_clauses": 0
}

Defined a common JSON schema to standardize merged outputs across all agent pipelines.

In [141]:
# merge pipeline outputs
def coordinator_merge(legal, compliance, finance, operations):
    return {
        "legal": legal["analysis"],
        "compliance": compliance["analysis"],
        "finance": finance["analysis"],
        "operations": operations["analysis"]
    }

merged_output = coordinator_merge(
    legal_output,
    compliance_output,
    finance_output,
    operations_output
)

merged_output.keys()

dict_keys(['legal', 'compliance', 'finance', 'operations'])

Merged domain-specific analyses into a single structured response using a coordinator function.

In [146]:
# Compute overall risk
risk_map = {"low": 1, "medium": 2, "high": 3}

def compute_overall_risk(merged):
    risks = [
        merged["legal"]["risk_level"],
        merged["compliance"]["risk_level"],
        merged["finance"]["risk_level"],
        merged["operations"]["risk_level"]
    ]

    # normalize to lowercase
    risks = [r.lower() for r in risks]

    # map to numbers
    scores = [risk_map[r] for r in risks]

    avg_score = sum(scores) / len(scores)

    if avg_score <= 1.5:
        return "low"
    elif avg_score <= 2.2:
        return "medium"
    else:
        return "high"

In [147]:
overall_risk = compute_overall_risk(merged_output)
overall_risk

'high'

Calculated a unified contract-level risk score by averaging individual agent risk levels using numerical weighting.

In [148]:
# Build final JSON Output
def build_final_output(merged):
    return {
        "legal": merged["legal"],
        "compliance": merged["compliance"],
        "finance": merged["finance"],
        "operations": merged["operations"],

        "confidence_scores": {
            "legal": merged["legal"]["confidence"],
            "compliance": merged["compliance"]["confidence"],
            "finance": merged["finance"]["confidence"],
            "operations": merged["operations"]["confidence"]
        },

        "total_clauses": (
            len(merged["legal"]["extracted_clauses"]) +
            len(merged["compliance"]["extracted_clauses"]) +
            len(merged["finance"]["extracted_clauses"]) +
            len(merged["operations"]["extracted_clauses"])
        ),

        "overall_risk": compute_overall_risk(merged)
    }

In [149]:
final_json_output = build_final_output(merged_output)
final_json_output

{'legal': {'clause_type': 'Legal',
  'extracted_clauses': ['Either Party may terminate this Agreement prior to its expiration upon the occurrence of either of the following: (i) the other Party becomes insolvent, or institutes (or there is instituted against it) proceedings in bankruptcy, insolvency, reorganization or dissolution, makes an assignment for the benefit of creditors or becomes nationalized or has any of its material assets confiscated or expropriated; or (ii) the other Party (in this case, the "breaching Party") fails to perform any of its obligations hereunder and fails to correct such failure within Ninety (90) calendar days after receiving written demand therefore from the non-breaching Party, specifying the failure in sufficient detail for the breaching Party to correct such failure; provided, however, that upon a second breach of the same obligation by such Party, the other Party may forthwith terminate this Agreement upon notice to the breaching Party.',
   "Each par

Constructed the final unified compliance–finance–legal–operations JSON output including per-agent confidence scores, total extracted clauses, and overall risk rating.

In [150]:
# save final output
import json

with open("final_merged_output.json", "w", encoding="utf-8") as f:
    json.dump(final_json_output, f, indent=2)

print("Saved → final_merged_output.json")

Saved → final_merged_output.json
