In [None]:
# Import Azure OpenAI
from langchain_openai import AzureChatOpenAI

# Initialize LLM with explicit endpoint + key
llm = AzureChatOpenAI(
    deployment_name="neostats_hackathon_api_v1",
    model="gpt-4.1",                         
    temperature=0,
    api_version="2024-05-01-preview",             
    api_key="Azure_Open_API",                      # Replace with your actual key
    azure_endpoint="Https://neoaihackathon.cognitiveservices.azure.com/openai/deployments/gpt-4.1/chat/completions?api-version=2025-01-01-preview" 
)

# Simple call
response = llm.invoke("Hi")
print(response.content)

In [1]:
contract_document = """Statement of Work (SOW)
This Statement of Work (&quot;SOW&quot;) is entered into by and between Patrick and Fredrick
Associates Inc. (&quot;Client&quot;) and Hari and Winston Associates LLC (&quot;Service Provider&quot; or
“Company”), effective as of March 1, 2025. This SOW is governed by the terms and
conditions outlined in the Master Services Agreement (&quot;MSA&quot;) signed between the parties.
1. Scope of Services
Neostats Analytics LLC agrees to provide the following services to the Client:
 Data analytics consulting
 Dashboard development and visualization
 Machine Learning model implementation
 Data pipeline optimization
 Ongoing technical support and system maintenance
2. Deliverables
The Service Provider will deliver the following:
 Custom Power BI dashboards tailored to Client’s specific needs
 Monthly reports detailing system performance and insights
 Technical documentation and user guides
 Regular updates on ML model improvements
3. Project Timeline
 Project Kickoff: March 5, 2025
 Milestone 1: Initial assessment and requirements gathering (March 15, 2025)
 Milestone 2: Prototype delivery (April 10, 2025)
 Milestone 3: Final dashboard deployment (May 20, 2025)
 Completion Date: June 15, 2025 (Subject to extension based on mutual agreement,
with priority to Neostats Analytics LLC&#39;s scheduling constraints)
4. Fees and Payment Terms
The client will pay the service provider the following amount:
 Total Fee: $150,000 USD
 Payment Schedule:
o $75,000 USD upon signing this SOW
o $45,000 USD upon delivery of the prototype
o $30,000 USD upon final acceptance
 Late payments beyond 15 days from the due date will incur a late fee of 5% per
month.

5. Intellectual Property Rights
All deliverables, including but not limited to software, models, and documentation, shall
remain the exclusive property of Neostats Analytics LLC unless otherwise agreed in writing.
6. Confidentiality
Both parties agree to maintain the confidentiality of proprietary information. However,
Neostats Analytics LLC reserves the right to use anonymized data insights for future
research and development purposes.
7. Termination Clause
 The Client may terminate this agreement with 60 days&#39; written notice.
 Neostats Analytics LLC reserves the right to terminate with 30 days&#39; written notice,
with all outstanding payments due immediately upon termination.
8. Limitation of Liability
The total liability of Neostats Analytics LLC under this agreement shall not exceed $150,000
USD. Neostats shall not be liable for indirect, incidental, or consequential damages.
9. Force Majeure
The Service Provider shall not be excused from the performance of any obligations under
this Agreement on account of Force Majeure events. Notwithstanding the occurrence of any
event beyond the control of the Service Provider, including but not limited to acts of God,
governmental restrictions, strikes, lockouts, riots, epidemics, or natural disasters, the Service
Provider shall remain fully liable for timely and complete performance of all its obligations
under this Agreement. The Client shall, however, be excused from performance to the extent
affected by Force Majeure events.
10. Indemnity
The Company agrees to indemnify and hold harmless the Client, its directors, officers, and
employees against any third-party claims, damages, or expenses (including reasonable legal
fees) directly arising out of (i) a material breach of this Agreement by the Company, or (ii)
willful misconduct or gross negligence of the Company or its personnel in the performance of
obligations under this Agreement. The Company’s liability under this indemnity shall,
however, not extend to any claims resulting from (a) the Client’s negligence, misconduct, or
breach of this Agreement, or (b) indirect, consequential, or punitive damages, except to the
extent such exclusion is prohibited by law.

11. Dispute Resolution Mechanism
Any dispute, controversy, or claim arising out of or relating to this Agreement, or the breach,
termination, or validity thereof, shall be resolved exclusively in the state or federal courts
located in New York, United States. Each party irrevocably submits to the personal
jurisdiction of such courts and waives any objection based on forum non conveniens or any
other objection to venue.
This Agreement shall be governed by and construed in accordance with the laws of the
State of New York, without regard to its conflict of law provisions.

In the event of any action or proceeding to enforce rights under this Agreement, the
prevailing party shall be entitled to recover its reasonable attorneys’ fees and costs.
12. Amendments
Any amendments to this SOW must be made in writing and signed by authorized
representatives of both parties. Neostats Analytics LLC reserves the right to adjust the
project scope and timelines due to unforeseen technical complexities, with prior notice.
13. Signatures
For Patrick and Fredrick Associates Inc.
Name: ___________________________
Title: ____________________________
Date: ____________________________
For Hari and Winston Associates LLC
Name: ___________________________
Title: ____________________________
Date: ____________________________"""


In [3]:
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings import HuggingFaceEmbeddings

# Load all-MiniLM model
embeddings = HuggingFaceEmbeddings(
    model_name="sentence-transformers/all-MiniLM-L6-v2"
)
def dynamic_chunker(text: str, C_min=400, C_max=2000, alpha=30, beta=0.20):
    N = len(text)

    # Formula-based chunk size
    chunk_size = min(C_max, max(C_min, int(alpha * (N ** 0.6))))
    overlap = int(beta * chunk_size)

    splitter = RecursiveCharacterTextSplitter(
        chunk_size=chunk_size,
        chunk_overlap=overlap,
        separators=["\n\n", "\n", ". ", " ", ""]
    )

    chunks = splitter.split_text(text)
    return chunks, chunk_size, overlap

# Example



  embeddings = HuggingFaceEmbeddings(
  from .autonotebook import tqdm as notebook_tqdm
2025-09-27 12:54:41.557299: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


In [4]:
chunks, size, overlap = dynamic_chunker(contract_document)

print(f"Doc length: {len(contract_document)}")
print(f"Chunk size: {size}, Overlap: {overlap}, Total chunks: {len(chunks)}")
print(chunks[0][:200])

Doc length: 5224
Chunk size: 2000, Overlap: 400, Total chunks: 4
Statement of Work (SOW)
This Statement of Work (&quot;SOW&quot;) is entered into by and between Patrick and Fredrick
Associates Inc. (&quot;Client&quot;) and Hari and Winston Associates LLC (&quot;Ser


In [8]:
from langchain.vectorstores import Chroma

# Create vectorstore from text chunks
vectordb = Chroma.from_texts(
    texts=chunks,
    embedding=embeddings,
    collection_name="contract_db"
)

# Persist if you want
vectordb.persist()


  vectordb.persist()


In [14]:
# Create a retriever
retriever = vectordb.as_retriever(
    search_type="similarity",   # could also use "mmr"
    search_kwargs={"k": 5}      # top 5 relevant chunks
)

# Example query
query = "payment terms and deadlines"
results = retriever.get_relevant_documents(query)

# Inspect results
for i, doc in enumerate(results, 1):
    print(f"Result {i}:")
    print("Text:", doc.page_content[:300])
    if hasattr(doc, "metadata"):
        print("Metadata:", doc.metadata)
    print("---")

Result 1:
Text: employees against any third-party claims, damages, or expenses (including reasonable legal
fees) directly arising out of (i) a material breach of this Agreement by the Company, or (ii)
willful misconduct or gross negligence of the Company or its personnel in the performance of
obligations under this
Metadata: {}
---
Result 2:
Text: 11. Dispute Resolution Mechanism
Any dispute, controversy, or claim arising out of or relating to this Agreement, or the breach,
termination, or validity thereof, shall be resolved exclusively in the state or federal courts
located in New York, United States. Each party irrevocably submits to the pe
Metadata: {}
---
Result 3:
Text: 5. Intellectual Property Rights
All deliverables, including but not limited to software, models, and documentation, shall
remain the exclusive property of Neostats Analytics LLC unless otherwise agreed in writing.
6. Confidentiality
Both parties agree to maintain the confidentiality of proprietary i
Metadata: {}
--

In [52]:
import re
from langchain_openai import AzureChatOpenAI

# --- Assume you already initialized your LLM ---
# llm = AzureChatOpenAI(...)

# --- Agent 1: Extract Sections ---
def section_extractor(documents):
    """Extract sections that start with numbers (1, 2, 3...)"""
    combined_text = "\n\n".join([doc.page_content for doc in documents])
    sections = re.findall(r'(\d+\..*?)(?=\n\d+\.|\Z)', combined_text, re.S)
    return [sec.strip() for sec in sections]

# --- Agent 2: Analyze Sections ---
def section_analyzer(sections, llm):
    """Analyze each section individually with the LLM"""
    analyses = []
    for idx, sec in enumerate(sections, 1):
        prompt = f"""
You are a contract section analyzer. 
Analyze the following contract section carefully and extract key points:
Section:
{sec}

Focus on deadlines, payment terms, and obligations.
Provide a structured analysis.
"""
        response = llm.invoke(prompt)
        analyses.append(f"Section {idx} Analysis:\n{response.content}\n")
    return analyses

# --- Agent 3: Compose Final Answer ---
def answer_composer(analyses, query, llm):
    """Combine all section analyses into a final detailed answer"""
    combined_analysis = "\n".join(analyses)
    prompt = f"""
You are a contract analysis assistant. 
You have been given analyses of individual contract sections:

{combined_analysis}

Now, based on the query:
{query}

Please provide a comprehensive final answer that clearly lists:
- Payment terms
- Deadlines
- Obligations
Organize it neatly as bullet points.
"""
    response = llm.invoke(prompt)
    return response.content

# --- Main function ---
def analyze_documents_multi_agent(query: str, retriever, llm):
    # Step 1: Retrieve relevant docs
    documents = retriever.get_relevant_documents(query)

    # Step 2: Agent 1 - Extract sections
    sections = section_extractor(documents)

    # Step 3: Agent 2 - Analyze sections
    analyses = section_analyzer(sections, llm)

    # Step 4: Agent 3 - Compose final structured answer
    final_answer = answer_composer(analyses, query, llm)

    return final_answer


# --- Example Run ---
query = "Analyze payment terms and deadlines in the documents."
output = analyze_documents_multi_agent(query, retriever, llm)
print(output)


Certainly! Here is a comprehensive summary of **Payment Terms**, **Deadlines**, and **Obligations** based on the analyses of all contract sections:

---

## **Payment Terms**

- **Total Fee:**  
  - The client will pay the service provider a total of **$150,000 USD** for the services described in the Statement of Work (SOW).

- **Payment Schedule (Milestone-Based):**
  - **First Payment:** $75,000 USD due **upon signing** of the SOW.
  - **Second Payment:** $45,000 USD due **upon delivery of the prototype**.
  - **Third Payment:** $30,000 USD due **upon final acceptance** of the deliverables.

- **Payment Deadlines:**
  - Each payment is due **immediately** upon completion of the corresponding milestone (signing, prototype delivery, final acceptance).
  - Payments must be made **within 15 days** of the due date for each milestone.

- **Late Payment Terms:**
  - If any payment is **more than 15 days late**, a **late fee of 5% per month** will be applied to the overdue amount.

- **Gover

# ChatBot


In [None]:
while True:
    query = input("Enter your analysis query (or 'exit' to quit): ")
    output = analyze_documents_multi_agent(query, retriever, llm)
    print(output)

In [46]:
# ------------------- Imports -------------------
import asyncio
import json
from dataclasses import dataclass, asdict
from enum import Enum
from datetime import datetime
from typing import List, Dict, Any
import numpy as np

from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.vectorstores import Chroma
from langchain.schema import Document, HumanMessage, SystemMessage
from langchain.chat_models import AzureChatOpenAI

# ------------------- Risk Levels -------------------
class RiskLevel(Enum):
    HIGH = "HIGH"
    MEDIUM = "MEDIUM"
    LOW = "LOW"

# ------------------- Data Classes -------------------
@dataclass
class ClauseAnalysis:
    clause_id: str
    section: str
    content: str
    risk_level: RiskLevel
    risk_score: float
    ambiguity_flags: List[str]
    compliance_issues: List[str]
    recommendations: List[str]

@dataclass
class ContractInsights:
    contract_type: str
    key_themes: List[str]
    critical_dates: List[str]
    financial_terms: List[str]
    jurisdiction: str
    termination_clauses: List[str]

@dataclass
class PartyInfo:
    name: str
    role: str
    stakes: List[str]
    obligations: List[str]
    rights: List[str]

@dataclass
class RiskAssessment:
    overall_risk_score: float
    risk_distribution: Dict[str, int]
    high_risk_areas: List[str]
    compliance_gaps: List[str]
    mitigation_strategies: List[str]

@dataclass
class FinalAnalysis:
    document_id: str
    timestamp: str
    is_contract: bool
    confidence_score: float
    risk_assessment: RiskAssessment
    insights: ContractInsights
    parties: List[PartyInfo]
    clause_analyses: List[ClauseAnalysis]
    executive_summary: str
    recommendations: List[str]

# ------------------- Dynamic Chunker -------------------
def dynamic_chunker(text: str, C_min=400, C_max=2000, alpha=30, beta=0.2):
    N = len(text)
    chunk_size = min(C_max, max(C_min, int(alpha * (N ** 0.6))))
    overlap = int(beta * chunk_size)
    splitter = RecursiveCharacterTextSplitter(
        chunk_size=chunk_size,
        chunk_overlap=overlap,
        separators=["\n\n", "\n", ". ", " ", ""]
    )
    chunks = splitter.split_text(text)
    return chunks, chunk_size, overlap

# ------------------- Multi-Agent Classes -------------------
class DocumentClassificationAgent:
    """Agent 1: Determines if document is a contract"""
    def __init__(self, llm: AzureChatOpenAI):
        self.llm = llm

    async def analyze(self, text: str):
        prompt = f"""
        Determine if the following document is a contract or legal agreement:
        {text[:2000]}...

        Respond JSON:
        {{
            "is_contract": true/false,
            "confidence": 0.0-1.0,
            "reasoning": "why or why not"
        }}
        """
        messages = [SystemMessage(content="You are a legal document classifier."),
                    HumanMessage(content=prompt)]
        response = await self.llm.ainvoke(messages)
        return json.loads(response.content)

class ClauseAnalysisAgent:
    """Agent 2.1: Risk analysis per clause"""
    def __init__(self, llm: AzureChatOpenAI, retriever):
        self.llm = llm
        self.retriever = retriever

    async def analyze_clause(self, section_title: str, content: str) -> ClauseAnalysis:
        # Retrieve relevant context
        context_docs = self.retriever.get_relevant_documents(content)
        context_text = "\n".join([doc.page_content for doc in context_docs])

        prompt = f"""
        Analyze this clause for risk using context:
        Section: {section_title}
        Content: {content}
        Context: {context_text[:2000]}...

        Respond JSON:
        {{
            "risk_level": "HIGH/MEDIUM/LOW",
            "risk_score": 0.0-1.0,
            "ambiguity_flags": [],
            "compliance_issues": [],
            "recommendations": []
        }}
        """
        messages = [SystemMessage(content="You are a contract risk analyst."),
                    HumanMessage(content=prompt)]
        response = await self.llm.ainvoke(messages)
        result = json.loads(response.content)
        return ClauseAnalysis(
            clause_id=f"{section_title[:10]}_{np.random.randint(0,1000)}",
            section=section_title,
            content=content[:500]+"..." if len(content)>500 else content,
            risk_level=RiskLevel(result["risk_level"]),
            risk_score=result["risk_score"],
            ambiguity_flags=result["ambiguity_flags"],
            compliance_issues=result["compliance_issues"],
            recommendations=result["recommendations"]
        )

class InsightsAgent:
    """Agent 2.2: Extracts contract insights"""
    def __init__(self, llm: AzureChatOpenAI, retriever):
        self.llm = llm
        self.retriever = retriever

    async def analyze(self, full_text: str) -> ContractInsights:
        context_docs = self.retriever.get_relevant_documents(full_text)
        context_text = "\n".join([doc.page_content for doc in context_docs])

        prompt = f"""
        Extract insights from this contract using context:
        Contract: {full_text[:2000]}
        Context: {context_text[:2000]}

        Respond JSON with contract_type, key_themes[], critical_dates[], financial_terms[], jurisdiction, termination_clauses[]
        """
        messages = [SystemMessage(content="You are a contract insights expert."),
                    HumanMessage(content=prompt)]
        response = await self.llm.ainvoke(messages)
        result = json.loads(response.content)
        return ContractInsights(**result)

class PartiesAgent:
    """Agent 2.3: Extract parties info"""
    def __init__(self, llm: AzureChatOpenAI, retriever):
        self.llm = llm
        self.retriever = retriever

    async def analyze(self, full_text: str):
        context_docs = self.retriever.get_relevant_documents(full_text)
        context_text = "\n".join([doc.page_content for doc in context_docs])
        prompt = f"""
        Identify parties and their stakes from contract:
        Contract: {full_text[:2000]}
        Context: {context_text[:2000]}

        Respond JSON as list of parties with name, role, stakes[], obligations[], rights[]
        """
        messages = [SystemMessage(content="You are an expert in contract parties."),
                    HumanMessage(content=prompt)]
        response = await self.llm.ainvoke(messages)
        result = json.loads(response.content)
        parties = result
        return parties

class MetricsAgent:
    """Agent 3.1: Calculates metrics"""
    def __init__(self, llm: AzureChatOpenAI):
        self.llm = llm

    async def analyze(self, clause_analyses: List[ClauseAnalysis]) -> RiskAssessment:
        scores = [c.risk_score for c in clause_analyses]
        overall_score = np.mean(scores) if scores else 0.0
        risk_dist = {
            "HIGH": len([c for c in clause_analyses if c.risk_level==RiskLevel.HIGH]),
            "MEDIUM": len([c for c in clause_analyses if c.risk_level==RiskLevel.MEDIUM]),
            "LOW": len([c for c in clause_analyses if c.risk_level==RiskLevel.LOW])
        }
        high_risk_areas = [c.section for c in clause_analyses if c.risk_level==RiskLevel.HIGH]
        compliance_gaps = [issue for c in clause_analyses for issue in c.compliance_issues]
        # Example static mitigation
        mitigation = ["Review high-risk clauses", "Consult legal team", "Update terms", "Monitor compliance", "Add clarifications"]
        return RiskAssessment(
            overall_risk_score=overall_score,
            risk_distribution=risk_dist,
            high_risk_areas=high_risk_areas,
            compliance_gaps=compliance_gaps[:10],
            mitigation_strategies=mitigation
        )

class FinalReportAgent:
    """Agent 4: Generate JSON final report"""
    def __init__(self, llm: AzureChatOpenAI):
        self.llm = llm

    async def generate(self, analysis_components: Dict[str, Any]) -> FinalAnalysis:
        # Simple executive summary
        summary = f"Contract type: {analysis_components['insights'].contract_type}, Overall risk: {analysis_components['risk_assessment'].overall_risk_score}, High risk areas: {analysis_components['risk_assessment'].high_risk_areas}"
        return FinalAnalysis(
            document_id=f"contract_{datetime.now().strftime('%Y%m%d_%H%M%S')}",
            timestamp=datetime.now().isoformat(),
            is_contract=analysis_components['is_contract'],
            confidence_score=analysis_components['confidence'],
            risk_assessment=analysis_components['risk_assessment'],
            insights=analysis_components['insights'],
            parties=analysis_components['parties'],
            clause_analyses=analysis_components['clause_analyses'],
            executive_summary=summary,
            recommendations=analysis_components['risk_assessment'].mitigation_strategies
        )

# ------------------- Multi-Agent Orchestrator -------------------
class MultiAgentContractAnalyzer:
    def __init__(self, llm: AzureChatOpenAI, retriever):
        self.llm = llm
        self.retriever = retriever
        self.classifier = DocumentClassificationAgent(llm)
        self.clause_agent = ClauseAnalysisAgent(llm, retriever)
        self.insights_agent = InsightsAgent(llm, retriever)
        self.parties_agent = PartiesAgent(llm, retriever)
        self.metrics_agent = MetricsAgent(llm)
        self.reporter = FinalReportAgent(llm)

    async def analyze_contract(self, contract_text: str):
        # Step 1: classify
        classification = await self.classifier.analyze(contract_text)
        if not classification["is_contract"] or classification["confidence"] < 0.7:
            return {"error": "Document not recognized as contract", "confidence": classification["confidence"]}

        # Step 2: split into sections (simple by headings)
        sections = {}
        for i, chunk in enumerate(contract_text.split("\n\n")):
            title = f"Section_{i}"
            sections[title] = chunk

        # Step 3: parallel analysis
        clause_tasks = [self.clause_agent.analyze_clause(title, content) for title, content in sections.items()]
        insights_task = self.insights_agent.analyze(contract_text)
        parties_task = self.parties_agent.analyze(contract_text)

        clause_results, insights, parties = await asyncio.gather(
            asyncio.gather(*clause_tasks),
            insights_task,
            parties_task
        )

        # Step 4: metrics
        risk_assessment = await self.metrics_agent.analyze(clause_results)

        # Step 5: final report
        analysis_components = {
            'is_contract': True,
            'confidence': classification["confidence"],
            'clause_analyses': clause_results,
            'insights': insights,
            'parties': parties,
            'risk_assessment': risk_assessment
        }
        final_report = await self.reporter.generate(analysis_components)
        return asdict(final_report)

# ------------------- Usage -------------------
# llm = AzureChatOpenAI(...)  # initialize with your Azure details



In [47]:
chunks, _, _ = dynamic_chunker(contract_document)
vectordb = Chroma.from_texts(chunks, embedding=embeddings, collection_name="contract_db")
retriever = vectordb.as_retriever(search_type="similarity", search_kwargs={"k":5})


# Instead of asyncio.run(), do this in a notebook:
analyzer = MultiAgentContractAnalyzer(llm, retriever)
result = await analyzer.analyze_contract(contract_document)

# Inspect result
# import json
# print(json.dumps(result, indent=2))


In [49]:
result

{'document_id': 'contract_20250927_141009',
 'timestamp': '2025-09-27T14:10:09.382514',
 'is_contract': True,
 'confidence_score': 0.95,
 'risk_assessment': {'overall_risk_score': np.float64(0.6125),
  'risk_distribution': {'HIGH': 1, 'MEDIUM': 3, 'LOW': 0},
  'high_risk_areas': ['Section_1'],
  'compliance_gaps': ['Late fee of 5% per month may exceed statutory limits in some jurisdictions',
   'Ambiguity in the relationship between the SOW and the MSA (MSA terms not referenced in detail)',
   'Force Majeure clause may conflict with local laws that mandate excusal of obligations in certain uncontrollable events.',
   'Limitation of Liability and Indemnity exclusions may not be enforceable in all jurisdictions, especially regarding gross negligence or willful misconduct.',
   "Data usage for 'future research and development' may raise data privacy and GDPR/CCPA compliance concerns if personal data is involved.",
   'Exclusive jurisdiction in New York may conflict with mandatory local la

In [51]:
while True:
    query = input("Enter your analysis query (or 'exit' to quit): ")
    output = analyze_documents(query, retriever)
    print(output)

Certainly! Although the provided excerpts focus on **indemnity** and do not contain any termination provisions, I can suggest **improvements for a typical contract termination clause** based on best practices. These improvements will address **deadlines, payment terms, and obligations** to ensure clarity and fairness for both parties.

---

## Suggested Improvements for the Termination Clause

### 1. **Notice Period (Deadlines)**
- **Current Issue:** Many contracts lack a clear notice period or have ambiguous language.
- **Improvement:** Specify a minimum written notice period for termination by either party (e.g., "Either party may terminate this Agreement by providing thirty (30) days’ prior written notice to the other party").
- **Example Language:**  
  > "Either party may terminate this Agreement for convenience by providing at least thirty (30) days’ prior written notice to the other party."

### 2. **Termination for Cause**
- **Improvement:** Clearly define what constitutes "cau

KeyboardInterrupt: Interrupted by user