Skip to content

starman69/contract-intelligence

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Contract Intelligence POC

Azure-native legal-contract intelligence platform. POC scaffold.

This repo holds the POC scope: a minimum viable Azure stack to validate metadata extraction, RAG retrieval, clause comparison against a gold standard, and human-in-the-loop review, on a 500-document corpus.

View the interactive overview — a visual tour of the query router, extraction, HITL review, and cross-domain reuse.

Contract Intelligence — interactive overview site

Quick Links

Repo Layout

docs/         reference architecture + POC docs + ADRs
infra/
  bicep/      Azure IaC (subscription-scoped main.bicep + 12 modules)
  local/      docker-compose stack (mssql, azurite, qdrant, ollama, unstructured)
scripts/      SQL DDL, AI Search index definitions, data-prep, function packaging
samples/      gold-clause templates + synthetic contracts (PDFs built on demand)
src/
  shared/     profile, config, clients, router, sql_builder, api, prompts,
              auth, openapi, layout, vector_search, coercions, embedding_text
  functions/
    ingestion/  Event Grid → process_blob_event (azure profile)
    api/        HTTP query/contracts/compare endpoints (azure profile)
  local/      FastAPI wrapper (api_server.py) + Azurite poll watcher
              (ingest_watcher.py) for the docker-compose runtime
  web/        React + Vite + TypeScript SPA with light/dark theming
              (Tailwind v4) — Static Web App ready
site/         static GitHub Pages overview site (index.html + screenshots)
tests/
  unit/       fast tests, no Azure deps
  eval/       integration eval runner (RUN_INTEGRATION_EVAL=1)

Getting Started

  1. Read docs/poc/00-overview.md for scope.
  2. Run the local stack: docs/poc/12-local-runtime.md.
  3. When ready for cloud: docs/poc/13-tenant-setup.mddocs/poc/14-deployment-guide.md.
  4. Source contracts: docs/poc/07-sample-documents.md.

Synthetic data

Counterparties in samples/contracts-synthetic/ and tests/golden_qa.jsonl are fictional. Build the PDFs with bash scripts/data-prep/build-synthetic-pdfs.sh. Real corpora (CUAD, SEC EDGAR) are not redistributed — see docs/poc/07-sample-documents.md.

Status

The full POC stack runs end-to-end in two profiles selected by RUNTIME_PROFILE:

  • azure (default): Functions on Event Grid + Document Intelligence + Azure OpenAI + Azure SQL + Azure AI Search + Static Web App. Bicep is idempotent and zero-warning; the Bicep-↔-app contract is enforced by tests/unit/test_bicep_app_contract.py. Deployment to a real subscription has not been performed; see docs/poc/13-tenant-setup.md for prerequisites.
  • local (docker-compose, no cloud): FastAPI wrapper + Azurite-poll watcher driving the same pipeline.process_blob_event and shared.api.query codepaths, with mssql / Azurite / Qdrant / Ollama / unstructured.io as drop-in service replacements. See docs/poc/12-local-runtime.md.

src/ highlights (all checked in and exercised by the local stack):

Area Where
Ingestion pipeline (DI/unstructured → LLM extraction → SQL + vectors + audit) src/functions/ingestion/pipeline.py — flow walkthrough in docs/poc/11-ingestion-pipeline.md
Query API (router → reporting / search / clause-comparison / mixed handlers) src/shared/api.py, src/shared/router.py, src/shared/sql_builder.py — design narrative in docs/poc/18-llm-orchestration.md
HITL review (queue + per-field write-through correction, append-only lineage, three-axis state, reviewer auth) src/shared/api.py, src/shared/auth.py — workflow in docs/poc/22-hitl-review.md
Profile-aware client factories (Azure SDK ↔ local equivalents) src/shared/clients.py, src/shared/layout.py, src/shared/vector_search.py
Web frontend (4 tabs — Chat, Contracts, Review, Gold Clauses — with HITL per-field review + lineage, shared drawer + compare modal, light/dark theme via Tailwind v4) src/web/
OpenAPI spec + Swagger UI served by the API src/shared/openapi.py

Tests: unit suite is PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 python3 -m pytest tests/unit -q. A golden-question eval lives in tests/golden_qa.jsonl and runs against the live API via tests/eval/ when RUN_INTEGRATION_EVAL=1.

Releases

No releases published

Packages

 
 
 

Contributors