Skip to content

Anandb71/FinShield

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

48 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Aegis (FinShield) πŸ›‘οΈ

AI-Powered Financial Document Forensics Platform β€” Built at DevSoc '26

Aegis is an autonomous forensic auditing system built during the DevSoc '26 hackathon for the Hotfoot sponsor track (with Backboard as the AI/RAG sponsor). It ingests financial documents (bank statements, invoices, payslips, images), detects fraud using forensic mathematics (Benford's Law, balance integrity verification), and visualizes entity relationships through an interactive knowledge graph.

Hotfoot provided the challenge track and the financial document datasets. When the official dataset was changed mid-hackathon from images to Excel spreadsheets, we adapted our pipeline to handle both β€” our system processes PDFs, images (via OCR), and Excel files through the same unified analysis engine.


πŸš€ Key Features

Fraud Detection Engine

  • Metadata Integrity Check β€” Compares header/summary balances against calculated transaction balances. Catches "lying headers" where the closing balance has been tampered with (e.g. injecting a -61M balance into a statement that actually closes at β‚Ή61).
  • Benford's Law Analysis β€” Flags unnatural leading-digit distributions in transaction amounts.
  • Structuring / Smurfing Detection β€” Identifies clusters of transactions just below reporting thresholds.
  • Round-Number Syndrome β€” Detects suspiciously high ratios of round-number transactions.
  • Date-Sequence Violations β€” Catches out-of-order or impossible date progressions.
  • Balance Continuity Checks β€” Verifies running balance consistency across every row.

Lie Detector Panel

  • Dynamic integrity display on the Review page β€” shows INTEGRITY FAILURE (red) or INTEGRITY VERIFIED (green) based on real-time comparison of reported vs calculated balances.
  • 3-priority fallback: backend metadata_discrepancy β†’ anomaly structured fields β†’ local transaction comparison.
  • Zero hardcoded values β€” all thresholds computed from actual data.

Excel Normalization ("The Repair Shop")

  • Parses messy bank statement spreadsheets from any bank layout.
  • Auto-detects header rows, column mappings, date formats.
  • Repairs OCR artifacts, skips junk rows, handles merged cells.
  • Extracts opening/closing balances from summary sections.
  • Infers transaction types (debit/credit) from signed amounts.

Investigation Board (Knowledge Graph)

  • Interactive 3D Graph β€” Navigate documents, entities, and risk nodes.
  • Conflict Hunter β€” Detects shared addresses/phones between vendors and employees.
  • Entity Resolution β€” Links accounts, names, and references across documents.

X-Ray Reconciliation

  • Smart Match β€” Auto-links invoices to bank transactions (exact & fuzzy).
  • Ghost Detection β€” Flags invoices with no matching payment.
  • Human-in-the-Loop β€” Force-match decisions feed the learning loop.

οΏ½ Project Evolution

Before Review 1 (Feb 8, 2026)

Original Vision: FinShield was initially built as "The Financial Flight Recorder" β€” an autonomous, real-time defense system for mobile users. We attempted to solve two Hotfoot sponsor track challenges simultaneously at DevSoc '26:

  1. Call Shield (Hotfoot Audio track) β€” Real-time analysis of phone calls to detect urgency manipulation, fear tactics, and pressure techniques used by scammers. Used WebSocket streaming with a live risk meter.
  2. Contract Scanner (Hotfoot Docs + Backboard track) β€” Instant detection of predatory clauses, hidden fees, and exploitative terms in financial documents, with cross-referencing via Backboard RAG.

⚠️ Note: Hotfoot's rules required teams to pick only one track. We initially tried both but pivoted to document intelligence after Review 1 (see below).

Tech at this stage:

  • Frontend: Flutter with Riverpod state management, cybersecurity dark theme with neon accents, glassmorphism cards, animated shield logo, and a call_shield_screen.dart with a real-time risk meter
  • Backend: FastAPI with mock service implementations for Hotfoot Audio (audio_analyzer.py), Hotfoot Docs (document_scanner.py), and Backboard RAG (context_engine.py)
  • Real-time pipeline: WebSocket-based streaming (sockets.py + ConnectionManager) with audio_processor.py stub and document_engine.py stub
  • Architecture: Monorepo with /frontend (Flutter) and /backend (FastAPI), Docker Compose for full-stack dev

Key commits:

Commit Date Description
8ef22ce Feb 8 Initial project setup β€” Flutter + FastAPI + mock Hotfoot Audio/Docs/Backboard
c1e1a27 Feb 8 Add Flutter lib source files (home screen, glassmorphism widgets)
f3616fb Feb 8 Phase 2: WebSocket Pipeline β€” audio_processor stub, call_shield_screen with risk meter

After Review 1 β†’ Before Review 2 (Feb 9–10, 2026)

The Pivot: After Review 1 feedback and Hotfoot's requirement to choose only one sponsor track, we made a strategic decision to drop the audio/call analysis track entirely and go all-in on document forensic intelligence. The project evolved from a dual-challenge attempt into a focused, deep financial document auditing platform.

Dataset curveball: Midway through, Hotfoot changed the official dataset from images (scanned documents) to Excel spreadsheets. Rather than just switching, we built a pipeline that handles both β€” PDFs and images go through OCR + LLM extraction, while Excel files go through our custom normalization engine. This dual-format capability became one of our strongest differentiators.

What changed and why:

What Changed Why
Dropped Hotfoot Audio (deleted audio_analyzer.py, audio_processor.py, call_shield_screen.dart) Hotfoot required picking one track. Document intelligence had deeper forensic potential.
Flutter β†’ React webapp (new /webapp with React 18 + TypeScript + Vite + Chakra UI) Flutter was great for mobile but the use case shifted to a desktop-first investigator dashboard. React + Chakra UI enabled faster iteration on complex data-heavy UIs (tables, charts, graphs).
Mock services β†’ Real AI pipeline (Backboard Assistant API integration with GPT-4o) Replaced mock implementations with actual LLM-powered document classification, field extraction, and validation.
Added forensic analysis suite Benford's Law, balance integrity checks, structuring detection, metadata fraud detection β€” none of this existed pre-review.
Added Knowledge Graph Cross-document entity resolution with interactive force-directed visualization. Links accounts, vendors, and counterparties across uploaded documents.
Added Excel Normalization Engine Parses messy bank statement spreadsheets from any bank layout β€” auto-detects headers, columns, date formats.
Added Self-Learning Loop Human corrections feed back into the system. Error clustering, retraining triggers, continuous improvement.
Rebranded to Aegis Reflected the evolved mission β€” from a "shield" to a full forensic auditing platform.

Key commits in this phase:

Commit Date Description
05fffb1 Feb 9 THE PIVOT β€” Finsight Knowledge Graph & Reconciliation
e89bb45 Feb 9 Backboard-only document intelligence system (classification, validation, entity resolution, learning loop)
d03a5f5 Feb 9 Universal Pipeline Dashboard with Bulk Ingestion and Review Queue
7b5b646 Feb 9 Complete frontend rebuild β€” audio services deleted, industry-grade Flutter redesign
0e5c3e2 Feb 10 Fix 10 diagnosed bugs + Excel viewer
2852da7 Feb 10 Metadata integrity fraud detection (Lie Detector panel)
ecf29f1 Feb 10 Enrich knowledge graph with accounts, balances & counterparties
089f663 Feb 10 Rebrand: Finsight/FinShield β†’ Aegis everywhere
6b59238 Feb 10 UI: Dramatically improve DocumentReviewPage inspector UI

The Flutter frontend still exists in /frontend as a legacy reference, but the primary interface is now the React webapp in /webapp.


οΏ½πŸ› οΈ Tech Stack

Layer Technology
Backend Python 3.13, FastAPI, SQLModel, SQLAlchemy, SQLite
AI/LLM OpenAI GPT-4o (via Backboard client)
Excel Parsing openpyxl
Frontend (Web) React 18, TypeScript, Vite, Chakra UI
Visualization react-force-graph-3d, Recharts, react-pdf
Frontend (Mobile) Flutter (legacy, in /frontend)

πŸƒβ€β™‚οΈ Quick Start

Backend

cd backend
python -m venv ../.venv
../.venv/Scripts/activate   # Windows
pip install -e ".[dev]"
uvicorn app.main:app --reload --port 8000

Web Frontend

cd webapp
npm install
npm run dev   # β†’ http://localhost:5173

Flutter Frontend (legacy)

cd frontend
flutter run -d chrome

πŸ“ Project Structure

FinShield/
β”œβ”€β”€ backend/           # FastAPI server + AI pipeline
β”‚   β”œβ”€β”€ app/
β”‚   β”‚   β”œβ”€β”€ api/       # REST endpoints (ingestion, documents, forensics, review, dashboard)
β”‚   β”‚   β”œβ”€β”€ services/  # Excel normalizer, validation engine, backboard client
β”‚   β”‚   β”œβ”€β”€ db/        # SQLModel models + session management
β”‚   β”‚   └── core/      # Settings, knowledge graph store
β”‚   └── storage/       # Uploaded document files
β”œβ”€β”€ webapp/            # React + Vite frontend
β”‚   └── src/
β”‚       β”œβ”€β”€ pages/     # Dashboard, DocumentReview, Upload
β”‚       └── components/
β”œβ”€β”€ frontend/          # Flutter frontend (legacy)
└── docker-compose.yml

πŸ“ Roadmap

  • Document ingestion pipeline with AI classification
  • Excel bank statement normalization engine
  • PDF + image ingestion with OCR support
  • Forensic validation suite (Benford, structuring, balance checks)
  • Metadata integrity fraud detection (Lie Detector)
  • React web dashboard with document review
  • Knowledge graph entity resolution
  • Cross-document entity linking
  • Self-learning loop with human corrections
  • Multi-currency support (dynamic round-number thresholds)
  • Automated report generation
  • Batch re-analysis on rule updates

πŸ† Hackathon

DevSoc '26 β€” Hotfoot Sponsor Track (Document Intelligence)

Event DevSoc '26
Sponsors Hotfoot (challenge track + datasets), Backboard (AI/RAG platform)
Track Financial Document Intelligence
Dataset Changed mid-hackathon from images β†’ Excel; we support both
Team Built in ~48 hours

About

FinShield is an autonomous, real-time financial defense system designed to proactively detect fraud and predatory practices. It leverages Hotfoot's audio and document intelligence to analyze live interactions and contracts, while Backboard.io provides cross-modal verification to flag contradictions and hidden risks before transactions occur.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors