Skip to content

6tring/proofstack

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ProofStack — Self-Healing Evidence-Backed Research Agent

A research pipeline that fetches evidence from public APIs, generates structured JSON reports via LLM, and validates them through a three-phase audit with automatic self-healing repair.

How It Works

User enters topic
    ↓
Fetch evidence (Wikipedia + arXiv)
    ↓
LLM generates JSON report
    ↓
Three-phase audit:
  1. Schema validation (AJV + JSON Schema)
  2. Grounding validation (source URLs match API responses)
  3. Consistency validation (cross-field integrity)
    ↓
Pass? → Done
Fail? → Minimal-diff repair → Re-audit (up to 3 attempts)

Self-Healing Strategy

The repairer uses minimal-diff repair — it patches only the specific failing fields rather than regenerating the entire report:

  • Grounding errors: Deterministic fix — replaces fabricated URLs with actual API-returned URLs
  • Consistency errors: Deterministic fix — clamps out-of-bounds indices, corrects type mismatches
  • Schema errors: Two-phase fix — deterministic repairs handle string length violations (truncating at sentence boundaries for maxLength, padding for minLength), then LLM-assisted fix handles remaining errors with a targeted prompt that includes the full schema

Quick Start

Prerequisites

Dependency Version Purpose
Node.js 18+ Runtime for both server and client
MongoDB 6+ Stores research sessions and audit trails
Gemini API key Free tier LLM for report generation and repair

MongoDB must be running before the application starts. On macOS with Homebrew:

brew services start mongodb-community

On Linux (systemd):

sudo systemctl start mongod

A Gemini API key can be obtained for free at https://aistudio.google.com/apikey. The free tier allows 15 requests per minute, which is sufficient for this application.

Install Dependencies

Three npm install commands are required — one for the root (which provides the concurrently launcher), one for the server, and one for the client:

npm install
cd server && npm install && cd ..
cd client && npm install && cd ..

Configure Environment

The example environment file should be copied and updated with the Gemini API key:

cp .env.example .env

Then open .env and set the following values:

Variable Required Default Description
GEMINI_API_KEY Yes API key from Google AI Studio
MONGODB_URI No mongodb://localhost:27017/research_agent_db MongoDB connection string
PORT No 3000 Backend server port
CLIENT_URL No http://localhost:5173 Frontend origin for CORS

Start the Application

npm run dev

This single command starts both the Express backend (port 3000) and the Vite development server (port 5173) concurrently. The terminal displays logs from both processes.

The dashboard is accessible at http://localhost:5173.

Verify the Setup

  1. Open http://localhost:5173 in a browser — the dashboard should load
  2. Enter a research topic (e.g., "quantum computing") and click Research
  3. The sidebar navigation provides access to six views: Pipeline, Attempts, Report, Audit Log, JSON Data, and Diff View
  4. A successful run produces a validated report accessible from the Report tab in the sidebar

If the pipeline fails with a Gemini API error, verify that the GEMINI_API_KEY in .env is correct and that the free tier quota has not been exceeded.

Run Tests

cd server && npm test

The test suite includes 41 tests covering schema validation, the auditor service, the repairer service, and pipeline integration. Tests use mocked dependencies and do not require MongoDB or a Gemini API key.

Architecture

The backend uses a layered architecture with factory-function-based dependency injection:

Controller → Service → Repository
  • Controllers handle HTTP requests and SSE streaming
  • Services implement business logic (producer, auditor, repairer, orchestrator)
  • Repositories wrap external APIs (Wikipedia, arXiv, Gemini) and database (MongoDB)

Dependencies are wired in a single composition root (routes/researchRoutes.js) using factory functions — no classes, no new keyword.

Pipeline Services

Service Role
producerService Builds prompts from evidence, calls Gemini for JSON generation
auditorService Three-phase validation: schema (AJV), grounding, consistency
repairerService Minimal-diff repair using AJV error paths
researchService Orchestrates the loop, emits SSE events, persists results

External APIs

API Purpose Format
Wikipedia REST API Page summaries and search JSON
arXiv API Academic paper abstracts Atom XML (parsed with xml2js)
Gemini (gemini-2.5-flash) JSON report generation and repair JSON

API Endpoints

Method Endpoint Description
POST /api/research Start research pipeline (SSE stream)
GET /api/research/sessions List recent sessions
GET /api/research/sessions/:id Get session with full audit trail

Tech Stack

  • Backend: Express, MongoDB (Mongoose), AJV, xml2js, @google/generative-ai
  • Frontend: React, Vite
  • Testing: Vitest
  • Validation: AJV + JSON Schema (draft-07)

JSON Schema

The report schema (server/schemas/report.schema.json) enforces intentionally tight constraints:

  • Summary: 100-500 characters
  • Key findings: exactly 3-7 items, each 20-300 characters
  • Sources: 2-10 items, each with validated URL format and type enum
  • Confidence scores: 0-1 range
  • Additional properties: not allowed

These constraints are designed so that LLM outputs naturally fail validation sometimes, demonstrating the self-healing loop in action.

Example Output

The file examples/sample-output.json contains a captured pipeline run where:

  1. Attempt 1 fails — summary exceeds 500 characters, source URL is fabricated
  2. Attempt 2 self-heals — grounding fix replaces the URL, LLM shortens the summary
  3. Report passes all three audit phases

Project Structure

research-agent-app/
├── server/
│   ├── server.js                 # Entry point
│   ├── db/index.js               # MongoDB connection
│   ├── schemas/report.schema.json
│   ├── entities/Report.js        # Report factory functions
│   ├── repositories/             # Wikipedia, arXiv, Gemini, MongoDB
│   ├── services/                 # Producer, Auditor, Repairer, Orchestrator
│   ├── controllers/              # HTTP handlers + SSE streaming
│   ├── routes/researchRoutes.js  # Composition root (DI wiring)
│   ├── errors/index.js           # Custom error classes
│   ├── middleware/               # Error handler
│   └── __tests__/                # 41 unit + integration tests
├── client/
│   └── src/
│       ├── App.jsx               # Sidebar navigation + main content
│       ├── api/researchApi.js    # SSE stream + fetch helpers
│       └── components/           # SearchForm, PipelineView, etc.
├── examples/sample-output.json
├── .env.example
└── package.json                  # One-command run: npm run dev

About

Self-Healing Evidence-Backed Research Agent

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors