ProofStack — Self-Healing Evidence-Backed Research Agent

A research pipeline that fetches evidence from public APIs, generates structured JSON reports via LLM, and validates them through a three-phase audit with automatic self-healing repair.

How It Works

User enters topic
    ↓
Fetch evidence (Wikipedia + arXiv)
    ↓
LLM generates JSON report
    ↓
Three-phase audit:
  1. Schema validation (AJV + JSON Schema)
  2. Grounding validation (source URLs match API responses)
  3. Consistency validation (cross-field integrity)
    ↓
Pass? → Done
Fail? → Minimal-diff repair → Re-audit (up to 3 attempts)

Self-Healing Strategy

The repairer uses minimal-diff repair — it patches only the specific failing fields rather than regenerating the entire report:

Grounding errors: Deterministic fix — replaces fabricated URLs with actual API-returned URLs
Consistency errors: Deterministic fix — clamps out-of-bounds indices, corrects type mismatches
Schema errors: Two-phase fix — deterministic repairs handle string length violations (truncating at sentence boundaries for maxLength, padding for minLength), then LLM-assisted fix handles remaining errors with a targeted prompt that includes the full schema

Quick Start

Prerequisites

Dependency	Version	Purpose
Node.js	18+	Runtime for both server and client
MongoDB	6+	Stores research sessions and audit trails
Gemini API key	Free tier	LLM for report generation and repair

MongoDB must be running before the application starts. On macOS with Homebrew:

brew services start mongodb-community

On Linux (systemd):

sudo systemctl start mongod

A Gemini API key can be obtained for free at https://aistudio.google.com/apikey. The free tier allows 15 requests per minute, which is sufficient for this application.

Install Dependencies

Three npm install commands are required — one for the root (which provides the concurrently launcher), one for the server, and one for the client:

npm install
cd server && npm install && cd ..
cd client && npm install && cd ..

Configure Environment

The example environment file should be copied and updated with the Gemini API key:

cp .env.example .env

Then open .env and set the following values:

Variable	Required	Default	Description
`GEMINI_API_KEY`	Yes	—	API key from Google AI Studio
`MONGODB_URI`	No	`mongodb://localhost:27017/research_agent_db`	MongoDB connection string
`PORT`	No	`3000`	Backend server port
`CLIENT_URL`	No	`http://localhost:5173`	Frontend origin for CORS

Start the Application

npm run dev

This single command starts both the Express backend (port 3000) and the Vite development server (port 5173) concurrently. The terminal displays logs from both processes.

The dashboard is accessible at http://localhost:5173.

Verify the Setup

Open http://localhost:5173 in a browser — the dashboard should load
Enter a research topic (e.g., "quantum computing") and click Research
The sidebar navigation provides access to six views: Pipeline, Attempts, Report, Audit Log, JSON Data, and Diff View
A successful run produces a validated report accessible from the Report tab in the sidebar

If the pipeline fails with a Gemini API error, verify that the GEMINI_API_KEY in .env is correct and that the free tier quota has not been exceeded.

Run Tests

cd server && npm test

The test suite includes 41 tests covering schema validation, the auditor service, the repairer service, and pipeline integration. Tests use mocked dependencies and do not require MongoDB or a Gemini API key.

Architecture

The backend uses a layered architecture with factory-function-based dependency injection:

Controller → Service → Repository

Controllers handle HTTP requests and SSE streaming
Services implement business logic (producer, auditor, repairer, orchestrator)
Repositories wrap external APIs (Wikipedia, arXiv, Gemini) and database (MongoDB)

Dependencies are wired in a single composition root (routes/researchRoutes.js) using factory functions — no classes, no new keyword.

Pipeline Services

Service	Role
`producerService`	Builds prompts from evidence, calls Gemini for JSON generation
`auditorService`	Three-phase validation: schema (AJV), grounding, consistency
`repairerService`	Minimal-diff repair using AJV error paths
`researchService`	Orchestrates the loop, emits SSE events, persists results

External APIs

API	Purpose	Format
Wikipedia REST API	Page summaries and search	JSON
arXiv API	Academic paper abstracts	Atom XML (parsed with xml2js)
Gemini (gemini-2.5-flash)	JSON report generation and repair	JSON

API Endpoints

Method	Endpoint	Description
POST	/api/research	Start research pipeline (SSE stream)
GET	/api/research/sessions	List recent sessions
GET	/api/research/sessions/:id	Get session with full audit trail

Tech Stack

Backend: Express, MongoDB (Mongoose), AJV, xml2js, @google/generative-ai
Frontend: React, Vite
Testing: Vitest
Validation: AJV + JSON Schema (draft-07)

JSON Schema

The report schema (server/schemas/report.schema.json) enforces intentionally tight constraints:

Summary: 100-500 characters
Key findings: exactly 3-7 items, each 20-300 characters
Sources: 2-10 items, each with validated URL format and type enum
Confidence scores: 0-1 range
Additional properties: not allowed

These constraints are designed so that LLM outputs naturally fail validation sometimes, demonstrating the self-healing loop in action.

Example Output

The file examples/sample-output.json contains a captured pipeline run where:

Attempt 1 fails — summary exceeds 500 characters, source URL is fabricated
Attempt 2 self-heals — grounding fix replaces the URL, LLM shortens the summary
Report passes all three audit phases

Project Structure

research-agent-app/
├── server/
│   ├── server.js                 # Entry point
│   ├── db/index.js               # MongoDB connection
│   ├── schemas/report.schema.json
│   ├── entities/Report.js        # Report factory functions
│   ├── repositories/             # Wikipedia, arXiv, Gemini, MongoDB
│   ├── services/                 # Producer, Auditor, Repairer, Orchestrator
│   ├── controllers/              # HTTP handlers + SSE streaming
│   ├── routes/researchRoutes.js  # Composition root (DI wiring)
│   ├── errors/index.js           # Custom error classes
│   ├── middleware/               # Error handler
│   └── __tests__/                # 41 unit + integration tests
├── client/
│   └── src/
│       ├── App.jsx               # Sidebar navigation + main content
│       ├── api/researchApi.js    # SSE stream + fetch helpers
│       └── components/           # SearchForm, PipelineView, etc.
├── examples/sample-output.json
├── .env.example
└── package.json                  # One-command run: npm run dev

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
client		client
examples		examples
server		server
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ProofStack — Self-Healing Evidence-Backed Research Agent

How It Works

Self-Healing Strategy

Quick Start

Prerequisites

Install Dependencies

Configure Environment

Start the Application

Verify the Setup

Run Tests

Architecture

Pipeline Services

External APIs

API Endpoints

Tech Stack

JSON Schema

Example Output

Project Structure

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ProofStack — Self-Healing Evidence-Backed Research Agent

How It Works

Self-Healing Strategy

Quick Start

Prerequisites

Install Dependencies

Configure Environment

Start the Application

Verify the Setup

Run Tests

Architecture

Pipeline Services

External APIs

API Endpoints

Tech Stack

JSON Schema

Example Output

Project Structure

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages