Skip to content

dmbernaal/Maxwell

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

136 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Maxwell: Verified Search Agent

Next.js TypeScript AI SDK

Maxwell is an adaptive, self-correcting search engine designed to solve the Hallucination Problem in Enterprise AI. Unlike standard RAG systems that simply summarize search results, Maxwell audits its own answers using a multi-signal verification pipeline.


🧠 The Architecture: "The Trust Layer"

Maxwell introduces a 5-phase pipeline that treats LLM generation as an "Untrusted Draft" until verified.

graph LR
    A[User Query] --> B(Phase 1: Adaptive Plan);
    B --> C(Phase 2: Surgical Search);
    C --> D(Phase 3: Synthesize);
    D --> E(Phase 4: Temporal Verify);
    E --> F(Phase 5: Reconstruct);
    F --> G[Verified Answer];
Loading

Key Innovations

Feature Description
Multi-Endpoint Architecture Pipeline split into 5 serverless functions for Vercel. Each phase under 60s timeout.
Vercel Blob for Large Payloads Embeddings (~12MB) stored in Blob Storage, passed as URL. Bypasses 4.5MB payload limit.
Pre-Embedding Optimization Embeddings computed during search, not verification. Reduces verify from ~45s to ~8s.
Adaptive Compute Analyzes query complexity. Simple β†’ Gemini Flash (fast). Complex β†’ Claude Sonnet (precise).
Temporal Verification NLI enforces "Recency Superiority" β€” old evidence cannot contradict current status.
Reasoning Bridge Uses hedging language for unverified data instead of deleting it.
Glass Box UI Visualizes the "thinking" process with per-claim confidence scores.

πŸ“ Project Structure

app/
β”œβ”€β”€ api/
β”‚   β”œβ”€β”€ chat/route.ts              # Standard chat endpoint
β”‚   └── maxwell/                   # Maxwell Multi-Endpoint API
β”‚       β”œβ”€β”€ route.ts               # Legacy monolithic (local dev)
β”‚       β”œβ”€β”€ decompose/route.ts     # Phase 1: Query decomposition
β”‚       β”œβ”€β”€ search/route.ts        # Phase 2: Search + pre-embedding
β”‚       β”œβ”€β”€ synthesize/route.ts    # Phase 3: SSE synthesis
β”‚       β”œβ”€β”€ verify/route.ts        # Phase 4: SSE verification
β”‚       └── adjudicate/route.ts    # Phase 5: SSE adjudication
β”œβ”€β”€ components/
β”‚   β”œβ”€β”€ maxwell/                   # Maxwell Canvas UI components
β”‚   └── InputInterface.tsx         # Main input with mode toggle
β”œβ”€β”€ hooks/
β”‚   └── use-maxwell.ts             # Client orchestrator for multi-endpoint
└── lib/
    └── maxwell/
        β”œβ”€β”€ index.ts               # 5-phase orchestrator (local dev)
        β”œβ”€β”€ api-types.ts           # Multi-endpoint request/response types
        β”œβ”€β”€ configFactory.ts       # Adaptive compute configuration
        β”œβ”€β”€ decomposer.ts          # Phase 1: Query β†’ Sub-queries
        β”œβ”€β”€ searcher.ts            # Phase 2: Surgical search
        β”œβ”€β”€ synthesizer.ts         # Phase 3: Draft synthesis
        β”œβ”€β”€ verifier.ts            # Phase 4: Multi-signal verification
        β”œβ”€β”€ adjudicator.ts         # Phase 5: Reconstruction
        β”œβ”€β”€ embeddings.ts          # Saturated pipeline embeddings
        β”œβ”€β”€ blob-storage.ts        # Vercel Blob utilities
        └── prompts.ts             # All LLM prompts

πŸ› οΈ Tech Stack

Layer Technology
Framework Next.js 16 (App Router, Turbopack)
Architecture Multi-endpoint serverless (5 functions)
Orchestration Client-side hook + Vercel AI SDK 5.0
Search Tavily API (Context-Aware w/ Raw Content)
Models Google Gemini 3 Flash (Speed) / Claude Sonnet 4.5 (Reasoning)
Embeddings Google Gemini Embedding 001 (Primary) / Qwen 3 (Fallback)
Large Payloads Vercel Blob Storage (bypasses 4.5MB limit)
Streaming Server-Sent Events (SSE) for real-time UI
State Zustand + IndexedDB (idb-keyval)

πŸš€ Getting Started

Prerequisites

API keys required:

  • OpenRouter (access to Gemini/Claude)
  • Tavily (search API)

Installation

# Clone the repository
git clone https://github.com/dmbernaal/maxwell.git
cd maxwell

# Install dependencies
npm install

# Set up environment variables
cp .env.example .env.local
# Add OPENROUTER_API_KEY and TAVILY_API_KEY

# Run development server
npm run dev

πŸ§ͺ Testing

Maxwell includes comprehensive tests organized by type:

Quick Start

# Run all unit tests (no API keys required)
npm run test:unit

# Run with coverage report
npm run test:coverage

Test Structure

__tests__/
β”œβ”€β”€ unit/                              # No external dependencies (176 tests)
β”‚   β”œβ”€β”€ blob-storage.test.ts           # Embedding encoding/decoding
β”‚   β”œβ”€β”€ embeddings-math.test.ts        # Cosine similarity, top matches
β”‚   β”œβ”€β”€ verifier-signals.test.ts       # Numeric extraction, normalization
β”‚   β”œβ”€β”€ config-factory.test.ts         # Adaptive compute configuration
β”‚   β”œβ”€β”€ constants.test.ts              # Threshold & model validation
β”‚   β”œβ”€β”€ decomposer-validation.test.ts  # Query decomposition validation
β”‚   β”œβ”€β”€ passage-chunking.test.ts       # Sentence segmentation
β”‚   β”œβ”€β”€ evidence-retrieval.test.ts     # Best-match finding logic
β”‚   β”œβ”€β”€ api-types.test.ts              # API contract validation
β”‚   └── error-handling.test.ts         # Edge cases & defensive coding
└── integration/                       # Requires API keys
    └── api-endpoints.test.ts          # Full pipeline E2E

Test Commands

Command Description API Keys?
npm run test:unit Unit tests only ❌ No
npm run test:integration Integration tests βœ… Yes
npm test All tests βœ… Yes
npm run test:coverage Unit tests with coverage ❌ No

What's Tested

Unit Tests (176 tests across 10 files):

  • Blob Storage: Base64 encoding/decoding, round-trip precision for embeddings
  • Embeddings Math: Cosine similarity, orthogonal/opposite vectors, top-N matching
  • Verifier Signals: Number extraction, normalization (billions/millions/%), confidence aggregation
  • Config Factory: Adaptive compute for all complexity levels (simple/standard/deep_research)
  • Constants: Model IDs, thresholds, multipliers, entailment scores validation
  • Decomposer Validation: Sub-query validation, duplicate ID detection, bounds checking
  • Passage Chunking: Sentence segmentation, window sizes, edge cases
  • Evidence Retrieval: Best-match finding, citation mismatch detection
  • API Types: Request/response contract validation, Blob URL formats
  • Error Handling: Malformed inputs, edge cases, defensive coding patterns

Integration Tests (requires OPENROUTER_API_KEY + TAVILY_API_KEY):

  • Full 5-phase pipeline: Decompose β†’ Search β†’ Synthesize β†’ Verify β†’ Adjudicate
  • Real API calls to Tavily and OpenRouter

Legacy Tests

Original phase-by-phase tests from initial development are preserved in tests/legacy/ for reference.


πŸ“¦ Deployment

Optimized for Vercel with multi-endpoint architecture:

  1. Push to GitHub
  2. Import project in Vercel
  3. Add environment variables in Vercel Dashboard:
    • OPENROUTER_API_KEY
    • TAVILY_API_KEY
    • BLOB_READ_WRITE_TOKEN (from Vercel Blob Storage)
  4. Deploy

Blob Storage Setup: In Vercel Dashboard β†’ Storage β†’ Create Blob Store β†’ Copy token to env vars.

Why Multi-Endpoint?

The Maxwell pipeline is split into 5 serverless functions to stay within Vercel's 60-second timeout:

Endpoint Purpose Timeout
/api/maxwell/decompose Query analysis 30s
/api/maxwell/search Search + pre-embed 60s
/api/maxwell/synthesize Answer generation 30s
/api/maxwell/verify Claim verification 60s
/api/maxwell/adjudicate Final verdict 30s

The key optimization: pre-embedding passages during search so verification only embeds claims (~5-30 texts, not ~3000).


πŸ€– AI-Assisted Development Process

This project was built using a structured AI collaboration workflow:

Architecture Design

  • Consulted Claude Opus 4.5 and Gemini to design the 5-phase pipeline architecture
  • Iteratively refined the verification strategy through architectural discussions

Implementation Workflow

  1. Planning Phase: Before any code is written, the AI constructs an implementation plan (.md file) that I must approve
  2. Review & Edit: I review the plan, make corrections, and provide feedback
  3. Execution: Only after approval does the AI write code using Cursor or Antigravity
  4. Verification:
    • For logic: Run unit tests
    • For UI: Visual inspection via browser + console logs
  5. Iteration: Back-and-forth refinement with AI assistants

Model Specialization

Task Type Model Used
Logic & Architecture Claude Opus 4.5
Frontend & Design Gemini 3.0
Code Generation Cursor Agent / Antigravity

This approach ensures:

  • βœ… Human oversight at every decision point
  • βœ… Structured, reviewable implementation plans
  • βœ… Clear separation between design and execution
  • βœ… Iterative refinement based on testing feedback

πŸ“„ Documentation

Detailed technical documentation is available in /documentation:


Author

Diego I. Medina-Bernal
πŸ“§ dmbernaal@gmail.com

About

Reducing Entropy

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages