Personal Research Agent

An agentic AI research assistant powered by Claude Haiku and Tavily Search. Give it a topic — it autonomously decides what to search, runs multiple live web queries, synthesises the findings, and delivers a structured markdown report in real time. Available as both a streaming web app and an interactive CLI.

How It Works

The pipeline runs in two distinct phases, shared between both entry points:

Phase 1 — Agent Loop

An agentic loop drives Claude Haiku with a web_search tool. Claude autonomously decides what queries to run, inspects the results, and issues further searches until it determines it has enough information — signalled by stop_reason == "end_turn". The loop is hard-capped at 10 iterations to bound cost and latency.

Phase 2 — Report Generation

A separate, non-agentic call takes the raw findings text and formats it into a structured markdown report with five fixed sections: Executive Summary, Key Findings, Current Trends, Implications, and Sources & Further Reading. Keeping these phases separate prevents the formatting step from interfering with search behaviour and avoids burning tokens on structure during research.

Browser / CLI
     │
     │  POST /research (topic)
     ▼
Flask Server  ──────────────────────────────────────────────────────────┐
     │                                                                   │
     │  Phase 1: Agent Loop (run_agent_streaming)                        │
     │      ├─ messages.create() ──────────────► Claude Haiku            │
     │      │          ◄── tool_use / end_turn ─────────────────────     │
     │      └─ tavily.search(query) ──────────► Tavily Search API        │
     │                 ◄── results ───────────────────────────────────   │
     │                                                                   │
     │  Phase 2: Report Generation (generate_report)                     │
     │      └─ messages.create() ──────────────► Claude Haiku            │
     │                 ◄── structured markdown ───────────────────────   │
     │                                                                   │
     │  SSE stream: status events → report event → browser              │
     ▼                                                                   │
Browser (ReadableStream → inline markdown renderer → downloadable .md) ─┘

Features

Autonomous research loop — Claude decides what to search and when to stop, with zero hardcoded query logic
Real-time streaming — Server-Sent Events push live progress updates to the browser as the agent works (Searching: X…, Research complete, etc.)
Structured reports — Consistent five-section markdown format, downloadable as .md
Duplicate query detection — An in-memory seen_queries set prevents Claude from re-issuing the same search within a session
Dual entry points — Full-featured web app and a standalone CLI that saves timestamped reports to disk
Rate limiting — 3 research requests per IP per day via flask-limiter, with a polished in-app modal when the limit is hit
Structured logging — Dual-handler logging to console and daily rotating log files with per-module context and token usage tracking
Secure by design — API keys live server-side only; the frontend never sees or sends credentials

Tech Stack

Layer	Technology
AI Model	Claude Haiku (`claude-haiku-4-5`) via Anthropic SDK
Web Search	Tavily Search API
Backend	Python 3.11 · Flask 3 · flask-limiter · flask-cors
Streaming	Server-Sent Events (`text/event-stream`)
Frontend	Vanilla JS · Fetch API · `ReadableStream`
Config	`python-dotenv` — keys server-side only, never exposed to client
Logging	Python `logging` — structured format, dual handlers (console + daily file)

Error Handling & Resilience

Every failure path is handled explicitly. A single failed search or transient API error never brings down the whole pipeline.

Agent Loop

Scenario	Handling
Claude API call fails	Exception caught and logged with full traceback. CLI exits with code 1. Web app emits an SSE `error` event to the browser — the stream closes cleanly and the UI displays the error message.
Individual Tavily search fails	Exception caught per query. A `"Search failed: <reason>"` string is returned as the tool result so Claude can continue with remaining queries instead of aborting the entire loop.
Duplicate search query	Detected via `seen_queries` set before the network call is made. Skipped with an informational tool result and a `WARNING` log entry. Prevents redundant API calls and infinite search loops.
Max iterations exceeded (10)	Loop terminates gracefully, warning logged. Returns a failed `AgentResult` with `error="Max iterations reached"`.
Agent ends turn with no text	Explicit post-`end_turn` check. If no text block is found in the response, an SSE `error` event is emitted rather than silently calling report generation with empty content.

Report Generation

Scenario	Handling
Claude API call fails	Exception caught and logged. CLI exits with code 1. Web app emits SSE `error` event.
File write fails (CLI)	Exception caught and logged. The report is still printed to stdout, so no work is lost even if the filesystem write fails.

HTTP Layer

Scenario	Handling
Rate limit exceeded (3/day/IP)	`flask-limiter` returns HTTP 429. The frontend checks `response.status` before opening the SSE stream and shows a modal dialog explaining the limit and reset time — no broken stream, no silent failure.
Missing server API keys	Returns HTTP 500 with `{"error": "Server not configured"}`. Keys are never accepted from the request body.
Empty topic submitted	Returns HTTP 400 with `{"error": "No topic provided"}`.
Partial SSE frame received	Each JSON parse in the frontend stream reader is wrapped in `try/catch`. Malformed partial frames are silently discarded without breaking the stream.

Logging

Every event — success or failure — is written to both stdout and a daily log file (logs/app_YYYYMMDD.log). Third-party loggers (httpx, httpcore, anthropic) are silenced to WARNING to keep the signal-to-noise ratio high.

2026-05-25 11:43:01 | INFO     | app                  | Research request received — topic='quantum computing'
2026-05-25 11:43:02 | INFO     | core                 | Searching — query='quantum computing breakthroughs 2025'
2026-05-25 11:43:03 | INFO     | core                 | Searching — query='quantum hardware IBM Google 2025'
2026-05-25 11:43:05 | WARNING  | core                 | Duplicate query skipped — query='quantum computing'
2026-05-25 11:43:09 | INFO     | core                 | Research complete — iterations=4
2026-05-25 11:43:11 | INFO     | core                 | Report generated — tokens used: input=3821, output=612

Project Structure

research-agent/
├── app.py                    # Flask web server — SSE streaming endpoint, rate limiting
├── research_agent.py         # CLI entry point — synchronous pipeline, saves .md to disk
├── core.py                   # Shared pipeline: call_claude, run_search, generate_report
├── logging_config.py         # Dual-handler logging setup (console + daily rotating file)
├── static/
│   ├── index.html            # Single-page app
│   ├── app.js                # SSE consumer, markdown renderer, rate-limit modal, UI logic
│   ├── style.css             # Styling
│   └── systemdesign.png      # Architecture diagram
├── developer-info.html       # Developer contact page (loaded in modal iframe)
├── testing_scripts/          # Standalone incremental build-up scripts for debugging
│   ├── step1_raw_responses.py
│   ├── step2_tool_execution.py
│   ├── step3_agent_loop.py
│   └── step4_report.py
├── logs/                     # Daily rotating log files (auto-created at runtime)
├── .env                      # API keys — never committed
└── requirements.txt

Getting Started

1. Clone and install

git clone <repo-url>
cd research-agent
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

2. Add API keys

# .env
ANTHROPIC_API_KEY=sk-ant-...
TAVILY_API_KEY=tvly-...

Get yours at:

Anthropic — console.anthropic.com
Tavily — app.tavily.com

3. Run

# Web app — serves at http://localhost:5500
python app.py

# CLI — interactive prompt
python research_agent.py

# CLI — topic as argument
python research_agent.py "large language model scaling laws"

# CLI — verbose debug logging
python research_agent.py "fusion energy" --debug

Report Format

Every report follows the same five-section structure:

# {Topic}: Research Report

## Executive Summary
## Key Findings
## Current Trends
## Implications
## Sources & Further Reading

---
*Report generated on {date}*

Key Constants

Constant	Default	Description
`MODEL`	`claude-haiku-4-5`	Anthropic model used for both phases
`MAX_ITERATIONS`	`10`	Hard cap on agent loop cycles
`MAX_SEARCH_RESULTS`	`3`	Tavily results returned per query
Content truncation	`300 chars`	Per-result content limit passed to Claude
Rate limit	`3 / day / IP`	Enforced server-side by `flask-limiter`

Developer

Saif Ahmed

License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Personal Research Agent

How It Works

Features

Tech Stack

Error Handling & Resilience

Agent Loop

Report Generation

HTTP Layer

Logging

Project Structure

Getting Started

Report Format

Key Constants

Developer

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.vscode		.vscode
static		static
testing_scripts		testing_scripts
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
README.md		README.md
app.py		app.py
core.py		core.py
developer-info.html		developer-info.html
logging_config.py		logging_config.py
requirements.txt		requirements.txt
research_agent.py		research_agent.py
systemdesign.png		systemdesign.png
vercel.json		vercel.json

Folders and files

Latest commit

History

Repository files navigation

Personal Research Agent

How It Works

Features

Tech Stack

Error Handling & Resilience

Agent Loop

Report Generation

HTTP Layer

Logging

Project Structure

Getting Started

Report Format

Key Constants

Developer

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages