One command, full institutional-quality equity research report.
./research.py NVDA
# → work/NVDA_20260317/artifacts/final_report.mdA steerable general research and report-writing pipeline powered by Claude Code.
The full report generation steps and dependencies are defined as a directed acyclic graph (DAG); research.py then runs all data gathering and information processing steps to produce a structured report — equity research, or other user-defined inputs, outputs, and research pipeline steps. The DAG makes it more repeatable compared to a skill with a bullet list of things to put in the report.
Enhances the Claude for Financial Services plugins by splitting up prompts for each section, adding hard quality gates, LanceDB RAG, adding critic-optimizer loops, and assembly via Jinja instead of prompts.
The pipeline runs ~33 tasks in dependency order: gather data, index it, research it, write sections, assemble a report.
- Gather data — Python scripts fetch company profile, financials, SEC filings, technical indicators, Wikipedia, and custom research questions
- Chunk & index — text artifacts are split into chunks, embedded, and stored in a LanceDB hybrid (vector + BM25) index, tagged by report section
- Research — 7 agents run in parallel, each querying the index and using MCP tools to dig deeper. Findings flow back into the index so researchers can build on each other's work
- Write — 7 section writers run in parallel, each querying the unified index. Each goes through a critic-rewrite loop
- Assemble — sections are concatenated, conclusion and intro written, a final critique-polish pass applied, and the report rendered to markdown/HTML/PDF
flowchart TD
profile["profile + peers"]
fetch["data gathering\n(technical, fundamental,\nedgar, wikipedia, custom)"]
chunk["chunk → tag → index\n(LanceDB hybrid search)"]
research["7 research agents\n(parallel, MCP-enabled)"]
index_research["index_research\n(append MCP + findings)"]
writers["7 section writers\n(parallel, query index)"]
assemble_body["assemble_body"]
write_conclusion["write_conclusion"]
write_intro["write_intro"]
assemble["assemble_text"]
critique["critique_body_final"]
polish["polish_body_final"]
final["final_assembly\n(Jinja2 + pandoc)"]
profile --> fetch
fetch --> chunk
chunk --> research
research --> index_research
index_research --> writers
writers --> assemble_body
assemble_body --> write_conclusion
write_conclusion --> write_intro
write_intro --> assemble
assemble --> critique
critique --> polish
polish --> final
style profile fill:#e1f5fe
style fetch fill:#e1f5fe
style chunk fill:#e1f5fe
style research fill:#f3e5f5
style index_research fill:#f3e5f5
style writers fill:#fff3e0
style assemble_body fill:#fff3e0
style write_conclusion fill:#fff3e0
style write_intro fill:#fff3e0
style critique fill:#fff3e0
style polish fill:#fff3e0
style assemble fill:#e8f5e9
style final fill:#e8f5e9
Colors: blue = data gathering & indexing, purple = research, orange = writing & editing, green = final assembly.
DAG + Claude Code hybrid. The orchestrator (research.py) runs a dependency graph where each node is either a Python script or a claude -p subprocess with full tool access. This combines the reliability of a state machine (retries, resume, parallel dispatch) with the flexibility of autonomous Claude agents (web search, code execution, MCP tools).
Steerable. The report outline, research questions, writing prompts, style guide, and data sources are all configuration — not code. Change what gets researched and how it's written without touching the pipeline.
Extensible. Add data sources by writing a Python script or adding MCP tools. Add report sections by editing the DAG. The pipeline doesn't know or care what "equity research" means — it just runs tasks in order.
| What | Where | How |
|---|---|---|
| Research questions | custom_prompts.json in workdir |
At the start of the worklow, add specific investigation questions/prompts |
| Report sections | dags/sra.yaml |
Add, remove, or reorder sections; adjust dependencies |
| Writing prompts | dags/sra.yaml task configs |
Edit the system/user prompts for each writer and critic |
| Style guide | STYLE.md |
Set tone, source hierarchy, formatting rules |
| Critic iterations | dags/sra.yaml n_iterations |
Control how many critic-rewrite passes each section gets |
| Data sources | .env + MCP config + Python scripts |
Add API keys, MCP servers, or write new fetch scripts |
| Report templates | templates/*.md.j2 |
Modify Jinja2 templates for final assembly |
- Python 3.10+
- uv package manager
- Claude Code CLI
- System libraries:
pandoc,ta-lib
# System dependencies (macOS)
brew install pandoc ta-lib
export TA_INCLUDE_PATH="$(brew --prefix ta-lib)/include"
export TA_LIBRARY_PATH="$(brew --prefix ta-lib)/lib"
# Python dependencies
uv syncCreate a .env file in the project root:
SEC_FIRM=... # SEC EDGAR identity (firm name)
SEC_USER=... # SEC EDGAR identity (email)
OPENAI_API_KEY=... # Chunk embeddings (text-embedding-3-small)
OPENBB_PAT=... # OpenBB Platform access token
FMP_API_KEY=... # Financial Modeling Prep API key
FINNHUB_API_KEY=... # Finnhub API key (peer detection)
BRAVE_API_KEY=... # Brave Search (MCP research agents)
ALPHAVANTAGE_API_KEY=... # Alpha Vantage (MCP research agents)
PERPLEXITY_API_KEY=... # Perplexity AI (optional, MCP research)
No ANTHROPIC_API_KEY needed if running with MAX subscription — all Claude tasks run via the Claude Code CLI subprocess.
In fact make sure to undefine it with MAX subscription or you will consume a lot of API tokens.
./research.py NVDA./research.py SYMBOL [--dag dags/sra.yaml] [--date YYYYMMDD] [--clean]./research.py SYMBOL --resume [--retry-failed]--resume picks up where a previous run left off — tasks stuck in running are reset to pending, completed tasks are skipped. Add --retry-failed to also retry tasks that previously failed.
./research.py SYMBOL --task TASK_IDRuns one task by ID (workdir must already exist). Useful for re-running a specific step after fixing an issue.
uvicorn web:app --reloadFastAPI interface at http://localhost:8000 with live log streaming via WebSocket.
| Source | What it provides |
|---|---|
| yfinance | Price history, fundamentals, analyst recommendations |
| TA-Lib | Technical indicators (SMA, RSI, MACD, ATR, Bollinger Bands) |
| OpenBB / FMP | Financial statements, key ratios, peer comparisons |
| Finnhub | Peer company detection |
| SEC EDGAR | 10-K, 10-Q, 8-K filings via edgartools |
| Wikipedia | Company history and background |
| Brave / Perplexity | Web search for research agents (via MCP) |
| Claude subagents | Report writing, critique, and revision |
Each run produces work/{SYMBOL}_{DATE}/artifacts/ containing 40+ files:
final_report.md— the complete formatted reportchart.png— stock price chart with technical overlaysprofile.json,technical_analysis.json— structured dataincome_statement.csv,balance_sheet.csv,cash_flow.csv,key_ratios.csv— financials- Section drafts, critic feedback, and revision history in
drafts/
├── research.py # Async DAG orchestrator (entry point)
├── web.py # FastAPI web runner + WebSocket logs
├── dags/
│ └── sra.yaml # DAG definition (33 tasks, v2 schema)
├── skills/
│ ├── db.py # SQLite state management CLI
│ ├── schema.py # Pydantic DAG validation models
│ ├── config.py # Centralized constants
│ ├── fetch_profile/ # Company profile + peers
│ ├── fetch_technical/ # Chart + technical indicators
│ ├── fetch_fundamental/ # Financials, ratios, analyst data
│ ├── fetch_edgar/ # SEC filings
│ ├── fetch_wikipedia/ # Wikipedia summary
│ ├── custom_research/ # User-provided investigation prompts
│ ├── chunk_index/ # Chunk, tag, build LanceDB index
│ ├── search_index/ # Hybrid vector + BM25 search
│ └── mcp_proxy/ # MCP caching proxy
├── templates/ # Jinja2 report assembly templates
├── tests/ # pytest suite (210+ tests)
└── work/ # Output (one dir per run)
└── {SYMBOL}_{DATE}/
├── research.db # Task state
├── artifacts/ # Final outputs
└── drafts/ # Iteration history
