Stock Research Agent

An async Python-orchestrated equity research pipeline that generates comprehensive analyst-style reports. A single script drives a DAG of data-gathering tasks and Claude writing agents, producing a polished report from one command.

How It Works

./research.py AMD --date 20260225

This triggers a 14-task pipeline:

flowchart TD
    profile["profile\n(company identity & peers)"]
    fetch["fetch data\n(technical, fundamental,\nperplexity, edgar,\nwikipedia, perplexity_analysis)"]
    write_body["write_body\n(Claude: 7-section report)"]
    write_conclusion["write_conclusion\n(Claude: concluding analysis)"]
    write_intro["write_intro\n(Claude: intro paragraph)"]
    assemble["assemble_text\n(Jinja2: combine sections)"]
    critique["critique_body_final\n(Claude: editorial review)"]
    polish["polish_body_final\n(Claude: revise per critique)"]
    final["final_assembly\n(Jinja2 + pandoc → md/html/pdf)"]

    profile --> fetch
    fetch --> write_body
    write_body --> write_conclusion
    write_body --> write_intro
    write_conclusion --> write_intro
    write_intro --> assemble
    assemble --> critique
    critique --> polish
    polish --> final

    style profile fill:#e1f5fe
    style fetch fill:#e1f5fe
    style write_body fill:#fff3e0
    style write_conclusion fill:#fff3e0
    style write_intro fill:#fff3e0
    style critique fill:#fff3e0
    style polish fill:#fff3e0
    style assemble fill:#e8f5e9
    style final fill:#e8f5e9

Phase 1 — Data gathering (parallel, blue): Profile fetches company identity, then 6 data tasks run concurrently — technicals, fundamentals, Perplexity research, SEC filings, Wikipedia, competitive analysis.

Phase 2 — Writing (sequential, orange): Claude subagents synthesize all gathered data into a 7-section report body, then conclusion and intro are written. An editor agent critiques and a revision agent polishes.

Phase 3 — Assembly (green): Sections are concatenated via Jinja2, then the final report is assembled with charts and tables and converted to markdown, HTML, and PDF via pandoc.

Architecture

Orchestrator: research.py — a single async Python script that reads the DAG, initializes the database, and runs waves of tasks as parallel subprocesses. Python data-gathering tasks run via uv run python, Claude writing tasks run via claude --dangerously-skip-permissions -p. All database writes are centralized in the orchestrator.

State management: One SQLite database per run (work/{SYMBOL}_{DATE}/research.db) tracks task status, dependencies, artifacts, and runtime variables. All components access state through skills/db.py — no direct SQL elsewhere.

Artifact context: A manifest.json file is maintained before each wave, listing all produced artifacts. Claude tasks read this file to discover available research data.

DAG definition: dags/sra.yaml declares tasks, types, dependencies, configs, and expected outputs in a version-2 schema validated by Pydantic.

Data Sources

Source	What it provides
yfinance	Price history, fundamentals, analyst recommendations
TA-Lib	Technical indicators (SMA, RSI, MACD, ATR, Bollinger Bands)
OpenBB / FMP	Financial statements, key ratios, peer comparisons
Finnhub	Peer company detection
Perplexity AI	News, business profiles, executive bios, competitive/risk analysis
SEC EDGAR	10-K, 10-Q, 8-K filings via edgartools
Wikipedia	Company history and background
Claude subagents	Report writing, critique, and revision

Output

Each run produces work/{SYMBOL}_{DATE}/artifacts/ containing 40+ files:

final_report.md — the complete formatted report
chart.png — stock price chart with technical overlays
profile.json, technical_analysis.json — structured data
income_statement.csv, balance_sheet.csv, cash_flow.csv, key_ratios.csv — financials
draft_report_body.md, draft_report_conclusion.md, draft_intro.md — draft sections
report_body.md, report_critique.md, report_body_final.md — critique/revise cycle
Perplexity research, SEC filing extracts, Wikipedia summaries

Setup

Prerequisites

Python 3.10+
uv package manager
Claude Code CLI
System libraries: pandoc, ta-lib

Install

# Install system dependencies (macOS)
brew install pandoc ta-lib
export TA_INCLUDE_PATH="$(brew --prefix ta-lib)/include"
export TA_LIBRARY_PATH="$(brew --prefix ta-lib)/lib"

# Install Python dependencies
uv sync

Environment

Create a .env file in the project root:

ANTHROPIC_API_KEY=...
PERPLEXITY_API_KEY=...
SEC_FIRM=...
SEC_USER=...
OPENBB_PAT=...
FINNHUB_API_KEY=...

Usage

Full pipeline

./research.py SYMBOL [--dag dags/sra.yaml] [--date YYYYMMDD]

The orchestrator validates the DAG, initializes the database, then executes waves of tasks in dependency order with parallel dispatch. Auto-skips failures and continues.

Individual data scripts

Each data-gathering script runs standalone:

uv run ./skills/fetch_profile/fetch_profile.py AMD --workdir work/AMD_20260225
uv run ./skills/fetch_technical/fetch_technical.py AMD --workdir work/AMD_20260225
uv run ./skills/fetch_fundamental/fetch_fundamental.py AMD --workdir work/AMD_20260225
uv run ./skills/fetch_perplexity/fetch_perplexity.py AMD --workdir work/AMD_20260225
uv run ./skills/fetch_edgar/fetch_edgar.py AMD --workdir work/AMD_20260225
uv run ./skills/fetch_wikipedia/fetch_wikipedia.py AMD --workdir work/AMD_20260225
uv run ./skills/fetch_perplexity_analysis/fetch_perplexity_analysis.py AMD --workdir work/AMD_20260225

Database CLI

uv run ./skills/db.py init --workdir work/AMD_20260225 --dag dags/sra.yaml --ticker AMD
uv run ./skills/db.py task-ready --workdir work/AMD_20260225
uv run ./skills/db.py status --workdir work/AMD_20260225

Template rendering

# Generic template renderer
./skills/render_template.py \
  --template templates/assemble_report.md.j2 \
  --output work/AMD_20260225/artifacts/report_body.md \
  --json work/AMD_20260225/artifacts/profile.json \
  --file intro=work/AMD_20260225/artifacts/draft_intro.md \
  --file body=work/AMD_20260225/artifacts/draft_report_body.md

# Final report assembly (loads all artifacts automatically)
./skills/render_final.py --workdir work/AMD_20260225

Project Structure

├── dags/
│   └── sra.yaml                    # DAG definition (14 tasks, v2 schema)
├── skills/
│   ├── db.py                       # SQLite state management CLI
│   ├── schema.py                   # Pydantic DAG validation models
│   ├── config.py                   # Centralized constants
│   ├── utils.py                    # Shared utilities
│   ├── render_template.py          # Generic Jinja2 renderer
│   ├── render_final.py             # Final report assembly
│   ├── fetch_profile/              # Company profile + peers
│   ├── fetch_technical/            # Chart + technical indicators
│   ├── fetch_fundamental/          # Financials, ratios, analyst data
│   ├── fetch_perplexity/           # News, profiles, executives
│   ├── fetch_perplexity_analysis/  # Business model, competitive, risk
│   ├── fetch_edgar/                # SEC filings
│   └── fetch_wikipedia/            # Wikipedia summary
├── templates/
│   ├── assemble_report.md.j2       # Section concatenation
│   └── final_report.md.j2          # Final formatted report
├── research.py                       # Async DAG orchestrator (entry point)
├── tests/
│   ├── test_db.py
│   └── test_schema.py
└── work/                           # Output (one dir per run)
    └── {SYMBOL}_{DATE}/
        ├── research.db
        └── artifacts/

Script Conventions

All Python scripts follow a consistent pattern:

#!/usr/bin/env python3 shebang
Import constants from config.py, utilities from utils.py
pathlib.Path for all path operations
logger = setup_logging(__name__) for output (stderr only)
JSON manifest to stdout: {"status": "complete", "artifacts": [...], "error": null}
Exit codes: 0 = success, 1 = partial, 2 = failure
Type hints on all functions, specific exception handling

Name		Name	Last commit message	Last commit date
Latest commit History 63 Commits
.claude		.claude
dags		dags
docs/plans		docs/plans
skills		skills
templates		templates
tests		tests
.gitignore		.gitignore
ASDK.md		ASDK.md
CLAUDE.md		CLAUDE.md
DESIGN.md		DESIGN.md
IMPLEMENTATION.md		IMPLEMENTATION.md
MDB.html		MDB.html
MDB.pdf		MDB.pdf
README.md		README.md
SKILLS_BEST_PRACTICES_CHEATSHEET.md		SKILLS_BEST_PRACTICES_CHEATSHEET.md
SPEC_ASSEMBLE.md		SPEC_ASSEMBLE.md
SPEC_EDGAR.md		SPEC_EDGAR.md
SPEC_FUNDAMENTAL.md		SPEC_FUNDAMENTAL.md
SPEC_PERPLEXITY.md		SPEC_PERPLEXITY.md
SPEC_PERPLEXITY_ANALYSIS.md		SPEC_PERPLEXITY_ANALYSIS.md
SPEC_PROFILE.md		SPEC_PROFILE.md
SPEC_TECHNICAL.md		SPEC_TECHNICAL.md
SPEC_WIKIPEDIA.md		SPEC_WIKIPEDIA.md
STYLE.md		STYLE.md
chart.png		chart.png
income_statement_sankey.png		income_statement_sankey.png
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
research.py		research.py
test.html		test.html
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Stock Research Agent

How It Works

Architecture

Data Sources

Output

Setup

Prerequisites

Install

Environment

Usage

Full pipeline

Individual data scripts

Database CLI

Template rendering

Project Structure

Script Conventions

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Stock Research Agent

How It Works

Architecture

Data Sources

Output

Setup

Prerequisites

Install

Environment

Usage

Full pipeline

Individual data scripts

Database CLI

Template rendering

Project Structure

Script Conventions

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages