AI-powered equity intelligence for buy-side analysts.
AIphaWatch is a multi-tenant SaaS platform that ingests SEC EDGAR filings, financial data, and news to generate structured AI analyst briefs and power a conversational RAG chat interface — reducing company research from hours to minutes.
- Analyst Briefs — 8-section AI-generated briefs (snapshot, what changed, risk flags, sentiment, executive summary) via LangGraph + AWS Bedrock. Fan-out parallelism keeps P95 generation under 15 s.
- RAG Chat — Multi-turn conversational interface grounded in real EDGAR source data. Responses stream sentence-by-sentence via Server-Sent Events with inline citations.
- Chunk Cache —
retrieved_chunk_idspersisted inChatSessionavoids re-embedding on follow-up questions. Target >70% cache hit rate per session. - Rolling Context Summary — Conversations beyond 20 messages are compressed into a rolling summary (~300 tokens) so the LLM context window stays bounded regardless of session length.
- Watchlist Dashboard — Monday-morning digest sorted by most material changes across your tracked companies.
- Automated Ingestion — SEC EDGAR filings, financial snapshots (Alpha Vantage), and news (NewsAPI) ingested on schedule via Celery + Redis.
- Multi-Tenant — Tenant isolation enforced at the repository layer with PostgreSQL RLS as defence-in-depth.
- Citation-Backed — Every claim in the executive summary traces to a retrieved source chunk. Hallucinated citations are dropped post-generation.
| Layer | Technology |
|---|---|
| Frontend | React 18 + TypeScript + shadcn/ui + Tailwind CSS |
| API | Python / FastAPI + Uvicorn |
| Agent Orchestration | LangGraph + AWS Bedrock (Claude 3.5 Sonnet / Haiku) |
| Scheduler | Celery + Redis |
| Database | PostgreSQL 16 (RDS) + pgvector |
| Embeddings | Amazon Titan Embeddings v2 (1536-dim) |
| Auth | AWS Cognito (JWT, multi-tenant) |
| Storage | S3 |
| IaC | Terraform |
| CI/CD | GitHub Actions |
LangGraph owns all stateful, multi-step workflows. Celery owns the clock. They meet at a well-defined boundary: Celery enqueues a task with an input payload; LangGraph executes the graph and writes results to Postgres.
Source file: docs/api-bedrock-pgvector-redis-ecs.mmd
How to read this diagram: follow solid arrows for primary request/task flow, dashed arrows for returned context or asynchronous feedback, and use the "What it shows" notes as quick summaries of each core subsystem's role.
flowchart LR
classDef api fill:#0b3954,color:#ffffff,stroke:#082a3f,stroke-width:1px
classDef compute fill:#bfd7ea,color:#102a43,stroke:#5c7c8a,stroke-width:1px
classDef data fill:#c7f9cc,color:#1b4332,stroke:#52b788,stroke-width:1px
classDef ai fill:#ffe8a3,color:#5f3b00,stroke:#d4a017,stroke-width:1px
classDef edge fill:#f8f9fa,color:#343a40,stroke:#adb5bd,stroke-width:1px
classDef note fill:#fff3bf,color:#5f3b00,stroke:#e9c46a,stroke-width:1px
Analyst["Analyst / Frontend"]:::edge
ALB["ALB / HTTPS Entry"]:::edge
subgraph ECSCluster["AWS ECS Cluster"]
API["FastAPI Service\nREST + SSE endpoints"]:::api
Worker["Celery Worker Service\nasync ingestion + generation jobs"]:::compute
end
subgraph DataPlane["State + Retrieval"]
PG[("PostgreSQL + pgvector\napp data + vector search")]:::data
Redis[("Redis\nbroker + cache + transient state")]:::data
end
subgraph AIPlane["Model Runtime"]
Bedrock["AWS Bedrock\nClaude + Titan Embeddings"]:::ai
end
subgraph Notes["What it shows"]
N1["What it shows: The API service in ECS is the central integration point for user traffic and system orchestration."]:::note
N2["What it shows: Redis brokers async work to Celery and also supports cache/state coordination for low-latency API paths."]:::note
N3["What it shows: PostgreSQL + pgvector stores operational data and powers vector retrieval returned to API and chat flows."]:::note
N4["What it shows: Bedrock handles both embeddings and LLM generation for chat, briefs, and sentiment tasks."]:::note
end
Analyst -->|"UI calls /api/*"| ALB
ALB -->|"HTTP / SSE"| API
API -->|"CRUD, briefs, chat sessions"| PG
API -->|"chunk similarity search"| PG
API -->|"session cache, broker coordination"| Redis
API -->|"chat, brief, sentiment, embeddings"| Bedrock
API -->|"enqueue background work"| Redis
Redis -->|"task queue"| Worker
Worker -->|"ingestion writes, brief storage, vectors"| PG
Worker -->|"embeddings, scoring, summarization"| Bedrock
PG -.->|"retrieved chunks + company state"| API
Redis -.->|"cache hits / async signals"| API
Worker -.->|"briefs and chat context become queryable"| API
N1 -.-> API
N2 -.-> Redis
N3 -.-> PG
N4 -.-> Bedrock
Five section builders run in parallel via LangGraph Send after chunk retrieval. The executive summary runs last, after fan-in, so it can only synthesise information that was surfaced in the parallel sections.
retrieve_chunks (Titan Embeddings v2 → pgvector cosine search, top-8 EDGAR chunks)
└─ Send (parallel fan-out)
├─ build_snapshot (data-driven — latest FinancialSnapshot, no LLM)
├─ build_what_changed (data-driven — snapshot diff, 0.5% threshold, no LLM)
├─ build_risk_flags (Claude Sonnet — up to 5 flags, sorted by severity)
├─ build_sentiment (data-driven — pre-computed SentimentRecord aggregates)
└─ build_sources (data-driven — deduplicated citation list)
└─ assemble_sections (fan-in — collects and sorts all parallel outputs)
└─ build_executive_summary (Claude Sonnet — synthesises sections, citation guard)
└─ build_suggested_followups (Claude Haiku — 4 follow-up chips)
└─ store_brief (persists AnalystBrief + BriefSection rows)
└─ handle_errors → END
prepare_context (loads ChatSession from DB; builds context window:
rolling summary + last 10 raw messages)
└─ detect_intent (Claude Haiku — 'rag' | 'comparison' | 'general';
falls back to 'rag' on any error)
├─ (rag / comparison) → check_chunk_cache
│ ├─ (cache hit, >70% target) ──────────────────────→ generate_response
│ └─ (cache miss) → retrieve_chunks
│ ├─ (comparison) → competitor_lookup → generate_response
│ └─ (no comparison) → generate_response
└─ (general) ──────────────────────────────────────────→ generate_response
└─ generate_response (Claude Sonnet — context + chunks + competitor data + question)
└─ generate_followups (Claude Haiku — 3 follow-up chips)
└─ persist_turn (appends messages; merges new chunk IDs into cache)
└─ maybe_summarize (Claude Haiku — triggers at >20 msgs)
└─ handle_errors → END
POST /api/chat/sessions/{session_id}/messages streams a sequence of Server-Sent Events:
data: {"type": "token", "token": "Apple's revenue grew..."} ← one per sentence
data: {"type": "token", "token": "...12% year-over-year. "}
data: {"type": "citations", "citations": [{"chunk_id": "...", "title": "Apple 10-K 2025", "source_type": "edgar_10k", "source_url": "...", "excerpt": "..."}]}
data: {"type": "followups", "questions": ["What drove the growth?", "How do margins compare YoY?", "What is the capex outlook?"]}
data: {"type": "done", "session_id": "<uuid>"}
# On error:
data: {"type": "error", "message": "An unexpected error occurred. Please try again."}
Response headers: Content-Type: text/event-stream, Cache-Control: no-cache, X-Accel-Buffering: no
fetch_news → parse_articles → store_articles → score_sentiments → store_sentiments → handle_errors → END
└─ (no articles found) ──────────────────────────────────────────────────────→ handle_errors
fetch_filings → parse_documents → chunk_documents → embed_chunks → store_chunks → handle_errors → END
└─ (no new filings) ──────────────────────────────────────────────────────→ handle_errors
| Method | Path | Description | Auth |
|---|---|---|---|
GET |
/health |
ALB health check | none |
GET |
/api/companies/resolve?q={query} |
Resolve ticker/name to canonical company | analyst |
GET |
/api/companies/{company_id} |
Get company by UUID | analyst |
GET |
/api/watchlist |
List user watchlist with company data | analyst |
POST |
/api/watchlist |
Add company by ticker | analyst |
DELETE |
/api/watchlist/{company_id} |
Remove from watchlist | analyst |
POST |
/api/ingestion/trigger |
Manually trigger EDGAR ingestion | admin |
GET |
/api/companies/{company_id}/brief |
Get latest brief with all sections | analyst |
POST |
/api/companies/{company_id}/brief/generate |
Force-generate a new brief | analyst |
GET |
/api/companies/{company_id}/briefs |
List recent briefs (metadata only) | analyst |
GET |
/api/companies/{company_id}/brief/{brief_id}/sections |
Get sections for a specific brief | analyst |
POST |
/api/chat/sessions |
Create new chat session | analyst |
GET |
/api/chat/sessions?company_id={id} |
List sessions for a company | analyst |
GET |
/api/chat/sessions/{session_id} |
Get session metadata | analyst |
DELETE |
/api/chat/sessions/{session_id} |
Delete session (ownership-enforced, 204) | analyst |
GET |
/api/chat/sessions/{session_id}/messages |
Get full message history | analyst |
POST |
/api/chat/sessions/{session_id}/messages |
Send message — returns SSE stream | analyst |
GET |
/api/dashboard |
Watchlist digest sorted by most changed | analyst |
The React chat streaming UI is implemented under src/ with a container-driven architecture, Zustand state, and SSE parsing utilities:
src/components/chat/ChatContainer.tsx— session lifecycle + send flow orchestrationsrc/components/chat/MessageList.tsx— scrollable message thread with stable message keyssrc/components/chat/MessageBubble.tsx— role-based rendering, citations, follow-ups, streaming indicatorsrc/components/chat/InlineCitation.tsx— inline source link renderingsrc/components/chat/FollowUpChips.tsx— clickable follow-up promptssrc/components/chat/CompanyContextBanner.tsx— active company context headersrc/components/chat/StreamingIndicator.tsx— typing/streaming visual statesrc/components/chat/ChatInput.tsx— composer with send controlssrc/hooks/useSSE.ts— SSE fetch stream reader + event dispatch (token,citations,followups,done,error)src/stores/chatStore.ts— Zustand store for messages/session/streaming and stream lifecycle actions (startAssistantStream,appendToken,finishStream,failStream)src/lib/sse.ts— typed SSE event parsingsrc/lib/api.ts— typed API wrappers for session and message calls
PR hardening fixes included in Step 11:
- Empty placeholder assistant bubble on fetch failure is handled via
failStream(errorText)(no blank bubble remains) - Message list uses stable generated
idkeys (no array-index keys) - Cross-company stale messages are cleared by resetting store state when navigating to a company without a pre-existing session
- Python 3.12+
- uv (recommended for environment management)
- Node.js 20+
- Docker Engine + Docker Compose plugin
- AWS CLI configured with appropriate credentials
- Terraform 1.6+
Start PostgreSQL (with pgvector) and Redis:
docker compose up -d postgres redisValidate both are healthy:
docker compose psThe local compose defaults match application defaults in alphawatch/config.py:
- Postgres:
localhost:5432, databasealphawatch, useralphawatch - Redis:
localhost:6379
Stop services when done:
docker compose down# Install dependencies
uv sync
# Run database migrations
uv run alembic upgrade head
# Run the API server
uv run uvicorn alphawatch.api.main:app --reload
# Verify health endpoint
curl http://localhost:8000/health
# {"status": "ok"}npm install
npm run devThe infra/ directory contains 8 Terraform modules and 2 environment configurations:
# Initialize and deploy staging
cd infra/environments/staging
terraform init
terraform plan
terraform apply
# Initialize and deploy production
cd infra/environments/production
terraform init
terraform plan
terraform applyModules: vpc, rds, elasticache, ecs, s3, cognito, cloudfront, secrets
Environments: staging (cost-optimized), production (Multi-AZ, auto-scaling)
GitHub Actions workflows are defined under .github/workflows/:
ci.yml— PR + main quality gates (pytest, mypy, ruff, frontend tests, TS typecheck)build-artifacts.yml— reusable image build/push + frontend build artifactdeploy-staging.yml— main branch deploy to staging via Terraform + ECS stability waits + smoke testsrelease-prod.yml— production release via tag/manual trigger (no approval gate for solo operation)
AWS_REGION— AWS region (e.g.us-east-1)AWS_ROLE_ARN_STAGING— OIDC-assumable role for staging deployAWS_ROLE_ARN_PRODUCTION— OIDC-assumable role for production deployECR_API_REPOSITORY— ECR repo name for API imageECR_WORKER_REPOSITORY— ECR repo name for worker imageDEPLOY_FRONTEND_STATIC— optional (true/false); static S3 sync only when enabled
# 1) Re-run Terraform apply with previous known-good image tags
cd infra/environments/staging
terraform init
terraform apply \
-var "api_image=<known-good-api-image-uri>" \
-var "worker_image=<known-good-worker-image-uri>"
# 2) Wait for ECS services to stabilize
aws ecs wait services-stable --cluster <cluster-name> --services <api-service-name>
aws ecs wait services-stable --cluster <cluster-name> --services <worker-service-name>
# 3) Verify health endpoint
curl --fail http://<alb-dns-name>/health# Backend tests
uv run pytest tests/ -v
# Backend with coverage
uv run pytest tests/ --cov=alphawatch --cov-report=term-missing
# Type checking
mypy alphawatch/
# Linting
ruff check alphawatch/
# Frontend tests
npm test
# Frontend type checking
npx tsc --noEmit| Test File | Scope | Tests |
|---|---|---|
test_models.py |
ORM model registration + structure | 30 |
test_config.py |
Settings defaults, computed URLs, env overrides | 14 |
test_auth.py |
Bearer token extraction, AuthError |
9 |
test_dependencies.py |
get_current_user, require_role RBAC |
8 |
test_api.py |
Health endpoint, middleware auth, schemas | 10 |
test_companies.py |
Company schemas, auth enforcement, routing | 10 |
test_watchlist.py |
Watchlist schemas, auth enforcement, routing | 12 |
test_ingestion.py |
State types, chunker, EDGAR mapping, graph, endpoint | 29 |
test_financial.py |
Safe parsing, data classes, schemas, client, config | 25 |
test_sentiment.py |
NewsClient, BedrockClient, SentimentGraph, repository | 34 |
test_brief.py |
BriefState types, all nodes, BriefGraph, repositories | 71 |
test_chat.py |
ChatState types, all nodes, routing, repository, schemas, API | 95 |
test_briefs_api.py |
Brief API schemas and routing | 13 |
test_dashboard.py |
Dashboard schemas, routing, repository coverage | 26 |
| Total | 401 |
| Phase | Status | Description |
|---|---|---|
| Phase 1 — MVP | ✅ Complete | Auth, watchlist, EDGAR ingestion, financial API, news, briefs, chat, dashboard, infra |
| Phase 2 — Intelligence | 🔧 In Progress | Full news depth, sentiment enrichment, risk flags, document upload, comparative intelligence |
| Phase 3 — SaaS Hardening | ⏳ Planned | Tenant branding, alerts, admin panel, bulk import, brief export, usage tracking |
| Phase 4 — Scale & Polish | ⏳ Planned | Earnings transcripts, watchlist sharing, scheduled briefs, comparison views, audit log |
- Step 1: Terraform — VPC, RDS, Redis, Cognito, ECS, S3, CloudFront, Secrets
- Step 2: Database schema — 12 ORM models, Alembic migration, HNSW index, RLS policies
- Step 3: FastAPI skeleton — Cognito JWT auth, TenantMiddleware, health endpoint
- Step 4: Company resolution + Watchlist CRUD endpoints
- Step 5: EDGAR ingestion —
IngestionGraph, EDGAR client, chunker, embeddings - Step 6: Financial API — Alpha Vantage client,
FinancialSnapshotrepository - Step 7: News ingestion — NewsAPI client, BedrockClient,
SentimentGraph - Step 8:
BriefGraph— all 8 sections with parallel fan-out - Step 9: Brief API endpoints + Pydantic schemas
- Step 10:
ChatGraph+ SSE streaming endpoint - Step 11: React
ChatContainer+ streaming UI - Step 12: Dashboard endpoint + React
WatchlistGrid - Step 13:
PeersChips+ competitor detection in chat - Step 14: CI/CD pipeline + staging deployment
- Step 1: Platform hardening baseline — IAM least-privilege cleanup, reproducible build locks, deployment guardrails
- Step 2: Runtime foundations rollout — provider factory wiring, config-driven selection, and chat summarization off the hot path
- Step 3: Full news ingestion depth — multi-source adapters, stronger deduplication, per-source quotas
- Step 4: Sentiment enrichment v2 — entity/aspect tagging, confidence scoring, trend normalization
- Step 5: Risk flag pipeline v2 — richer categories, deterministic severity calibration, persistence upgrades
- Step 6: Document upload ingestion — tenant-scoped upload API, parser/chunker/embed/store workflow
- Step 7: Hybrid retrieval policy — blend EDGAR and uploaded sources with source weighting and controls
- Step 8: Comparative intelligence expansion — competitor benchmark cards and comparison-aware prompt routing
- Step 9: Brief delta intelligence — cross-brief section diffs and "what materially changed" attribution
- Step 10: Alerts and delivery channels — threshold/rule engine plus email/slack delivery
- Step 11: Evaluation harness and quality gates — golden datasets, response scoring, regression checks in CI
- Step 12: Phase 2 release hardening — staging soak, runbooks, rollback drills, production cutover
docs/PRD-AIphaWatch-2026-03-25.md— Product requirementsdocs/AIphaWatch-TechnicalSpec.md— Technical specificationdocs/project-status.md— Detailed phase and step trackingAGENTS.md— AI agent guidance, graph shapes, and architecture referencedeveloper/developer-journal.md— Development logdocs/phase2-step1-hardening.md— Step 1 execution checklist (IAM, CI gates, deploy guardrails, migration drill)docs/phase2-step2-runtime-foundations-spec.md— Step 2 architecture and execution specdocs/phase2-step3-news-ingestion-spec.md— Step 3 multi-source ingestion specdocs/phase2-operations-runbook-news-sources.md— News source operations runbook- Phase 2 Epic (GitHub issue) — Phase 2 epic and issue breakdown
- Phase 2 Project (GitHub) — Phase 2 project board