A multi-agent AI system for stock analysis using 5 specialized agents, multi-round debate, ML ensemble predictions, and reinforcement learning paper trading validation.
Single-Perspective Bias — Traditional stock analysis tools provide a single viewpoint. A technical analyst ignores fundamentals. A sentiment analyst ignores chart patterns. Real investment decisions require synthesizing multiple perspectives—just like how investment committees at hedge funds work.
Black Box Predictions — ML models that predict "BUY" or "SELL" without explanation are useless for real decisions. You need to understand why the system recommends something.
Unvalidated Recommendations — Most systems tell you what to do but don't test if their advice actually works.
Data Overload — Investors face information overload: SEC filings, news, social media, charts, economic indicators—scattered across dozens of sources.
MAIS solves all four problems with a multi-layered AI pipeline:
- 5 Specialized AI Agents — each fetches real data and analyzes independently using Claude AI
- 4-Round Structured Debate — agents challenge each other's reasoning, just like a real investment committee
- 5-Model ML Ensemble — LSTM, Transformer, XGBoost, Random Forest, and Prophet predict price direction
- RL Paper Trading Validator — a PPO reinforcement learning agent paper-trades the recommendation to verify it works
- Professional Reports — PDF reports with charts, full reasoning, and audit trails
Input: A stock ticker (e.g., AAPL)
Output: A BUY/HOLD/SELL recommendation with confidence score, 5 agent analyses, debate transcript, ML predictions, RL paper trading results, risk metrics, and a PDF report with charts.
Disclaimer: MAIS is for educational and research purposes only. It does not constitute financial advice. Always consult a licensed financial advisor before making investment decisions.
┌─────────────────────────────────────────────────────────────────────────┐
│ FRONTEND (Next.js 15) │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────────┐ │
│ │ Analysis │ │ Paper │ │ Training │ │ Reports + Risk │ │
│ │ Dashboard │ │ Trading │ │ Console │ │ Dashboard │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────────┘ │
├─────────────────────────────────────────────────────────────────────────┤
│ API LAYER (FastAPI) │
│ /analysis /paper-trading /training /reports /risk │
├─────────────────────────────────────────────────────────────────────────┤
│ ORCHESTRATION LAYER │
│ Master Orchestrator: Agents → Debate → ML → RL → Blending → Report │
├──────────┬──────────┬──────────┬──────────┬──────────┬─────────────────┤
│ AGENT │ DEBATE │ ML │ RL │ RISK │ OUTPUT │
│ LAYER │ ENGINE │ ENSEMBLE │ VALIDATOR│ ENGINE │ LAYER │
├──────────┼──────────┼──────────┼──────────┼──────────┼─────────────────┤
│ Trend │ Round 1: │ LSTM │ PPO │ VaR/CVaR │ PDF Reports │
│ Sentiment│ Positions│ Trans- │ Agent │ Stress │ Charts │
│ Risk │ Round 2: │ former │ │ Testing │ Memos │
│ Fundmtls │ Counters │ XGBoost │ Gym Env │ Beta │ Audit Logs │
│ Technical│ Round 3: │ R.Forest │ │ Drawdown │ │
│ │ Evidence │ Prophet │ │ │ │
│ │ Round 4: │ │ │ │ │
│ │ Voting │ │ │ │ │
├──────────┴──────────┴──────────┴──────────┴──────────┴─────────────────┤
│ DATA LAYER │
│ yfinance │ SEC EDGAR │ News APIs │ Reddit/StockTwits │ FRED/World Bank│
├─────────────────────────────────────────────────────────────────────────┤
│ LLM LAYER │
│ Claude (Primary) ──→ Gemini (Fallback) ──→ Ollama (Local Fallback) │
└─────────────────────────────────────────────────────────────────────────┘
- User enters a stock ticker (e.g.,
AAPL) in the frontend - API receives request and passes to the Master Orchestrator
- Orchestrator spawns 5 agents in parallel — each fetches relevant data using specialized tools
- Each agent uses Claude AI to analyze its data and form a position (bullish/bearish)
- Debate Engine runs 4 rounds of structured argument between agents
- Consensus Builder aggregates votes with confidence weighting
- ML Ensemble generates price direction predictions from 5 models
- RL Validator paper-trades the recommendation to validate it
- Blending Layer combines debate consensus + ML prediction + RL validation
- Report Generator creates audit-ready output with checksums
Each agent follows the ReAct (Reasoning + Acting) pattern: collect data → analyze with LLM → form position → debate.
| Agent | Expertise | Data Sources |
|---|---|---|
| Trend Agent | Price momentum, moving averages, trend strength | yfinance price history |
| Sentiment Agent | News sentiment, social media mood, fear/greed | News APIs, Reddit, StockTwits |
| Risk Agent | Volatility, VaR, correlation, drawdown risk | Price history, benchmark data |
| Fundamentals Agent | Financial statements, valuations, earnings quality | SEC EDGAR, yfinance fundamentals |
| Technical Agent | Chart patterns, RSI, MACD, Bollinger Bands | OHLCV data, technical indicators |
All 5 agents run in parallel — reducing total analysis time from ~2.5 minutes (sequential) to ~30 seconds.
Real investment committees have specialists who challenge each other's reasoning. MAIS simulates this:
| Round | Purpose | Example |
|---|---|---|
| Round 1 — Position Statements | Each agent presents its analysis and takes a stance (strongly bullish → strongly bearish) with confidence and cited data | "RSI at 72 suggests overbought conditions" |
| Round 2 — Counter-Arguments | Agents challenge each other, targeting positions most different from their own | "The Technical Agent ignores that social sentiment just turned positive" |
| Round 3 — Evidence & Rebuttals | Agents respond to challenges with strongest evidence, may update confidence | "While social sentiment is positive, my RSI concern stands because..." |
| Round 4 — Final Voting | Each agent casts a final vote (BUY/HOLD/SELL) with updated confidence. Dissent notes capture remaining concerns | "I vote BUY but worry about the earnings date risk" |
Consensus Building uses confidence-weighted voting — an agent 90% sure of BUY counts more than one 51% sure:
Votes: [BUY(0.8), BUY(0.6), HOLD(0.5), SELL(0.7), BUY(0.9)]
BUY weight: 0.8 + 0.6 + 0.9 = 2.3
HOLD weight: 0.5
SELL weight: 0.7
→ BUY with 65.7% consensus strength
Five different ML models predict whether a stock's price will go UP, DOWN, or stay FLAT over the next 5 days:
| Model | Architecture | Why It's Good for Stocks |
|---|---|---|
| LSTM | 3 layers, 128 hidden units, bidirectional + attention | Captures temporal dependencies — today's price depends on yesterday's |
| Transformer | 6 encoder layers, 8 attention heads, 256-dim embeddings | Self-attention finds relationships between any two time points |
| XGBoost | 500 trees, max depth 8, learning rate 0.05 | Excels at non-linear relationships: "RSI > 70 AND volume declining → reversal" |
| Random Forest | 500 trees, max depth 12 | Robust to outliers and noise, provides feature importance rankings |
| Prophet | Facebook's time series model | Captures weekly patterns (Monday dips, Friday rallies) and seasonality |
The models are combined with a Ridge regression meta-learner that automatically learns optimal weights — not just simple majority voting:
1. Each model predicts: [-0.02, +0.01, +0.03, +0.01, +0.02]
2. Meta-learner weights: [0.25, 0.30, 0.20, 0.15, 0.10]
3. Weighted average: +0.015 (+1.5%)
4. Direction: UP (> 0.5% threshold)
5. Confidence: 1 / (1 + std_deviation × 100) — high agreement = high confidence
Feature Engineering: 60 engineered features from raw OHLCV data — price returns (6 timeframes), moving average ratios, volatility measures, RSI, MACD, Bollinger Bands, volume features, and lagged variants.
A PPO (Proximal Policy Optimization) reinforcement learning agent validates recommendations by paper-trading with simulated money:
Trading Environment (Custom Gymnasium)
- State: 35 features (30 market features + 5 portfolio features)
- Action: Continuous position from -1 (100% short) to +1 (100% long)
- Initial Balance: $100,000 (simulated)
- Transaction Cost: 0.1% per trade (realistic broker fee)
- Slippage: 0.05% (simulates market impact)
Multi-Objective Reward Function:
Reward = 100 × daily_return ← main signal: make money
+ 0.1 × sharpe_component ← reward consistency
- 0.5 × excess_drawdown ← penalize big losses (>5% from peak)
- 10 × transaction_cost ← penalize overtrading
PPO Neural Network:
- Actor (policy): 256 → 256 → 128 neurons with GELU activation
- Critic (value): 256 → 256 → 128 neurons with GELU activation
- Training: 500,000 timesteps (~2,000 years of simulated trading) with orthogonal initialization
Validation Logic: If the recommendation is BUY and the RL agent's average position is > 0.3 (long-biased), the RL agent agrees. Validation passes if total return > -5%, Sharpe ratio > -0.5, and agreement score > 0.6.
The DecisionReportGenerator creates point-in-time, checksummed reports:
| Section | Contents |
|---|---|
| Header | Ticker, date, recommendation, confidence |
| Executive Summary | One-paragraph recommendation with key reasons |
| Agent Analyses | Each agent's stance, reasoning, evidence |
| Debate Transcript | All 4 rounds with arguments and counters |
| ML Predictions | 5-model predictions with confidence |
| RL Validation | Paper trading results, trade log |
| Risk Metrics | VaR, volatility, drawdown, beta |
| Counterfactuals | What-if scenarios (market crash, rate hike, sector rotation, earnings miss) |
| Charts | Price history, prediction vs. actual, agent positions |
| Audit Trail | Timestamps, SHA-256 checksums, model versions |
| Technology | Version | Purpose |
|---|---|---|
| Python | 3.11+ | Primary backend language |
| FastAPI | Latest | High-performance async API framework |
| Pydantic v2 | Latest | Data validation and settings management |
| asyncio | Built-in | Parallel agent execution |
| uvicorn | Latest | ASGI server |
| Technology | Purpose |
|---|---|
| Anthropic Claude | Primary LLM for agent reasoning |
| Google Gemini | Fallback LLM provider |
| Ollama | Local LLM fallback (no API needed) |
| LangChain + LangGraph | LLM tool orchestration and agent framework |
| PyTorch | Deep learning (LSTM, Transformer) |
| scikit-learn | Traditional ML (Random Forest) + meta-learner |
| XGBoost | Gradient boosting |
| Prophet | Time series forecasting |
| Stable-Baselines3 | Reinforcement learning (PPO) |
| Gymnasium | RL environment framework |
| Technology | Purpose |
|---|---|
| Next.js 15 | React framework with App Router |
| React 19 | UI component library |
| Tailwind CSS | Utility-first styling |
| Plotly.js | Interactive charts |
| TypeScript | Type-safe JavaScript |
| Source | Data Type | Cost |
|---|---|---|
| yfinance | Price/volume history, fundamentals | Free |
| SEC EDGAR | 10-K, 10-Q, 8-K filings | Free |
| Finnhub | Real-time quotes, news | Free tier |
| GDELT | Global news sentiment | Free |
| StockTwits | Social sentiment | Free |
| Reddit API | r/wallstreetbets sentiment | Free tier |
| World Bank | Macro indicators | Free |
| FRED | Economic data | Free |
MAIS uses a fault-tolerant fallback chain to ensure reliability:
Claude (Primary, highest quality)
↓ on failure (rate limit, downtime, error)
Gemini (Secondary, different failure modes)
↓ on failure
Ollama (Local, always available, no cost)
Prompt Engineering Techniques:
- Structured Output: All prompts request JSON with specific schemas for parseable, consistent responses
- Role Assignment: Each agent has a detailed system prompt defining its expertise
- Few-Shot Examples: Critical prompts include examples of desired output format
- Temperature Control: Analysis uses temperature 0.3 (focused), debate uses 0.5 (more creative)
| Requirement | Version | Notes |
|---|---|---|
| Python | 3.11+ | 3.12 or 3.13 work fine |
| Node.js | 18+ | For the frontend |
| Anthropic API Key | Required | Get at console.anthropic.com |
| 8GB+ RAM | Recommended | For ML training |
Optional (not required): PostgreSQL, Redis, Qdrant — the system works without them using in-memory fallbacks.
# 1. Clone the repository
git clone https://github.com/Dev-Sirbhaiya/FinBot.git
cd FinBot
# 2. Create and activate a virtual environment
python -m venv .venv
# Windows:
.venv\Scripts\activate
# macOS/Linux:
source .venv/bin/activate
# 3. Install Python dependencies (~40 packages including PyTorch)
pip install -e .
# 4. Create your environment file
cp .env.example .env
# Edit .env and add: ANTHROPIC_API_KEY=sk-ant-your-key-here
# 5. Install frontend dependencies
cd frontend
npm install
cd ..Start the backend (Terminal 1):
python -m uvicorn src.api.main:app --reload --port 8000Start the frontend (Terminal 2):
cd frontend
npm run devOpen http://localhost:3000 in your browser.
With the backend running:
- Swagger UI: http://localhost:8000/docs
- ReDoc: http://localhost:8000/redoc
- Health Check: http://localhost:8000/health
- Enter a stock ticker (e.g.,
AAPL,NVDA,TSLA) and click Analyze - The pipeline runs: 5 agents analyze in parallel → 4-round debate → ML predictions → RL validation
- Results show agent reasoning, debate transcript, ML predictions, and RL validation
Test a specific investment hypothesis. Enter a ticker, choose a recommendation (BUY/SELL/HOLD) and confidence level. The RL agent simulates trades.
Manually train or retrain ML and RL models. View trained models, ensemble weights, and accuracy statistics.
View risk profiles (VaR, CVaR, beta, drawdown), run stress tests (market crash, flash crash, interest rate shock, sector rotation, etc.).
Generate PDF reports with charts, agent analyses, debate summaries, ML/RL results, and audit metadata with SHA-256 checksums.
On the first analysis for a new ticker, the system auto-trains ML and RL models:
| Stage | First Run (CPU) | Subsequent Runs |
|---|---|---|
| Agent analysis (5 parallel) | 15–30s | 15–30s |
| Debate (4 rounds) | 30–60s | 30–60s |
| ML training (5 models) | 5–20 min | 1–3s (cached) |
| RL training (500K steps) | 10–30 min | 1–3s (cached) |
| Total | ~20–55 min | ~1–2 min |
Trained models are saved to models/<TICKER>/ and reused on subsequent runs.
Speed up first run: Reduce
epochsinconfig/ml_models.yamlandtotal_timestepsinconfig/rl_config.yaml. If you have a CUDA GPU, setdevice: "cuda"inconfig/settings.py.
mais/
├── config/ # Configuration (YAML + Python)
│ ├── settings.py # Pydantic settings hub (loads from .env + YAML)
│ ├── agents.yaml # Agent prompts, tools, timeouts
│ ├── ml_models.yaml # ML hyperparameters (LSTM, XGBoost, etc.)
│ ├── rl_config.yaml # RL training config (PPO, environment, reward)
│ ├── data_sources.yaml # Data source URLs and rate limits
│ └── agent_weights.json # Initial blending weights
│
├── src/ # Python backend
│ ├── agents/ # 5 financial agents + base class + tool definitions
│ ├── api/ # FastAPI routes, middleware, auth
│ ├── core/ # Shared utilities (caching, validation, audit)
│ ├── data/ # Data fetching, caching, rate limiting
│ ├── db/ # Database models (optional, in-memory fallback)
│ ├── debate/ # Debate engine + consensus builder
│ ├── llm/ # LLM providers (Claude, Gemini, Ollama) + fallback chain
│ ├── ml/ # ML models, ensemble, training, feature engineering
│ ├── orchestrator/ # Master orchestrator + pipeline coordination
│ ├── output/ # PDF generation, charts, memo rendering
│ ├── reports/ # Decision report system with checksums
│ ├── risk/ # Risk metrics, VaR/CVaR, stress testing, governance
│ ├── rl/ # RL agent (PPO), Gym environment, validator
│ ├── simulation/ # Backtesting engine, walk-forward testing
│ ├── tasks/ # Async task definitions
│ └── vision/ # Chart image processing
│
├── frontend/ # Next.js 15 + React 19 + Tailwind + TypeScript
│ └── src/
│ ├── app/ # Pages (analysis, paper-trading, training, risk, reports)
│ ├── components/ # Reusable components (AgentCard, DebateTimeline, etc.)
│ ├── hooks/ # Custom React hooks
│ └── lib/ # API client + TypeScript types
│
├── models/ # Saved ML/RL weights (auto-generated at runtime)
├── tests/ # Test suite (agents, API, debate, ML, RL)
├── scripts/ # Training scripts (train_ml_models.py, train_rl_agent.py)
├── data/ # Runtime data (reports, simulations, predictions)
├── .env.example # Environment variable template
├── pyproject.toml # Python dependencies and project metadata
├── Dockerfile # Production container (non-root, multi-layer cache)
└── docker-compose.yml # Full stack: API + frontend + PostgreSQL + Redis + Qdrant
| Endpoint | Method | Description |
|---|---|---|
/api/analysis/start |
POST | Start full analysis pipeline for a ticker |
/api/analysis/{task_id}/status |
GET | Check analysis progress |
/api/analysis/{task_id}/result |
GET | Get completed analysis results |
/api/paper-trading/run |
POST | Run paper trading simulation |
/api/training/ml/{ticker} |
POST | Train ML models for a ticker |
/api/training/rl/{ticker} |
POST | Train RL agent for a ticker |
/api/reports/generate/{task_id} |
POST | Generate PDF report |
/api/risk/profile/{ticker} |
GET | Get risk metrics |
/ws/analysis/{task_id} |
WebSocket | Real-time analysis progress updates |
| Decision | Chosen | Why |
|---|---|---|
| Multi-Agent vs. Single Agent | Multi-Agent | Investment decisions are multi-faceted. Single perspectives have blind spots. Debate forces consideration of all angles. |
| Ensemble vs. Single ML Model | Ensemble | Each model captures different patterns (LSTM: sequences, Prophet: seasonality, XGBoost: non-linear). Ensemble is more robust than any individual. |
| PPO vs. DQN/SAC | PPO | PPO's clipping mechanism prevents catastrophic policy updates — crucial for financial applications where stability > marginal performance. |
| Fallback Chain vs. Single LLM | Fallback Chain | API rate limits and outages are common. Fallback ensures the system works even when Claude's API is down. |
| Confidence-Weighted Voting | Over simple majority | An agent 90% sure of BUY should count more than one 51% sure. Captures nuance in consensus. |
# Run all tests
pytest
# Run specific category
pytest tests/test_ml/
pytest tests/test_debate/
pytest tests/test_rl/
# Run with coverage
pytest --cov=src --cov-report=htmlTest suite covers:
- Agent logic — weight calculation, circuit breaker, JSON parsing
- API endpoints — analysis routes, error sanitization
- Debate engine — consensus building, voting mechanics
- ML pipeline — ensemble predictions, data validation, market data caching
- RL environment — trading environment step/reset, reward calculation
# Build and run all services
docker-compose up -d
# Services started:
# - mais-api: FastAPI backend on :8000
# - mais-frontend: Next.js on :3000
# - postgres: Database on :5432
# - redis: Cache on :6379
# - qdrant: Vector DB on :6333
# View logs
docker-compose logs -f mais-api
# Stop all
docker-compose downLLMSettings:
anthropic_api_key # From ANTHROPIC_API_KEY env var
claude_model # Default: claude-sonnet-4-20250514
fallback_chain # [CLAUDE, GEMINI, OLLAMA]
MLSettings:
lookback_window: 60 # Days of history for features
prediction_horizon: 5 # Days ahead to predict
model_dir: "models" # Where to save trained models
RLSettings:
total_timesteps: 500000
learning_rate: 0.0003
initial_balance: 100000
orchestrator:
max_parallel_agents: 5 # Run all 5 simultaneously
debate_rounds: 4
analysis_timeout: 300 # 5 min max per analysis
agents:
trend:
indicators: [sma_20, sma_50, rsi, macd, ...]
sentiment:
news_lookback_days: 7
subreddits: [wallstreetbets, stocks, investing]
use_finbert: truelstm:
hidden_size: 128
num_layers: 3
bidirectional: true
attention_heads: 8
epochs: 100
xgboost:
n_estimators: 500
max_depth: 8
ensemble:
meta_learner: "ridge"
ridge_alpha: 1.0environment:
initial_balance: 100000
transaction_cost: 0.001
slippage: 0.0005
reward:
return_weight: 100.0
sharpe_weight: 0.1
drawdown_penalty: 0.5
turnover_penalty: 10.0
ppo:
learning_rate: 0.0003
n_steps: 2048
batch_size: 64
gamma: 0.99
clip_range: 0.2Add any of these to .env for additional data sources:
| Key | Source | Purpose |
|---|---|---|
GOOGLE_API_KEY |
Google Gemini | LLM fallback provider |
NEWSAPI_KEY |
NewsAPI | News articles |
FINNHUB_API_KEY |
Finnhub | Market data + news |
REDDIT_CLIENT_ID / REDDIT_CLIENT_SECRET |
Social sentiment | |
FRED_API_KEY |
FRED | Economic indicators |
ALPHA_VANTAGE_API_KEY |
Alpha Vantage | Market data |
None required — free sources (yfinance, SEC EDGAR, GDELT, StockTwits, World Bank) work by default.
| Problem | Solution |
|---|---|
ANTHROPIC_API_KEY error |
Make sure .env exists and has your key |
| Database connection errors | Comment out DATABASE_URL, REDIS_URL, QDRANT_URL in .env |
| Port 8000 in use | Kill existing process or use --port 8001 |
| Port 3000 in use | Frontend auto-detects and uses 3001 |
| ML training is slow | Reduce epochs in config/ml_models.yaml, timesteps in config/rl_config.yaml |
Module not found errors |
Run pip install -e . from the mais/ directory |
| Frontend can't reach API | Ensure backend is running on port 8000 first |
Built end-to-end:
- Complete multi-agent architecture with 5 specialized agents, each with different data sources and tools
- 4-round debate engine with structured argumentation and confidence-weighted consensus building
- ML ensemble pipeline with feature engineering (60 features), model training, and Ridge regression meta-learner
- Custom Gymnasium trading environment with multi-objective reward shaping (returns, Sharpe, drawdown, turnover)
- PPO trading agent with custom neural network architecture (256→256→128, GELU, orthogonal init)
- Walk-forward backtesting engine with performance metrics (Sharpe, max drawdown, win rate, profit factor)
- Decision report system with point-in-time snapshots, counterfactual scenarios, and SHA-256 checksums
- FastAPI backend with RESTful API, WebSocket real-time updates, and API key authentication
- LLM fallback chain with automatic failover (Claude → Gemini → Ollama) and robust JSON parsing
- Data layer with intelligent caching (market-hours-aware TTL), token-bucket rate limiting, and graceful degradation
- Next.js frontend with interactive dashboard, debate visualization, paper trading interface, and risk dashboard
Libraries and services integrated:
- Claude AI (Anthropic) — Primary LLM for agent reasoning
- Stable-Baselines3 — RL algorithm implementation (PPO)
- PyTorch — Deep learning framework for LSTM/Transformer models
- yfinance, SEC EDGAR — Market data and filings
- LangChain / LangGraph — Agent tool orchestration
This project is licensed under the MIT License — see the LICENSE file for details.