AI-powered trading signal generator. Five specialist agents collaborate on a single AMD Instinct MI300X to produce structured BUY / SELL / HOLD calls with confidence, entry, stop loss, target, and per-agent reasoning.
Built for the AMD Developer Hackathon · Track: AI Agents & Agentic Workflows · May 2026.
| Live demo | https://huggingface.co/spaces/lablab-ai-amd-developer-hackathon/finagent |
| Track | AI Agents & Agentic Workflows |
| Model | Qwen/Qwen3-14B via vLLM 0.17 on ROCm 7.2 |
| Hardware | AMD Instinct MI300X (AMD Developer Cloud) |
| License | MIT |
If you're watching a list of stocks, you hit the same question every day: which of these 10 tickers deserves real attention today? Most AI tools answer this with a single-shot prompt into a general-purpose model — the output is confident, shallow, and indistinguishable across tickers.
Real analysts don't work that way. They split the work: one person reads the news, another digs into financials, another reads charts, another sizes the trade. Then a head of strategy synthesizes.
FinAgent reproduces that division of labour with five CrewAI agents running against a single locally-hosted Qwen3-14B instance on an AMD MI300X GPU. Each agent is tool-equipped for its niche; they run in a dependency-correct topology (scanner + fundamental + technical first, then risk, then strategy), and the final output conforms to a strict schema that the Gradio frontend parses into interactive signal cards.
Everything is self-hosted inference — no OpenAI, no Claude, no API bills. The $100 AMD Developer Cloud credit is the entire compute budget.
flowchart LR
UI[Gradio Space<br/>HF frontend]
R[WatchlistRunner]
C[FinAgentCrew]
MS[Market Scanner]
FA[Fundamental Analyst]
TA[Technical Analyst]
RM[Risk Manager]
CS[Chief Strategist]
V[vLLM<br/>Qwen3-14B<br/>MI300X]
T[(Tool APIs<br/>yfinance / ddgs)]
UI -->|ticker watchlist| R
R --> C
C --> MS & FA & TA
TA --> RM
MS --> CS
FA --> CS
TA --> CS
RM --> CS
CS -->|TradingSignal| R
R -->|streaming updates| UI
MS <--> T
FA <--> T
TA <--> T
RM <--> T
MS & FA & TA & RM & CS <-->|OpenAI-compatible HTTPS| V
| Layer | Role |
|---|---|
inference/ |
ROCm + vLLM + Qwen3 deployment scripts for MI300X |
tools/ |
10 keyless tool functions (yfinance, ddgs, pandas-ta) agents can call |
crew/ |
CrewAI agent + task + crew + runner orchestration, with callbacks |
gradio-frontend/ |
Dark financial-terminal Gradio UI, deployable as a Hugging Face Space |
| # | Agent | Goal | Tools |
|---|---|---|---|
| 1 | Market Scanner | Detect news and price/volume anomalies | search_news, get_price_change, get_volume |
| 2 | Fundamental Analyst | Determine intrinsic value | get_financials, get_earnings, get_peers |
| 3 | Technical Analyst | Identify entry/exit via indicators | get_price_history, calculate_indicators |
| 4 | Risk Manager | Size position, place ATR-based stop-loss | calculate_position_size, set_stop_loss |
| 5 | Chief Strategist | Synthesize 1–4 into a final BUY/SELL/HOLD | — pure reasoning |
Agents 1, 2, 3 run in parallel. Risk Manager waits for the Technical Analyst's entry price. Chief Strategist waits for all four.
For a watchlist AAPL, NVDA, BTC-USD, each ticker renders a card like:
AAPL — BUY (Confidence: 75%)
Entry: $293.32
Stop Loss: $284.52
Target: $307.99
Reasoning:
- Market: Uptrend confirmed by rising prices and 20-day SMA, but
overbought RSI suggests short-term consolidation risk
- Fundamental: Strong earnings momentum (+3.6–9.8% surprises), robust
margins (27%), but elevated debt and high P/E vs peers
- Technical: Price above Bollinger Upper ($291.39), RSI 73 overbought,
MACD neutral
- Risk: 1:2 risk-reward ratio with 5% stop-loss and target;
position size 110 shares (2% of a $100K portfolio)
The entry price is grounded in the live yfinance quote. Stop-loss and target are synthesised from the Risk Manager's ATR band, with an automatic sanity check that replaces any LLM-emitted price that drifts too far from the live quote.
Small LLMs confidently invent prices from their training data. A 14B model will cheerfully emit $10.00 as the entry price for NVDA on a day it traded at $215. FinAgent doesn't allow it. Every signal has its entry price anchored to the live yfinance quote; stop-loss and target are rescaled around that live price so the model's risk/reward geometry is preserved. When the model output is too malformed to parse, a deterministic fallback builds a clean signal from the live price directly. The card always reflects real market data.
Most CrewAI examples give every agent its own LLM client. FinAgent constructs one crewai.LLM (backed by litellm's hosted_vllm/ provider) pointing at the vLLM endpoint and shares it across the five agents. vLLM's prefix caching means repeated backstory prefixes hit the same KV cache, giving a measurable throughput win on the MI300X.
If one ticker blows up (bad symbol, network hiccup, parser failure), the runner captures the error in a CrewResult and keeps going. The Gradio frontend renders an error card for that ticker and normal signal cards for the rest. No silent failure, no broken dashboard.
Every agent lifecycle event (task_start, task_complete, agent_output, task_failed, crew_error) fires a structured ActivityEvent through a single callback. The Gradio frontend consumes those events to drive the live activity feed — so the user watches the pipeline think, not just its final output.
The whole codebase is guarded by 309 tests, including Hypothesis property tests that exercise key invariants across thousands of randomised inputs:
LLMConfig.base_urlpropagates through every agent to thecrewai.LLMclient.- A formatted
TradingSignalsurvives a round-trip through the parser for any valid ticker, action, confidence, and price. - The watchlist parser normalises arbitrary whitespace, casing, and empty segments deterministically.
WatchlistResult.successful + failed == total_tickersalways holds — fault isolation is verified, not hoped-for.- The vLLM health check reports failure for any non-200 response, for arbitrary status code and body content.
FinAgent/
├── crew/ # Agent orchestration (FinAgentCrew, WatchlistRunner)
├── tools/ # 10 tool functions for agents (yfinance, ddgs, pandas-ta)
├── inference/ # ROCm + vLLM + Qwen deployment scripts
├── gradio-frontend/
│ ├── app.py # Gradio UI + event handler
│ ├── validation.py # Input validation (tickers, portfolio)
│ ├── rendering.py # HTML rendering (cards, feed, CSS)
│ └── space/ # Ready-to-push HF Space package (app+deps+crew+tools)
├── tests/ # Agent-orchestration tests
├── _crewai_mocks.py # Shared crewai MagicMock classes for testing
├── conftest.py # Root pytest config — installs mocks at session start
├── requirements.txt # Pinned runtime deps
└── requirements-dev.txt # Test + hypothesis deps
- Python 3.11 or later (tested on 3.13)
- Git, pip
- A vLLM endpoint. Either run one yourself via
inference/setup.shon an AMD GPU, or point at any OpenAI-compatible endpoint for local experiments.
git clone https://github.com/emmanuelakbi/FinAgent.git
cd FinAgent
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt -r requirements-dev.txtpytest tests/ tools/tests/ inference/tests/ gradio-frontend/tests/ -m "not integration"
# → 309 passedexport VLLM_ENDPOINT_URL=http://localhost:8000/v1 # or wherever vLLM is
python gradio-frontend/app.py
# → http://127.0.0.1:7860The repo ships a ready-to-push Space directory at gradio-frontend/space/. It contains everything Hugging Face needs — app.py, requirements.txt, the crew/ package, the tools/ package, and a Space-flavoured README.md with the SDK metadata header. Create a new Space on Hugging Face (Gradio SDK, CPU basic), clone it, copy the contents of gradio-frontend/space/ in, then git push. Set a VLLM_ENDPOINT_URL repository secret pointing at your vLLM instance and the Space will build and serve.
If the live Space at huggingface.co/spaces/lablab-ai-amd-developer-hackathon/finagent is offline, here is the complete self-host recipe. Everything is OpenAI-compatible, so the project runs against any vLLM instance, not just AMD silicon — a rented A100 or H100 works identically.
# 1. Clone the repo
git clone https://github.com/emmanuelakbi/FinAgent.git
cd FinAgent
# 2. Boot vLLM with Qwen3-14B and tool-calling enabled
cd inference
./setup.sh --host 0.0.0.0 --port 8000
# Wait ~30s for "Application startup complete", then verify:
./health_check.sh --host 0.0.0.0 --port 8000
# 3. In a second terminal, launch the Gradio frontend
cd ..
pip install -r requirements.txt
export VLLM_ENDPOINT_URL=http://localhost:8000/v1
python gradio-frontend/app.py
# Open http://127.0.0.1:7860 and enter a watchlistQwen3-14B requires ~28 GB VRAM. Skip inference/setup.sh and run vanilla vLLM:
pip install "vllm>=0.17"
vllm serve Qwen/Qwen3-14B \
--host 0.0.0.0 --port 8000 \
--enable-auto-tool-choice --tool-call-parser hermes
# Then, from the repo root:
pip install -r requirements.txt
export VLLM_ENDPOINT_URL=http://localhost:8000/v1
python gradio-frontend/app.pyTo verify the agents run end-to-end without provisioning hardware, point VLLM_ENDPOINT_URL at any OpenAI-compatible endpoint that serves Qwen3-14B (Together AI, Fireworks, OpenRouter all work). The code uses crewai.LLM with the hosted_vllm/ litellm provider, so any OpenAI-compatible base URL drops in:
export VLLM_ENDPOINT_URL=https://your-provider/v1
export OPENAI_API_KEY=your_key # optional; any value works against vLLM itself
python gradio-frontend/app.py# Runs the full 5-agent pipeline for AAPL and prints the signal:
python -c "
from crew import LLMConfig, OrchestratorConfig, WatchlistRunner
from tools import (search_news, get_price_change, get_volume,
get_financials, get_earnings, get_peers,
get_price_history, calculate_indicators,
calculate_position_size, set_stop_loss)
runner = WatchlistRunner(
config=OrchestratorConfig(llm=LLMConfig(base_url='http://localhost:8000/v1')),
tools={
'market_scanner': [search_news, get_price_change, get_volume],
'fundamental_analyst': [get_financials, get_earnings, get_peers],
'technical_analyst': [get_price_history, calculate_indicators],
'risk_manager': [calculate_position_size, set_stop_loss],
},
)
print(runner.run('AAPL'))
"pip install -r requirements-dev.txt
pytest tests/ tools/tests/ inference/tests/ gradio-frontend/tests/ -m "not integration"
# -> 309 passed| Concern | Choice | Why |
|---|---|---|
| Agent framework | CrewAI 1.14 | Built-in role/task/dependency model; context=[...] handles wait-for-predecessor cleanly |
| Inference server | vLLM 0.17 | Continuous batching + prefix caching; first-class ROCm 7.x support for MI300X |
| Model | Qwen/Qwen3-14B | Strong instruction following + tool-calling at ~28 GB VRAM; open-weights, commercial-friendly |
| GPU runtime | ROCm 7.2 | MI300X's native stack; PyTorch 2.7 ROCm wheels ship with matching support |
| LLM client | crewai.LLM (litellm hosted_vllm) | Native CrewAI integration; points at any OpenAI-compatible base_url with tool-calling |
| Frontend | Gradio 5 | HF Spaces' native SDK; generator-based streaming maps directly to agent activity events |
| Tools | yfinance · ddgs · pandas-ta-remake | All keyless — no API tokens needed in the demo |
| Testing | pytest 8 + hypothesis 6 | 309 passing; every key correctness guarantee has a test that tries to break it |
The $100 AMD Developer Cloud credit is the entire compute budget. MI300X pricing runs roughly $2–5/hour → 20–50 hours of runtime available on a single credit. The endpoint only needs to be live while the Gradio frontend is in active use; shutting the instance down between sessions preserves the remaining credit.
The architecture started from a single constraint: every signal a user sees must be grounded in real market data, not LLM imagination. That constraint shaped the rest of the design.
The agent topology mirrors how an actual investment desk splits the work. Scanner, Fundamental Analyst, and Technical Analyst run in parallel against the same ticker — each reads its own slice of the world. Risk Manager waits on Technical's entry price because you can't size a position without one. Chief Strategist waits on all four, then produces the final BUY/SELL/HOLD. CrewAI's context=[...] plumbing handles the wait-for-predecessor dependencies cleanly without a bespoke scheduler.
The grounding layer was the hardest engineering problem. A 14B model at temperature 0.7 will cheerfully emit $10.00 as the entry price for NVDA on a day it traded at $215. Asking nicely in the prompt didn't work. A deterministic post-processor runs after the model: it anchors every parsed signal's entry price to the live yfinance quote and rescales stop/target proportionally to preserve the model's risk/reward geometry. When the output is too malformed to parse at all, a fallback synthesiser reads the live price directly and constructs a clean signal from scratch. That turned the pipeline from "usually works" into "always returns a grounded card."
The test suite was written property-first. Every invariant that matters — base_url propagation, signal round-trip parseability, watchlist parser determinism, fault isolation across tickers, vLLM health-check behaviour under any HTTP status code — is exercised by a Hypothesis property test across thousands of randomised inputs. 309 unit + property tests pass on every commit.
The UI preferences (Risk Tolerance, Trading Style, Portfolio Value) aren't cosmetic dropdowns. They thread end-to-end into the Strategist's task prompt and the grounding clamps, so the same ticker at the same moment produces a tight 1.5% / 2% band on Conservative / Day Trading and a wide 5% / 10% band on Aggressive / Position Trading.
Inference is self-hosted. No OpenAI, no Claude, no API bills. vLLM on ROCm exposes an OpenAI-compatible endpoint on the MI300X; all five agents share the same crewai.LLM instance, so vLLM's prefix caching kicks in on repeated prompt prefixes and pushes throughput up.
- AMD Developer Cloud — MI300X compute credit
- Qwen team at Alibaba Cloud — open-weights models
- CrewAI — agent orchestration framework
- vLLM project — high-throughput LLM serving with ROCm support
- lablab.ai — hackathon platform