FinAgent

AI-powered trading signal generator. Five specialist agents collaborate on a single AMD Instinct MI300X to produce structured BUY / SELL / HOLD calls with confidence, entry, stop loss, target, and per-agent reasoning.

Built for the AMD Developer Hackathon · Track: AI Agents & Agentic Workflows · May 2026.


Live demo	https://huggingface.co/spaces/lablab-ai-amd-developer-hackathon/finagent
Track	AI Agents & Agentic Workflows
Model	Qwen/Qwen3-14B via vLLM 0.17 on ROCm 7.2
Hardware	AMD Instinct MI300X (AMD Developer Cloud)
License	MIT

The problem

If you're watching a list of stocks, you hit the same question every day: which of these 10 tickers deserves real attention today? Most AI tools answer this with a single-shot prompt into a general-purpose model — the output is confident, shallow, and indistinguishable across tickers.

Real analysts don't work that way. They split the work: one person reads the news, another digs into financials, another reads charts, another sizes the trade. Then a head of strategy synthesizes.

The solution

FinAgent reproduces that division of labour with five CrewAI agents running against a single locally-hosted Qwen3-14B instance on an AMD MI300X GPU. Each agent is tool-equipped for its niche; they run in a dependency-correct topology (scanner + fundamental + technical first, then risk, then strategy), and the final output conforms to a strict schema that the Gradio frontend parses into interactive signal cards.

Everything is self-hosted inference — no OpenAI, no Claude, no API bills. The $100 AMD Developer Cloud credit is the entire compute budget.

Architecture

flowchart LR
    UI[Gradio Space<br/>HF frontend]
    R[WatchlistRunner]
    C[FinAgentCrew]
    MS[Market Scanner]
    FA[Fundamental Analyst]
    TA[Technical Analyst]
    RM[Risk Manager]
    CS[Chief Strategist]
    V[vLLM<br/>Qwen3-14B<br/>MI300X]
    T[(Tool APIs<br/>yfinance / ddgs)]

    UI -->|ticker watchlist| R
    R --> C
    C --> MS & FA & TA
    TA --> RM
    MS --> CS
    FA --> CS
    TA --> CS
    RM --> CS
    CS -->|TradingSignal| R
    R -->|streaming updates| UI

    MS <--> T
    FA <--> T
    TA <--> T
    RM <--> T

    MS & FA & TA & RM & CS <-->|OpenAI-compatible HTTPS| V

Layer	Role
`inference/`	ROCm + vLLM + Qwen3 deployment scripts for MI300X
`tools/`	10 keyless tool functions (yfinance, ddgs, pandas-ta) agents can call
`crew/`	CrewAI agent + task + crew + runner orchestration, with callbacks
`gradio-frontend/`	Dark financial-terminal Gradio UI, deployable as a Hugging Face Space

The five agents

#	Agent	Goal	Tools
1	Market Scanner	Detect news and price/volume anomalies	`search_news`, `get_price_change`, `get_volume`
2	Fundamental Analyst	Determine intrinsic value	`get_financials`, `get_earnings`, `get_peers`
3	Technical Analyst	Identify entry/exit via indicators	`get_price_history`, `calculate_indicators`
4	Risk Manager	Size position, place ATR-based stop-loss	`calculate_position_size`, `set_stop_loss`
5	Chief Strategist	Synthesize 1–4 into a final BUY/SELL/HOLD	— pure reasoning

Agents 1, 2, 3 run in parallel. Risk Manager waits for the Technical Analyst's entry price. Chief Strategist waits for all four.

Example output

For a watchlist AAPL, NVDA, BTC-USD, each ticker renders a card like:

AAPL — BUY (Confidence: 75%)
Entry:     $293.32
Stop Loss: $284.52
Target:    $307.99

Reasoning:
- Market:      Uptrend confirmed by rising prices and 20-day SMA, but
               overbought RSI suggests short-term consolidation risk
- Fundamental: Strong earnings momentum (+3.6–9.8% surprises), robust
               margins (27%), but elevated debt and high P/E vs peers
- Technical:   Price above Bollinger Upper ($291.39), RSI 73 overbought,
               MACD neutral
- Risk:        1:2 risk-reward ratio with 5% stop-loss and target;
               position size 110 shares (2% of a $100K portfolio)

The entry price is grounded in the live yfinance quote. Stop-loss and target are synthesised from the Risk Manager's ATR band, with an automatic sanity check that replaces any LLM-emitted price that drifts too far from the live quote.

What makes this technically interesting

Grounded signals, not hallucinated ones

Small LLMs confidently invent prices from their training data. A 14B model will cheerfully emit $10.00 as the entry price for NVDA on a day it traded at $215. FinAgent doesn't allow it. Every signal has its entry price anchored to the live yfinance quote; stop-loss and target are rescaled around that live price so the model's risk/reward geometry is preserved. When the model output is too malformed to parse, a deterministic fallback builds a clean signal from the live price directly. The card always reflects real market data.

Shared LLM instance across all five agents

Most CrewAI examples give every agent its own LLM client. FinAgent constructs one crewai.LLM (backed by litellm's hosted_vllm/ provider) pointing at the vLLM endpoint and shares it across the five agents. vLLM's prefix caching means repeated backstory prefixes hit the same KV cache, giving a measurable throughput win on the MI300X.

Fault-isolated multi-ticker pipeline

If one ticker blows up (bad symbol, network hiccup, parser failure), the runner captures the error in a CrewResult and keeps going. The Gradio frontend renders an error card for that ticker and normal signal cards for the rest. No silent failure, no broken dashboard.

Callback-driven activity feed

Every agent lifecycle event (task_start, task_complete, agent_output, task_failed, crew_error) fires a structured ActivityEvent through a single callback. The Gradio frontend consumes those events to drive the live activity feed — so the user watches the pipeline think, not just its final output.

Property-based testing of invariants

The whole codebase is guarded by 309 tests, including Hypothesis property tests that exercise key invariants across thousands of randomised inputs:

LLMConfig.base_url propagates through every agent to the crewai.LLM client.
A formatted TradingSignal survives a round-trip through the parser for any valid ticker, action, confidence, and price.
The watchlist parser normalises arbitrary whitespace, casing, and empty segments deterministically.
WatchlistResult.successful + failed == total_tickers always holds — fault isolation is verified, not hoped-for.
The vLLM health check reports failure for any non-200 response, for arbitrary status code and body content.

Repository layout

FinAgent/
├── crew/                     # Agent orchestration (FinAgentCrew, WatchlistRunner)
├── tools/                    # 10 tool functions for agents (yfinance, ddgs, pandas-ta)
├── inference/                # ROCm + vLLM + Qwen deployment scripts
├── gradio-frontend/
│   ├── app.py                # Gradio UI + event handler
│   ├── validation.py         # Input validation (tickers, portfolio)
│   ├── rendering.py          # HTML rendering (cards, feed, CSS)
│   └── space/                # Ready-to-push HF Space package (app+deps+crew+tools)
├── tests/                    # Agent-orchestration tests
├── _crewai_mocks.py          # Shared crewai MagicMock classes for testing
├── conftest.py               # Root pytest config — installs mocks at session start
├── requirements.txt          # Pinned runtime deps
└── requirements-dev.txt      # Test + hypothesis deps

Quick start (local development)

Prerequisites

Python 3.11 or later (tested on 3.13)
Git, pip
A vLLM endpoint. Either run one yourself via inference/setup.sh on an AMD GPU, or point at any OpenAI-compatible endpoint for local experiments.

Install

git clone https://github.com/emmanuelakbi/FinAgent.git
cd FinAgent
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt -r requirements-dev.txt

Run the test suite

pytest tests/ tools/tests/ inference/tests/ gradio-frontend/tests/ -m "not integration"
# → 309 passed

Run the Gradio app locally

export VLLM_ENDPOINT_URL=http://localhost:8000/v1   # or wherever vLLM is
python gradio-frontend/app.py
# → http://127.0.0.1:7860

Deploy the Gradio app to a Hugging Face Space

The repo ships a ready-to-push Space directory at gradio-frontend/space/. It contains everything Hugging Face needs — app.py, requirements.txt, the crew/ package, the tools/ package, and a Space-flavoured README.md with the SDK metadata header. Create a new Space on Hugging Face (Gradio SDK, CPU basic), clone it, copy the contents of gradio-frontend/space/ in, then git push. Set a VLLM_ENDPOINT_URL repository secret pointing at your vLLM instance and the Space will build and serve.

Run it on your own GPU in under 10 minutes

If the live Space at huggingface.co/spaces/lablab-ai-amd-developer-hackathon/finagent is offline, here is the complete self-host recipe. Everything is OpenAI-compatible, so the project runs against any vLLM instance, not just AMD silicon — a rented A100 or H100 works identically.

On an AMD MI300X (matches the original deployment)

# 1. Clone the repo
git clone https://github.com/emmanuelakbi/FinAgent.git
cd FinAgent

# 2. Boot vLLM with Qwen3-14B and tool-calling enabled
cd inference
./setup.sh --host 0.0.0.0 --port 8000
# Wait ~30s for "Application startup complete", then verify:
./health_check.sh --host 0.0.0.0 --port 8000

# 3. In a second terminal, launch the Gradio frontend
cd ..
pip install -r requirements.txt
export VLLM_ENDPOINT_URL=http://localhost:8000/v1
python gradio-frontend/app.py
# Open http://127.0.0.1:7860 and enter a watchlist

On any other GPU (A100, H100, RTX 4090 with enough VRAM)

Qwen3-14B requires ~28 GB VRAM. Skip inference/setup.sh and run vanilla vLLM:

pip install "vllm>=0.17"
vllm serve Qwen/Qwen3-14B \
    --host 0.0.0.0 --port 8000 \
    --enable-auto-tool-choice --tool-call-parser hermes

# Then, from the repo root:
pip install -r requirements.txt
export VLLM_ENDPOINT_URL=http://localhost:8000/v1
python gradio-frontend/app.py

Without any GPU — use a hosted Qwen3-14B endpoint

To verify the agents run end-to-end without provisioning hardware, point VLLM_ENDPOINT_URL at any OpenAI-compatible endpoint that serves Qwen3-14B (Together AI, Fireworks, OpenRouter all work). The code uses crewai.LLM with the hosted_vllm/ litellm provider, so any OpenAI-compatible base URL drops in:

export VLLM_ENDPOINT_URL=https://your-provider/v1
export OPENAI_API_KEY=your_key   # optional; any value works against vLLM itself
python gradio-frontend/app.py

Verifying the pipeline without the UI

# Runs the full 5-agent pipeline for AAPL and prints the signal:
python -c "
from crew import LLMConfig, OrchestratorConfig, WatchlistRunner
from tools import (search_news, get_price_change, get_volume,
                   get_financials, get_earnings, get_peers,
                   get_price_history, calculate_indicators,
                   calculate_position_size, set_stop_loss)
runner = WatchlistRunner(
    config=OrchestratorConfig(llm=LLMConfig(base_url='http://localhost:8000/v1')),
    tools={
        'market_scanner': [search_news, get_price_change, get_volume],
        'fundamental_analyst': [get_financials, get_earnings, get_peers],
        'technical_analyst': [get_price_history, calculate_indicators],
        'risk_manager': [calculate_position_size, set_stop_loss],
    },
)
print(runner.run('AAPL'))
"

Verifying the tests

pip install -r requirements-dev.txt
pytest tests/ tools/tests/ inference/tests/ gradio-frontend/tests/ -m "not integration"
# -> 309 passed

Tech stack

Concern	Choice	Why
Agent framework	CrewAI 1.14	Built-in role/task/dependency model; `context=[...]` handles wait-for-predecessor cleanly
Inference server	vLLM 0.17	Continuous batching + prefix caching; first-class ROCm 7.x support for MI300X
Model	Qwen/Qwen3-14B	Strong instruction following + tool-calling at ~28 GB VRAM; open-weights, commercial-friendly
GPU runtime	ROCm 7.2	MI300X's native stack; PyTorch 2.7 ROCm wheels ship with matching support
LLM client	crewai.LLM (litellm hosted_vllm)	Native CrewAI integration; points at any OpenAI-compatible `base_url` with tool-calling
Frontend	Gradio 5	HF Spaces' native SDK; generator-based streaming maps directly to agent activity events
Tools	yfinance · ddgs · pandas-ta-remake	All keyless — no API tokens needed in the demo
Testing	pytest 8 + hypothesis 6	309 passing; every key correctness guarantee has a test that tries to break it

Cost note

The $100 AMD Developer Cloud credit is the entire compute budget. MI300X pricing runs roughly $2–5/hour → 20–50 hours of runtime available on a single credit. The endpoint only needs to be live while the Gradio frontend is in active use; shutting the instance down between sessions preserves the remaining credit.

How it was built

The architecture started from a single constraint: every signal a user sees must be grounded in real market data, not LLM imagination. That constraint shaped the rest of the design.

The agent topology mirrors how an actual investment desk splits the work. Scanner, Fundamental Analyst, and Technical Analyst run in parallel against the same ticker — each reads its own slice of the world. Risk Manager waits on Technical's entry price because you can't size a position without one. Chief Strategist waits on all four, then produces the final BUY/SELL/HOLD. CrewAI's context=[...] plumbing handles the wait-for-predecessor dependencies cleanly without a bespoke scheduler.

The grounding layer was the hardest engineering problem. A 14B model at temperature 0.7 will cheerfully emit $10.00 as the entry price for NVDA on a day it traded at $215. Asking nicely in the prompt didn't work. A deterministic post-processor runs after the model: it anchors every parsed signal's entry price to the live yfinance quote and rescales stop/target proportionally to preserve the model's risk/reward geometry. When the output is too malformed to parse at all, a fallback synthesiser reads the live price directly and constructs a clean signal from scratch. That turned the pipeline from "usually works" into "always returns a grounded card."

The test suite was written property-first. Every invariant that matters — base_url propagation, signal round-trip parseability, watchlist parser determinism, fault isolation across tickers, vLLM health-check behaviour under any HTTP status code — is exercised by a Hypothesis property test across thousands of randomised inputs. 309 unit + property tests pass on every commit.

The UI preferences (Risk Tolerance, Trading Style, Portfolio Value) aren't cosmetic dropdowns. They thread end-to-end into the Strategist's task prompt and the grounding clamps, so the same ticker at the same moment produces a tight 1.5% / 2% band on Conservative / Day Trading and a wide 5% / 10% band on Aggressive / Position Trading.

Inference is self-hosted. No OpenAI, no Claude, no API bills. vLLM on ROCm exposes an OpenAI-compatible endpoint on the MI300X; all five agents share the same crewai.LLM instance, so vLLM's prefix caching kicks in on repeated prompt prefixes and pushes throughput up.

Disclaimer

⚠️ The trading signals produced here are for informational purposes only and do not constitute financial advice. Always do your own research before placing a trade.

Acknowledgments

AMD Developer Cloud — MI300X compute credit
Qwen team at Alibaba Cloud — open-weights models
CrewAI — agent orchestration framework
vLLM project — high-throughput LLM serving with ROCm support
lablab.ai — hackathon platform

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
crew		crew
gradio-frontend		gradio-frontend
inference		inference
tests		tests
tools		tools
.gitignore		.gitignore
README.md		README.md
_crewai_mocks.py		_crewai_mocks.py
conftest.py		conftest.py
pytest.ini		pytest.ini
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

FinAgent

The problem

The solution

Architecture

The five agents

Example output

What makes this technically interesting

Grounded signals, not hallucinated ones

Shared LLM instance across all five agents

Fault-isolated multi-ticker pipeline

Callback-driven activity feed

Property-based testing of invariants

Repository layout

Quick start (local development)

Prerequisites

Install

Run the test suite

Run the Gradio app locally

Deploy the Gradio app to a Hugging Face Space

Run it on your own GPU in under 10 minutes

On an AMD MI300X (matches the original deployment)

On any other GPU (A100, H100, RTX 4090 with enough VRAM)

Without any GPU — use a hosted Qwen3-14B endpoint

Verifying the pipeline without the UI

Verifying the tests

Tech stack

Cost note

How it was built

Disclaimer

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages