Skip to content

W00DSRULES/launch

Repository files navigation

Smart Opportunity Finder

A self-hosted, multi-agent system that scans public MedTech signals (patents, clinical trials, regulatory filings, funding rounds, academic literature, grants) and surfaces a short, ranked list of growth opportunities matched to a company's manufacturing competencies.

The output answers two questions per opportunity:

  • Why us? — deterministic match against the company's competency model
  • Why now? — derived from timing proxies in the public record (trial phase transitions, patent grant windows, regulatory clearance events, funding rounds)

The hard invariant: no claim without traceable evidence. A trust gate in Python forces any claim with no retrieved evidence to unsupported, regardless of what the LLM said.


Quick start

cp .env.example .env          # fill in at least one provider key (or set DRY_RUN=true)
docker compose up -d
# UI:    http://localhost:3000
# API:   http://localhost:3000/api/v1
# Docs:  http://localhost:3000/api/docs   (nginx-proxied OpenAPI)

DRY_RUN=true runs the full pipeline without any LLM call (deterministic placeholder JSON). Useful for CI and offline demos.


Architecture

flowchart TD
    User([Browser SPA]) -->|HTTPS| API[FastAPI]
    API --> Pipe{{OpportunityOrchestrator}}

    Pipe --> S1[1. PLAN<br/>LLM]
    S1 --> S2[2. RESEARCH<br/>deterministic fan-out]
    S2 -->|parallel| A1[ClinicalTrials]
    S2 -->|parallel| A2[Patents — Google & EPO]
    S2 -->|parallel| A3[Regulatory — openFDA]
    S2 -->|parallel| A4[Funding — SEC EDGAR]
    S2 -->|parallel| A5[Academic — OpenAlex & PubMed]
    S2 -->|parallel| A6[Grants — CORDIS & NIH]
    A1 & A2 & A3 & A4 & A5 & A6 --> S3[3. CHECK<br/>LLM + Trust Gate]
    S3 --> S4[4. MATCH<br/>Competency Matcher]
    S4 --> S5[5. TIME<br/>rules + LLM rationale]
    S5 --> S6[6. RANK<br/>weighted blend]
    S6 --> S7[7. BRIEF<br/>LLM]
    S7 --> S8[8. CRITIQUE<br/>LLM quality tier]
    S8 --> Out[Ranked Briefing]

    API -.->|persistence| DB[(SQLite / Postgres)]
    API -.->|embeddings| QD[(Qdrant)]
    API -.-> LLM[UnifiedLLM<br/>Anthropic / OpenAI / DeepSeek / Ollama]
Loading

The 8-step pipeline

Entry point: core/orchestrator/opportunity_flow.py::OpportunityOrchestrator.run().

Step Function Notes
1. PLAN step_plan LLM decomposes the focus area into atomic claims with competency_hint and search_terms.
2. RESEARCH _research Parallel fan-out across all registered SourceAdapter instances (20 s per-adapter timeout).
3. CHECK step_check LLM verifies each claim against retrieved evidence. The Python trust gate (enforce_trust_gate) overwrites any verified verdict that has no evidence. Coherence pass halves confidence when invented identifiers (NCT IDs, DOIs, dollar amounts) are detected.
4. MATCH step_match Deterministic competency match using competencies/schott.yaml. Below MATCH_THRESHOLD=0.20 is dropped.
5. TIME step_time Rules + LLM rationale. Maps timing signals to {NOW, 2_YEARS, 5_YEARS, EMERGING} with a [0,1] urgency score. Rules in scoring/timing_rules.yaml.
6. RANK core/scoring/ranking.rank Weighted blend (see below). Top 5 returned.
7. BRIEF step_brief LLM writes the final card; coherence pass downgrades confidence if invented refs are spotted.
8. CRITIQUE step_critique LLM quality grade. Critique score < 0.4 or blocking issues → card dropped.

Trust gate (the invariant)

def enforce_trust_gate(verdict: str, evidence: list[EvidenceItem]) -> str:
    if not evidence and verdict == "verified":
        return "unsupported"
    return verdict

Source: core/orchestrator/opportunity_flow.py.

Scoring formula

$$\text{score} = \frac{\sum_i w_i \cdot s_i}{\sum_i w_i}$$

Axis Default weight Source
competency_fit 0.40 matcher score
timing 0.30 TIME step urgency
signal_strength 0.15 min(len(evidence)/10, 1.0) × coherence
market_potential 0.10 CHECK confidence × coherence
competitive_window 0.05 default 0.6, tunable per opportunity

Weights renormalise if they don't sum to 1.0. All five RANK_W_* overrides come from environment variables — no code change needed.

Competency matching

core/scoring/matcher.py. Fully deterministic — no embeddings, no LLM. Four signals blended into one [0,1] score against each competency in competencies/schott.yaml:

Signal Weight How
Keyword overlap 50% Fraction present; substring match for multi-word terms
Application overlap 30% Phrase hit = 1.0; partial token overlap ≤ 0.5
Material match 20% Exact / prefix / order-agnostic token-set; generic tokens excluded
Negative-keyword penalty subtractive Caps at 30%

Results memoised with a 1 024-entry LRU keyed on (text_hash, competencies_id).


Source adapters

All registered in core/orchestrator/adapter_factory.py::build_default_adapters. Adapters that need a key skip themselves if their env var is empty.

Adapter Source Notes
ClinicalTrialsAdapter clinicaltrials.gov v2 Graceful token-limit degradation
GooglePatentsAdapter Google Patents (scraper) Requires pip install -e .[patents]
EpoOpsAdapter EPO Open Patent Services Skipped without EPO_OPS_API_KEY
OpenFDADevicesAdapter openFDA devices FDA 510(k) / PMA clearances
OpenFDADrugAdapter openFDA drug NDA / ANDA approvals
SecEdgarAdapter SEC EDGAR full-text Funding / M&A signals
CordisAdapter EU CORDIS Horizon grants
NihReporterAdapter NIH Reporter US NIH grants
OpenAlexAdapter OpenAlex 250M+ academic works
PubmedAdapter PubMed / MEDLINE Biomedical literature
EuClinicalTrialsAdapter EudraCT EU trial registrations

A new source = subclass core.sources.base.SourceAdapter, return EvidenceItems, register in build_default_adapters.


LLM routing

providers/unified.py::UnifiedLLM. Three cost tiers configured independently:

Tier Default model Used by
quality claude-sonnet-4-20250514 PLAN, BRIEF, CRITIQUE
balanced gpt-4o-mini CHECK
cheap deepseek-chat TIME rationale, fallback

Runtime guarantees:

  • Circuit breakers per provider (pybreaker) — a failing provider trips for the rest of the request.
  • Health cache with 60 s TTL — unhealthy providers are skipped without a probe call.
  • Budget guard (providers/budget.py) — pre-call cost estimate, skipped if it would exceed MAX_BUDGET_USD.
  • Fallback chainUnifiedLLM tries every configured provider in priority order before raising.
  • Ollama passthroughQUALITY_PROVIDER=ollama + OLLAMA_HOST point at any on-prem instance.
  • DRY_RUN=true — returns deterministic placeholder JSON, no API call.

API reference

apps/api/main.py. Base path served by nginx at /api/v1/.

Method Route Description
GET /health Liveness check
POST /api/v1/runs Start a new opportunity scan (returns immediately, async)
GET /api/v1/runs/{run_id} Fetch run status + result
GET /api/v1/runs/ List recent runs
GET /api/v1/events/{run_id} SSE stream of pipeline stage transitions
POST /api/v1/schedules Create a cron schedule
GET /api/v1/schedules List schedules
DELETE /api/v1/schedules/{id} Delete a schedule
POST /api/v1/schedules/{id}/trigger Fire a schedule once, outside cron cadence

The React SPA renders the SSE events as live scan progress. Pages that only read from /api/v1/runs (Dashboard, OpportunityDeepDive, StrategicBets, RadarFeed) show an empty state until a scan completes.


Database schema

core/persistence.py. SQLAlchemy 2.x. SQLite by default; Postgres via DATABASE_URL.

runs — one row per pipeline execution:

Column Type Notes
run_id UUID PK
status enum pending / running / done / error
focus_area text User-supplied query
created_at / completed_at timestamp
current_stage text Last pipeline stage name
result JSON Full RankedBrief payload
error text Traceback on failure
triggered_by text manual or schedule
schedule_id FK → schedules Nullable

schedules — cron-triggered scans:

Column Type
schedule_id UUID PK
name, focus_area, cron, enabled text / text / text / bool
created_at, last_run_at, last_run_id timestamps + FK

Frontend

frontend/ — React 18 + Vite + TypeScript + Tailwind, served as a static SPA by the nginx layer in the API container.

Path Page Backed by
/ Dashboard GET /runs
/scan/:runId ScanMonitor GET /runs/:id + SSE /events/:id
/opportunity/:id OpportunityDeepDive GET /runs (client-side lookup)
/bets StrategicBets GET /runs
/radar RadarFeed GET /runs
/schedules Schedules GET/POST/DELETE /schedules
/research ResearchAgent Ad-hoc manual query (client-side)
/competencies Competencies YAML editor (client-side)
/admin Admin Settings UI (client-side)
/login Login sessionStorage gate using VITE_AUTH_USERS

The auth gate is a static credential check (set VITE_AUTH_USERS=user:pass,... in frontend/.env) — there is no backend auth endpoint yet.


Install (without Docker)

git clone https://github.com/W00DSRULES/launch.git
cd launch
python3 -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"
cp .env.example .env

# backend
uvicorn apps.api.main:app --reload
# frontend (separate terminal)
cd frontend && npm install && npm run dev

Python 3.11+ required. SQLite + an embedded Qdrant are used by default — zero external services needed for a first run.


Develop

make test         # pytest
make lint         # ruff + black
make typecheck    # mypy on core/ + providers/
make format       # ruff --fix + black

Pre-commit hooks (.pre-commit-config.yaml) run ruff + black on every commit.


License

MIT — see LICENSE.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors