TrialThread

An AI agent that finds clinical trials people would never find on their own — and explains, in plain English, why each one may or may not fit.

Live instance: trialthread.org — free, no account, nothing you type is stored (also reachable via trialthread.com)

Why

Every recruiting trial in the United States is publicly listed on clinicaltrials.gov, and almost nobody can find the one that fits. The registry speaks in eligibility criteria; patients speak in plain language. The translation layer between them has mostly lived inside the heads of research nurses and well-connected oncologists. This project makes that layer free. The longer version is on the About page.

How it works

Patient description → structured profile → live registry search → LLM eligibility screening → adaptive broadening → grounded explanations. No database, no accounts, no stored patient data.

Extraction (lib/extract.ts) — Claude parses a free-text description into a structured profile: condition, stage, biomarkers, prior treatments, age, location (LLM city-level geocode), and honest red flags.
Search (lib/ctgov.ts) — live queries against the clinicaltrials.gov v2 API: condition + RECRUITING + geo radius, normalized with haversine site distances. No ingest pipeline, so results are never stale.
Triage (lib/score.ts) — a fast model screens every candidate (batched, parallel) as strong / possible / unlikely.
The loop (lib/loop.ts) — if strong candidates are thin, the search widens deliberately: geography (100 mi → 300 mi → national), parent condition terms, then a biomarker basket hunt across solid-tumor trials. Every pass streams to the UI as a visible ledger.
Deep parse (lib/score.ts) — a stronger model reads the full criteria of the top candidates and produces criterion-grounded output: "Requires X — you reported Y," concerns phrased as things to check, and questions to bring to your doctor.

Eligibility is a constraint-satisfaction problem, not a similarity problem — "HER2-positive required" and "prior HER2 therapy excluded" embed almost identically, which is why the ranking authority here is criteria reading, not vector distance. A semantic recall arm (embedding LLM-synthesized "who this trial wants" archetypes) is the planned v2 addition for trials whose criteria are phrased unlike the diagnosis.

Run it yourself

npm install
cp .env.example .env   # add your ANTHROPIC_API_KEY
npm run dev

TRIALTHREAD_MOCK=1 runs the full pipeline with canned LLM responses (no key needed) — useful for testing the search loop and UI. Model choices are env-configurable (TT_EXTRACT_MODEL, TT_TRIAGE_MODEL, TT_DEEP_MODEL).

What this is not

Not medical advice. Not a determination of eligibility — only a trial team can make that call. Summaries are AI-generated and can contain errors; the official clinicaltrials.gov listing is always authoritative. The hosted instance stores no patient data, and this repo contains no tracking of any kind. Keep it that way in forks that serve real patients.

Project boundary

The matching engine — everything in this repo — is open source under Apache-2.0. If a referral layer (patient-consented warm handoffs to trial sites, under standard FMV recruitment agreements) is ever built, it will live in a separate private service, fees will come from sites and sponsors, and patients will never pay. That boundary is documented here on purpose.

Contributing

The most valuable contribution is not code — it is clinician-reviewed synthetic test vignettes that let us measure match quality. See CONTRIBUTING.md. Never include real patient information in issues or pull requests.

Support

Hosting and inference for the free instance cost real money (Vercel + Claude API) — the running bill is public in COSTS.md, so "every dollar goes to servers and tokens" is checkable, not asserted. A GitHub Sponsors link is coming; until then, the best support is a test vignette, a bug report, or telling one oncology social worker this exists.

Prior art and lineage

TrialGPT (NIH) validated LLM criterion-level matching. TrialMatchAI (Nature Communications, 2026) published the RAG variant. ClinTrialFinder open-sourced a comparable pipeline. The iterative search pattern follows the Karpathy AutoResearch loop. TrialThread's contribution is the product shape: stateless, patient-first, visible reasoning, honest verdicts.

License

Apache-2.0 — Eric Porres, 2026.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.github		.github
app		app
eval		eval
lib		lib
public		public
scripts		scripts
v2/recall-arm		v2/recall-arm
.env.example		.env.example
.gitignore		.gitignore
ARCHITECTURE.mermaid		ARCHITECTURE.mermaid
CONTRIBUTING.md		CONTRIBUTING.md
COSTS.md		COSTS.md
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
next.config.mjs		next.config.mjs
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TrialThread

Why

How it works

Run it yourself

What this is not

Project boundary

Contributing

Support

Prior art and lineage

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

TrialThread

Why

How it works

Run it yourself

What this is not

Project boundary

Contributing

Support

Prior art and lineage

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages