mutavax

DISCLAIMER: This software is provided for research and educational purposes only. Not intended for clinical or veterinary use. No warranty of fitness for any particular purpose.

Design your own mRNA cancer vaccine.

mutavax is an open studio for designing personalized mRNA cancer vaccines — for dogs, cats, and humans. Sequence a tumor and a healthy sample, run the studio on your own machine, hand the design to a manufacturer.

Site: https://mutavax.straehhuber.com

Pick a species	Stage the samples	Run alignment	Find the mutations	Read what they mean

Score the neoantigens	Curate the cassette	Design the construct	Hand off the vaccine

Sample. Compute. Design.

Sample. Sequence a tumor and a matched healthy sample at any standard lab. Two sequencing files — that's the whole input.

Compute. Run mutavax on your machine. Eight guided stages compare tumor vs. healthy, find the cancer-specific mutations, and design the vaccine. ≈12 hours on a workstation.

Design. Send the finished design to a GMP manufacturer. A vial arrives roughly ten days later.

Eight stages. Twelve hours. One vaccine.

#	Stage	State	Tools
1	Ingestion	Live	samtools, pigz, fastp
2	Alignment	Live — chunked stop-and-resume on commodity hardware	strobealign, samtools
3	Variant Calling	Live — karyogram + plain-English filter buckets, Broad 1000G panel-of-normals on human runs	GATK Mutect2 (GPU via NVIDIA Parabricks when available)
4	Annotation	Live — cancer-gene cards + lollipop plot	Ensembl VEP 111
5	Neoantigen Prediction	Live — binding buckets + peptide × allele heatmap + antigen funnel	pVACseq 5.4.0, MHCflurry 2.0 (default, license-free) or NetMHCpan 4.2, NetMHCIIpan 4.3
6	Epitope Selection	Live — 8-slot cassette curation UI	pVACview + custom scoring
7	mRNA Construct Design	Live — molecule hero + λ slider trading CAI vs. MFE + codon swap preview + 7/7 manufacturability checks	LinearDesign, DNAchisel, ViennaRNA
8	Construct Output	Live — color-coded FASTA with FASTA/GenBank/JSON downloads, CMO release flow, vet dosing, audit trail	pVACvector, Biopython

Every live stage is pause-and-resumable. Progress is surfaced honestly, tool names live in the expert drawer.

What you'll need

Inputs

Tumor + matched-normal sequencing for one patient. FASTQ, BAM, or CRAM. ≥30× coverage for confident somatic variant calling.

Hardware

	Recommended
RAM	64 GB — strobealign indexing peaks around 31 GB free
CPU	16 cores
Disk	1 TB SSD — a 30× human WGS costs ~400 GB (deduped BAMs + FASTQs); multiple cases share the ~55 GB reference + VEP cache + PON footprint
GPU	NVIDIA Ampere+ (RTX 3090 / 4090 / A-series / H-series) — Parabricks accelerates stage 3 Mutect2 ~10× (opt-in)
OS	Linux

Everything runs in a single Docker container: FastAPI backend + Next.js frontend + samtools + strobealign + GATK + VEP + pVACtools + MHCflurry + Parabricks base in one ~10 GB image. No cloud, no object storage.

Getting started

You don't need to clone this repo. Paste the compose file below, run docker compose up -d, open the browser.

1. Install Docker

Ubuntu / Debian / Linux Mint:

curl -fsSL https://get.docker.com | sudo bash
sudo usermod -aG docker "$USER"

macOS / Windows: install Docker Desktop. For GPU-accelerated stage 3 variant calling on Linux, also install the NVIDIA Container Toolkit.

2. Create the compose file

mkdir ~/mutavax && cd ~/mutavax
curl -fsSL https://raw.githubusercontent.com/niach/mutavax/main/docker-compose.yml -o docker-compose.yml

The file pulls the pre-built image from GHCR (ghcr.io/niach/mutavax) — no build step on your machine.

3. Create a `.env` (optional)

Most users don't need one. Add it if you want to customize anything:

cat > .env <<'EOF'
# Where workspace artifacts, references, and the SQLite DB live. Default: ./data
# MUTAVAX_DATA_ROOT=./data

# Stage 9 AI review — only needed if you want the LLM review feature.
# ANTHROPIC_API_KEY=

# Switch the class-I predictor back to DTU NetMHCpan (default is MHCflurry).
# MUTAVAX_CLASS_I_PREDICTOR=NetMHCpan
EOF

See .env.example for the full list of overrides.

4. NetMHC binaries (optional for humans)

The compose file ships with MHCflurry as the default class-I predictor — a license-free alternative to NetMHCpan, validated to match NetMHCpan AUC = 1.000 on the canonical tumor-antigen benchmark. Human users running stages 1–5 class-I only need nothing else.

Opt in to the DTU NetMHC stack if you want:

non-human species (dog DLA / cat FLA — MHCflurry has no canine or feline training data), or
class-II neoantigen scoring (NetMHCIIpan has no license-free equivalent).

Both are free for academic use; commercial usage needs a separate DTU license. Fill the forms, download the Linux tarballs:

NetMHCpan 4.2 — https://services.healthtech.dtu.dk/services/NetMHCpan-4.2/
NetMHCIIpan 4.3 — https://services.healthtech.dtu.dk/services/NetMHCIIpan-4.3/

Extract them so the layout is:

./data/netmhc/
├── netMHCpan-4.2/
└── netMHCIIpan-4.3/

That dir is mounted at /tools/src:ro inside the backend container, which matches the stock DTU wrapper scripts' hardcoded NMHOME — no script edits.

5. Drop in your DNA

The backend auto-creates ./data/inbox/, ./data/workspaces/, ./data/references/, and ./data/vep-cache/ on first start. Drop your tumor + normal FASTQ / BAM / CRAM pair into ./data/inbox/ and the app registers them into a workspace.

6. Start

docker compose up -d

Open http://localhost:3000. Create a workspace, pick a species, follow the stages.

LAN access: The web UI binds to 0.0.0.0:3000, so any other machine on your network can hit http://<server-ip>:3000. Put it behind Caddy/Traefik if you want TLS.

GPU-accelerated stage 3 (opt-in):

curl -fsSL https://raw.githubusercontent.com/niach/mutavax/main/docker-compose.gpu.yml -o docker-compose.gpu.yml
docker compose -f docker-compose.yml -f docker-compose.gpu.yml up -d

Requires NVIDIA drivers + the NVIDIA Container Toolkit.

Reference genomes

Species reference — GRCh38 (human), UU_Cfam_GSD_1.0 (dog), or Felis_catus_9.0 (cat) — is auto-downloaded on the first alignment run. Cached under ./data/references/, shared across workspaces.

Panel-of-normals (human only)

Human workspaces apply the Broad's 1000 Genomes panel-of-normals to Mutect2 to filter recurrent artefacts and low-frequency germline variants. The VCF is auto-downloaded, renamed from UCSC to Ensembl contigs, and indexed on first variant-calling run. Lives under ./data/references/pon/grch38/. Set MUTAVAX_PON_GRCH38_VCF="" in .env to disable.

Dog and cat workspaces skip the PON (no curated canine / feline panel exists yet).

Troubleshooting

Alignment refuses to start with "insufficient memory." Indexing the human reference peaks around 31 GB of RAM. Either free some up, or drop a prebuilt index into ./data/references/ (see the contributors section below).

Stage 5 preflight says a NetMHC binary is missing. You set MUTAVAX_CLASS_I_PREDICTOR=NetMHCpan but didn't drop the tarballs in. Check ls ./data/netmhc/ — should contain netMHCpan-4.2/ and netMHCIIpan-4.3/ as directories, not tarballs.

Stage 5 finishes with zero peptides. Your patient alleles weren't recognized by pvacseq. The Patient MHC panel marks these with a strikethrough + SKIPPED pill. For dog, pvacseq only recognizes a handful of DLA-88 alleles and zero class II alleles.

Annotation complains about missing TSL fields. Rerun stage 4 on the workspace — older annotations predate the --tsl flag and need refreshing.

For contributors

Clone the repo for source-level work:

git clone https://github.com/niach/mutavax.git
cd mutavax
npm install

Frontend: Next.js 15, React 19, TypeScript, Tailwind. Backend: FastAPI + SQLAlchemy, all bioinformatics tools in one Docker image, SQLite under ./data/.

Dev workflow — hot-reload the backend from the cloned source, run the Next.js dev server on the host:

docker compose -f docker-compose.yml -f docker-compose.dev.yml up    # backend with --reload on :8000
npm run dev                                                          # next dev on :3000

Set NEXT_PUBLIC_API_URL=http://localhost:8000 in your .env for this workflow so the browser hits the native uvicorn instead of the same-origin /backend proxy.

Fast tests (lint + TS + backend non-integration):

npm run test:fast

Browser and live real-data paths:

npx playwright install chromium
npm run test:integration
npm run sample-data:smoke
npm run test:backend:real-data
npm run test:browser:real-data

Sample datasets for smoke and full validation runs:

npm run sample-data:smoke                 # COLO829 smoke (~50k read pairs per lane)
npm run sample-data:full                  # COLO829 full 100x WGS (~174 GB)
npm run sample-data:alignment             # BAM/CRAM normalization fixture
python3 scripts/fetch_canine_dlbcl_sample_data.py         # canine DLBCL smoke
python3 scripts/fetch_canine_dlbcl_sample_data.py --mode full  # full DLBCL1 pair (~45 GB)

Regenerate the screenshots in this README (frontend + backend must be running):

# Stages 1–5 need a real completed pipeline run; point the script at that workspace.
node scripts/take-screenshots.mjs <workspace-id>

# Stages 6–8 can be captured from a synthetic demo workspace that skips the heavy
# bioinformatics (inserts minimum DB stubs only — not suitable for any real run).
docker cp scripts/seed_demo_workspace.py mutavax:/tmp/seed.py
WORKSPACE_ID=$(docker exec mutavax python /tmp/seed.py)
node scripts/take-screenshots.mjs --stages=6,7,8 "$WORKSPACE_ID"

Alignment compute knobs (chunk size, per-chunk aligner threads, samtools sort memory, parallel chunks) are tunable from the UI's Compute Settings drawer on the alignment stage — no env file edit needed. They persist to ./data/settings.json.

Full list of env overrides lives in .env.example.

Credits

mutavax is inspired by Paul Conyngham's 2025 personalized mRNA vaccine for his dog Rosie (mast cell cancer, 75% tumor shrinkage). His pipeline — BWA-MEM2 → Mutect2 → VEP → pVACseq with NetMHCpan — proved the approach works on a single-patient, single-desktop scale. mutavax is an attempt to make that pipeline accessible as a guided workspace, species-flexible by default.

Built on the shoulders of:

pVACtools (Griffith Lab)
MHCflurry (openvax) — license-free class-I binding predictor
NetMHCpan / NetMHCIIpan (DTU Health Tech)
Ensembl VEP + its pVACseq-ready plugins (Frameshift, Wildtype, Downstream)
GATK Mutect2 and NVIDIA Parabricks
strobealign, samtools, pigz

Name		Name	Last commit message	Last commit date
Latest commit History 145 Commits
.github/workflows		.github/workflows
.run		.run
backend		backend
docs		docs
public		public
scripts		scripts
src		src
tests/e2e		tests/e2e
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
README.md		README.md
components.json		components.json
docker-compose.dev.yml		docker-compose.dev.yml
docker-compose.gpu.yml		docker-compose.gpu.yml
docker-compose.yml		docker-compose.yml
eslint.config.mjs		eslint.config.mjs
next.config.ts		next.config.ts
package-lock.json		package-lock.json
package.json		package.json
playwright.config.ts		playwright.config.ts
postcss.config.mjs		postcss.config.mjs
pytest.ini		pytest.ini
tsconfig.json		tsconfig.json
validation.md		validation.md
vercel.json		vercel.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mutavax

Sample. Compute. Design.

Eight stages. Twelve hours. One vaccine.

What you'll need

Inputs

Hardware

Getting started

1. Install Docker

2. Create the compose file

3. Create a `.env` (optional)

4. NetMHC binaries (optional for humans)

5. Drop in your DNA

6. Start

Reference genomes

Panel-of-normals (human only)

Troubleshooting

For contributors

Credits

About

Uh oh!

Releases 5

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

mutavax

Sample. Compute. Design.

Eight stages. Twelve hours. One vaccine.

What you'll need

Inputs

Hardware

Getting started

1. Install Docker

2. Create the compose file

3. Create a .env (optional)

4. NetMHC binaries (optional for humans)

5. Drop in your DNA

6. Start

Reference genomes

Panel-of-normals (human only)

Troubleshooting

For contributors

Credits

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 5

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

3. Create a `.env` (optional)

Packages