Skip to content

matbanik/agentic-genomics

Repository files navigation

Agentic Genomics

Connect your whole-genome sequencing data to AI coding assistants via the Model Context Protocol (MCP).

Query personal VCF files — variants, genes, pharmacogenomics, GWAS traits, polygenic risk scores, and ClinVar pathogenicity — directly from your IDE using natural language. Supports trio analysis (patient + parents) for inheritance determination.

Architecture

┌──────────────────────────────────────────────────────────────────────┐
│  IDE (Antigravity / Codex / Claude Code)     Windows 11              │
│  ┌──────────────────────┐  ┌─────────────────────┐  ┌────────────┐ │
│  │  genechat-mcp        │  │  opencravat-mcp     │  │  pomera    │ │
│  │  WSL2 → Ubuntu       │  │  cloud SSE          │  │  Windows   │ │
│  │  LOCAL VCF queries   │  │  REMOTE variant     │  │  Python    │ │
│  │  ClinVar, SnpEff,    │  │  annotation (VEST,  │  │  Notes,    │ │
│  │  PGx, GWAS, PRS      │  │  REVEL, gnomAD…)    │  │  Search,   │ │
│  │                      │  │                     │  │  AI Tools  │ │
│  └──────┬───────────────┘  └─────────────────────┘  └────────────┘ │
│         │                                                          │
│    ┌────▼────────────────────────┐                                 │
│    │  C:\Users\<user>\           │                                 │
│    │    genechat-data\           │                                 │
│    │      Mat\     *.vcf.gz      │                                 │
│    │      Father\  *.vcf.gz      │                                 │
│    │      Mother\  *.vcf.gz      │                                 │
│    └─────────────────────────────┘                                 │
└──────────────────────────────────────────────────────────────────────┘
Server Runs Data Stays Purpose
genechat-mcp WSL2 (local) On your machine Query VCFs — variants, genes, PGx, GWAS, PRS, ClinVar
opencravat-mcp Cloud SSE Variant coordinates sent to mcp.opencravat.org Deep annotation — VEST, REVEL, CADD, gnomAD
pomera Windows (local) On your machine Cross-session notes, web search, AI research tools

Supported IDEs

Quick Start

Full instructions with copy-paste commands are in the linked guides below.

  1. Install WSL2 + Ubuntu + Miniforge + bio tools + GeneChat CLI → Installation
  2. Prepare VCFs from Nebula / Sequencing.com (bgzip, index, validate) → VCF Preparation
  3. Configure IDE — wire MCP servers, register genomes, install databases → IDE Configuration
  4. Run workflows — trio analysis, variant annotation, cross-session persistence → Workflows
  5. TroubleshootTroubleshooting

Documentation

# Guide Description
0 Setup Guide Start here — overview, prerequisites, reading order
1 Installation WSL2, Miniforge, bcftools, SnpEff, GeneChat CLI
2 VCF Preparation Nebula / Sequencing.com formats, bgzip, contig handling
3 IDE Configuration serve.sh, Antigravity / Codex / Claude Code configs, Pomera, genome registration
4 Workflows Trio analysis skill, gene investigation, variant annotation, documentation patterns
5 Troubleshooting Common errors, verification checklist, recovery procedures

Prerequisites

Component Minimum Purpose
Windows 11 with WSL2 Host OS
Python 3.10+ Pomera MCP server
Node.js v18+ mcp-remote proxy, Pomera npm wrapper
RAM 16 GB SnpEff needs 8 GB heap
Disk ~10 GB VCFs + ClinVar/SnpEff/GWAS databases

Privacy & Security

  • genechat-mcp is fully local — your VCF data never leaves your machine
  • opencravat-mcp sends variant coordinates (rsIDs, chr/pos) to mcp.opencravat.org (KarchinLab, Johns Hopkins) — a research service with no HIPAA/BAA guarantees
  • pomera stores notes locally; web search and AI tools route through configured external providers
  • Never include patient names, DOB, or identifiers in any external-facing tool call

See Privacy Considerations for full details.

Upstream Projects

This guide orchestrates these open-source tools — it does not contain their source code:

Contributing

Found an issue or have a suggestion? Open an issue or submit a pull request.

License

This project is licensed under the MIT License.

About

Connect whole-genome sequencing data to AI coding assistants via the Model Context Protocol (MCP). Setup guides for GeneChat-MCP, OpenCRAVAT-MCP, and Pomera across Antigravity, Codex, and Claude Code.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors