Open-source (AGPL-3.0), self-hostable scientific discovery engine. Ehrlich uses Claude as a hypothesis-driven scientific reasoning engine that works across multiple scientific domains. The core engine is domain-agnostic: a pluggable DomainConfig system lets any domain bring its own tools, scoring, and visualization. Run it on your own infrastructure with your own API keys, or use the hosted instance where credits cover the Anthropic API costs.
Named after Paul Ehrlich, the father of the "magic bullet" -- finding the right answer for any question.
Built for the Claude Code Hackathon (Feb 10-16, 2026).
Ehrlich is COSS (Commercial Open-Source Software) -- the same model used by Supabase, PostHog, Cal.com, and GitLab. The entire codebase is open source under AGPL-3.0. There is no proprietary version.
Two paths, same product:
| Path | How It Works | Cost |
|---|---|---|
| Self-host | Clone the repo, bring your own Anthropic API key | Free. No limits, no credits, no account needed |
| Hosted instance | Use app.ehrlich.dev | Credits cover Anthropic API costs (Opus is expensive) |
Credits exist because Claude Opus costs real money per investigation. They make scientific reasoning accessible -- not monetized. A student in Mexico and a pharma company in Boston get the same 91 tools, the same 28 data sources, the same methodology. The model quality is the only variable.
The AI is the scientist. The platform is the laboratory.
Ehrlich is domain-agnostic. The hypothesis-driven engine works for any scientific domain:
- Antimicrobial resistance -- "Find novel compounds active against MRSA"
- Drug discovery -- "What compounds in Lycopodium selago could treat Alzheimer's?"
- Environmental toxicology -- "Which microplastic degradation products are most toxic?"
- Agricultural biocontrol -- "Find biocontrol molecules from endophytic bacteria in Mammillaria"
- Any molecular target -- dynamic protein target discovery from 200K+ PDB structures
- Training optimization -- "What are the most effective protocols for improving VO2max?"
- Injury prevention -- "Evaluate ACL injury prevention programs for female soccer players"
- Supplement evidence -- "What is the evidence for creatine on strength performance?"
- Nutrient profiling -- "Compare protein content and amino acid profiles of whey vs plant-based supplements"
- Social program analysis -- "Evaluate the effectiveness of Sonora's sports scholarship program"
- Causal inference -- "Does the conditional cash transfer reduce school dropout rates?" (4 methods: DiD, PSM, RDD, Synthetic Control)
- Cross-country benchmarking -- "Compare cost-effectiveness of state sports programs in Mexico"
- Economic indicators from World Bank, WHO GHO, FRED, Census Bureau, BLS, INEGI, and Banxico
- US federal data: USAspending grants, College Scorecard education outcomes, HUD housing data, CDC WONDER mortality/natality, data.gov open datasets
- Mexico data: INEGI economic/demographic series, Banxico SIE financial series, datos.gob.mx open datasets
- CONEVAL/CREMAA indicator quality analysis (MIR framework validation)
- Cross-program comparison and international benchmarking
- Domain-agnostic causal inference tools (in analysis/ context) usable by any domain
Ehrlich follows Domain-Driven Design with eleven bounded contexts:
| Context | Purpose |
|---|---|
| kernel | Shared primitives: SMILES, Molecule, exceptions |
| shared | Cross-cutting ports and value objects (ChemistryPort, Fingerprint, Conformer3D) |
| literature | Scientific paper search (Semantic Scholar) and reference management |
| chemistry | Cheminformatics: molecular descriptors, fingerprints, 3D conformers |
| analysis | Dataset exploration (ChEMBL, PubChem), substructure enrichment, domain-agnostic causal inference (DiD, PSM, RDD, Synthetic Control) |
| prediction | ML modeling: train, predict, ensemble, cluster |
| simulation | Molecular docking, ADMET, resistance, target discovery (RCSB PDB), toxicity (EPA CompTox) |
| training | Exercise physiology: evidence analysis, protocol comparison, injury risk, training metrics, clinical trials |
| nutrition | Nutrition science: supplement evidence, labels, nutrients, safety, interactions, adequacy, inflammatory index |
| investigation | Multi-model agent orchestration with Director-Worker-Summarizer pattern + domain registry + MCP bridge |
| impact | Social program evaluation: economic indicators (World Bank, WHO GHO, FRED, Census, BLS, INEGI, Banxico), health (CDC WONDER), spending (USAspending), education (College Scorecard), housing (HUD), open data (data.gov, datos.gob.mx), CONEVAL/CREMAA MIR analysis. Causal estimators live in analysis/ (domain-agnostic) |
Ehrlich uses a three-tier Claude model architecture for cost-efficient investigations:
Opus 4.6 (Director) -- Formulates hypotheses, evaluates evidence, synthesizes (3-5 calls)
Sonnet 4.5 (Researcher) -- Executes experiments with 91 tools (10-20 calls)
Haiku 4.5 (Summarizer) -- Compresses large outputs, classifies domains (5-10 calls)
Cost: ~$3-4 per investigation (vs ~$11 with all-Opus).
| Source | What It Provides | Coverage |
|---|---|---|
| ChEMBL | Bioactivity data (MIC, IC50, Ki, EC50, Kd) | 2.5M+ compounds, 15K+ targets |
| RCSB PDB | Protein target discovery by organism/function | 200K+ structures |
| PubChem | Compound search by target/activity/similarity | 100M+ compounds |
| EPA CompTox | Environmental toxicity, bioaccumulation, fate | 1M+ chemicals |
| Semantic Scholar | Scientific literature search | 200M+ papers |
| UniProt | Protein function, disease associations, GO terms | 250M+ sequences |
| Open Targets | Disease-target associations (scored evidence) | 60K+ targets |
| GtoPdb | Expert-curated pharmacology (pKi, pIC50) | 11K+ ligands |
| ClinicalTrials.gov | Registered exercise/training RCTs | 500K+ studies |
| PubMed | Biomedical literature with MeSH terms | 37M+ articles |
| wger | Exercise database (muscles, equipment, categories) | 300+ exercises |
| NIH DSLD | Dietary supplement label database | 120K+ products |
| USDA FoodData | Nutrient profiles for foods and supplements | 1.1M+ foods |
| OpenFDA CAERS | Supplement adverse event reports | 200K+ reports |
| RxNav | Drug interaction screening (RxCUI resolution) | 100K+ drugs |
| World Bank | Development indicators by country (GDP, poverty, education) | 190+ countries |
| WHO GHO | Global health statistics (life expectancy, mortality, disease) | 190+ countries |
| FRED | US economic time series (GDP, unemployment, CPI) | 800K+ series |
| Census Bureau | US demographics, poverty, education (ACS 5-year) | 50 states + territories |
| BLS | US labor statistics (unemployment, CPI, wages) | 130K+ series |
| USAspending | Federal spending awards and grants | All federal agencies |
| College Scorecard | US higher education outcomes (completion, earnings) | 6K+ institutions |
| HUD | Fair Market Rents, income limits, housing data | All US counties |
| CDC WONDER | US mortality, natality, public health statistics | National-level |
| data.gov | US federal open dataset discovery (CKAN) | 300K+ datasets |
| INEGI Indicadores | Mexico economic/demographic time series | 400K+ series |
| Banxico SIE | Mexico central bank financial series | Exchange rates, inflation, reserves |
| datos.gob.mx | Mexico federal open dataset discovery (CKAN) | 1000+ datasets |
All data sources are free and open-access.
| Context | Tool | Description |
|---|---|---|
| Chemistry | validate_smiles |
Validate SMILES string |
| Chemistry | compute_descriptors |
MW, LogP, TPSA, HBD, HBA, QED, rings |
| Chemistry | compute_fingerprint |
Morgan (2048-bit) or MACCS (166-bit) |
| Chemistry | tanimoto_similarity |
Similarity between two molecules (0-1) |
| Chemistry | generate_3d |
3D conformer with MMFF94 optimization |
| Chemistry | substructure_match |
SMARTS/SMILES substructure search |
| Literature | search_literature |
Semantic Scholar paper search |
| Literature | search_citations |
Citation chasing (snowballing) via references/citing |
| Literature | get_reference |
Curated reference lookup |
| Analysis | explore_dataset |
Load ChEMBL bioactivity data for any target |
| Analysis | search_bioactivity |
Flexible ChEMBL query (any assay type) |
| Analysis | search_compounds |
PubChem compound search |
| Analysis | analyze_substructures |
Chi-squared enrichment analysis |
| Analysis | compute_properties |
Property distributions (active vs inactive) |
| Analysis | search_pharmacology |
GtoPdb curated receptor/ligand interactions |
| Causal | estimate_did |
Difference-in-differences causal estimation |
| Causal | estimate_psm |
Propensity score matching with balance diagnostics |
| Causal | estimate_rdd |
Regression discontinuity design (sharp/fuzzy) |
| Causal | estimate_synthetic_control |
Synthetic control method |
| Causal | assess_threats |
Validity threat assessment for causal methods |
| Causal | compute_cost_effectiveness |
Cost per unit outcome, ICER |
| Prediction | train_model |
Train XGBoost on SMILES+activity data |
| Prediction | predict_candidates |
Score compounds with trained model |
| Prediction | cluster_compounds |
Butina structural clustering |
| ML | train_classifier |
Train binary classifier on tabular feature data (any domain) |
| ML | predict_scores |
Score samples with trained classifier (any domain) |
| ML | cluster_data |
Hierarchical clustering on tabular features (any domain) |
| Simulation | search_protein_targets |
RCSB PDB target discovery by organism/function |
| Simulation | dock_against_target |
Descriptor-based binding energy estimation |
| Simulation | predict_admet |
Drug-likeness profiling |
| Simulation | fetch_toxicity_profile |
EPA CompTox environmental toxicity |
| Simulation | assess_resistance |
Resistance mutation scoring |
| Simulation | get_protein_annotation |
UniProt protein function and disease links |
| Simulation | search_disease_targets |
Open Targets disease-target associations |
| Training | search_training_literature |
Training science literature with MeSH expansion, study type ranking, non-human filtering |
| Training | analyze_training_evidence |
Pooled effect sizes, heterogeneity, evidence grading |
| Training | compare_protocols |
Evidence-weighted protocol comparison |
| Training | assess_injury_risk |
Knowledge-based injury risk scoring |
| Training | compute_training_metrics |
ACWR, monotony, strain, session RPE load |
| Training | search_clinical_trials |
ClinicalTrials.gov exercise/training RCT search |
| Training | search_pubmed_training |
PubMed literature search with MeSH terms |
| Training | search_exercise_database |
Exercise database by muscle/equipment/category |
| Training | compute_performance_model |
Banister fitness-fatigue model (CTL/ATL/TSB) |
| Training | compute_dose_response |
Dose-response curve from dose-effect data |
| Training | plan_periodization |
Evidence-based periodization planning (linear/undulating/block) |
| Nutrition | search_supplement_evidence |
Supplement evidence with GRADE-style ranking, retracted paper exclusion |
| Nutrition | search_supplement_labels |
NIH DSLD supplement product ingredient lookup |
| Nutrition | search_nutrient_data |
USDA FoodData Central nutrient profiles |
| Nutrition | search_supplement_safety |
OpenFDA CAERS adverse event reports |
| Nutrition | compare_nutrients |
Side-by-side nutrient comparison with per-nutrient deltas, winner, MAR score |
| Nutrition | assess_nutrient_adequacy |
DRI-based nutrient adequacy assessment |
| Nutrition | check_intake_safety |
Tolerable Upper Intake Level safety screening |
| Nutrition | check_interactions |
Drug-supplement interaction screening via RxNav |
| Nutrition | analyze_nutrient_ratios |
Key nutrient ratio analysis (omega-6:3, Ca:Mg, etc.) |
| Nutrition | compute_inflammatory_index |
Simplified Dietary Inflammatory Index scoring |
| Impact | search_economic_indicators |
Query economic time series from FRED, BLS, Census, World Bank, WHO GHO, INEGI, or Banxico |
| Impact | search_health_indicators |
Search WHO GHO or CDC WONDER for health indicators |
| Impact | fetch_benchmark |
Get comparison values from international or US data sources |
| Impact | compare_programs |
Cross-program comparison using statistical tests |
| Impact | search_spending_data |
Search USAspending.gov for federal spending awards and grants |
| Impact | search_education_data |
Search College Scorecard for US higher education outcomes |
| Impact | search_housing_data |
Search HUD for Fair Market Rents and income limits |
| Impact | search_open_data |
Search data.gov or datos.gob.mx CKAN catalog for open datasets |
| Impact | analyze_program_indicators |
Validate MIR indicators against CREMAA quality criteria (CONEVAL methodology) |
| Visualization | render_binding_scatter |
Scatter plot of compound binding affinities |
| Visualization | render_admet_radar |
Radar chart of ADMET/drug-likeness properties |
| Visualization | render_training_timeline |
Training load timeline with ACWR danger zones |
| Visualization | render_muscle_heatmap |
Anatomical body diagram with muscle activation |
| Visualization | render_forest_plot |
Forest plot for meta-analysis results |
| Visualization | render_evidence_matrix |
Hypothesis-by-evidence support/contradiction heatmap |
| Visualization | render_performance_chart |
Banister fitness-fatigue performance chart |
| Visualization | render_funnel_plot |
Funnel plot for publication bias assessment |
| Visualization | render_dose_response |
Dose-response curve with confidence intervals |
| Visualization | render_nutrient_comparison |
Grouped bar chart comparing nutrient profiles |
| Visualization | render_nutrient_adequacy |
Horizontal bar chart with DRI adequacy + MAR score |
| Visualization | render_therapeutic_window |
Therapeutic window chart (EAR/RDA/AI/UL zones) |
| Visualization | render_program_dashboard |
Multi-indicator KPI dashboard with target tracking |
| Visualization | render_geographic_comparison |
Region bar chart with benchmark reference line |
| Visualization | render_parallel_trends |
DiD parallel trends chart (treatment vs control) |
| Visualization | render_rdd_plot |
Regression discontinuity scatter with cutoff line |
| Visualization | render_causal_diagram |
DAG showing treatment, outcome, confounders |
| Statistics | run_statistical_test |
Compare two numeric groups (auto-selects t-test/Welch/Mann-Whitney) |
| Statistics | run_categorical_test |
Test contingency tables (auto-selects Fisher's exact/chi-squared) |
| Investigation | propose_hypothesis |
Register testable hypothesis |
| Investigation | design_experiment |
Plan experiment with tool sequence |
| Investigation | evaluate_hypothesis |
Assess outcome with confidence score |
| Investigation | record_finding |
Record finding linked to hypothesis |
| Investigation | record_negative_control |
Validate model with known-inactive compounds |
| Investigation | search_prior_research |
Search past investigation findings via full-text search |
| Investigation | query_uploaded_data |
Query user-uploaded files (CSV/Excel filtering, PDF text) |
| Investigation | conclude_investigation |
Final summary with ranked candidates |
- Server: Python 3.12, FastAPI, uv, PostgreSQL (asyncpg), httpx
- Console: React 19, TypeScript 5.6+, Bun, Vite 7+, TanStack Router, 3Dmol.js, WorkOS AuthKit
- AI: Claude Opus 4.6 + Sonnet 4.5 + Haiku 4.5 (Anthropic API) with tool use
- Science: RDKit, Chemprop, XGBoost, PyArrow
- Visualization: 3Dmol.js (live 3D molecular scene), React Flow (investigation diagrams)
- Data: ChEMBL, RCSB PDB, PubChem, EPA CompTox, Semantic Scholar (all free APIs)
Self-hosting? You use your own Anthropic API key directly -- no credits, no account, no hosted dependency. The credit system only applies to the managed hosted instance.
- uv (Python package manager)
- Bun (JavaScript runtime)
- Python 3.12
- PostgreSQL 16+
- An Anthropic API key
| Variable | Required | Description |
|---|---|---|
EHRLICH_ANTHROPIC_API_KEY |
Yes | Anthropic API key for Claude |
ANTHROPIC_API_KEY |
Alt | Falls back to this if EHRLICH_ not set |
EHRLICH_DATABASE_URL |
Yes | PostgreSQL connection URL (default: postgresql://postgres:postgres@localhost:5432/ehrlich) |
EHRLICH_WORKOS_CLIENT_ID |
Yes | WorkOS AuthKit client ID (authentication) |
EHRLICH_WORKOS_API_KEY |
Yes | WorkOS API key (authentication) |
EHRLICH_DIRECTOR_MODEL |
No | Director model (default: claude-opus-4-6) |
EHRLICH_RESEARCHER_MODEL |
No | Researcher model (default: claude-sonnet-4-5-20250929) |
EHRLICH_SUMMARIZER_MODEL |
No | Summarizer model (default: claude-haiku-4-5-20251001) |
EHRLICH_SUMMARIZER_THRESHOLD |
No | Chars before Haiku summarizes output (default: 2000) |
EHRLICH_ANTHROPIC_MODEL |
No | Single-model fallback (overrides all three) |
EHRLICH_MAX_ITERATIONS |
No | Max agent loop iterations (default: 50) |
EHRLICH_MAX_ITERATIONS_PER_PHASE |
No | Max iterations per experiment in multi-model mode (default: 10) |
EHRLICH_LOG_LEVEL |
No | Logging level (default: INFO) |
EHRLICH_COMPTOX_API_KEY |
No | EPA CompTox API key (free, for toxicity data) |
INEGI_API_TOKEN |
No | INEGI Indicadores API token (Mexico economic/demographic data) |
BANXICO_API_TOKEN |
No | Banxico SIE API token (Mexico central bank financial series) |
DATOSGOB_API_TOKEN |
No | datos.gob.mx API token (Mexico open datasets, optional) |
# Create the PostgreSQL database
createdb ehrlich
# Or via psql:
# psql -c "CREATE DATABASE ehrlich;"cd server
uv sync --extra dev # Core + dev dependencies
# uv sync --extra all --extra dev # All deps including deep learning
export EHRLICH_ANTHROPIC_API_KEY=sk-ant-...
export EHRLICH_DATABASE_URL=postgresql://postgres:postgres@localhost:5432/ehrlich
export EHRLICH_WORKOS_CLIENT_ID=client_...
export EHRLICH_WORKOS_API_KEY=sk_...
uv run uvicorn ehrlich.api.app:create_app --factory --reload --port 8000Optional dependency groups:
deeplearning-- Chemprop D-MPNN (pulls PyTorch)all-- everything
cd console
bun install
bun devOpen http://localhost:5173 in your browser.
Pre-download ChEMBL bioactivity data to avoid first-run delays:
# Download bioactivity data for all target organisms
uv run python data/scripts/prepare_data.py --chembl
# Download protein structures from RCSB PDB
uv run python data/scripts/prepare_data.py --proteins
# Download everything
uv run python data/scripts/prepare_data.py --allCached datasets are stored in data/datasets/ as parquet files.
docker compose upServer at :8000, Console at :3000.
| Method | Path | Description |
|---|---|---|
| GET | /api/v1/health |
Health check |
| GET | /api/v1/methodology |
Methodology: phases, domains, tools, data sources, models |
| GET | /api/v1/stats |
Aggregate counts (tools, domains, phases, data sources, events) |
| GET | /api/v1/molecule/depict?smiles=&w=&h= |
2D SVG depiction (image/svg+xml, cached 24h). SMILES max 500 chars |
| GET | /api/v1/molecule/conformer?smiles= |
3D conformer (JSON: mol_block, energy, num_atoms). SMILES max 500 chars |
| GET | /api/v1/molecule/descriptors?smiles= |
Molecular descriptors + Lipinski pass/fail. SMILES max 500 chars |
| GET | /api/v1/targets |
List protein targets (pdb_id, name, organism) |
| Method | Path | Description |
|---|---|---|
| GET | /api/v1/investigate |
List user's investigations (most recent first) |
| POST | /api/v1/investigate |
Create new investigation (director_tier: haiku/sonnet/opus) |
| GET | /api/v1/investigate/{id} |
Full investigation detail (owner only) |
| GET | /api/v1/investigate/{id}/stream |
SSE stream of investigation events (owner only, supports ?token=) |
| GET | /api/v1/investigate/{id}/paper |
Structured scientific paper + visualizations from completed investigation (owner only). PDF via /paper/:id route + browser print |
| POST | /api/v1/investigate/{id}/approve |
Approve/reject formulated hypotheses (owner only) |
| POST | /api/v1/upload |
Upload file (CSV/XLSX/PDF) for investigation data, returns preview |
| GET | /api/v1/credits/balance |
Current credit balance + BYOK status |
| Event | Description |
|---|---|
hypothesis_formulated |
Hypothesis with prediction, criteria, scope, prior confidence |
experiment_started |
Experiment execution begins |
experiment_completed |
Experiment finished with results |
hypothesis_evaluated |
Outcome assessed against pre-defined criteria |
negative_control |
Model validation result (known-inactive compound) |
positive_control |
Model validation result (known-active compound) |
validation_metrics |
Z'-factor, control separation stats |
tool_called |
Tool invocation with inputs |
tool_result |
Tool output preview |
finding_recorded |
Scientific finding with evidence level |
thinking |
Model reasoning text |
output_summarized |
Haiku compressed a large output |
literature_survey_completed |
PICO, search stats, evidence grade |
phase_changed |
Investigation phase transition (1-6) |
cost_update |
Progressive cost/token snapshot |
hypothesis_approval_requested |
Awaiting user approval of hypotheses |
domain_detected |
Domain identified with display config |
visualization |
Chart/diagram visualization rendered |
completed |
Investigation finished with candidates and cost |
error |
Error occurred |
- Start the server:
cd server && uv run uvicorn ehrlich.api.app:create_app --factory --port 8000 - Start the console:
cd console && bun dev - Open http://localhost:5173
- Type any research question or pick a template, for example:
- "Find novel antimicrobial candidates against MRSA"
- "What compounds in Lycopodium selago could treat Alzheimer's?"
- "Which microplastic degradation products are most toxic?"
- "What are the most effective training protocols for improving VO2max?"
- "Evaluate ACL injury prevention programs for female soccer players"
- Watch the investigation unfold in real-time via SSE streaming
- See domain-specific visualizations: molecular investigations show a 3D Live Lab (proteins load, ligands dock, candidates glow by score); training investigations show timeline charts and body diagrams; all domains render forest plots and evidence matrices
- View ranked candidates with domain-specific score columns in the results table
- Click any candidate row to expand the detail panel with domain-specific views (molecular: 3D conformer, properties, Lipinski; training/nutrition: scores, attributes)
- Browse the hypothesis board showing formulated, tested, supported, and refuted hypotheses
- Explore the auto-generated investigation diagram: SVG hypothesis map with status-colored nodes and evidence-chain arrows
- After completion, view the structured investigation report: research question, executive summary, hypotheses, methodology, findings, candidates, model validation, and cost breakdown
- Reload a completed investigation and see the full timeline replayed from stored events
- View past investigations in the history section on the home page
uv run pytest --cov=ehrlich --cov-report=term-missing # Tests + coverage
uv run ruff check src/ tests/ # Lint
uv run ruff format src/ tests/ # Format
uv run mypy src/ehrlich/ # Type checkbunx vitest run # Vitest
bun run build # vite build (generates routes + bundles)
bun run typecheck # tsc --noEmit (run after build to ensure route types exist)| Gate | Tool | Threshold |
|---|---|---|
| Linting | ruff | Zero violations |
| Formatting | ruff format | Zero violations |
| Type checking | mypy (strict) | Zero errors |
| Test coverage | pytest-cov | 80% minimum |
| TypeScript | tsc --noEmit | Zero errors |
See CONTRIBUTING.md for development setup, architecture overview, and code conventions.
AGPL-3.0 - see LICENSE.
Ricardo Armenta / Sequel