A single-file, browser-based chemical data platform for medicinal chemistry. SAR, ADME, MPO, kinase selectivity, and AI/ML — all in one HTML file. No install, no server, no account.
Jump to: Live demo · Case study walkthrough · Architecture · Roadmap · Changelog
→ Launch the live demo (hosted on GitHub Pages)
The demo is pre-loaded with 25 public FDA-approved drugs organized in 6 project series so you can exercise every analysis feature without uploading anything:
- EGFR (5): Gefitinib, Erlotinib, Lapatinib, Osimertinib, Afatinib
- BCR-ABL (5): Imatinib, Dasatinib, Nilotinib, Bosutinib, Ponatinib
- VEGFR2 (5): Sorafenib, Sunitinib, Regorafenib, Pazopanib, Axitinib
- JAK (3): Ruxolitinib, Tofacitinib, Baricitinib
- BTK (2): Ibrutinib, Acalabrutinib
- CDK4/6 + ALK (3): Palbociclib, Ribociclib, Crizotinib
- COX-1 (2): Aspirin, Ibuprofen
For the Kinase Selectivity module, a 300-kinase × 3-compound dummy panel (docs/sample_panels/demo_kinase_panel.xlsx) is included — upload it inside the Kinase Selectivity tool to immediately see the heatmap, selectivity overlap, and KinMap visualization in action. Use the demo data to explore SAR, MMPA, ADME, MPO, AutoQSAR, Bioisosteres, Kinase Selectivity, and every other feature. → See the full walkthrough in CASE_STUDY.md.
To use MolForge with your own data, download molforge_database.html and open it in Chrome, Edge, Firefox, or Safari.
| MolForge | Dotmatics / LiveDesign | DataWarrior | RDKit + Jupyter | |
|---|---|---|---|---|
| Cost | Free | $5–50K / seat / yr | Free | Free |
| Install required | None | Yes | Java | Python + conda |
| Single-file deploy | Yes | No | No | No |
| SAR / MMPA / MPO | Yes | Yes | Partial | DIY |
| Kinase panel upload | Yes (with session persistence) | Yes | No | DIY |
| ADME prediction (25 endpoints) | Yes | Yes | Partial | DIY |
| AutoQSAR (Ridge, RF, DNN, MPNN) | Yes | Yes | Partial | DIY |
| Bioisosteric replacement | Yes | Yes | No | DIY |
| Structure editor | Yes | Yes | Yes | No |
| Offline mode | Yes (Portable Version) | No | Yes | Yes |
| Privacy (client-side only) | Yes | Cloud | Yes | Yes |
| Multi-user collaboration | No | Yes | No | No |
| 3D / docking | No | Yes | Partial | DIY |
| GxP / audit compliance | No | Yes | No | No |
MolForge is designed for the 80 % of medicinal-chemistry workflows that do not need 3D, multi-user, or regulatory validation — and for which enterprise pricing is the blocker.
- File-embedded database — compound data is embedded directly in the HTML file. Move it, rename it, email it: the data travels with it.
- IndexedDB session cache — every change is auto-saved in the browser for instant recovery on refresh.
- Save / Save As — save in place via the File System Access API (Chrome/Edge) or download a new copy.
- Import / Export — CSV, Excel (.xlsx), JSON, and SDF with full round-trip fidelity for custom columns.
- Grid and Card views with sort, search, substructure match, similarity search, and column filters.
- Custom columns with per-cell colors and file attachments (for biological data tracker).
- RDKit.js — SMILES parsing, 2D rendering, descriptors (MW, LogP, HBA/HBD, TPSA, rotatable bonds), substructure and similarity search, Morgan fingerprints.
- OpenChemLib structure editor — draw compounds in the browser with paste-from-ChemDraw support, undo/redo, zoom, periodic table element picker, and text annotations.
- SAR Analysis
- Scaffold and R-groups with automatic MCS detection (multiple fallbacks: Murcko consensus, substructure-match loop, ring-pattern library)
- SAR Matrix
- Activity Landscape (pIC50 potency ranking and LogP vs pIC50 bubble plot, both with manual compound selection)
- MMPA (matched molecular pair analysis via Bemis-Murcko grouping and Tanimoto similarity)
- Chemical Space (2D projection)
- ADME Prediction — 25 endpoints: ESOL, Lobell PPB, Lombardo Vdss, Clark/Ertl BBB, Wager CNS MPO, Bickerton QED, IV/IVE clearance, fragment CYP models. Rule-based QSAR for early screening.
- MPO Scoring — QED, CNS MPO, Rule of 5, Rule of 3, pMPO with exportable summary table.
- AI / ML
- AutoQSAR (Ridge, Random Forest, DNN, MPNN/GNN) with automated descriptor selection and cross-validation
- Bioisosteric Replacements (SMARTS-matched groups ranked by ADME relevance)
- MolGen (SyntheMol / REINVENT4-inspired analog generation)
- Kinase Selectivity
- Selectivity Analysis: upload KINOMEscan / Eurofins / DiscoverX panels (Excel, CSV, TSV). Uploads persist across sessions via IndexedDB — re-open the tool later and click "Open Saved" to resume.
- Selectivity Prediction: multi-level gatekeeper analysis, ChEMBL lookup, kinome heatmap, KLIFS-inspired interaction fingerprints.
- KinMap-style visualization.
- Compound Analysis Hub — per-compound profile with descriptors, ADME, MPO, bioisosteres, and predictions.
- Shipment Tracker and Biological Data Tracker — lightweight tracking tools with per-cell file attachments.
React 18 + Babel standalone, OpenChemLib 7.5, RDKit.js, TensorFlow.js, SheetJS (XLSX), jsPDF. All libraries load from CDN. No build step required.
Clone or download this repo, then double-click molforge_database.html. Your browser will open the app. Requires an internet connection on first load (CDN libraries are cached after).
If your browser blocks file:// access to worker scripts, serve the file over HTTP:
# Node (any version)
npx serve . -l 8095
# Python 3
python -m http.server 8095Then open http://localhost:8095/molforge_database.html.
Open the app and go to Import / Export → Share / Save → Create Portable Version. This downloads every CDN library and bundles them into a ~35 MB self-contained HTML file that works completely offline.
- Primary storage — compound data is embedded as JSON inside the HTML file (between
<!-- MOLFORGE_DATA_START -->and<!-- MOLFORGE_DATA_END -->). Click 💾 Save or 💾 Save As to rewrite the file. - Session cache — IndexedDB mirrors your working set so a refresh or crash doesn't lose work.
- Cross-browser — since data is in the file itself, the database works identically in Chrome, Firefox, Safari, and Edge.
- Offline — no cloud required. Everything you upload stays on your machine.
| Feature | Chrome/Edge | Firefox | Safari |
|---|---|---|---|
| Core app | ✅ | ✅ | ✅ |
| Save in place (File System Access) | ✅ | ⬇ falls back to download | ⬇ falls back to download |
| RDKit.js / TensorFlow.js | ✅ | ✅ | ✅ |
The project ships with only 25 public FDA-approved drugs as demo data (EGFR, BCR-ABL, VEGFR, JAK, BTK, CDK/ALK, and COX-1 series), plus a synthesized dummy kinase panel. All SMILES and IC50 values are from public literature and databases like ChEMBL — nothing confidential or proprietary.
Your own compounds live in your own copy of the file. MolForge makes no network requests except to load CDN libraries on startup, and never transmits your data.
A GitHub Actions workflow (verify-no-compound-data.yml) runs on every push and PR to enforce that the embedded data block contains only the 25 whitelisted public FDA-approved drug compounds — this prevents any accidental commit of confidential data.
Place screenshots in docs/screenshots/ and reference them here.
Open source: GNU Affero General Public License v3.0 (LICENSE). You can use, modify, and redistribute MolForge freely under AGPL's terms, including the network-copyleft clause — any modified version you run as a service must share its source under AGPL too.
Commercial: If AGPL's copyleft obligations conflict with your organization's policies, commercial licenses are available. See COMMERCIAL.md.
Contributing: External contributions require signing a lightweight Contributor License Agreement so the project can continue offering both open-source and commercial options.
Issues and pull requests welcome. See CONTRIBUTING.md for guidelines, good-first-issues for ideas, and CODE_OF_CONDUCT.md for the community standards.
- CASE_STUDY.md — a five-minute walkthrough using the 10 demo compounds
- ARCHITECTURE.md — how an 800 KB HTML file replaces a typical pharma cheminformatics stack
- ROADMAP.md — what's next
- CHANGELOG.md — release notes
- SECURITY.md — privacy model and security reporting
- NOTICE.md — third-party library licenses and attribution (full audit)
- docs/sample_panels/ — 300-kinase demo selectivity panel (XLSX / CSV) for the Kinase Selectivity module
If you use MolForge in academic work, please cite it — see CITATION.cff or click "Cite this repository" on GitHub.
- RDKit.js — cheminformatics library
- OpenChemLib — structure editor
- SheetJS — Excel import/export
- TensorFlow.js — in-browser ML
- ChEMBL — bioactivity reference database
- Published QSAR models: ESOL (Delaney), PPB (Lobell), Vdss (Lombardo), BBB (Clark/Ertl), CNS MPO (Wager), QED (Bickerton)