Skip to content

lwang68/molforge

MolForge

Live demo License: AGPL v3 Commercial license available Single file No install RDKit.js Deploy Pages

A single-file, browser-based chemical data platform for medicinal chemistry. SAR, ADME, MPO, kinase selectivity, and AI/ML — all in one HTML file. No install, no server, no account.

Jump to: Live demo · Case study walkthrough · Architecture · Roadmap · Changelog

MolForge demo
↑ Click to open the live demo

🚀 Try it now — no install required

→ Launch the live demo (hosted on GitHub Pages)

The demo is pre-loaded with 25 public FDA-approved drugs organized in 6 project series so you can exercise every analysis feature without uploading anything:

  • EGFR (5): Gefitinib, Erlotinib, Lapatinib, Osimertinib, Afatinib
  • BCR-ABL (5): Imatinib, Dasatinib, Nilotinib, Bosutinib, Ponatinib
  • VEGFR2 (5): Sorafenib, Sunitinib, Regorafenib, Pazopanib, Axitinib
  • JAK (3): Ruxolitinib, Tofacitinib, Baricitinib
  • BTK (2): Ibrutinib, Acalabrutinib
  • CDK4/6 + ALK (3): Palbociclib, Ribociclib, Crizotinib
  • COX-1 (2): Aspirin, Ibuprofen

For the Kinase Selectivity module, a 300-kinase × 3-compound dummy panel (docs/sample_panels/demo_kinase_panel.xlsx) is included — upload it inside the Kinase Selectivity tool to immediately see the heatmap, selectivity overlap, and KinMap visualization in action. Use the demo data to explore SAR, MMPA, ADME, MPO, AutoQSAR, Bioisosteres, Kinase Selectivity, and every other feature. → See the full walkthrough in CASE_STUDY.md.

To use MolForge with your own data, download molforge_database.html and open it in Chrome, Edge, Firefox, or Safari.


How MolForge compares

MolForge Dotmatics / LiveDesign DataWarrior RDKit + Jupyter
Cost Free $5–50K / seat / yr Free Free
Install required None Yes Java Python + conda
Single-file deploy Yes No No No
SAR / MMPA / MPO Yes Yes Partial DIY
Kinase panel upload Yes (with session persistence) Yes No DIY
ADME prediction (25 endpoints) Yes Yes Partial DIY
AutoQSAR (Ridge, RF, DNN, MPNN) Yes Yes Partial DIY
Bioisosteric replacement Yes Yes No DIY
Structure editor Yes Yes Yes No
Offline mode Yes (Portable Version) No Yes Yes
Privacy (client-side only) Yes Cloud Yes Yes
Multi-user collaboration No Yes No No
3D / docking No Yes Partial DIY
GxP / audit compliance No Yes No No

MolForge is designed for the 80 % of medicinal-chemistry workflows that do not need 3D, multi-user, or regulatory validation — and for which enterprise pricing is the blocker.


Features

Data management

  • File-embedded database — compound data is embedded directly in the HTML file. Move it, rename it, email it: the data travels with it.
  • IndexedDB session cache — every change is auto-saved in the browser for instant recovery on refresh.
  • Save / Save As — save in place via the File System Access API (Chrome/Edge) or download a new copy.
  • Import / Export — CSV, Excel (.xlsx), JSON, and SDF with full round-trip fidelity for custom columns.
  • Grid and Card views with sort, search, substructure match, similarity search, and column filters.
  • Custom columns with per-cell colors and file attachments (for biological data tracker).

Chemistry

  • RDKit.js — SMILES parsing, 2D rendering, descriptors (MW, LogP, HBA/HBD, TPSA, rotatable bonds), substructure and similarity search, Morgan fingerprints.
  • OpenChemLib structure editor — draw compounds in the browser with paste-from-ChemDraw support, undo/redo, zoom, periodic table element picker, and text annotations.

Analysis tools (open in separate tabs)

  • SAR Analysis
    • Scaffold and R-groups with automatic MCS detection (multiple fallbacks: Murcko consensus, substructure-match loop, ring-pattern library)
    • SAR Matrix
    • Activity Landscape (pIC50 potency ranking and LogP vs pIC50 bubble plot, both with manual compound selection)
    • MMPA (matched molecular pair analysis via Bemis-Murcko grouping and Tanimoto similarity)
    • Chemical Space (2D projection)
  • ADME Prediction — 25 endpoints: ESOL, Lobell PPB, Lombardo Vdss, Clark/Ertl BBB, Wager CNS MPO, Bickerton QED, IV/IVE clearance, fragment CYP models. Rule-based QSAR for early screening.
  • MPO Scoring — QED, CNS MPO, Rule of 5, Rule of 3, pMPO with exportable summary table.
  • AI / ML
    • AutoQSAR (Ridge, Random Forest, DNN, MPNN/GNN) with automated descriptor selection and cross-validation
    • Bioisosteric Replacements (SMARTS-matched groups ranked by ADME relevance)
    • MolGen (SyntheMol / REINVENT4-inspired analog generation)
  • Kinase Selectivity
    • Selectivity Analysis: upload KINOMEscan / Eurofins / DiscoverX panels (Excel, CSV, TSV). Uploads persist across sessions via IndexedDB — re-open the tool later and click "Open Saved" to resume.
    • Selectivity Prediction: multi-level gatekeeper analysis, ChEMBL lookup, kinome heatmap, KLIFS-inspired interaction fingerprints.
    • KinMap-style visualization.
  • Compound Analysis Hub — per-compound profile with descriptors, ADME, MPO, bioisosteres, and predictions.
  • Shipment Tracker and Biological Data Tracker — lightweight tracking tools with per-cell file attachments.

Tech stack

React 18 + Babel standalone, OpenChemLib 7.5, RDKit.js, TensorFlow.js, SheetJS (XLSX), jsPDF. All libraries load from CDN. No build step required.


Getting started

Option 1: open directly

Clone or download this repo, then double-click molforge_database.html. Your browser will open the app. Requires an internet connection on first load (CDN libraries are cached after).

Option 2: local static server

If your browser blocks file:// access to worker scripts, serve the file over HTTP:

# Node (any version)
npx serve . -l 8095

# Python 3
python -m http.server 8095

Then open http://localhost:8095/molforge_database.html.

Option 3: portable build

Open the app and go to Import / Export → Share / Save → Create Portable Version. This downloads every CDN library and bundles them into a ~35 MB self-contained HTML file that works completely offline.


How data persistence works

  • Primary storage — compound data is embedded as JSON inside the HTML file (between <!-- MOLFORGE_DATA_START --> and <!-- MOLFORGE_DATA_END -->). Click 💾 Save or 💾 Save As to rewrite the file.
  • Session cache — IndexedDB mirrors your working set so a refresh or crash doesn't lose work.
  • Cross-browser — since data is in the file itself, the database works identically in Chrome, Firefox, Safari, and Edge.
  • Offline — no cloud required. Everything you upload stays on your machine.

Browser support

Feature Chrome/Edge Firefox Safari
Core app
Save in place (File System Access) ⬇ falls back to download ⬇ falls back to download
RDKit.js / TensorFlow.js

Privacy

The project ships with only 25 public FDA-approved drugs as demo data (EGFR, BCR-ABL, VEGFR, JAK, BTK, CDK/ALK, and COX-1 series), plus a synthesized dummy kinase panel. All SMILES and IC50 values are from public literature and databases like ChEMBL — nothing confidential or proprietary.

Your own compounds live in your own copy of the file. MolForge makes no network requests except to load CDN libraries on startup, and never transmits your data.

A GitHub Actions workflow (verify-no-compound-data.yml) runs on every push and PR to enforce that the embedded data block contains only the 25 whitelisted public FDA-approved drug compounds — this prevents any accidental commit of confidential data.


Screenshots

Place screenshots in docs/screenshots/ and reference them here.


License

Open source: GNU Affero General Public License v3.0 (LICENSE). You can use, modify, and redistribute MolForge freely under AGPL's terms, including the network-copyleft clause — any modified version you run as a service must share its source under AGPL too.

Commercial: If AGPL's copyleft obligations conflict with your organization's policies, commercial licenses are available. See COMMERCIAL.md.

Contributing: External contributions require signing a lightweight Contributor License Agreement so the project can continue offering both open-source and commercial options.


Contributing

Issues and pull requests welcome. See CONTRIBUTING.md for guidelines, good-first-issues for ideas, and CODE_OF_CONDUCT.md for the community standards.

Documentation

  • CASE_STUDY.md — a five-minute walkthrough using the 10 demo compounds
  • ARCHITECTURE.md — how an 800 KB HTML file replaces a typical pharma cheminformatics stack
  • ROADMAP.md — what's next
  • CHANGELOG.md — release notes
  • SECURITY.md — privacy model and security reporting
  • NOTICE.md — third-party library licenses and attribution (full audit)
  • docs/sample_panels/ — 300-kinase demo selectivity panel (XLSX / CSV) for the Kinase Selectivity module

Citing MolForge

If you use MolForge in academic work, please cite it — see CITATION.cff or click "Cite this repository" on GitHub.

Acknowledgments

  • RDKit.js — cheminformatics library
  • OpenChemLib — structure editor
  • SheetJS — Excel import/export
  • TensorFlow.js — in-browser ML
  • ChEMBL — bioactivity reference database
  • Published QSAR models: ESOL (Delaney), PPB (Lobell), Vdss (Lombardo), BBB (Clark/Ertl), CNS MPO (Wager), QED (Bickerton)

About

A single-file, browser-based chemical data platform for medicinal chemistry — SAR, ADME, MPO, kinase selectivity, AI/ML

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages