GitHub - wleejess/orbIC: orbIC: A Multidimensional Chemical Space Navigation Platform. An open-source tool to navigate the orbit of IC50s in given chemical space.

orbIC: A Multidimensional Chemical Space Navigation Platform

Browser-native hit-to-lead visualization. No installation. No license. No data leaves your browser.

orbIC translates Structure-Activity Relationship (SAR) data into an interactive force-directed graph — where IC₅₀ or EC₅₀ drives the coordinate system. Potent compounds are rendered as primary light sources; structural clusters emerge as constellations. Built for medicinal chemists and computational researchers who need to move faster than spreadsheets allow.

Renders 4,147 compounds with sub-second interaction via TMAP · MST layout.

Demo

orbic_demo.mov

Why orbIC

Traditional SAR analysis tools rely on 2D heatmaps and static tables. These are adequate for small datasets but break down at scale — structural trends disappear into rows, activity cliffs go unnoticed, and there's no intuitive way to navigate chemical neighborhoods.

orbIC treats potency as a spatial property:

IC₅₀ / EC₅₀ maps to luminance and node scale
Tanimoto similarity (ECFP4 fingerprints) maps to edge proximity
Scaffold density and diversity are immediately legible as visual clusters

The result is a navigable map of a compound library — not a filtered spreadsheet.

Core Capabilities

Potency-driven layout — nodes rendered with luminance and scale proportional to IC₅₀ / EC₅₀; potent hits act as primary light sources
Structural clustering — automated grouping via Tanimoto coefficients and ECFP4 fingerprints; visualizes scaffold density and diversity
Lead-orbit view — centers a parent molecule and shows all derivatives as orbiting satellites, with activity delta relative to structural changes
Zero-dependency runtime — OpenChemLib compiled to WebAssembly; all chemistry runs client-side, no server calls, no data egress
Precomputed sidecar system — bundled datasets ship with .precomputed.json files (scaffold, fragment index, MW, LogP, PSA); near-instant load for standard libraries

Architecture

orbIC/
├── src/
│   ├── components/         # React UI — panels, controls, node renderers
│   ├── lib/
│   │   ├── datasetService.ts   # CSV ingestion, chunked async OCL processing
│   │   ├── TMap.tsx            # Force-directed graph, MST construction, scaffold clustering
│   │   └── similarity.ts       # Tanimoto / ECFP4 fingerprint utilities
│   └── App.tsx
├── scripts/
│   └── precompute-datasets.ts  # Offline sidecar generation for bundled datasets
├── public/data/                # Bundled datasets + .precomputed.json sidecars
└── components/ui/              # Shadcn/Radix primitives

Key design decisions:

WebAssembly chemistry (OpenChemLib) — RDKit in the browser historically required a Python backend or heavy WASM bundle. OpenChemLib provides the critical subset needed (SMILES parsing, ECFP4 fingerprints, substructure search, property calculation) at ~3MB WASM, keeping the tool installable and offline-capable.
Chunked async OCL processing — initial implementation processed all compounds synchronously in Papa.parse callbacks, freezing the browser on datasets >500 compounds. Refactored to 50-row chunks with setTimeout(0) yielding between chunks; shows live progress during import. (See Performance Notes below.)
Precomputed sidecars — for bundled datasets, all OCL work (scaffold extraction, fragment index, property calculation) runs offline at build time via npm run precompute, writing .precomputed.json alongside each CSV. At runtime, the loader checks for the sidecar first — no OCL, near-instant load.
MST over substructure graph — eliminated the O(n²) OCL.SSSearcher pass from TMap construction (22,350 calls at 150 scaffolds). kNN Tanimoto edges alone produce an equivalent MST; structurally related scaffolds score high similarity and remain adjacent. Saves 2–10s of main-thread freeze per import.

Tech Stack

Layer	Choice
UI	React + TypeScript
Visualization	D3.js (force simulation, MST layout)
Chemistry	OpenChemLib (WebAssembly)
Build	Vite
CI	GitHub Actions

Scientific Inspiration & Methodology

orbIC's layout approach is grounded in two areas of cheminformatics research:

Chemical Space Networks (CSNs) — Heikamp & Bajorath (2019), Frontiers in Chemistry. Molecules as nodes, structural similarity as edges; enables rapid identification of SAR transfer and activity cliffs.
Dimensionality reduction for molecular descriptors — Czarnecki et al. (2020), Journal of Cheminformatics. t-SNE and UMAP preserve global and local structure when projecting high-dimensional fingerprint space.

Getting Started

git clone https://github.com/wleejess/orbIC.git
cd orbIC
npm install
npm run start

To regenerate precomputed sidecars after adding a new bundled dataset:

npm run precompute

Add the new dataset filename to the DATASETS array in scripts/precompute-datasets.ts first.

Performance Notes

The large-dataset freeze problem and how it was fixed.

Loading ~4,000 compounds initially froze the browser ("Wait or terminate"). Three compounding causes:

Synchronous OCL on the main thread — ~25,000 blocking calls per import (SMILES parse, fingerprint, scaffold, properties for each compound)
O(n²) scaffold graph in TMap — substructure check + kNN pass for every scaffold pair immediately after the first freeze
No row cap on bundled dataset loader — the remote loader used limit: 200; the local one didn't

Fixes applied:

Fix	File	Effect
Chunked async OCL	`datasetService.ts`	50 rows/tick with `setTimeout(0)`; progress shown during import
Precomputed sidecar	`datasetService.ts` + `scripts/precompute-datasets.ts`	Zero OCL at runtime for bundled datasets
Scaffold cap (150)	`TMap.tsx`	Prevents O(n²) lock-up during graph construction
Remove substructure pass	`TMap.tsx`	Eliminates 22k synchronous OCL calls; kNN edges sufficient for MST
O(1) compound lookup	`TMap.tsx`	Pre-built `Map<id, Compound>` replaces `compounds.find()` — eliminates ~17M string comparisons per import
Query fingerprint hoisted	`App.tsx`	`processSmiles(query)` moved to `useMemo`; was running 4,139× per slider move
Dialog close before import	`ImportDialog.tsx`	Closes dialog visually first, defers `onImport` to `setTimeout(0)` to prevent frozen-dialog UX

Known issue: detectNumericCols still scans all compounds on the main thread (~0.5s pause before the filter picker). Fix requires a Web Worker; tracked as next step.

Roadmap

Web Worker for all OCL — move chemistry off the main thread entirely; UI stays interactive during large imports
Incremental TMap updates — compute MST once on import, recolor on filter changes rather than re-running the full simulation
LSH fingerprint index — sub-linear similarity search for libraries >10k compounds

License

MIT — intended for academic research and commercial pharmaceutical development.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.github/workflows		.github/workflows
components/ui		components/ui
lib		lib
public		public
scripts		scripts
src		src
.gitignore		.gitignore
README.md		README.md
components.json		components.json
index.html		index.html
metadata.json		metadata.json
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
vite.config.ts		vite.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

orbIC: A Multidimensional Chemical Space Navigation Platform

Demo

Why orbIC

Core Capabilities

Architecture

Tech Stack

Scientific Inspiration & Methodology

Getting Started

Performance Notes

Roadmap

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

orbIC: A Multidimensional Chemical Space Navigation Platform

Demo

Why orbIC

Core Capabilities

Architecture

Tech Stack

Scientific Inspiration & Methodology

Getting Started

Performance Notes

Roadmap

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages