Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .erpaval/INDEX.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@ development sessions. Solutions are reusable; specs are per-feature.

## Solutions (architecture patterns + conventions)

- [Collapse a publish-many TS monorepo into one bundled CLI with tsup](solutions/architecture-patterns/tsup-collapse-monorepo-to-single-cli.md) — `noExternal:[/^@scope//]` + `external:[/^[^.]/]`; workers as named entries (esbuild won't follow `new URL(...,import.meta.url)`); copy import.meta.url assets in onSuccess; tsconfig.test.json → dist-test/ because tsup drops *.test.ts; convert hidden string-imports to static. Kills the pack-all-publishables bug class.
- [Make a heavy native dep optional + lazy so a default install can prune it](solutions/architecture-patterns/optional-native-dep-lazy-import.md) — onnxruntime-node 254MB: deps→optionalDependencies, top-level value-import→`import type`, dynamic `import()` at use site threading the runtime constructor in; bundler must keep it `external`.
- [SCIP replaces LSP for code-graph oracle edges](solutions/architecture-patterns/scip-replaces-lsp.md) — one-shot indexers beat stateful LSP clients for compiler-grade graph edges.
- [Repomix --compress is output-side only](solutions/architecture-patterns/repomix-is-output-side.md) — don't substitute it for a tree-sitter chunker; use it for repo snapshots.
- [Starlight in a pnpm monorepo — minimal scaffold + GH Pages](solutions/architecture-patterns/starlight-in-pnpm-monorepo.md) — 9 files + 1 workflow give you a buildable docs site; gotchas captured.
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
---
title: Make a heavy native dep optional + lazy so a default install can prune it
tags: [onnxruntime, optionalDependencies, dynamic-import, native, install-size, embedder, type-only-import]
modules:
- packages/embedder/package.json
- packages/embedder/src/onnx-embedder.ts
first_applied: 2026-06-04
session: session-a99b0c
track: knowledge
category: architecture-patterns
---

# Make a heavy native dep optional + lazy so a default install can prune it

## Context

`onnxruntime-node` (~254 MB native binary) was a hard `dependency` of
`@opencodehub/embedder`, eagerly imported at module top-level — so it resolved
at install AND loaded on import, even though embeddings are OFF by default and
most users run BM25-only. Goal: a default install can omit it; it loads only
when embeddings are actually opened.

## The pattern (three coordinated moves)

1. **`dependencies` → `optionalDependencies`** in `package.json`. (Keep it OUT
of `devDependencies` too — pnpm installs optional deps by default, so type
resolution and tests still work in the workspace.)

2. **Top-level value import → top-level TYPE-only import.** Types are erased at
compile, so this never triggers a runtime resolution:
```ts
import type { InferenceSession, Tensor } from "onnxruntime-node";
```

3. **Dynamic `import()` at the use site**, threading any runtime *constructor*
(here `Tensor`, used as `new Tensor(...)`) into the consumer:
```ts
let InferenceSession, Tensor;
try {
({ InferenceSession, Tensor } = await import("onnxruntime-node"));
} catch (cause) {
throw new EmbedderNotSetupError("onnxruntime-node is not installed …", { cause });
}
```
A class that previously closed over the imported `Tensor` value must now
receive it via constructor param (`readonly #Tensor: typeof Tensor`) — the
type-only import gives you the *type*, the dynamic import gives you the
*value*.

## Gotchas

- **A bundler must mark it `external`.** If the consuming CLI is bundled
(tsup/esbuild), add the optional dep to `external` so the bundler doesn't try
to inline a `.node` binary. See [[tsup-collapse-monorepo-to-single-cli]].
- **`optionalDependencies` still install by default.** The real prune requires
the END USER to pass `npm i --omit=optional` (or use a remote embedder). The
lazy import guarantees it's never LOADED without embeddings, but "pruned on
every install" is not automatic — document the flag.
- **Throw a typed, actionable error on the dynamic-import catch**, not a raw
`ERR_MODULE_NOT_FOUND`. The user reached weight-load already (weights present)
so the binding genuinely should be there; name the remediation.

## Verification

80/80 embedder tests pass; `dist/onnx-embedder.js` shows `await
import("onnxruntime-node")` with zero top-level require; BM25-only path runs
with the binding absent.
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
---
title: Collapse a publish-many TS monorepo into one bundled CLI with tsup
tags: [tsup, esbuild, monorepo, npm, publish, bundling, workers, piscina, wasm, release-please, collapse]
modules:
- packages/cli/tsup.config.ts
- packages/cli/package.json
- packages/cli/tsconfig.test.json
- packages/cli/src/commands/doctor.ts
- .release-please-config.json
first_applied: 2026-06-04
session: session-a99b0c
track: knowledge
category: architecture-patterns
---

# Collapse a publish-many TS monorepo into one bundled CLI with tsup

## Context

OpenCodeHub published **17 npm packages** (one CLI + 16 libraries), all plain
`tsc -b`, no bundler. Goal: publish only `@opencodehub/cli`, inlining the 14
internal libs into its tarball. Motivation was operational, not cosmetic — see
the "why this matters" section. The collapse went green end-to-end (9/9
global-install gates) but only after solving five coupled problems esbuild does
NOT handle for you.

## The recipe that works

`packages/cli/tsup.config.ts`:

```ts
export default defineConfig({
entry: {
index: "src/index.ts",
"parse-worker": "../ingestion/src/parse/parse-worker.ts", // worker → own chunk
"embedder-worker": "../ingestion/src/pipeline/phases/embedder-worker.ts",
},
format: ["esm"], platform: "node", target: "node20",
splitting: true, clean: true, dts: false,
// NO shims: true — see gotcha 2
external: [/^[^.]/], // externalize EVERY bare import …
noExternal: [/^@opencodehub\//], // … except our own workspace libs (inline them)
async onSuccess() { /* cp vendor/wasms, plugin-assets, ci-templates, config, java → dist/ */ },
})
```

## The five things esbuild will NOT do for you

1. **Workers are not followed.** esbuild does not rewrite
`new Worker(new URL("./w.js", import.meta.url))` or piscina `filename:` — it
leaves the string verbatim, resolved at runtime against the EMITTED file. So
every worker must be a **named `entry`** that emits a sibling chunk
(`dist/parse-worker.js`) at the path the pool's `import.meta.url` expects.
`splitting: true` keeps shared code in `chunk-*.js` instead of duplicating it
into each worker.

2. **`external: [/^[^.]/]` beats an explicit allowlist** — and you must drop
`shims: true`. Externalize every bare import (anything not starting with `.`)
and bundle only `@opencodehub/*` via `noExternal`. An explicit native-only
`external` list let esbuild wander into a transitive dep's optional-plugin
`require()` graph (`@cyclonedx/cyclonedx-library` → `require("xmlbuilder2")` /
`require("libxmljs2")`) and hard-fail. But `/^[^.]/` also matches tsup's own
injected `esm_shims.js` absolute path → "cannot be marked as external". Fix:
drop `shims: true` (native ESM uses `import.meta.url` directly).

3. **Assets that load via `import.meta.url` are not copied.** esbuild's
file/copy loaders only fire on `import`-ed assets. The WASM grammars,
plugin-assets, ci-templates, scanner config TOML, and the COBOL JVM bridge
are walk-up-resolved at runtime, so copy them in `onSuccess` and make the
resolvers **walk up looking for a sentinel** (e.g. `vendor/wasms/manifest.json`)
rather than a fixed `../../` offset — the offset shifts when code is inlined.

4. **Tests don't ship in the bundle.** tsup emits only the entrypoints, so the
38 `*.test.ts` files vanish from `dist/` and `node --test` silently finds
zero tests (a green-looking regression). Add a `tsconfig.test.json` that
`tsc`-compiles the full `src` tree to a **gitignored `dist-test/`**, and point
the `test` script there. Asset-dependent tests (`init`, `ci-init`) must
resolve assets from the source-of-truth (`plugins/opencodehub`,
`src/commands/ci-templates`) since `dist-test/` has no copied assets.

5. **Deliberately-hidden dynamic imports must become static.** Code that wrote
`const s = "@opencodehub/mcp"; await import(s)` to dodge the build-time graph
now points at a package that won't exist post-collapse. Convert to a static
`import`. Same for `import.meta.resolve("@opencodehub/sarif")` probes in
`doctor.ts` — replace with a liveness check on a statically-imported symbol
(`typeof mergeSarif === "function"`). See [[doctor-probe-drift-after-rip-and-replace]].

## Package wiring

- The 14 internal libs → `private: true` (not published) and moved to the CLI's
**devDependencies** (tsup needs them at build time to inline from their `dist`).
- The CLI's runtime `dependencies` = exactly the third-party set the bundle
imports (derive it: `cat dist/*.js | grep -oE '(from |import\()"[^"]+"'` →
filter bare specifiers), PLUS any subprocess-spawned bins
(`@sourcegraph/scip-*`) that won't appear in the import scan but are resolved
via `createRequire` at runtime.
- `release-please`: drop the 16 private packages from `packages` + manifest;
remove the `node-workspace` plugin (no inter-package version sync needed).
- Add every newly-static workspace import to the CLI's `tsconfig.json`
`references` (e.g. `../mcp`) or composite incremental builds break.

## Why this matters

The collapse is not cosmetic. It eliminates the entire
[[workspace-tarball-pack-all-publishables]] bug class (published-graph-vs-local
divergence is impossible with one package), and cuts the npm trusted-publisher
toil from 17 manual passkey-gated web-UI saves to 1 (see
[[npm-trusted-publisher-matches-entry-workflow-not-reusable]]). The shipped
tarball was 2.7 MB compressed / 27 MB unpacked (25 MB is the required vendored
WASM grammars, unchanged), with **0 nested `@opencodehub` dirs** — full inlining
confirmed.

## Related

- [[doctor-probe-drift-after-rip-and-replace]] — doctor's resolve-by-package
probes are the canonical thing that breaks on any rip/collapse.
- [[workspace-tarball-pack-all-publishables]] — the bug class this collapse kills.
- [[exclude-heavy-build-from-pnpm-recursive]] — sibling concern: docs/Astro is
still excluded from `-r build`.
6 changes: 6 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,11 @@ jobs:
# single install path across the matrix. The remaining native deps
# (@duckdb/node-api, @ladybugdb/core, onnxruntime-node) ship prebuilds, so
# storage/embedder tests pass without running postinstall.
#
# Build before test: every package's `test` runs `node --test` against its
# built `dist/` (and the cli compiles `src` → `dist-test/`), so the dist
# graph must exist first. Without this step a package's test glob silently
# matches zero files and reports a vacuous pass.
strategy:
fail-fast: false
matrix:
Expand All @@ -53,6 +58,7 @@ jobs:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
- uses: jdx/mise-action@1648a7812b9aeae629881980618f079932869151 # v4
- run: pnpm install --frozen-lockfile --ignore-scripts
- run: pnpm --filter '!@opencodehub/docs' -r build
- run: pnpm --filter '!@opencodehub/docs' -r test

sarif-validate:
Expand Down
20 changes: 11 additions & 9 deletions .github/workflows/verify-global-install.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,13 @@
#
# planning/bulletproof-npm-install/plan.md §Verification Criteria.
#
# Per cell: pack `@opencodehub/cli` + `@opencodehub/ingestion` with
# `pnpm pack`, install both globally with `npm install -g`, run the 5 hard
# gates plus the 4 smoke commands. The matrix exercises Linux/macOS x
# Node 20/22/24 x mise/nvm/Homebrew/Volta installers so a regression in
# any one of those tool managers cannot land silently.
# Per cell: pack `@opencodehub/cli` with `pnpm pack`, install it globally with
# `npm install -g`, run the 5 hard gates plus the 4 smoke commands. The CLI is
# the only published package — the 14 internal libraries are bundled into its
# tarball at build time (tsup noExternal), so a single tarball is the entire
# install graph. The matrix exercises Linux/macOS x Node 20/22/24 x
# mise/nvm/Homebrew/Volta installers so a regression in any one of those tool
# managers cannot land silently.
#
# This workflow does NOT publish anything. RC publishes remain
# release-please's responsibility (release-please.yml). Each cell is fully
Expand Down Expand Up @@ -205,10 +207,10 @@ jobs:
run: pnpm --filter '!@opencodehub/docs' -r build

# ------------------------------------------------------------------
# The single-cell verifier. Packs cli + ingestion, installs them
# globally with npm, applies the 5 hard gates and runs the 4 smoke
# commands. Local mode is what runs in CI today; rc mode is
# available for future post-publish smokes.
# The single-cell verifier. Packs the cli (the only published package;
# internal libs are bundled in), installs it globally with npm, applies
# the 5 hard gates and runs the 4 smoke commands. Local mode is what runs
# in CI today; rc mode is available for future post-publish smokes.
# ------------------------------------------------------------------
- name: Verify global install (single cell)
env:
Expand Down
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -46,3 +46,6 @@ examples/fixtures/**/.codehub/
# Release artifact — regenerated by cdxgen in release.yml; never committed.
# A stale committed copy poisons the local OSV scan (scans the whole tree).
SBOM.cdx.json

# tsc test-only output (CLI), never published
dist-test/
26 changes: 2 additions & 24 deletions .release-please-config.json
Original file line number Diff line number Diff line change
Expand Up @@ -24,28 +24,6 @@
"package-name": "opencodehub",
"component": "root"
},
"packages/analysis": { "package-name": "@opencodehub/analysis" },
"packages/cli": { "package-name": "@opencodehub/cli" },
"packages/cobol-proleap": { "package-name": "@opencodehub/cobol-proleap" },
"packages/core-types": { "package-name": "@opencodehub/core-types" },
"packages/embedder": { "package-name": "@opencodehub/embedder" },
"packages/frameworks": { "package-name": "@opencodehub/frameworks" },
"packages/ingestion": { "package-name": "@opencodehub/ingestion" },
"packages/mcp": { "package-name": "@opencodehub/mcp" },
"packages/pack": { "package-name": "@opencodehub/pack" },
"packages/policy": { "package-name": "@opencodehub/policy" },
"packages/sarif": { "package-name": "@opencodehub/sarif" },
"packages/scanners": { "package-name": "@opencodehub/scanners" },
"packages/scip-ingest": { "package-name": "@opencodehub/scip-ingest" },
"packages/search": { "package-name": "@opencodehub/search" },
"packages/storage": { "package-name": "@opencodehub/storage" },
"packages/summarizer": { "package-name": "@opencodehub/summarizer" },
"packages/wiki": { "package-name": "@opencodehub/wiki" }
},
"plugins": [
{
"type": "node-workspace",
"updatePeerDependencies": true
}
]
"packages/cli": { "package-name": "@opencodehub/cli" }
}
}
18 changes: 1 addition & 17 deletions .release-please-manifest.json
Original file line number Diff line number Diff line change
@@ -1,20 +1,4 @@
{
".": "0.7.0",
"packages/analysis": "0.4.0",
"packages/cli": "0.6.0",
"packages/cobol-proleap": "0.2.0",
"packages/core-types": "0.4.0",
"packages/embedder": "0.1.3",
"packages/frameworks": "0.2.0",
"packages/ingestion": "0.5.0",
"packages/mcp": "0.5.0",
"packages/pack": "0.3.0",
"packages/policy": "0.2.0",
"packages/sarif": "0.2.0",
"packages/scanners": "0.2.4",
"packages/scip-ingest": "0.3.0",
"packages/search": "0.3.0",
"packages/storage": "0.3.0",
"packages/summarizer": "0.2.0",
"packages/wiki": "0.3.0"
"packages/cli": "0.6.0"
}
3 changes: 2 additions & 1 deletion packages/analysis/package.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
{
"name": "@opencodehub/analysis",
"version": "0.4.0",
"private": true,
"description": "OpenCodeHub — impact, detect_changes, staleness",
"license": "Apache-2.0",
"repository": {
Expand Down Expand Up @@ -33,7 +34,7 @@
],
"scripts": {
"build": "tsc -b",
"test": "node --test ./dist/*.test.js ./dist/**/*.test.js",
"test": "node --test \"./dist/**/*.test.js\"",
"clean": "rm -rf dist *.tsbuildinfo"
},
"dependencies": {
Expand Down
49 changes: 37 additions & 12 deletions packages/cli/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -27,16 +27,48 @@
"!dist/**/*.test.js.map",
"dist/**/*.d.ts.map",
"!dist/**/*.test.d.ts.map",
"dist/vendor/wasms/**",
"dist/plugin-assets/**",
"dist/commands/ci-templates/**"
"dist/commands/ci-templates/**",
"dist/config/**",
"dist/java/**"
],
"scripts": {
"build": "tsc -b && node scripts/copy-ci-templates.mjs && node scripts/copy-plugin-assets.mjs",
"test": "node --test './dist/**/*.test.js'",
"clean": "rm -rf dist *.tsbuildinfo"
"build": "tsup",
"build:test": "tsc -p tsconfig.test.json",
"test": "pnpm run build:test && node --test \"./dist-test/**/*.test.js\"",
"clean": "rm -rf dist dist-test *.tsbuildinfo"
},
"//deps": "The 14 @opencodehub/* workspace libs are INLINED into the bundle at build time (tsup noExternal) — they are devDependencies, not runtime deps. `dependencies` below is exactly the third-party set the bundle imports at runtime (kept `external`), plus the two @sourcegraph/scip-* indexers the parse pipeline spawns as subprocesses. onnxruntime-node is optional (lazy-loaded only when embeddings are enabled).",
"dependencies": {
"@apidevtools/swagger-parser": "12.1.0",
"@aws-sdk/client-bedrock-runtime": "3.1054.0",
"@aws-sdk/client-sagemaker-runtime": "3.1054.0",
"@chonkiejs/core": "^0.0.10",
"@cyclonedx/cyclonedx-library": "10.0.0",
"@duckdb/node-api": "1.5.2-r.2",
"@huggingface/tokenizers": "0.1.3",
"@iarna/toml": "2.2.5",
"@ladybugdb/core": "^0.16.1",
"@modelcontextprotocol/sdk": "1.29.0",
"@sourcegraph/scip-python": "0.6.6",
"@sourcegraph/scip-typescript": "0.4.0",
"cli-table3": "0.6.5",
"commander": "14.0.3",
"fast-xml-parser": "5.8.0",
"listr2": "10.2.1",
"lru-cache": "11.5.0",
"piscina": "5.1.4",
"snyk-nodejs-lockfile-parser": "2.7.1",
"web-tree-sitter": "0.26.9",
"write-file-atomic": "7.0.1",
"yaml": "2.9.0",
"zod": "4.4.3"
},
"optionalDependencies": {
"onnxruntime-node": "1.26.0"
},
"devDependencies": {
"@opencodehub/analysis": "workspace:*",
"@opencodehub/core-types": "workspace:*",
"@opencodehub/embedder": "workspace:*",
Expand All @@ -50,16 +82,9 @@
"@opencodehub/search": "workspace:*",
"@opencodehub/storage": "workspace:*",
"@opencodehub/wiki": "workspace:*",
"cli-table3": "0.6.5",
"commander": "14.0.3",
"envinfo": "7.21.0",
"listr2": "10.2.1",
"write-file-atomic": "7.0.1",
"yaml": "2.9.0"
},
"devDependencies": {
"@types/node": "25.9.1",
"@types/write-file-atomic": "4.0.3",
"tsup": "^8.5.1",
"typescript": "6.0.3"
},
"publishConfig": {
Expand Down
Loading
Loading