Skip to content

generate-cli produces non-deterministic bundles: tmp paths and schema key order leak into artifact #180

@imroc

Description

@imroc

Summary

When running mcporter generate-cli --bundle ... repeatedly with no upstream changes, the produced *.bundle.js differs significantly between runs. There are two independent sources of non-determinism, both of which leak build-time state into the artifact:

  1. Per-run random tmp/mcporter-cli-XXXXXX/ and /var/folders/.../T/tmp.XXXXXXXXXX paths get embedded as JS strings.
  2. The order of tools in the embedded embeddedSchemas object is non-stable.

For projects that commit generated bundles to git (so teammates can git clone && run without installing mcporter), every regeneration produces hundreds of lines of meaningless diff that buries any real schema change. We hit this trying to share a bundled CLI across our team — re-generating after no upstream change still produced ~825-line diffs per bundle, almost all noise.

Repro

# any HTTP MCP server with a stable schema
cat > /tmp/cfg.json <<'EOF'
{ "mcpServers": { "demo": { "url": "https://example.com/mcp", "headers": {"Authorization": "Bearer ${TOKEN}"} } } }
EOF

TOKEN=xxx mcporter --config /tmp/cfg.json generate-cli --server demo --bundle /tmp/a.js --runtime node --bundler rolldown
TOKEN=xxx mcporter --config /tmp/cfg.json generate-cli --server demo --bundle /tmp/b.js --runtime node --bundler rolldown

diff /tmp/a.js /tmp/b.js | wc -l   # non-zero, often hundreds of lines

Root causes (located in source)

mcporter @ ae3b83c (current main).

1. tmp paths leak into artifact

src/generate-cli.ts:97-101:

const tmpPrefix = path.join(process.cwd(), "tmp", "mcporter-cli-");
await fs.mkdir(path.dirname(tmpPrefix), { recursive: true });
templateTmpDir = await fs.mkdtemp(tmpPrefix);
templateOutputPath = path.join(templateTmpDir, `${name}.ts`);

fs.mkdtemp adds a 6-char random suffix; the resulting path becomes the templateOutputPath, which then surfaces in three places inside the bundle:

  • a rolldown region marker: //#region tmp/mcporter-cli-XXXXXX/<server>.ts
  • embeddedServer.source.path and embeddedServer.sources[].path (set when the user passes --config <path>, e.g. a mktemp config file used in CI/scripts; see src/cli/generate/definition.ts:47)
  • embeddedMetadata.invocation.configPath

These paths are pure metadata. As far as I can tell from reading src/cli/generate/template.ts:246 (normalizeEmbeddedServer) and the runtime path, none of the source / sources / configPath fields are read at runtime — only embeddedServer.command (url/headers for HTTP, args/cwd for stdio) is consumed. So embedding tmp paths is purely build-state leakage.

2. embeddedSchemas tool order is non-stable

src/cli/generate/tools.ts:53-61:

export function buildEmbeddedSchemaMap(
  tools: ToolMetadata[],
): Record<string, unknown> {
  const result: Record<string, unknown> = {};
  for (const entry of tools) {
    if (entry.tool.inputSchema && typeof entry.tool.inputSchema === "object") {
      result[entry.tool.name] = entry.tool.inputSchema;
    }
  }
  return result;
}

Iteration order is whatever the upstream MCP server returned in list_tools. Since the result is then JSON.stringify'd into the bundle as a literal, any reorder server-side becomes a noisy diff client-side, even when no schemas actually changed.

Suggested fixes

Both fixes are small and isolated:

Fix 1 — stable tmp / strip build paths from artifact

Either:

  • (a) Use a deterministic, non-random templateTmpDir (e.g. tmp/mcporter-cli/<hashOfServerName>/) that gets cleaned up on success — the mkdtemp only seems necessary to avoid concurrent runs colliding, which a per-server-name dir solves equally well.
  • (b) Keep mkdtemp but scrub the temporary path out of the embedded metadata before serializing (replace with a placeholder like <tmpdir> or simply null). Since the runtime doesn't read these fields, this is safe.

Fix 2 — stable schema key order

Sort tools by name before building the map:

export function buildEmbeddedSchemaMap(
  tools: ToolMetadata[],
): Record<string, unknown> {
  const result: Record<string, unknown> = {};
  const sorted = [...tools].sort((a, b) =>
    a.tool.name.localeCompare(b.tool.name),
  );
  for (const entry of sorted) {
    if (entry.tool.inputSchema && typeof entry.tool.inputSchema === "object") {
      result[entry.tool.name] = entry.tool.inputSchema;
    }
  }
  return result;
}

Same probably applies anywhere else tools order influences serialized output (Commander subcommand registration, help text — these would just be cosmetic but consistent).

Why this matters (use case)

Sharing bundled CLIs across a team via git: clone + node bundle.js ... for users (no mcporter install needed), regenerate.sh for maintainers when upstream schema changes. With current behavior, every re-generate commit looks like a 1000-line schema rewrite even when only temporary paths or tool ordering changed, making code review and git log analysis impractical.

Environment

  • mcporter 0.11.1 (also reproduced on main @ ae3b83c)
  • Node.js v26.0.0
  • macOS 25.4.0 (Darwin arm64)
  • runtime=node, bundler=rolldown

Happy to provide a PR if the maintainers think the suggested approach is correct.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions