Skip to content

security: harden credential handling, pipeline JS execution, and the HTTP/stream layer#5

Open
lhfer wants to merge 5 commits into
modelstudioai:mainfrom
lhfer:claude/busy-noether-Rjbaz
Open

security: harden credential handling, pipeline JS execution, and the HTTP/stream layer#5
lhfer wants to merge 5 commits into
modelstudioai:mainfrom
lhfer:claude/busy-noether-Rjbaz

Conversation

@lhfer
Copy link
Copy Markdown

@lhfer lhfer commented May 29, 2026

Summary

A security review surfaced several issues across credential handling, the pipeline
engine, and the HTTP/stream layer. This PR fixes the cleanly-fixable ones with
regression tests
; a few items that need a product/contract decision are listed at
the end rather than changed unilaterally.

Source-only (no dependency changes). vp check is green; unit suites pass
(core 9, cli 81; e2e skipped without an API key).

Fixes

Credential handling

  • config set no longer echoes secrets in cleartext. bl config set --key api_key …
    (and access_token/access_key_id/access_key_secret) printed the stored value
    verbatim to stdout — unlike config show/auth status, which mask. The confirmation
    echo now masks secret-valued keys.
  • --verbose request logs use maskToken() instead of printing the first 8 chars of
    the bearer token / AccessKey id (http.ts, knowledge retrieve).
  • telemetry.jsonl is written 0600 (was default/world-readable).
  • ensureConfigDir enforces 0700 via an explicit chmod (mkdir's mode is ignored for
    a pre-existing ~/.bailian that may already hold cleartext credentials).

Pipeline engine

  • script/js code must be a literal string. A step's code is a resolved input, so
    it could be written as { $from: <chat-step> } — turning model/API output into the body
    of new Function (untrusted data → arbitrary host code execution). Validation now rejects
    $from/expression-sourced code; authoring a literal script/js step is unchanged.
  • --dry-run never executes $js. Planning previously ran no-arg $js via new Function;
    a preview of an untrusted pipeline must not run code. Planning now returns the placeholder.
  • getByJsonPointer blocks __proto__/constructor/prototype and only follows own
    properties.
  • --concurrency is clamped to 64 to bound fan-out.

HTTP / stream layer

  • URL path segments are encodeURIComponent-d: task_id (from the server's async-submit
    response, fetched back with the bearer token attached) plus app_id/node_id/schema_id.
  • SSE parser buffer is bounded (16 MiB) against streams with no newline / one giant event.
  • base_url / console_gateway_url are validated as real http(s) URLs via new URL,
    replacing a startsWith("http") check that also accepted e.g. httpfoo://….

Not changed here (maintainer decision)

  • script/js / $js runtime still uses new Function (full host authority). With the
    literal-code rule, untrusted data can no longer become the code body; remaining exposure
    is authored code (the file is the trust boundary). Stronger isolation is a product call.
  • base_url host allow-listing — would break proxy/self-host users.
  • --console loopback callback (wildcard CORS, token via GET query, no Host check, state
    embedded in the notice param) — hardening depends on the console server contract.
  • Credential precedence: a config.json key silently overrides DASHSCOPE_API_KEY.
  • Telemetry derives a persistent device_id from the MAC address and forwards raw error
    messages to gm.mmstat.com (default-on; DO_NOT_TRACK=1 opts out).

Testing

  • vp check: pass · core 9 tests pass · cli 81 tests pass (48 e2e skipped)
  • New regression tests cover every fix above.

claude added 5 commits May 29, 2026 12:34
- config set: mask api_key/access_token/access_key_id/access_key_secret in the
  confirmation echo. It previously printed the stored secret verbatim to stdout
  (CI logs, pipes, screen shares), unlike `config show` / `auth status` which
  already maskToken().
- http / knowledge retrieve: use maskToken() in --verbose request logs instead
  of printing the first 8 chars of the bearer token / AccessKey id.
- telemetry: write telemetry.jsonl with mode 0600 (was created world-readable
  by default), matching the other credential-area writers.
- ensureConfigDir: chmod 0700 after mkdir, so a pre-existing ~/.bailian created
  by an older build/another tool (where mkdir's mode is ignored) holding
  cleartext credentials gets locked down too. Best-effort; never fatal.

https://claude.ai/code/session_017ZGQCjwNQF5Pz96gLUnnG1
- endpoints: encodeURIComponent the id segments (task_id, app_id, node_id,
  schema_id) interpolated into request URLs. task_id in particular comes from
  the server's async-submit response and is fetched back with the bearer token
  attached, so an unencoded value could steer the authenticated follow-up
  request to a different path on the host.
- stream (SSE parser): cap the in-memory buffer (16 MiB). A stream that never
  emits a newline, or that builds one enormous event from many data: lines,
  could otherwise grow the buffer without bound and exhaust process memory.

https://claude.ai/code/session_017ZGQCjwNQF5Pz96gLUnnG1
…rrency

- expressions: never execute $js during planning/dry-run. `pipeline run --dry-run`
  is the command a cautious user runs to preview an unfamiliar pipeline; it must
  not run embedded JavaScript. Planning now returns the expression placeholder
  instead of calling new Function.
- schema (getByJsonPointer): block __proto__/constructor/prototype and require
  own properties, so a crafted $from/$input path cannot pull object internals
  (e.g. constructor) out of step output and feed them downstream.
- scheduler: clamp --concurrency to a maximum (64) to bound fan-out so a single
  run cannot launch an unbounded number of concurrent API calls / downloads.

Note: the runtime new Function sinks in script/js and $js (arbitrary host code
execution) are intentionally left unchanged here — remediating them is a design
decision (sandbox vs. literal-only code) for the maintainers; see PR notes.

https://claude.ai/code/session_017ZGQCjwNQF5Pz96gLUnnG1
…ted-code RCE)

script/js executes its `code` as host JavaScript (via new Function), and a step's
`code` is a *resolved* input — so it could be written as `{ $from: <chat-step> }`,
turning model/API output into the body of the executed function (untrusted data
-> arbitrary host code execution). Pipeline validation now requires script/js
`code` to be a literal string: any $from/expression-sourced code is rejected.

Authoring a literal script/js step remains supported (the pipeline file is the
trust boundary, like a shell/npm script). Combined with "dry-run never executes
$js", this closes the path where untrusted text reaches the JS sink.

Adds regression tests: $from-sourced code rejected, literal code accepted,
dry-run does not execute $js, getByJsonPointer blocks prototype/inherited keys,
and concurrency clamps to the maximum.

https://claude.ai/code/session_017ZGQCjwNQF5Pz96gLUnnG1
…) URLs

The config file accepted any value that merely starts with "http" (so even
"httpfoo://evil" passed) for base_url and console_gateway_url — origins the
client sends the Bearer token to. Validate them with `new URL()` and an
http:/https: protocol check instead, rejecting malformed values. Valid http(s)
URLs (including custom proxies and local http) are unaffected.

https://claude.ai/code/session_017ZGQCjwNQF5Pz96gLUnnG1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants