fix(0.30.5): close issue #269 -- 9 verified findings from security audit by padak · Pull Request #270 · keboola/cli

padak · 2026-05-07T18:22:27Z

Closes #269.

Fixes nine verified security findings surfaced by a three-stage automated audit (use-case map → state-machine → security review). Bugs A–E from #267 were intentionally out of scope. Each finding was verified by reading code; the eight non-trivial ones have explicit regression tests.

Branch base: This PR branches from fix/267-sync-git-branching-bugs (PR #268), because sec-20 modifies _coerce_keboola_id() which was added there. Merge order: #268 first, then this PR.

Summary

ID	Severity	What	File:line
sec-01 + sec-07	Critical	Path traversal via API-controlled `component_id` / `component_type` in `sync pull`	`sync/naming.py:25-28` + `services/sync_service.py:434`
sec-02 + sec-08	High / Med	MCP HTTP subprocess inherits `KBC_MASTER_TOKEN*` and `KBC_MANAGE_API_TOKEN`	`services/mcp_transport.py:115`
sec-04	High	REPL history file 0644 → 0600	`commands/repl.py:63`
sec-05	High	XSS in `lineage show --format html` (entity / config names)	`services/deep_lineage_service.py:1156, 1177, 1204, 1217-1218, 1227`
sec-06	High	`encrypt values --output-file` race (world-readable window before chmod)	`commands/encrypt.py:107-108`
sec-11	Medium	`max_parallel_workers: 0` crashes ThreadPoolExecutor (no `ge=1`)	`models.py:76-79`
sec-19	Low	`permissions check` ignores `--deny-writes` / `--deny-destructive` flags	`commands/permissions.py:367`
sec-20	Low	`_coerce_keboola_id` raises raw `ValueError` on non-numeric ID	`sync/branch_mapping.py:24`

Changes

sec-01 + sec-07 — path traversal

naming.config_path() previously interpolated component_type and component_id raw, while only config_name was sanitized. With template {component_type}/{component_id}/{config_name}, an API-controlled component_id = "../../../etc" would route sync pull writes outside the workspace.

New sync/naming.py:sanitize_path_segment() rejects /, \, and parent-directory references (..) while preserving dots, hyphens, underscores. Legitimate component IDs (keboola.ex-db-mysql, kds-team.app-custom-python, keboola.python-transformation-v2) pass through unchanged.
Both component_type and component_id now go through it in config_path().
Defense in depth: new _ensure_within_branch() in services/sync_service.py resolves branch_dir and config_dir and raises ConfigError if the resolved config dir is not contained in the branch dir. Called before any file write.

sec-02 + sec-08 — MCP HTTP env scrubbing

The HTTP transport at services/mcp_transport.py:_start() used subprocess.Popen(cmd, ...) with no env= argument → MCP server inherited the full kbagent environment. The stdio transport at mcp_service.py:_build_server_params() was already safe (explicit minimal env via MCP SDK's get_default_environment()); only the HTTP transport leaked.

New _build_minimal_env() allow-lists only what the MCP binary needs: PATH, HOME, USER, LOGNAME, TERM, SHELL, TMPDIR/TEMP/TMP, LANG, LC_ALL, LC_CTYPE, PYTHONPATH/PYTHONHOME/VIRTUAL_ENV, UV_CACHE_DIR/UV_PYTHON, XDG_*. Every KBC_* is omitted.
Per-project Storage tokens still flow via HTTP request headers (existing design).

sec-04 — REPL history 0600

commands/repl.py:_get_history_path() now atomically creates the file with os.open(O_WRONLY|O_CREAT|O_EXCL, 0o600) and tightens existing files via chmod(0o600) (silently no-op on filesystems that reject chmod, e.g. some network mounts).

sec-05 — HTML XSS

services/deep_lineage_service.py:render_er_diagram() previously did name.replace('"', "'") for entity/config names, leaving <, >, and & untouched. A Keboola table named </div><script>alert(1)</script> would inject the script into the generated HTML body where the browser parses it before Mermaid runs.

Fix: html.escape(s, quote=True) for every API-derived string in render_er_diagram(). Mermaid renders entities back to their characters in SVG text so visible output is unchanged.

sec-06 — atomic encrypt output

commands/encrypt.py replaced output_file.write_text(...) + output_file.chmod(0o600) with os.open(path, O_WRONLY|O_CREAT|O_TRUNC, 0o600) + os.write(). No race window between create and chmod.

sec-11 — max_parallel_workers ge=1

Added ge=1 to the Pydantic Field in models.py. BaseService._resolve_max_workers() also clamps to >= 1 defensively (so a legacy on-disk config with 0 does not break startup before the validator runs).

sec-19 — permissions check applies session flags

commands/permissions.py:permissions_check() now uses apply_firewall_flags() the same way permissions list does, so kbagent --deny-writes permissions check branch.create correctly returns DENIED.

sec-20 — descriptive ValueError

sync/branch_mapping.py:_coerce_keboola_id() now wraps int(raw) in try/except and raises a descriptive ValueError naming the bad value. load_branch_mapping() re-wraps with the file path so users see Failed to parse /path/to/branch-mapping.json: Invalid branch ID 'not-a-number'.

Tests

33 new regression tests:

File	Tests
`tests/test_sync_naming.py`	`TestSanitizePathSegment` (4) + `TestConfigPathTraversalRegression` (2)
`tests/test_mcp_transport.py`	`TestBuildMinimalEnv` (6)
`tests/test_repl.py`	`TestHistoryPathPermissions` (2)
`tests/test_deep_lineage_service.py`	`TestRenderErDiagramXssRegression` (2)
`tests/test_models.py`	extends `TestMaxParallelWorkersValidation` (2)
`tests/test_sync_branch_mapping.py`	extends `TestBranchMapping` and `TestBranchMappingIO` (2)
`tests/test_permissions_cli.py`	extends `TestPermissionsCheck` (2)

Total suite: 2830 passed (was 2797).

Test plan

make lint format-check skill-check changelog-check check-error-codes
uv run pytest tests/ --ignore=tests/test_e2e.py --ignore=tests/test_sync_e2e.py --ignore=tests/test_e2e_lineage_deep.py → 2830 passed
Each fix has a regression test that fails against main before the fix
Manual verification of sanitize_path_segment() on adversarial inputs (../../etc, foo/../bar, .., etc.)
Manual verification that _build_minimal_env() does not include any KBC_* even when those vars are set
Manual verification that html.escape renders Mermaid output as expected
e2e — not strictly needed; all fixes are at unit boundaries and have unit tests

Out of scope (tracked as follow-ups in #269)

sec-03 — Token in --token argv visible to ps. Design change (deprecation cycle for the flag).
sec-09 — permissions reset/set PTY bypass. Design change (second-factor / parent-process check).
sec-10 — Silent branch-name collisions in sanitize_name. UX warning, not security.
sec-12 — version_cache.json non-atomic write race.
sec-13 — --input @file reads arbitrary path. Design discussion needed.
sec-14 — KBC_MASTER_TOKEN_{ALIAS} from unvalidated alias.
sec-16 — Lineage HTML loads Mermaid from CDN without SRI.
sec-17 — KBAGENT_AUTO_UPDATE toggle stale cache.
sec-18 — doctor --fix uses PATH-resolved uv.

Each will get its own follow-up issue if/when prioritized.

Audit artifacts

The audit produced three structured documents (kept locally for the test design phase):

/tmp/kbagent-audit/use-cases.md — 28 KB — every command + 20 life situations + cross-cutting concerns
/tmp/kbagent-audit/state-machine.md — 43 KB — every persistent / in-memory state with file:line, command-by-command read/write/assert table, life-situation traces, 9 forbidden state combinations, concurrency model, 6 Mermaid state diagrams
/tmp/kbagent-audit/issues.md — 15 KB — the 20 findings with file:line, repro, exploitability, suggested fix, confidence

This release ships fixes for nine security findings surfaced by an audit driven by three sub-agents in sequence (use-case map -> state-machine -> security review). Findings A-E from issue #267 (merged in #268) are intentionally out of scope for this audit. Critical - sec-01 / sec-07: path traversal via API-controlled component_id and component_type. naming.config_path() interpolated those tokens raw, so a malicious or compromised stack returning component_id = "../../etc" would direct sync pull writes outside the workspace. Fix: new sanitize_path_segment() that rejects /, \, and parent references while preserving dots/hyphens/underscores in legitimate IDs (keboola.ex-db-mysql, kds-team.app-custom-python). Plus a defense-in-depth confinement check in sync_service.pull() that raises ConfigError if the resolved config dir is not contained in the branch dir. High - sec-02 / sec-08: MCP HTTP transport subprocess no longer inherits any KBC_* token from the kbagent process env. Pre-fix, Popen(cmd, ...) had no env= arg, so KBC_MASTER_TOKEN, KBC_MASTER_TOKEN_<ALIAS>, KBC_MANAGE_API_TOKEN, and KBC_TOKEN all leaked to the MCP server when KBAGENT_MCP_TRANSPORT=http. New _build_minimal_env() allow-lists only PATH, HOME, locale, and uv/python cache vars. Per-project Storage tokens still flow via HTTP request headers as before. Closes the v0.29.0 manage-token default-deny gap on the HTTP transport path. - sec-04: REPL history file (~/.config/keboola-agent-cli/repl_history) is now created with mode 0600. Pre-fix, FileHistory created the file with the user's umask (typically 0644), persisting any token typed at the prompt in plaintext readable by group/world. - sec-05: kbagent lineage show --format html no longer emits XSS- vulnerable HTML. render_er_diagram() previously did name.replace('"', "'") which left <, >, and & untouched -- a Keboola entity named </div><script>...</script> would inject the script. Fix uses html.escape(s, quote=True) consistently for every API-derived string. - sec-06: kbagent encrypt values --output-file now atomically creates the file with mode 0600 via os.open(..., 0o600). Replaces the previous Path.write_text() + chmod(0o600) which left a race window where the file was world-readable. Medium - sec-11: max_parallel_workers Pydantic field now requires ge=1 in addition to le=100. Pre-fix, max_parallel_workers: 0 in config.json passed validation and crashed every multi-project op with ValueError from ThreadPoolExecutor. _resolve_max_workers() also clamps defensively for legacy on-disk configs. Low - sec-19: kbagent permissions check OPERATION now reflects the EFFECTIVE policy for the invocation -- persisted policy MERGED with --deny-writes / --deny-destructive session flags -- matching permissions list semantics. Pre-fix, an AI agent inspecting its own self-imposed firewall got a misleading "allowed" answer. - sec-20: _coerce_keboola_id() and load_branch_mapping() now raise descriptive errors for malformed branch IDs in branch-mapping.json. Pre-fix, "id": "not-a-number" produced raw "invalid literal for int()" from deep inside the parser. Tests - 33 new regression tests across 7 test files. Total suite: 2830 passed (was 2797). Out of scope (tracked as follow-up in #269) - sec-03 token-in-argv deprecation - sec-09 PTY-bypass design - sec-10 silent name collisions - sec-12 version_cache atomicity - sec-13 @file path restriction - sec-14 alias validation - sec-16 SRI on Mermaid CDN - sec-17 KBAGENT_AUTO_UPDATE toggle staleness - sec-18 doctor --fix uv via PATH Audit artifacts kept locally at /tmp/kbagent-audit/use-cases.md (28K) + state-machine.md (43K) + issues.md (15K) for the test design phase.

This was referenced May 7, 2026

Security audit: 9 verified findings (path traversal, env leak, file perms, XSS, etc.) #269

Closed

Security audit follow-ups: 10 deferred findings (sec-03, sec-09, sec-10, sec-12, sec-13, sec-14, sec-16, sec-17, sec-18) #271

Open

padak deleted the branch fix/267-sync-git-branching-bugs May 7, 2026 18:35

padak closed this May 7, 2026

padak mentioned this pull request May 7, 2026

fix(0.30.5): close issue #269 -- 9 verified findings from security audit #272

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(0.30.5): close issue #269 -- 9 verified findings from security audit#270

fix(0.30.5): close issue #269 -- 9 verified findings from security audit#270
padak wants to merge 1 commit into
fix/267-sync-git-branching-bugsfrom
fix/269-security-audit-9-findings

padak commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

padak commented May 7, 2026

Summary

Changes

sec-01 + sec-07 — path traversal

sec-02 + sec-08 — MCP HTTP env scrubbing

sec-04 — REPL history 0600

sec-05 — HTML XSS

sec-06 — atomic encrypt output

sec-11 — max_parallel_workers ge=1

sec-19 — permissions check applies session flags

sec-20 — descriptive ValueError

Tests

Test plan

Out of scope (tracked as follow-ups in #269)

Audit artifacts

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant