Security audit: 9 verified findings (path traversal, env leak, file perms, XSS, etc.)

## Summary

Comprehensive security audit of `kbagent` CLI surface, performed across three sub-agents (use case map → state machine → security review). Result: **20 findings**, of which **9 are verified bugs** that warrant immediate fix in a single PR. The rest are either suspected (need design discussion), low-impact hardening, or false positives.

This is a follow-up to #267 (which fixed chained sync git-branching bugs). The audit specifically excluded those bugs and looked for OTHER issues across the entire CLI surface — auth, secrets, races, broken invariants, footguns, input validation, subprocess safety, fail-open paths, multi-project corruption, AI agent attack surface, doc drift.

The methodology and per-finding artefacts are kept locally:
- Use case map: 3000+ words, all command groups + 20 life situations + cross-cutting concerns
- State machine: every persistent + in-memory state, command-by-command transitions, 9 forbidden state combinations, concurrency model, 6 Mermaid diagrams
- Security findings doc: 20 findings with file:line, repro, exploitability, suggested fix, confidence

---

## In-scope for this PR (9 verified findings)

### Critical
| ID | Title | File:line |
|---|---|---|
| sec-01 | Path traversal via API-controlled `component_id` in `sync pull` | `sync/naming.py:25-28` |

### High
| ID | Title | File:line |
|---|---|---|
| sec-02 + sec-08 | `KBC_MASTER_TOKEN_*` and `KBC_MANAGE_API_TOKEN` inherited by MCP HTTP transport | `services/mcp_transport.py:115-119` |
| sec-04 | REPL history file stores tokens with default umask (typically 0644) | `commands/repl.py:63-67` |
| sec-05 | XSS in lineage HTML output via unescaped Keboola entity names | `services/deep_lineage_service.py:1156, 1177, 1204, 1217-1218, 1227` |
| sec-06 | `encrypt values --output-file` race: file is world-readable between `write_text` and `chmod(0o600)` | `commands/encrypt.py:107-108` |

### Medium
| ID | Title | File:line |
|---|---|---|
| sec-07 | `component_type` from API also unsanitized (companion to sec-01) | `sync/naming.py:26-27` |
| sec-11 | `max_parallel_workers` accepts `0` (no `ge=1`) → `ThreadPoolExecutor` crash | `models.py:76-79` |

### Low
| ID | Title | File:line |
|---|---|---|
| sec-19 | `permissions check` ignores `--deny-writes`/`--deny-destructive` session flags | `commands/permissions.py:367` |
| sec-20 | `_coerce_keboola_id` raises raw `ValueError` on non-numeric ID | `sync/branch_mapping.py:24` |

---

## Detailed findings (in-scope)

### sec-01 — Path traversal via API-controlled `component_id` in `sync pull` (CRITICAL)

`naming.config_path()` interpolates `component_type` and `component_id` directly from the Keboola API response into the filesystem path template:

```python
return naming_template.format(
    component_type=component_type,        # NOT sanitized
    component_id=component_id,            # NOT sanitized
    config_name=sanitize_name(config_name),
)
```

Default template `{component_type}/{component_id}/{config_name}` exposes both first two segments to whatever the API returns. A compromised stack or supply-chain attack injecting `component_id = "../../../.ssh"` would direct file writes outside the sync workspace.

**Repro:**
```python
>>> from pathlib import Path
>>> Path('/tmp/proj/main') / 'other/../../../etc/passwd/test'
# resolves to /private/tmp/etc/passwd/test — outside the workspace
```

No `Path.is_relative_to()` confinement check exists anywhere in `sync_service.py` before file writes.

**Fix:** Apply `sanitize_name()` to `component_type` and `component_id` in `naming.config_path()`, AND add a defensive confinement check (`config_dir.resolve().is_relative_to(branch_dir.resolve())`) before any file write.

---

### sec-02 + sec-08 — Master / Manage tokens inherited by MCP HTTP transport (HIGH/MED)

`services/mcp_transport.py:115-119`:

```python
self._process = subprocess.Popen(
    cmd,
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE,
)
```

No `env=` argument → MCP server subprocess inherits the **full kbagent environment** including `KBC_MASTER_TOKEN`, `KBC_MASTER_TOKEN_*`, and (when `--allow-env-manage-token` is set) `KBC_MANAGE_API_TOKEN`. The default-deny in v0.29.0 protected against passing manage token to API calls within kbagent itself, but the MCP HTTP transport bypasses that protection by inheriting all env vars.

The stdio transport at `services/mcp_service.py:228-239` correctly builds a minimal explicit env (KBC_STORAGE_TOKEN/URL/BRANCH_ID only). The HTTP transport does not.

**Fix:** Build explicit minimal env dict in `_start()`, mirroring the stdio approach. Block-list all `KBC_*` env vars except the per-project ones explicitly required.

---

### sec-04 — REPL history file stores tokens with default umask (HIGH)

`commands/repl.py:63-67`:

```python
def _get_history_path() -> Path:
    config_dir = Path(platformdirs.user_config_dir("keboola-agent-cli"))
    config_dir.mkdir(parents=True, exist_ok=True)
    return config_dir / "repl_history"
```

`prompt_toolkit.FileHistory` then creates this file with the user's default umask (typically 0644). Every line typed at the REPL — including `project add --token 901-xxx-Secret` — is persisted in plaintext.

**Fix:** Pre-create the history file with `os.open(path, O_WRONLY|O_CREAT, 0o600)` before passing to FileHistory; or `chmod(0o600)` the existing file at startup if it exists.

---

### sec-05 — XSS in lineage HTML output (HIGH)

`services/deep_lineage_service.py:1156` and similar:

```python
safe_name = entity_name.replace('"', "'")
```

This replaces double-quotes with single-quotes but does NOT escape `<`, `>`, or `&`. A Keboola table or config named `</div><script>alert(1)</script>` survives sanitization and is embedded into the generated HTML.

**Fix:** Use `html.escape()` consistently for all API-derived strings (`entity_name`, `cfg_label`, `inp_name`, `out_name`) before embedding into the Mermaid body or the surrounding HTML scaffold.

---

### sec-06 — `encrypt values --output-file` race (HIGH)

`commands/encrypt.py:107-108`:

```python
output_file.write_text(json.dumps(result, indent=2), encoding="utf-8")
output_file.chmod(0o600)
```

`write_text` creates the file with the default umask (typically 0644). On systems with shared multi-user filesystems, the race window between create and chmod allows other users to read the encrypted secrets file.

**Fix:** Use `os.open(path, O_WRONLY|O_CREAT|O_TRUNC, 0o600)` then `os.write()` — atomic permission set at creation time.

---

### sec-07 — `component_type` from API also unsanitized (MEDIUM)

Same root cause as sec-01 but for `component_type`. `component_type = "other/../transformations"` would misdirect file writes. Fix together with sec-01.

---

### sec-11 — `max_parallel_workers` accepts 0 (MEDIUM)

`models.py:76-79`:

```python
max_parallel_workers: int = Field(default=10, le=100)
```

No `ge=1` lower bound. Setting to 0 in config.json passes Pydantic validation, then `ThreadPoolExecutor(max_workers=0)` raises `ValueError`, crashing every multi-project operation.

**Fix:** Add `ge=1` to the `Field`. Add a defensive clamp in `_resolve_max_workers()` so an old config.json with `0` doesn't break startup.

---

### sec-19 — `permissions check` ignores session firewall flags (LOW)

`commands/permissions.py:367`:

```python
engine = PermissionEngine(config.permissions)
allowed = engine.is_allowed(operation)
```

The companion `permissions list` command at line 115 correctly applies session flags via `apply_firewall_flags()`. `permissions check` does not — so `kbagent --deny-writes permissions check branch.create` returns ALLOWED even though actual enforcement would deny.

**Fix:** Apply `apply_firewall_flags(config.permissions, deny_writes=ctx.obj.get('deny_writes'), deny_destructive=ctx.obj.get('deny_destructive'))` consistently with `permissions list`.

---

### sec-20 — `_coerce_keboola_id` raises raw `ValueError` (LOW)

`sync/branch_mapping.py:24`:

```python
return int(raw)
```

A hand-edited `branch-mapping.json` with `{"id": "not-a-number"}` raises raw `ValueError`, crashing all sync operations. Should raise a descriptive `ConfigError`, or treat as `None` (production) with a warning.

**Fix:** Wrap in `try/except ValueError` → either descriptive error or warning + None.

---

## Out of scope for this PR (deferred / suspected / discussion needed)

| ID | Severity | Title | Why deferred |
|---|---|---|---|
| sec-03 | high | Token in `--token` argv visible to `ps` | Design change: deprecation cycle for the flag |
| sec-09 | medium | `permissions reset` PTY bypass | Design change: requires second-factor (sudo-pattern) |
| sec-10 | medium | Branch name silent collisions | UX improvement, not security |
| sec-12 | medium | `version_cache.json` non-atomic write race | Low impact; needs atomic-write helper |
| sec-13 | medium | `--input @file` reads arbitrary path | Design discussion: restrict to relative paths or whitelist? |
| sec-14 | medium | `KBC_MASTER_TOKEN_{ALIAS}` from unvalidated alias | Design: validate alias on `project add` |
| sec-16 | low | Lineage HTML loads Mermaid from CDN (no SRI) | Hardening; needs Mermaid version pin policy |
| sec-17 | low | `KBAGENT_AUTO_UPDATE` toggle stale cache | Document or invalidate on transition |
| sec-18 | low | `doctor --fix` uses PATH for `uv` | Low risk in practice; needs design |
| sec-15 | n/a | manifest `apiHost` not used (false positive) | Confirmed safe — no SSRF |

Each deferred finding will get its own follow-up issue if/when prioritized.

---

## Methodology

This audit was generated by three specialized sub-agents in sequence:

1. **kbagent expert** read the entire command surface (`commands/`, `AGENT_CONTEXT`, workflow docs) and produced a complete use case map: every command's pre-conditions, happy paths, flag combinations that materially change behavior, plus 20 multi-command "life situations" representing real user goals (new dev onboarding, dev-branch workflow, CI/CD token rotation, AI agent flow, etc.). Plus cross-cutting concerns (auth model, branch context, multi-project semantics, local state, env vars).

2. **state-machine engineer** read the use case map and traced actual code paths. Output: every persistent and in-memory state with type, owner, persistence, lifetime, invariants; per-command read/write/assert table; per-life-situation transition trace with `file:line` references; 9 forbidden state combinations and their guards (or absence); concurrency / process model; 6 Mermaid state diagrams.

3. **security engineer** read both prior outputs and audited 11 categories: auth/authz, secrets, races/TOCTOU, broken invariants, footguns/UX traps, input validation, subprocess safety, fail-open vs fail-closed, multi-project state corruption, AI agent attack surface, doc drift. Each finding has severity, category, file:line, repro, cost, suggested fix, confidence.

This is the kind of audit that surfaces issues the original developers and reviewers don't see because they're too close to the code. Bugs A–E from #267 were intentionally excluded — those are already fixed in #268.

The PR will follow the same protocol as #267 / #268: e2e verification (where applicable), regression tests, version bump, changelog, ruff lint/format, and CI checks.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Security audit: 9 verified findings (path traversal, env leak, file perms, XSS, etc.) #269

Summary

In-scope for this PR (9 verified findings)

Critical

High

Medium

Low

Detailed findings (in-scope)

sec-01 — Path traversal via API-controlled `component_id` in `sync pull` (CRITICAL)

sec-02 + sec-08 — Master / Manage tokens inherited by MCP HTTP transport (HIGH/MED)

sec-04 — REPL history file stores tokens with default umask (HIGH)

sec-05 — XSS in lineage HTML output (HIGH)

sec-06 — `encrypt values --output-file` race (HIGH)

sec-07 — `component_type` from API also unsanitized (MEDIUM)

sec-11 — `max_parallel_workers` accepts 0 (MEDIUM)

sec-19 — `permissions check` ignores session firewall flags (LOW)

sec-20 — `_coerce_keboola_id` raises raw `ValueError` (LOW)

Out of scope for this PR (deferred / suspected / discussion needed)

Methodology

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

ID	Title	File:line
sec-02 + sec-08	`KBC_MASTER_TOKEN_*` and `KBC_MANAGE_API_TOKEN` inherited by MCP HTTP transport	`services/mcp_transport.py:115-119`
sec-04	REPL history file stores tokens with default umask (typically 0644)	`commands/repl.py:63-67`
sec-05	XSS in lineage HTML output via unescaped Keboola entity names	`services/deep_lineage_service.py:1156, 1177, 1204, 1217-1218, 1227`
sec-06	`encrypt values --output-file` race: file is world-readable between `write_text` and `chmod(0o600)`	`commands/encrypt.py:107-108`

ID	Title	File:line
sec-07	`component_type` from API also unsanitized (companion to sec-01)	`sync/naming.py:26-27`
sec-11	`max_parallel_workers` accepts `0` (no `ge=1`) → `ThreadPoolExecutor` crash	`models.py:76-79`

ID	Title	File:line
sec-19	`permissions check` ignores `--deny-writes`/`--deny-destructive` session flags	`commands/permissions.py:367`
sec-20	`_coerce_keboola_id` raises raw `ValueError` on non-numeric ID	`sync/branch_mapping.py:24`

ID	Severity	Title	Why deferred
sec-03	high	Token in `--token` argv visible to `ps`	Design change: deprecation cycle for the flag
sec-09	medium	`permissions reset` PTY bypass	Design change: requires second-factor (sudo-pattern)
sec-10	medium	Branch name silent collisions	UX improvement, not security
sec-12	medium	`version_cache.json` non-atomic write race	Low impact; needs atomic-write helper
sec-13	medium	`--input @file` reads arbitrary path	Design discussion: restrict to relative paths or whitelist?
sec-14	medium	`KBC_MASTER_TOKEN_{ALIAS}` from unvalidated alias	Design: validate alias on `project add`
sec-16	low	Lineage HTML loads Mermaid from CDN (no SRI)	Hardening; needs Mermaid version pin policy
sec-17	low	`KBAGENT_AUTO_UPDATE` toggle stale cache	Document or invalidate on transition
sec-18	low	`doctor --fix` uses PATH for `uv`	Low risk in practice; needs design
sec-15	n/a	manifest `apiHost` not used (false positive)	Confirmed safe — no SSRF

Security audit: 9 verified findings (path traversal, env leak, file perms, XSS, etc.) #269

Description

Summary

In-scope for this PR (9 verified findings)

Critical

High

Medium

Low

Detailed findings (in-scope)

sec-01 — Path traversal via API-controlled component_id in sync pull (CRITICAL)

sec-02 + sec-08 — Master / Manage tokens inherited by MCP HTTP transport (HIGH/MED)

sec-04 — REPL history file stores tokens with default umask (HIGH)

sec-05 — XSS in lineage HTML output (HIGH)

sec-06 — encrypt values --output-file race (HIGH)

sec-07 — component_type from API also unsanitized (MEDIUM)

sec-11 — max_parallel_workers accepts 0 (MEDIUM)

sec-19 — permissions check ignores session firewall flags (LOW)

sec-20 — _coerce_keboola_id raises raw ValueError (LOW)

Out of scope for this PR (deferred / suspected / discussion needed)

Methodology

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

sec-01 — Path traversal via API-controlled `component_id` in `sync pull` (CRITICAL)

sec-06 — `encrypt values --output-file` race (HIGH)

sec-07 — `component_type` from API also unsanitized (MEDIUM)

sec-11 — `max_parallel_workers` accepts 0 (MEDIUM)

sec-19 — `permissions check` ignores session firewall flags (LOW)

sec-20 — `_coerce_keboola_id` raises raw `ValueError` (LOW)