Skip to content

fix(0.30.5): close issue #269 -- 9 verified findings from security audit#272

Merged
padak merged 3 commits into
mainfrom
fix/269-security-audit-9-findings
May 7, 2026
Merged

fix(0.30.5): close issue #269 -- 9 verified findings from security audit#272
padak merged 3 commits into
mainfrom
fix/269-security-audit-9-findings

Conversation

@padak
Copy link
Copy Markdown
Member

@padak padak commented May 7, 2026

Closes #269. (Replaces auto-closed PR #270 — original PR was closed when its base fix/267-sync-git-branching-bugs was deleted upon merge of #268.)

Fixes nine verified security findings surfaced by a three-stage automated audit (use-case map → state-machine → security review). Bugs A–E from #267 were intentionally out of scope. Each finding was verified by reading code; the eight non-trivial ones have explicit regression tests. Branch is now rebased onto current main (#268 already merged).

Summary

ID Severity What File:line
sec-01 + sec-07 Critical Path traversal via API-controlled component_id / component_type in sync pull sync/naming.py:25-28 + services/sync_service.py:434
sec-02 + sec-08 High / Med MCP HTTP subprocess inherits KBC_MASTER_TOKEN* and KBC_MANAGE_API_TOKEN services/mcp_transport.py:115
sec-04 High REPL history file 0644 → 0600 commands/repl.py:63
sec-05 High XSS in lineage show --format html (entity / config names) services/deep_lineage_service.py:1156, 1177, 1204, 1217-1218, 1227
sec-06 High encrypt values --output-file race (world-readable window before chmod) commands/encrypt.py:107-108
sec-11 Medium max_parallel_workers: 0 crashes ThreadPoolExecutor (no ge=1) models.py:76-79
sec-19 Low permissions check ignores --deny-writes / --deny-destructive flags commands/permissions.py:367
sec-20 Low _coerce_keboola_id raises raw ValueError on non-numeric ID sync/branch_mapping.py:24

Tests

33 new regression tests across 7 test files. Each fix has a dedicated test that fails against main before the fix.

Total suite: 2830 passed (was 2797).

Test plan

  • make lint format-check skill-check changelog-check check-error-codes
  • uv run pytest tests/ --ignore=tests/test_e2e.py --ignore=tests/test_sync_e2e.py --ignore=tests/test_e2e_lineage_deep.py → 2830 passed
  • Each fix has a regression test that fails against main before the fix
  • Manual verification of sanitize_path_segment() on adversarial inputs
  • Manual verification that _build_minimal_env() does not include any KBC_*
  • e2e — not strictly needed; all fixes are at unit boundaries

Related

This release ships fixes for nine security findings surfaced by an
audit driven by three sub-agents in sequence (use-case map ->
state-machine -> security review). Findings A-E from issue #267
(merged in #268) are intentionally out of scope for this audit.

Critical
- sec-01 / sec-07: path traversal via API-controlled component_id and
  component_type. naming.config_path() interpolated those tokens raw,
  so a malicious or compromised stack returning component_id =
  "../../etc" would direct sync pull writes outside the workspace.
  Fix: new sanitize_path_segment() that rejects /, \, and parent
  references while preserving dots/hyphens/underscores in legitimate
  IDs (keboola.ex-db-mysql, kds-team.app-custom-python). Plus a
  defense-in-depth confinement check in sync_service.pull() that
  raises ConfigError if the resolved config dir is not contained in
  the branch dir.

High
- sec-02 / sec-08: MCP HTTP transport subprocess no longer inherits
  any KBC_* token from the kbagent process env. Pre-fix,
  Popen(cmd, ...) had no env= arg, so KBC_MASTER_TOKEN,
  KBC_MASTER_TOKEN_<ALIAS>, KBC_MANAGE_API_TOKEN, and KBC_TOKEN all
  leaked to the MCP server when KBAGENT_MCP_TRANSPORT=http. New
  _build_minimal_env() allow-lists only PATH, HOME, locale, and
  uv/python cache vars. Per-project Storage tokens still flow via
  HTTP request headers as before. Closes the v0.29.0 manage-token
  default-deny gap on the HTTP transport path.
- sec-04: REPL history file (~/.config/keboola-agent-cli/repl_history)
  is now created with mode 0600. Pre-fix, FileHistory created the
  file with the user's umask (typically 0644), persisting any token
  typed at the prompt in plaintext readable by group/world.
- sec-05: kbagent lineage show --format html no longer emits XSS-
  vulnerable HTML. render_er_diagram() previously did
  name.replace('"', "'") which left <, >, and & untouched -- a
  Keboola entity named </div><script>...</script> would inject the
  script. Fix uses html.escape(s, quote=True) consistently for every
  API-derived string.
- sec-06: kbagent encrypt values --output-file now atomically creates
  the file with mode 0600 via os.open(..., 0o600). Replaces the
  previous Path.write_text() + chmod(0o600) which left a race window
  where the file was world-readable.

Medium
- sec-11: max_parallel_workers Pydantic field now requires ge=1 in
  addition to le=100. Pre-fix, max_parallel_workers: 0 in config.json
  passed validation and crashed every multi-project op with
  ValueError from ThreadPoolExecutor. _resolve_max_workers() also
  clamps defensively for legacy on-disk configs.

Low
- sec-19: kbagent permissions check OPERATION now reflects the
  EFFECTIVE policy for the invocation -- persisted policy MERGED
  with --deny-writes / --deny-destructive session flags -- matching
  permissions list semantics. Pre-fix, an AI agent inspecting its
  own self-imposed firewall got a misleading "allowed" answer.
- sec-20: _coerce_keboola_id() and load_branch_mapping() now raise
  descriptive errors for malformed branch IDs in branch-mapping.json.
  Pre-fix, "id": "not-a-number" produced raw "invalid literal for
  int()" from deep inside the parser.

Tests
- 33 new regression tests across 7 test files. Total suite: 2830
  passed (was 2797).

Out of scope (tracked as follow-up in #269)
- sec-03 token-in-argv deprecation
- sec-09 PTY-bypass design
- sec-10 silent name collisions
- sec-12 version_cache atomicity
- sec-13 @file path restriction
- sec-14 alias validation
- sec-16 SRI on Mermaid CDN
- sec-17 KBAGENT_AUTO_UPDATE toggle staleness
- sec-18 doctor --fix uv via PATH

Audit artifacts kept locally at /tmp/kbagent-audit/use-cases.md (28K)
+ state-machine.md (43K) + issues.md (15K) for the test design phase.
Copy link
Copy Markdown
Member Author

@padak padak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review of #272 — fix(0.30.5): close issue #269 -- 9 verified findings from security audit

Generated by kbagent-pr-reviewer subagent. Verdict and findings below
are advisory; the human author retains every veto. CI-coverable issues
(lint, format, tests) are confirmed via make check, not duplicated here.

Summary

PR #272 fixes nine verified security findings from a three-stage automated audit of kbagent.
The fixes are technically sound: path traversal in sync pull (sec-01/sec-07), token
inheritance by the MCP subprocess (sec-02/sec-08), REPL history world-readable (sec-04),
XSS in lineage ER output (sec-05), race on encrypted secrets file (sec-06),
max_parallel_workers=0 crash (sec-11), permissions check ignoring session flags (sec-19),
and descriptive errors for malformed branch IDs (sec-20). All 33 regression tests pass;
make check exits 0 (2830 passed).

One BLOCKING finding: gotchas.md is missing a (since v0.30.5) entry for the
permissions check behavior change (sec-19). This is mandatory per CONTRIBUTING.md for
any behavior change that would give AI agents different answers on old vs. new versions.
Two NON-BLOCKING findings are noted below.

Verdict: REQUEST CHANGES

Verdict

  • Verdict: REQUEST CHANGES
  • Blocking findings: 1
  • Non-blocking findings: 2
  • Nits: 1

Blocking findings

[B-1] plugins/kbagent/skills/kbagent/references/gotchas.mdpermissions check behavior change missing (since v0.30.5) entry

sec-19 changes kbagent permissions check OPERATION to reflect the EFFECTIVE policy
(persisted policy merged with --deny-writes / --deny-destructive session flags), matching
permissions list semantics. Pre-fix, running kbagent --deny-writes --json permissions check branch.create returned "allowed": true even though the operation was blocked.

keboola-expert.md §5 error recovery directs AI agents to use kbagent permissions show /
kbagent permissions check to audit their own session firewall. An agent running on v0.30.4
that calls permissions check with --deny-writes in effect will get a misleading "allowed"
answer and may proceed with an operation that will then exit 6. An agent reading the output
on v0.30.5 will correctly see "denied". The version boundary matters.

Per CONTRIBUTING.md §"Documentation changes (mandatory!)" and the Plugin synchronization map:

"gotchas.md -- New non-obvious behavior -- always tag with (since vX.Y.Z)"

The existing --deny-writes / --deny-destructive firewall (since 0.22.0) entry at
gotchas.md:501 should gain a bullet: **permissions check OPERATIONnow reflects the effective (merged) policy** (since 0.30.5): before 0.30.5,permissions checkconsulted only the persisted policy and returned "allowed" even when--deny-writesor--deny-destructivewas set for the current invocation. Upgrade to 0.30.5+ before relying onpermissions check to pre-flight session-flag-blocked operations.

Non-blocking findings

[NB-1] src/keboola_agent_cli/services/sync_service.py:351_ensure_within_branch() has no direct unit test

The new defense-in-depth guard _ensure_within_branch() raises ConfigError if a resolved
config path escapes the branch directory. It is exercised only when sanitize_path_segment()
fails to neutralize a traversal AND the combined template path still escapes the root.
The primary sanitizer layer (sanitize_path_segment) has comprehensive tests in
test_sync_naming.py, but the guard itself (_ensure_within_branch) has no test that
directly verifies it fires when given a path outside the branch dir.

Per CONTRIBUTING.md: service-layer tests should cover error paths. A test like
_ensure_within_branch(branch_dir=tmp_path/"branch", config_dir=tmp_path/"escape") -> raises ConfigError
would pin this regression guard against future refactoring.

[NB-2] src/keboola_agent_cli/services/deep_lineage_service.py:1161html.escape(quote=True) renders &quot; literally in the lineage server ER diagram

html.escape(entity_name, quote=True) replaces " with &quot;. When the lineage server
serves the ER Mermaid code via /api/mermaid as text/plain, the browser JS receives it
without HTML-decoding (.text() does not decode HTML entities). The raw &quot; string is
then passed to mermaid.render(id, code), which renders it as the literal seven characters
&quot; in the SVG text node — not as a double-quote.

The old name.replace('"', "'") was also lossy (turned " into ') but at least produced
readable output. The new behavior is both correct for XSS prevention and subtly wrong for
display when Keboola config or table names contain a double-quote character.

Real-world impact is low (Keboola Storage API validation discourages " in names), but the
tests at tests/test_deep_lineage_service.py:558 do not exercise a name containing a literal
" — adding one would document the expected rendering clearly for future contributors.

A simple fix: use html.escape(entity_name, quote=False) (does not escape ") and rely on
the existing "..." surrounding the entity name in the Mermaid syntax to delimit the string.
The double-quote escaping is only necessary in HTML attribute contexts, not in Mermaid code
served as text/plain and processed by the Mermaid JS library.

Nits

  • [NIT-1] src/keboola_agent_cli/changelog.py:50 — sec-05 description says "XSS in
    lineage show --format html" but the vulnerable code path is --format er and the
    lineage server ER view (/api/mermaid?...&view=er). The --format html path uses
    render_mermaid() (flowchart), which was already protected via _escape_mermaid_label().
    The description is correct about the affected file and fix; only the command flag name in
    the prose is slightly misleading.

Verification log

  • gh auth status → authenticated as padak
  • git rev-parse --abbrev-ref HEADfix/269-security-audit-9-findings ✓ (matches PR branch)
  • gh pr view 272 --json title,body,files,additions,deletions,baseRefName,headRefName,labels,state
    → state=OPEN, 22 files, +539/-25, fix(0.30.5): conventional prefix ✓
  • gh pr diff 272 → 920 lines, read in full ✓
  • Layer violation checks (typer in services, httpx in commands, formatter in clients) → empty ✓
  • Magic numbers / bare except / print() in production code → none found ✓
  • Token in new logged output without mask_token → none found ✓
  • make check → 2830 passed, 7 skipped, 15 warnings, exit 0 ✓
  • plugins/kbagent/agents/keboola-expert.md VERSION GATE §1 Rule 6 → no new commands needing
    min-version gate; all changes are fixes to existing commands ✓
  • plugins/kbagent/agents/keboola-expert.md §2 Tool Selection Matrix → no new write/destructive
    commands; no new rows needed ✓
  • plugins/kbagent/skills/kbagent/references/commands-reference.md → no new commands; no
    updates needed ✓
  • CLAUDE.md ## All CLI Commands → no signature changes; no updates needed ✓
  • src/keboola_agent_cli/commands/context.py AGENT_CONTEXT → no new commands or flag changes;
    permissions check behavior change is a bug fix, not a flag addition ✓
  • src/keboola_agent_cli/permissions.py OPERATION_REGISTRY → no new commands; not applicable ✓
  • plugins/kbagent/skills/kbagent/references/gotchas.md → NO entry for sec-19 behavior change
    with (since v0.30.5) tag → BLOCKING [B-1]
  • _ensure_within_branch() unit test coverage → no direct test found → NON-BLOCKING [NB-1]
  • html.escape(quote=True) on ER diagram names → &quot; display artifact in lineage server
    ER mode → NON-BLOCKING [NB-2]
  • Behavior verification (sec-06 atomic file create): verified
    os.open(str(output_file), os.O_WRONLY | os.O_CREAT | os.O_TRUNC, 0o600) mode is
    correctly passed to os.open as the third positional argument (file creation mode,
    applied before umask) ✓
  • Behavior verification (sec-19 permissions check): diff at
    commands/permissions.py:358-385 confirmed — lazy import of apply_firewall_flags from
    cli.py inside the function body (same pattern already used by permissions_list at
    line 109, which has been in main since before this PR — no new circular-import risk) ✓
  • Behavior verification (sec-02/sec-08 env scrubbing): _build_minimal_env() is an allowlist
    of 18 keys; any KBC_* variable is absent from the allowlist → correctly excluded ✓
  • E2E tests: not run (no E2E_API_TOKEN/E2E_URL). All fixes are at unit boundaries;
    the PR description correctly notes E2E is not strictly needed for this change set.

Open questions for the author

  • [NB-2] is worth a quick author comment: was the html.escape(quote=False) alternative
    considered? The XSS fix does not need quote=True in the Mermaid code context (names
    are served as text/plain, not embedded in HTML attributes). Using quote=False would
    close the XSS vector for < > & while avoiding the &quot; display artifact for
    double-quote names.

B-1 (blocking): gotchas.md missing (since v0.30.5) entry for the sec-19
behavior change. permissions check OPERATION now reflects session
firewall flags as of v0.30.5; pre-fix it consulted only the persisted
policy. Added the version-tagged bullet to the firewall section so an
AI agent on pre-0.30.5 install knows the dry probe lies under
--deny-writes.

NB-1: added 4 direct unit tests for _ensure_within_branch
defense-in-depth guard in tests/test_sync_service.py
(TestEnsureWithinBranch). Covers the pass-through case, dotdot escape,
error message naming the offending component / config, and resolved
absolute paths.

NIT-1: fixed changelog wording for sec-05. The vulnerable code path
was render_er_diagram() (used by --format er and the lineage server
ER view), not the Mermaid flowchart at --format html. Clarified in
the entry; also noted the flowchart route was already safe.

NB-2 (not addressed): see PR comment. html.escape(quote=True) was
deliberate -- " is a Mermaid ER syntax delimiter, so leaving it
literal would break parsing on entity names containing it. The
&quot; display artifact in rare cases is acceptable; XSS closure is
the priority.
@padak
Copy link
Copy Markdown
Member Author

padak commented May 7, 2026

Thanks for the review. Pushed 4119793 addressing B-1, NB-1, NIT-1.

B-1 — fixed (4119793)

Added (since v0.30.5) bullet to the existing --deny-writes / --deny-destructive firewall (since 0.22.0) section in gotchas.md:

permissions check OPERATION reflects the EFFECTIVE policy (persisted policy MERGED with session flags) (since v0.30.5). Pre-0.30.5 it consulted only the persisted policy, so an agent doing self-introspection ... got allowed: true despite the session flag denying that op at execution time. If your agent uses permissions check to gate destructive actions and may run against pre-0.30.5 installs, also re-check at execution-time exit codes (6 = denied) rather than trusting the dry probe alone.

NB-1 — fixed (4119793)

Added TestEnsureWithinBranch to tests/test_sync_service.py with 4 cases:

  • Pass-through for normal in-branch path
  • ConfigError "escapes sync workspace" for dotdot escape
  • Error message names the offending component_id and config_id
  • Pass-through for explicitly resolved absolute path inside branch

NIT-1 — fixed (4119793)

Changelog entry for sec-05 corrected: --format er (and lineage server ER view) was the vulnerable path, not --format html. Also noted --format html (Mermaid flowchart via render_mermaid()) was already safe.

NB-2 — won't fix; rationale below

You suggested html.escape(quote=False) to avoid &quot; display artifacts in entity names containing literal double quotes.

The trade-off goes the other way:

  • In Mermaid ER syntax, "name" is the delimiter for entity names: "prod:in.c-bucket.users" { ... }. If we leave a literal " inside entity_name, Mermaid's parser truncates the name at that quote and the rest becomes a syntax error. Production today uses replace('"', "'") precisely because of this.
  • Switching to quote=False while preserving safety would require manually replacing "&quot; (or ') after html.escape() — re-introducing the same code path complexity, just split between two lines.
  • The &quot; display artifact only shows up when an API-supplied entity name actually contains a literal ". That is rare in practice (Keboola UI doesn't encourage it) and harmless when it does happen — the entity is still visible and identifiable.
  • The XSS vector closure is the primary goal and quote=True covers it without escape-sequence gymnastics.

I'd rather keep the simpler quote=True and accept the rare display artifact than carry the dual-pass logic. If real-user reports show entity names with literal " causing confusion, that's a small follow-up to add a custom _html_escape_for_mermaid() helper that does both jobs explicitly. Filing as a low-priority follow-up under #271 for now.


Other than NB-2, all reviewer findings are addressed. Test suite at 2834 passed, ruff lint+format clean. Ready for re-review when convenient.

Copy link
Copy Markdown
Member Author

@padak padak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review of #272 — fix(0.30.5): close issue #269 -- 9 verified findings from security audit

Generated by kbagent-pr-reviewer subagent (re-review pass, focused on fix-up commit 4119793).
Verdict and findings below are advisory; the human author retains every veto.
CI-coverable issues (lint, format, tests) confirmed via make check.

Summary

This PR fixes nine security findings from a three-stage automated audit of the kbagent codebase: a critical path-traversal in sync pull, token inheritance in the MCP HTTP subprocess, REPL history file permissions, XSS in the lineage ER diagram renderer, an atomic-write race on encrypt --output-file, a max_parallel_workers: 0 crash, and three lower-severity hardening fixes. The fix-up commit 4119793 addresses all four issues from the previous review cycle: B-1 (gotchas.md missing (since v0.30.5) entry for the permissions check behavior change), NB-1 (4 direct unit tests for _ensure_within_branch), NIT-1 (changelog wording corrected to --format er). NB-2 (html.escape quote=True display artifact) is explicitly accepted as won't-fix with a sound technical rationale. make check passes with 2834 tests (net +37 over main). Verdict: APPROVE.

Verdict

  • Verdict: APPROVE
  • Blocking findings: 0
  • Non-blocking findings: 1
  • Nits: 1

Blocking findings

(none)

Non-blocking findings

[NB-1] src/keboola_agent_cli/commands/context.py:810permissions check description does not mention effective-policy semantics

The AGENT_CONTEXT string at line 810 says "Check if operation is allowed. Exit 0=allowed, 6=denied." It does not note that as of 0.30.5 the command reflects the effective (session-merged) policy rather than only the persisted policy. The session-start context is what AI agents read to understand available commands; an agent that skims the context but not gotchas.md will not know about the sec-19 behavior change. The gotchas.md entry at line 507 covers this fully; the context.py description could add "Reflects effective policy including --deny-writes / --deny-destructive session flags (since 0.30.5)." in parentheses. Non-blocking because gotchas.md is the authoritative source per CONTRIBUTING.md and agents that follow the documented workflow will see it.

Nits

  • [NIT-1] 4119793 commit message — review: is not an allowed conventional-commit prefix per CONTRIBUTING.md:282 (allowed: feat:, fix:, chore:, docs:, test:, refactor:). Since the commit squashes review feedback the closest match is chore:. No functional impact; only relevant if the project parses commit history programmatically.

Verification log

  • gh auth status → authenticated as padak on github.com ✓
  • git rev-parse --abbrev-ref HEADfix/269-security-audit-9-findings (matches <branch>) ✓
  • gh pr view 272 --json state"OPEN"
  • Read CONTRIBUTING.md → loaded Plugin synchronization map, per-command checklist, release checklist ✓
  • Read CLAUDE.md → loaded convention #17 (silent-drift surfaces), All CLI Commands block ✓
  • Read plugins/kbagent/agents/keboola-expert.md → loaded §1 rules, §2 Tool Selection Matrix, §3 Inline Gotchas ✓
  • gh pr diff 272 → 1005 diff lines, 24 changed files ✓
  • git show 4119793 --stat → 3 files changed (+67/-1): gotchas.md, changelog.py, tests/test_sync_service.py
  • B-1 closure verified: grep "since v0.30.5" plugins/kbagent/skills/kbagent/references/gotchas.md:507 → entry present with full version-tagged bullet ✓
  • NB-1 closure verified: TestEnsureWithinBranch at tests/test_sync_service.py:2742 — 4 test methods covering pass-through, dotdot escape, error message naming offending component/config, and absolute path inside branch ✓
  • NIT-1 closure verified: changelog.py entry now reads --format er (and the lineage server's ER view) + notes --format html flowchart was already safe via render_mermaid()
  • NB-2 won't-fix rationale assessed: html.escape(s, quote=True) produces &quot; in entity names; Mermaid ER uses " as the entity name delimiter so a literal " inside would break parsing. The display artifact (&quot; showing in rare entity names with embedded double-quotes) is the correct trade-off for closing the XSS vector. Rationale accepted ✓
  • Layer violation check (typer/click in services, httpx in commands, formatter in clients) → grep on diff additions → empty ✓
  • Token discipline (new logged output without mask_token) → all token references in diff are in changelog strings, docstrings, test names, or variable names (_MCP_ENV_ALLOWLIST, sanitize_path_segment) → no exposure ✓
  • Convention checks (magic numbers, raw error_code strings, bare except, print() in production) → all empty ✓
  • Plugin synchronization map (no new commands added) → CLAUDE.md, context.py, commands-reference.md, OPERATION_REGISTRY, hints/definitions/ do not need updates for a security-only patch ✓
  • keboola-expert.md VERSION GATE → no new commands that need version-gating; permissions check effective-policy behavior is covered in gotchas.md at line 507; §1 Rule 6 does not require a new minimum-version entry for a behavior fix (as opposed to a new command) ✓
  • make check2834 passed, 7 skipped, 73 deselected, 14 warnings in 40.49s ✓
  • Commit prefix check: commit 63f8f11 uses fix(0.30.5): ✓; commit 4119793 uses review: which is not in the allowed prefix list (NIT-1) ✓

Open questions for the author

(none)

Mirror the gotchas.md (since v0.30.5) entry into AGENT_CONTEXT in
commands/context.py so AI agents that load context via 'kbagent
context' (and not via gotchas.md) also know that 'permissions
check' reflects session firewall flags as of 0.30.5.

Reviewer noted this in the second review pass; same silent-drift
class as the original B-1 fix in 4119793. Treating both
referential surfaces as one update.
@padak padak merged commit f70b633 into main May 7, 2026
1 check passed
@padak padak deleted the fix/269-security-audit-9-findings branch May 7, 2026 19:05
ottomansky added a commit to ottomansky/keboola-agent-cli that referenced this pull request May 8, 2026
Adds `--new-alias NEW` to `kbagent project edit` so users can rename a
project alias without going through `project remove` + `project add`
(which forces token re-entry). Adds `--dry-run` for read-only preview.
Mirrors the `kbagent config rename` precedent for the on-disk part of
the cascade.

Cascading scope:
- config.json `projects` dict key (`pop(old)` + insert under `new`)
- config.json `default_project` field (when it matched the old alias)
- nested-layout sync directory `<cwd>/<old-alias>/` -> `<cwd>/<new-alias>/`
  (with -2 collision suffix, git-mv-with-shutil-fallback)
- WARNS on `*.lineage.json` -- caches embed alias FQNs and are NOT
  auto-rewritten (partial rewrites are worse than no rewrite)

`--dry-run` (PR keboola#266 review NIT, addressed) previews collision detection,
planned disk-rename method (`git_mv` vs `shutil_move`), and the lineage-
cache warning without mutating any state. Validation errors raise the
same `ConfigError` exit-5 codes as the live path -- callers can rely on
`--dry-run` as a 1:1 pre-flight.

Combined with `--url` and/or `--token` in one call, those mutations
target the new alias post-rename: `kbagent project edit --project foo
--new-alias bar --token NEW` is one atomic operation.

Service / Command:
- `commands/project.py` -- new `--new-alias` and `--dry-run` Typer
  options; human formatter branches on `dry_run` / `old_alias`
- `services/project_service.py` -- `edit_project` accepts `new_alias`,
  `search_root`, `dry_run`; new helpers `_rename_project_alias`,
  `_validate_alias_format`, `_rename_nested_sync_dir`, `_move_directory`,
  `_detect_lineage_cache_warning` for the live path; `_plan_project_alias_rename`
  and `_plan_nested_sync_dir` for the read-only dry-run path

ConfigStore:
- `config_store.py` -- new `rename_project(old, new)` method (atomic
  dict-key swap + `default_project` cascade in one save() call)

Security hardening (from review iter 2):
- Validator regex `[A-Za-z0-9_][A-Za-z0-9_.-]*` plus explicit `..`
  rejection; rejects path traversal, NUL bytes, leading dot/dash,
  whitespace, and characters outside the slug alphabet. Stricter than
  `project add`'s no-op check; rationale is the rename's filesystem
  interaction (alias becomes a directory name).
- `search_root` resolved via `Path.resolve()` once before the disk
  rename to collapse symlinks; closes a malicious-cwd vector.
- Disk rename failures (`OSError`) trigger a config rollback so config
  and disk never end up out of sync; rollback's own failure is
  suppressed via `contextlib.suppress` so the original error wins.
- Lineage cache scan depth-capped at 2 levels (top + `*/` + `*/*/`)
  to bound cost when search_root is a deep tree.

Tests: 38 new (32 service + 6 CLI). Pin alias-key swap, collision
rejection, default_project cascade, sync-dir disk rename, no-sync-dir
no-op, sync-dir collision -2 suffix, combined edit-and-rename, no-op
on same-alias-only, parametrized 9-input path-traversal validator,
legal slug shapes accepted, OS failure rolls config back, rollback
failure surfaces original error, symlink target collision triggers
suffix bump, dry-run no-mutation happy path, dry-run collision still
raises, dry-run format validation still raises, dry-run predicts disk
method without touching disk, dry-run human DRY-RUN label, dry-run
JSON planned-block shape.

E2E: `tests/test_e2e.py::_test_project_edit_and_remove` extended with
a `--dry-run` preview (planned-block assertion) followed by a live
`--new-alias` round-trip (rename + reverse-rename to baseline) before
the existing `--url` step. Pinned by Padak's PR keboola#266 review BLOCKING --
every CLI command must have E2E coverage per CONTRIBUTING.md /
convention keboola#16.

Sync map updates:
- AGENT_CONTEXT (commands/context.py) -- new flags mentioned
- CLAUDE.md `## All CLI Commands` -- same wording
- commands-reference.md -- expanded with cascade scope + dry-run
- gotchas.md -- new `(since v0.30.7)` entry on lineage cache rebuild
- keboola-expert.md -- VERSION GATE clause + tool selection matrix row

Live-validated against project 1143 (`99_Playground_Max`,
europe-west3.gcp.keboola.com): rename to `playground` + reverse rename
to baseline; nested sync dir moved on disk; default_project cascaded;
all error paths produce correct ConfigError exit-5 messages.

Three review iterations on the original PR: self -> independent ->
convergence; zero material findings on the convergence pass. Padak's
post-merge review found 1 BLOCKING (E2E coverage) + 1 NIT (`--dry-run`);
both addressed in this iteration.

Rebased onto upstream/main after Padak shipped 0.30.4 (keboola#268),
0.30.5 (keboola#272), and 0.30.6 (keboola#273) since the original PR opened.
This PR ships as 0.30.7.
padak pushed a commit that referenced this pull request May 11, 2026
…un (#266)

Adds `--new-alias NEW` to `kbagent project edit` so users can rename a
project alias without going through `project remove` + `project add`
(which forces token re-entry). Adds `--dry-run` for read-only preview.
Mirrors the `kbagent config rename` precedent for the on-disk part of
the cascade.

Cascading scope:
- config.json `projects` dict key (`pop(old)` + insert under `new`)
- config.json `default_project` field (when it matched the old alias)
- nested-layout sync directory `<cwd>/<old-alias>/` -> `<cwd>/<new-alias>/`
  (with -2 collision suffix, git-mv-with-shutil-fallback)
- WARNS on `*.lineage.json` -- caches embed alias FQNs and are NOT
  auto-rewritten (partial rewrites are worse than no rewrite)

`--dry-run` (PR #266 review NIT, addressed) previews collision detection,
planned disk-rename method (`git_mv` vs `shutil_move`), and the lineage-
cache warning without mutating any state. Validation errors raise the
same `ConfigError` exit-5 codes as the live path -- callers can rely on
`--dry-run` as a 1:1 pre-flight.

Combined with `--url` and/or `--token` in one call, those mutations
target the new alias post-rename: `kbagent project edit --project foo
--new-alias bar --token NEW` is one atomic operation.

Service / Command:
- `commands/project.py` -- new `--new-alias` and `--dry-run` Typer
  options; human formatter branches on `dry_run` / `old_alias`
- `services/project_service.py` -- `edit_project` accepts `new_alias`,
  `search_root`, `dry_run`; new helpers `_rename_project_alias`,
  `_validate_alias_format`, `_rename_nested_sync_dir`, `_move_directory`,
  `_detect_lineage_cache_warning` for the live path; `_plan_project_alias_rename`
  and `_plan_nested_sync_dir` for the read-only dry-run path

ConfigStore:
- `config_store.py` -- new `rename_project(old, new)` method (atomic
  dict-key swap + `default_project` cascade in one save() call)

Security hardening (from review iter 2):
- Validator regex `[A-Za-z0-9_][A-Za-z0-9_.-]*` plus explicit `..`
  rejection; rejects path traversal, NUL bytes, leading dot/dash,
  whitespace, and characters outside the slug alphabet. Stricter than
  `project add`'s no-op check; rationale is the rename's filesystem
  interaction (alias becomes a directory name).
- `search_root` resolved via `Path.resolve()` once before the disk
  rename to collapse symlinks; closes a malicious-cwd vector.
- Disk rename failures (`OSError`) trigger a config rollback so config
  and disk never end up out of sync; rollback's own failure is
  suppressed via `contextlib.suppress` so the original error wins.
- Lineage cache scan depth-capped at 2 levels (top + `*/` + `*/*/`)
  to bound cost when search_root is a deep tree.

Tests: 38 new (32 service + 6 CLI). Pin alias-key swap, collision
rejection, default_project cascade, sync-dir disk rename, no-sync-dir
no-op, sync-dir collision -2 suffix, combined edit-and-rename, no-op
on same-alias-only, parametrized 9-input path-traversal validator,
legal slug shapes accepted, OS failure rolls config back, rollback
failure surfaces original error, symlink target collision triggers
suffix bump, dry-run no-mutation happy path, dry-run collision still
raises, dry-run format validation still raises, dry-run predicts disk
method without touching disk, dry-run human DRY-RUN label, dry-run
JSON planned-block shape.

E2E: `tests/test_e2e.py::_test_project_edit_and_remove` extended with
a `--dry-run` preview (planned-block assertion) followed by a live
`--new-alias` round-trip (rename + reverse-rename to baseline) before
the existing `--url` step. Pinned by Padak's PR #266 review BLOCKING --
every CLI command must have E2E coverage per CONTRIBUTING.md /
convention #16.

Sync map updates:
- AGENT_CONTEXT (commands/context.py) -- new flags mentioned
- CLAUDE.md `## All CLI Commands` -- same wording
- commands-reference.md -- expanded with cascade scope + dry-run
- gotchas.md -- new `(since v0.30.7)` entry on lineage cache rebuild
- keboola-expert.md -- VERSION GATE clause + tool selection matrix row

Live-validated against project 1143 (`99_Playground_Max`,
europe-west3.gcp.keboola.com): rename to `playground` + reverse rename
to baseline; nested sync dir moved on disk; default_project cascaded;
all error paths produce correct ConfigError exit-5 messages.

Three review iterations on the original PR: self -> independent ->
convergence; zero material findings on the convergence pass. Padak's
post-merge review found 1 BLOCKING (E2E coverage) + 1 NIT (`--dry-run`);
both addressed in this iteration.

Rebased onto upstream/main after Padak shipped 0.30.4 (#268),
0.30.5 (#272), and 0.30.6 (#273) since the original PR opened.
This PR ships as 0.30.7.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Security audit: 9 verified findings (path traversal, env leak, file perms, XSS, etc.)

1 participant