Skip to content

feat(headless): token-only invocation via __env__ project (#359)#363

Merged
padak merged 5 commits into
mainfrom
claude/silly-banzai-796844
May 29, 2026
Merged

feat(headless): token-only invocation via __env__ project (#359)#363
padak merged 5 commits into
mainfrom
claude/silly-banzai-796844

Conversation

@padak
Copy link
Copy Markdown
Member

@padak padak commented May 29, 2026

Summary

Closes #359. Lets a daemon / container / CI (e.g. the jasnost bridge) run kbagent with only a token in the environment — no kbagent project add, no config.json on disk.

Set three env vars:

export KBAGENT_PROJECT_FROM_ENV=1
export KBC_TOKEN=<storage-api-token>
export KBC_STORAGE_API_URL=https://connection.<region>.keboola.com

…and ConfigStore synthesizes an in-memory project under the reserved alias __env__:

kbagent --json storage file-upload --project __env__ --file screenshot.png   # CLI
kbagent serve                                                                 # serve: POST project=__env__

Design: one chokepoint, both consumption paths

Both the CLI subprocess and kbagent serve resolve a project through the same ConfigStore.load(). Injecting the ephemeral project there means a single ~30-line change covers both styles with zero edits to the 50+ commands and routers. Verified live: CLI project list and serve GET /projects both return __env__ (token masked).

Security

  • Token never persisted. __env__ is marked ephemeral=True (Pydantic exclude=True) and stripped by ConfigStore.save(). Even a write op that triggers a config.json write cannot leak the env token to disk. Covered by test_ephemeral_never_persisted.
  • Opt-in is the flag, not the token. Only KBAGENT_PROJECT_FROM_ENV (truthy: 1/true/yes/on) triggers injection — KBC_TOKEN alone stays a project add fallback. Avoids a phantom project surprising a dev who exported KBC_TOKEN for an unrelated project add.
  • Fail fast. Flag set but KBC_TOKEN / KBC_STORAGE_API_URL missing → exit 5 (config error), not a silent skip.
  • No collision. The alias is literally __env__ (double underscore); a real project already registered under that alias wins, no injection.

Changes

Core: constants.py (2 constants), models.py (ephemeral field), config_store.py (_inject_env_project + _strip_ephemeral_projects).

Tests: 7 unit (TestEnvProjectInjection in test_config_store.py) + 3 E2E (TestHeadlessEnvProject in test_e2e.py). Full non-e2e suite: 3764 passed; lint / format / ty clean.

Docs / agent-sync (convention #17): version 0.49.00.50.0 + changelog, keboola-expert.md (tool matrix row), gotchas.md (new (since v0.50.0) entry), commands-reference.md (env-var table), context.py (AGENT_CONTEXT), CLAUDE.md (global-options note).

jasnost integration (separate repo, not in this PR)

The bridge shells out to kbagent ... --project <alias>. To go headless: set the three env vars in the daemon environment and KBAGENT_PROJECT_ALIAS=__env__. No bridge code change.

Manual testing

TMPD=$(mktemp -d)
# without opt-in -> empty (no phantom project)
KBC_TOKEN=<tok> KBC_STORAGE_API_URL=https://connection.keboola.com \
  kbagent --config-dir "$TMPD" --json project list
# with opt-in -> __env__ listed, nothing written to disk
KBAGENT_PROJECT_FROM_ENV=1 KBC_TOKEN=<tok> KBC_STORAGE_API_URL=https://connection.keboola.com \
  kbagent --config-dir "$TMPD" --json project list
ls "$TMPD"   # no config.json

Open in Devin Review

Let a daemon / container / CI run kbagent with only a token in the
environment -- no `kbagent project add`, no config.json on disk.

Setting KBAGENT_PROJECT_FROM_ENV=1 together with KBC_TOKEN +
KBC_STORAGE_API_URL makes ConfigStore synthesize an in-memory project
under the reserved alias `__env__`. Because both the CLI and `kbagent
serve` resolve projects through the same ConfigStore.load() chokepoint,
a single env-injection covers both consumption styles:

    kbagent --json storage file-upload --project __env__ --file X
    kbagent serve   # POST endpoints take project=__env__

Security:
- The `__env__` project is marked `ephemeral` and stripped by
  ConfigStore.save(), so the env token is never persisted, even when a
  write op triggers a config.json write.
- Opt-in is the explicit flag, not the mere presence of KBC_TOKEN, to
  avoid a phantom project on a dev machine that exported KBC_TOKEN only
  for `project add`.
- Flag set but credentials missing -> fail fast (exit 5), not a silent
  skip.

Tests: 7 unit (test_config_store.py) + 3 E2E (test_e2e.py).
Docs: changelog, keboola-expert.md, gotchas.md, commands-reference.md,
context.py AGENT_CONTEXT, CLAUDE.md. Version 0.49.0 -> 0.50.0.
Copy link
Copy Markdown

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 potential issue.

View 4 additional findings in Devin Review.

Open in Devin Review

Comment thread plugins/kbagent/agents/keboola-expert.md
padak added 2 commits May 29, 2026 17:37
…lash)

UX follow-up on the headless mode. `KBC_STORAGE_API_URL` (and
`project add --url` / `project edit --url`) previously rejected anything
that was not already a clean `https://<host>` base -- a bare host like
`connection.keboola.com` raised a pydantic ValidationError traceback.

Add `normalize_stack_url()` as the single source of truth, used by the
ProjectConfig field validator (safety net + clean stored value) and by
ProjectService.add_project / edit_project (so token verification hits
the right host). It accepts:

  - bare host                connection.keboola.com
  - trailing slash           https://connection.keboola.com/
  - surrounding whitespace   (paste artifact)
  - full project deep-link   https://connection.keboola.com/admin/projects/10105/dashboard

and reduces every form to https://<host>. Explicit non-https schemes
(http://, file://, ftp://) are still rejected (SSRF / protocol-abuse
guard). An unusable URL in the headless `__env__` injection now raises a
clean ConfigError (exit 5) instead of a raw ValidationError traceback.

Tests: 6 new model tests + 2 new env-injection tests; updated the old
"reject no scheme" test to assert normalization. Full non-e2e suite:
3771 passed.
`project list` showed `project_name="env (headless)"` and a null
Project ID for the env-injected project -- the fake name was misleading
and the ID was simply missing.

ConfigStore.load() must stay offline (it runs many times per command and
per serve request), so it cannot call verify_token to fetch the real
project name. But Keboola Storage tokens are `{projectId}-{tokenId}-
{secret}`, so the project_id is recovered offline from the token prefix.
The project_name is left blank (honest) instead of a fake placeholder;
`project status` / `project info` verify against the API and show the
real name when a command actually needs it.

Tests: assert project_id is parsed (901-...) and name is blank; a
non-numeric token prefix leaves project_id unset without crashing.
Copy link
Copy Markdown
Member Author

@padak padak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review of #363 — feat(headless): token-only invocation via __env__ project

Generated by kbagent-pr-reviewer subagent. Verdict and findings below
are advisory; the human author retains every veto. CI-coverable issues
(lint, format, tests) are confirmed via make check, not duplicated here.

Summary

This PR implements headless / token-only invocation for kbagent (issue #359): when
KBAGENT_PROJECT_FROM_ENV=1 is set alongside KBC_TOKEN and KBC_STORAGE_API_URL,
ConfigStore.load() synthesizes an in-memory project under the reserved alias __env__
with no config.json on disk and no project add required. The implementation is clean,
secure (the ephemeral field + _strip_ephemeral_projects() guarantee the env token is never
persisted), and covers both the CLI and kbagent serve through a single chokepoint.
The PR also ships a URL-normalization improvement that accepts bare hosts and project deep-links
everywhere a stack URL is accepted. Plugin sync surfaces are well-covered: context.py, CLAUDE.md,
keboola-expert.md §2 Tool Matrix, gotchas.md (with (since v0.50.0) tag), and
commands-reference.md are all updated. The one gap is that §1 Rule 6 VERSION GATE in
keboola-expert.md does not mention the new 0.50.0 minimum-version requirement for the
headless feature; and kbagent project remove --project __env__ returns a misleading success
message that is immediately reversed on the next load(). Both are NON-BLOCKING.
make check → 3772 passed, 8 skipped. Typecheck clean (pre-existing hatchling warning only).

Verdict: COMMENT (no blocking findings; two non-blocking items worth addressing).

Verdict

  • Verdict: COMMENT
  • Blocking findings: 0
  • Non-blocking findings: 2
  • Nits: 1

Blocking findings

(none)

Non-blocking findings

[NB-1] plugins/kbagent/agents/keboola-expert.md:115 — §1 Rule 6 VERSION GATE missing 0.50.0 entry for headless mode

The PR adds the __env__ env-var feature under §2 Tool Selection Matrix (with a (0.50.0+) tag), but does not add a corresponding entry to the §1 Rule 6 VERSION GATE list. An agent running on 0.49.x that receives instructions to use KBAGENT_PROJECT_FROM_ENV will silently get a raw ConfigError ('KBAGENT_PROJECT_FROM_ENV' is set but KBC_TOKEN...) rather than the "missing command" handoff the VERSION GATE is designed to produce. Per CONTRIBUTING.md, the VERSION GATE should be updated whenever a new minimum-version requirement is introduced.

Fix: after the dev-portal command group = 0.49.0+, line, add:

   headless token-only invocation (`KBAGENT_PROJECT_FROM_ENV` + `KBC_TOKEN` + `KBC_STORAGE_API_URL` synthesizing `--project __env__`) = 0.50.0+,

[NB-2] src/keboola_agent_cli/config_store.py:376project remove --project __env__ silently succeeds but is immediately reversed

Reproduced live: kbagent project remove --project __env__ returns {"status":"ok","data":{"alias":"__env__","message":"Project '__env__' removed."}} (exit 0), but because the env vars are still set, the very next load() call re-injects __env__. The command appears idempotent when it is actually a no-op — a user who runs project remove __env__ expecting the project to disappear will be confused when it reappears on the next invocation.

Fix (two complementary approaches): (a) Guard config_store.remove_project() or project_service.remove_project() against ephemeral aliases and raise a ConfigError("Cannot remove the synthetic '__env__' project. Unset KBAGENT_PROJECT_FROM_ENV to stop using it."), OR (b) document this behavior as a gotcha in gotchas.md under the headless entry (cost: a user who searches the docs will find the explanation, but the CLI gives no hint). Option (a) is significantly cleaner UX.

Nits

  • [NIT-1] tests/test_services.py — service-layer tests for ProjectService.add_project() and edit_project() only use pre-normalized https://connection.keboola.com URLs. Adding one test for a bare-host input (e.g. stack_url="connection.keboola.com") would confirm that normalize_stack_url() is wired correctly at the service boundary (the model tests in test_models.py cover the validator, but they do not exercise the project_service.add_project call path that passes the URL through normalize_stack_url() before model_validate()).

Verification log

  • gh pr view 363 --json title,body,files,additions,deletions,state → 17 files, +499/-25, feat(headless): prefix, state OPEN. PR description accurately describes all changes. ✓
  • git rev-parse --abbrev-ref HEAD (worktree) → claude/silly-banzai-796844 matches PR branch ✓
  • grep '^\+' diff | grep -E 'from typer|import typer|formatter\.' → empty ✓ (no layer violations in services or clients)
  • grep '^\+' diff | grep -E 'from httpx|import httpx' | grep commands/ → empty ✓
  • grep '^\+.*error_code\s*=\s*"[A-Z_]+"' diff → empty ✓ (no raw error_code strings)
  • grep '^\+\s*except\s*:' diff → empty ✓ (no bare except)
  • grep '^\+\s*print\(' diff | grep src/ → empty ✓
  • Plugin sync map: context.py updated ✓, CLAUDE.md updated ✓, keboola-expert.md §2 Tool Matrix row added ✓, gotchas.md (since v0.50.0) entry added ✓, commands-reference.md env-var table extended ✓, permissions.py N/A (no new command surface) ✓, hints/definitions/ N/A (deprecated per CONTRIBUTING.md) ✓
  • keboola-expert.md §1 Rule 6 VERSION GATE: 0.50.0 headless feature absent → [NB-1]
  • No new @*_app.command() decorators in diff → OPERATION_REGISTRY and server/routers/ changes not required ✓
  • make check → ruff ✓, format ✓, skill-check ✓, version-sync ✓, changelog-check ✓, error-codes ✓, 3772 passed, 8 skipped
  • make typecheck-warn → 1 pre-existing unresolved-import for hatchling (not from this PR); exit 0 ✓
  • Behavior reproduction (headless project list, no config.json on disk):
    KBAGENT_PROJECT_FROM_ENV=1 KBC_TOKEN=not-a-real-token KBC_STORAGE_API_URL=https://connection.keboola.com kbagent --config-dir $TMPD --json project list{"status":"ok","data":[{"alias":"__env__","project_id":null,...}]}, ls $TMPD → empty ✓
  • Fail-fast with flag set but missing creds:
    KBAGENT_PROJECT_FROM_ENV=1 kbagent --json project list (no KBC_TOKEN) → exit 5, clear message citing both missing vars ✓
  • Bare-host URL normalization:
    KBC_STORAGE_API_URL=connection.keboola.comstack_url: "https://connection.keboola.com"
  • Silent-no-op reproduction (NB-2):
    kbagent project remove --project __env__ → exit 0 "removed"; next project list__env__ reappears ✓ (confirmed as confusing UX)
  • E2E tests added in tests/test_e2e.py::TestHeadlessEnvProject (3 tests) ✓. Cannot run make test-e2e (no live credentials per reviewer policy).
  • Security: _strip_ephemeral_projects() called in save() before writing; exclude=True on ephemeral field prevents JSON serialization; no token in logs (only mask_token() in list_projects); no real-looking token found in diff ✓

Open questions for the author

(none)

Address the kbagent-pr-reviewer findings on #363:

- NB-1: add the 0.50.0 headless / URL-normalization entry to the Rule 6
  VERSION GATE in keboola-expert.md (highest silent-drift surface).
- NB-2: reject remove/edit/rename/set-branch on the env-synthesized
  __env__ project with a clear ConfigError instead of reporting a
  success that silently vanishes on the next load(). A real persisted
  project under the same alias (ephemeral=False) stays mutable.
- NIT-1: add a service-layer test asserting add_project() normalizes a
  bare-host / deep-link URL through normalize_stack_url() before the
  verification client and before persisting.

Tests: +5 (guard x2, service normalization x1, project_id parse x2 from
earlier). Full non-e2e suite: 3775 passed.
@padak
Copy link
Copy Markdown
Member Author

padak commented May 29, 2026

Addressed the review findings in 209d33b:

  • [NB-1] Added the 0.50.0 headless + URL-normalization entry to the Rule 6 VERSION GATE in keboola-expert.md.
  • [NB-2] remove / edit / rename / branch-switch on the env-synthesized __env__ project now raise a clear ConfigError (instead of reporting a success that vanishes on the next load()). A real persisted project under the same alias (ephemeral=False) stays mutable.
  • [NIT-1] Added a service-layer test asserting add_project() normalizes a bare-host / deep-link URL through normalize_stack_url() before the verification client and before persisting.

Full non-e2e suite: 3775 passed.

Devin review flagged that `project status` in headless mode could write
a config.json to disk via `_backfill_org_info`: the __env__ project
always has empty org_id/org_name, so the backfill kept trying to persist
it. `save()` strips the ephemeral entry (so no token leaked), but the
file was still created -- breaking the "no config.json on disk" promise
-- and the futile backfill re-ran on every `project status`.

Skip ephemeral projects when building the backfill update set. When
__env__ is the only candidate, the update set stays empty and no file is
written at all.

Test: get_status() under env-injection leaves the config dir file-free.
Full non-e2e suite: 3776 passed.
@padak
Copy link
Copy Markdown
Member Author

padak commented May 29, 2026

Reviewed the remaining Devin dashboard findings (they're not posted as inline threads, only in the Devin review UI):

  • 🚩 _backfill_org_info may create config.json in headless mode (project_service.py) — valid, fixed in 67f2810. The __env__ project always has empty org_id/org_name, so the org-info backfill that project status runs kept trying to persist it. save() strips the ephemeral entry (no token leaked), but the file was still created — breaking the "no config.json on disk" guarantee — and the futile backfill re-ran on every project status. Now ephemeral projects are skipped when building the backfill set, so headless project status writes nothing to disk. Test added.
  • The other dashboard items (no permission-registry entry needed; mutation methods on __env__ are no-ops; double-defense token-persistence design; add_project('__env__') blocked under the flag) are informational and already reflect the intended design — the "no-op mutation" concern is the same as the reviewer's NB-2, now guarded with an explicit ConfigError (209d33b).
  • The "3 Potential bugs / 2 Flags" panel (handler.ts, store.ts, App.tsx, parser.ts) is Devin's locked on-demand-credits demo — those files don't exist in this repo, not real findings.

Full non-e2e suite: 3776 passed.

@padak padak merged commit cd3b19f into main May 29, 2026
2 checks passed
@padak padak deleted the claude/silly-banzai-796844 branch May 29, 2026 19:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: headless token-only invocation (no configured --project alias)

1 participant