Skip to content

Add lazy mode to thv llm setup#5427

Merged
ChrisJBurns merged 4 commits into
mainfrom
chris/llm-setup-lazy
Jun 3, 2026
Merged

Add lazy mode to thv llm setup#5427
ChrisJBurns merged 4 commits into
mainfrom
chris/llm-setup-lazy

Conversation

@ChrisJBurns
Copy link
Copy Markdown
Collaborator

@ChrisJBurns ChrisJBurns commented Jun 3, 2026

Summary

thv llm setup runs the interactive OIDC browser login inline, which makes it
unusable in unattended provisioning — e.g. an MDM profile running it at first
login to pre-configure Claude Code, Cursor, etc. across a fleet of laptops. The
browser flow cannot complete without a human present, so setup hangs or fails.
Everything else setup does (flag validation, tool detection, config patching,
persistence) is non-interactive and safe to run unattended.

This PR adds an opt-in --lazy flag that skips only the inline login and defers
it to first gateway access, where the same browser flow fires transparently.

Closes #5386

Medium level
  • pkg/llm.Setup gains a lazy parameter: when set it skips the login()
    call and prints a deferred-login message, while still detecting tools, patching
    config files, and persisting ConfiguredTools — the on-disk result is identical
    to a normal setup except no token is obtained yet.
  • thv llm setup --lazy flag wires that parameter through the CLI.
  • thv llm token is now interactive: on a genuine cache miss it launches the
    same OIDC browser flow setup would have run, so a prior --lazy setup signs the
    user in on first use with no extra command. A cached/refreshable token is still
    printed without prompting.
  • LLM proxy per-request token-fetch timeout widened from 10s to a 3-minute
    tokenFetchTimeout constant, so the first request after a lazy setup can drive
    an interactive browser login without being cut off mid-login.
  • E2E coverage for the lazy setup path and the deferred token-login path, plus
    regenerated CLI docs.
Low level
File Change
pkg/llm/setup.go Add lazy bool param to Setup; skip login + print deferred-login message when set
cmd/thv/app/llm.go Add --lazy flag; make runLLMToken build an interactive token source; update token help text
pkg/llm/proxy/proxy.go Introduce tokenFetchTimeout = 3 * time.Minute, replacing the inline 10s per-request timeout
docs/cli/thv_llm_setup.md, docs/cli/thv_llm_token.md Regenerated via task docs
pkg/llm/setup_test.go, cmd/thv/app/llm_test.go Unit tests for lazy skip-login + tool persistence
test/e2e/cli_llm_setup_test.go Generalize runSetupWithOIDCCompletionrunWithOIDCCompletion; add lazy setup + deferred token-login specs
test/e2e/cli_llm_all_clients_test.go Drive OIDC completion in the expired-token test (token is now interactive)

Type of change

  • New feature

Test plan

  • Unit tests (task test) — pkg/llm, pkg/llm/proxy, cmd/thv/app pass
  • E2E tests — lazy setup, deferred token-login, expired-token, and cached-token specs pass (green on CI across kind v1.33/1.34/1.35)
  • Linting (task lint-fix) — could not run locally (golangci-lint/go toolchain mismatch); confirmed passing on CI
  • Manual testing (see below)

E2E specifically exercises the real flow end to end: thv llm setup --lazy completes with no browser (guarded by a 30s timeout), patches tool config identically to a normal setup, and the subsequent thv llm token launches the deferred OIDC browser flow (fake browser + mock OIDC) and prints a fresh token.

Manual test results (built binary, isolated HOME/XDG_CONFIG_HOME sandbox)

Sandbox setup: a fake-browser stub for open/xdg-open that records any invocation to a log file (so an opened browser is detectable), a fake claude binary + ~/.claude dir for tool detection, and the environment secrets provider (no keychain).

1. thv llm setup --lazy — must configure tools without logging in

$ thv llm setup --lazy
Lazy mode: skipping OIDC login. You'll be signed in automatically
the first time a configured tool accesses the LLM gateway.
Configured claude-code (direct mode)  →  <sandbox>/.claude/settings.json
exit code: 0
  • ✅ No browser opened — the fake-browser log file never appeared.
  • ~/.claude/settings.json patched with apiKeyHelper"…/thv" llm token and ANTHROPIC_BASE_URL, identical to a normal setup.
  • thv llm config show --format json lists the persisted claude-code entry in configured_tools.

2. thv llm token — must now engage the interactive OIDC flow on a cache miss

With no cached token and the issuer pointed at a refused port (so OIDC discovery fails instantly, with no hang and no browser):

$ thv llm token
Error: OIDC browser flow failed: failed to discover OIDC endpoints: unable to
discover OIDC endpoints at "http://127.0.0.1:9/.well-known/openid-configuration" …
connection refused
exit code: 1
  • ✅ The OIDC browser flow failed: … error originates from the interactive tier-4 path in pkg/auth/tokensource/tokensource.go. The previous non-interactive thv llm token never reached it — it returned ErrTokenRequired (“no cached credentials found; run "thv llm setup" to log in”) immediately. This confirms the deferred-login browser path is now engaged on a cache miss.

Coverage note: the full browser-completed deferred login (browser opens → callback → token printed) requires a live OIDC server and is covered by the green E2E suite; reproducing it by hand just rebuilds the mock-OIDC harness. The lazy setup path and the thv llm token interactive-engagement were both verified manually as above.

Does this introduce a user-facing change?

Yes. A new opt-in --lazy flag on thv llm setup skips the interactive OIDC
login and defers it to first gateway access. Default behavior for interactive
users is unchanged. Separately, thv llm token now launches the OIDC browser
flow on a cache miss instead of failing fast — a cached or refreshable token is
still printed without prompting.

Implementation plan

Approved implementation plan

Designed via /plan-design for #5386. Four stages, single PR:

  1. pkg/llm.Setup lazy support — add trailing lazy bool; when set, skip the
    login() block and print the deferred-login message; tool detection, patching,
    and persistence still run. Unit test asserts the injected LoginFunc is not
    called yet tools are persisted.
  2. CLI wiring--lazy flag threaded through runLLMSetupllm.Setup;
    runLLMToken switched to an interactive token source (browser only on a real
    cache miss); token/setup help text updated; CLI docs regenerated.
  3. Proxy timeout — replace the inline 10s per-request token-fetch timeout with
    a 3-minute tokenFetchTimeout constant so a human can complete the first
    interactive login through the proxy. (Decision: a single generous timeout for
    all requests, for simplicity.)
  4. E2E + docs — lazy setup spec (no browser) and deferred token-login spec
    (drives the browser flow to completion via the generalized
    runWithOIDCCompletion helper).

Key decision: thv llm token is always interactive rather than flag-gated — the
4-tier token source only opens a browser on a genuine cache miss, so this is
transparent for cached/refreshable tokens and needs no extra flag.

Non-goals (per issue): headless/device-code flows; changing the default
interactive thv llm setup behavior (lazy is strictly opt-in).

Special notes for reviewers

  • Proxy timeout tradeoff: the 3-minute bound applies to every request, not
    just first-login. A genuinely hung upstream token endpoint now holds a request
    up to 3 min instead of 10s. This was the chosen simplicity tradeoff over
    first-login-only detection.
  • No proxy-through-browser E2E: the deferred-login proxy path is covered by
    the timeout change plus existing proxy tests; an E2E that drives a browser login
    through the proxy was deemed optional and not added. The token-helper deferred
    path is fully E2E-covered. Happy to add the proxy variant if reviewers want it.
  • thv llm token behavior change: it is now interactive. This is intended
    (it's what makes lazy transparent) and does not affect callers that already have
    a cached or refreshable token.

Generated with Claude Code

ChrisJBurns and others added 4 commits June 2, 2026 21:02
The inline OIDC browser login in "thv llm setup" cannot complete in
unattended provisioning contexts (e.g. an MDM profile configuring a
fleet of laptops at first login), where no human is present to finish
the callback.

Add a lazy parameter to llm.Setup that skips only the interactive login.
Tool detection, config-file patching, and config persistence still run,
so the on-disk result is identical to a normal setup except that no OIDC
token is obtained yet — login is deferred to first gateway access. The
sole call site passes false for now; the --lazy CLI flag is wired in a
later change.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Wire the lazy parameter added to llm.Setup to a new --lazy flag on
"thv llm setup", so unattended provisioning can configure tools without
the interactive OIDC browser login.

For deferred login to be transparent, "thv llm token" now builds an
interactive token source: on a genuine cache miss it launches the same
OIDC browser flow setup would have run, while a cached or refreshable
token is still printed without prompting. This is what signs the user in
on first use after a lazy setup, with no extra command to run.

Regenerate CLI docs for the new flag and updated token help.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The proxy's per-request token fetch used a 10-second timeout. After a
lazy setup, the first request through the proxy drives the interactive
OIDC browser login via the token source, and 10 seconds is far too short
for a human to complete it — the request would be cut off mid-login.

Introduce a tokenFetchTimeout constant set to 3 minutes. Once a token is
cached, subsequent requests are served from cache and return well within
the bound, so only the initial login ever approaches it.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Cover the lazy provisioning flow end to end: "thv llm setup --lazy"
completes with no browser and patches tool config identically to a normal
setup, and the first "thv llm token" afterwards launches the deferred OIDC
browser flow and prints a fresh token.

Generalize runSetupWithOIDCCompletion into runWithOIDCCompletion so any
command that performs the browser flow can be driven to completion in
tests. Update the expired-cached-token test to drive that completion:
"thv llm token" is now interactive, so an expired token with no refresh
token falls through to the browser flow rather than failing fast.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added the size/M Medium PR: 300-599 lines changed label Jun 3, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented Jun 3, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 68.85%. Comparing base (8fce30e) to head (140cf86).
⚠️ Report is 7 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #5427   +/-   ##
=======================================
  Coverage   68.85%   68.85%           
=======================================
  Files         634      634           
  Lines       64422    64437   +15     
=======================================
+ Hits        44358    44369   +11     
- Misses      16783    16788    +5     
+ Partials     3281     3280    -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@github-actions github-actions Bot added size/M Medium PR: 300-599 lines changed and removed size/M Medium PR: 300-599 lines changed labels Jun 3, 2026
Comment thread pkg/llm/setup.go
@ChrisJBurns ChrisJBurns merged commit 6ebadaf into main Jun 3, 2026
49 checks passed
@ChrisJBurns ChrisJBurns deleted the chris/llm-setup-lazy branch June 3, 2026 16:29
ChrisJBurns added a commit that referenced this pull request Jun 3, 2026
The lazy setup message said the user would be "signed in automatically",
which oversells it: the deferred login still opens a browser for the user
to complete on the first gateway request. Reword to "signed in on the
first request a configured tool makes to the LLM gateway" so the message
accurately describes when (and that) login happens.

Addresses review feedback on #5427.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/M Medium PR: 300-599 lines changed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

thv llm setup: add --lazy mode that defers OIDC login until first token use

2 participants