Add lazy mode to thv llm setup#5427
Merged
Merged
Conversation
The inline OIDC browser login in "thv llm setup" cannot complete in unattended provisioning contexts (e.g. an MDM profile configuring a fleet of laptops at first login), where no human is present to finish the callback. Add a lazy parameter to llm.Setup that skips only the interactive login. Tool detection, config-file patching, and config persistence still run, so the on-disk result is identical to a normal setup except that no OIDC token is obtained yet — login is deferred to first gateway access. The sole call site passes false for now; the --lazy CLI flag is wired in a later change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Wire the lazy parameter added to llm.Setup to a new --lazy flag on "thv llm setup", so unattended provisioning can configure tools without the interactive OIDC browser login. For deferred login to be transparent, "thv llm token" now builds an interactive token source: on a genuine cache miss it launches the same OIDC browser flow setup would have run, while a cached or refreshable token is still printed without prompting. This is what signs the user in on first use after a lazy setup, with no extra command to run. Regenerate CLI docs for the new flag and updated token help. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The proxy's per-request token fetch used a 10-second timeout. After a lazy setup, the first request through the proxy drives the interactive OIDC browser login via the token source, and 10 seconds is far too short for a human to complete it — the request would be cut off mid-login. Introduce a tokenFetchTimeout constant set to 3 minutes. Once a token is cached, subsequent requests are served from cache and return well within the bound, so only the initial login ever approaches it. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Cover the lazy provisioning flow end to end: "thv llm setup --lazy" completes with no browser and patches tool config identically to a normal setup, and the first "thv llm token" afterwards launches the deferred OIDC browser flow and prints a fresh token. Generalize runSetupWithOIDCCompletion into runWithOIDCCompletion so any command that performs the browser flow can be driven to completion in tests. Update the expired-cached-token test to drive that completion: "thv llm token" is now interactive, so an expired token with no refresh token falls through to the browser flow rather than failing fast. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #5427 +/- ##
=======================================
Coverage 68.85% 68.85%
=======================================
Files 634 634
Lines 64422 64437 +15
=======================================
+ Hits 44358 44369 +11
- Misses 16783 16788 +5
+ Partials 3281 3280 -1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
jerm-dro
approved these changes
Jun 3, 2026
4 tasks
ChrisJBurns
added a commit
that referenced
this pull request
Jun 3, 2026
The lazy setup message said the user would be "signed in automatically", which oversells it: the deferred login still opens a browser for the user to complete on the first gateway request. Reword to "signed in on the first request a configured tool makes to the LLM gateway" so the message accurately describes when (and that) login happens. Addresses review feedback on #5427. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
1 task
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
thv llm setupruns the interactive OIDC browser login inline, which makes itunusable in unattended provisioning — e.g. an MDM profile running it at first
login to pre-configure Claude Code, Cursor, etc. across a fleet of laptops. The
browser flow cannot complete without a human present, so setup hangs or fails.
Everything else setup does (flag validation, tool detection, config patching,
persistence) is non-interactive and safe to run unattended.
This PR adds an opt-in
--lazyflag that skips only the inline login and defersit to first gateway access, where the same browser flow fires transparently.
Closes #5386
Medium level
pkg/llm.Setupgains alazyparameter: when set it skips thelogin()call and prints a deferred-login message, while still detecting tools, patching
config files, and persisting
ConfiguredTools— the on-disk result is identicalto a normal setup except no token is obtained yet.
thv llm setup --lazyflag wires that parameter through the CLI.thv llm tokenis now interactive: on a genuine cache miss it launches thesame OIDC browser flow setup would have run, so a prior
--lazysetup signs theuser in on first use with no extra command. A cached/refreshable token is still
printed without prompting.
tokenFetchTimeoutconstant, so the first request after a lazy setup can drivean interactive browser login without being cut off mid-login.
regenerated CLI docs.
Low level
pkg/llm/setup.golazy boolparam toSetup; skip login + print deferred-login message when setcmd/thv/app/llm.go--lazyflag; makerunLLMTokenbuild an interactive token source; update token help textpkg/llm/proxy/proxy.gotokenFetchTimeout = 3 * time.Minute, replacing the inline 10s per-request timeoutdocs/cli/thv_llm_setup.md,docs/cli/thv_llm_token.mdtask docspkg/llm/setup_test.go,cmd/thv/app/llm_test.gotest/e2e/cli_llm_setup_test.gorunSetupWithOIDCCompletion→runWithOIDCCompletion; add lazy setup + deferred token-login specstest/e2e/cli_llm_all_clients_test.goType of change
Test plan
task test) —pkg/llm,pkg/llm/proxy,cmd/thv/apppasstask lint-fix) — could not run locally (golangci-lint/go toolchain mismatch); confirmed passing on CIE2E specifically exercises the real flow end to end:
thv llm setup --lazycompletes with no browser (guarded by a 30s timeout), patches tool config identically to a normal setup, and the subsequentthv llm tokenlaunches the deferred OIDC browser flow (fake browser + mock OIDC) and prints a fresh token.Manual test results (built binary, isolated
HOME/XDG_CONFIG_HOMEsandbox)Sandbox setup: a fake-browser stub for
open/xdg-openthat records any invocation to a log file (so an opened browser is detectable), a fakeclaudebinary +~/.claudedir for tool detection, and the environment secrets provider (no keychain).1.
thv llm setup --lazy— must configure tools without logging in~/.claude/settings.jsonpatched withapiKeyHelper→"…/thv" llm tokenandANTHROPIC_BASE_URL, identical to a normal setup.thv llm config show --format jsonlists the persistedclaude-codeentry inconfigured_tools.2.
thv llm token— must now engage the interactive OIDC flow on a cache missWith no cached token and the issuer pointed at a refused port (so OIDC discovery fails instantly, with no hang and no browser):
OIDC browser flow failed: …error originates from the interactive tier-4 path inpkg/auth/tokensource/tokensource.go. The previous non-interactivethv llm tokennever reached it — it returnedErrTokenRequired(“no cached credentials found; run "thv llm setup" to log in”) immediately. This confirms the deferred-login browser path is now engaged on a cache miss.Coverage note: the full browser-completed deferred login (browser opens → callback → token printed) requires a live OIDC server and is covered by the green E2E suite; reproducing it by hand just rebuilds the mock-OIDC harness. The lazy setup path and the
thv llm tokeninteractive-engagement were both verified manually as above.Does this introduce a user-facing change?
Yes. A new opt-in
--lazyflag onthv llm setupskips the interactive OIDClogin and defers it to first gateway access. Default behavior for interactive
users is unchanged. Separately,
thv llm tokennow launches the OIDC browserflow on a cache miss instead of failing fast — a cached or refreshable token is
still printed without prompting.
Implementation plan
Approved implementation plan
Designed via
/plan-designfor #5386. Four stages, single PR:pkg/llm.Setuplazy support — add trailinglazy bool; when set, skip thelogin()block and print the deferred-login message; tool detection, patching,and persistence still run. Unit test asserts the injected
LoginFuncis notcalled yet tools are persisted.
--lazyflag threaded throughrunLLMSetup→llm.Setup;runLLMTokenswitched to an interactive token source (browser only on a realcache miss); token/setup help text updated; CLI docs regenerated.
a 3-minute
tokenFetchTimeoutconstant so a human can complete the firstinteractive login through the proxy. (Decision: a single generous timeout for
all requests, for simplicity.)
(drives the browser flow to completion via the generalized
runWithOIDCCompletionhelper).Key decision:
thv llm tokenis always interactive rather than flag-gated — the4-tier token source only opens a browser on a genuine cache miss, so this is
transparent for cached/refreshable tokens and needs no extra flag.
Non-goals (per issue): headless/device-code flows; changing the default
interactive
thv llm setupbehavior (lazy is strictly opt-in).Special notes for reviewers
just first-login. A genuinely hung upstream token endpoint now holds a request
up to 3 min instead of 10s. This was the chosen simplicity tradeoff over
first-login-only detection.
the timeout change plus existing proxy tests; an E2E that drives a browser login
through the proxy was deemed optional and not added. The token-helper deferred
path is fully E2E-covered. Happy to add the proxy variant if reviewers want it.
thv llm tokenbehavior change: it is now interactive. This is intended(it's what makes lazy transparent) and does not affect callers that already have
a cached or refreshable token.
Generated with Claude Code