Remove token-refresher sidecar, generate tokens in-process#971
Merged
Conversation
There was a problem hiding this comment.
9 issues found across 28 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="cmd/kelos-spawner/main_test.go">
<violation number="1" location="cmd/kelos-spawner/main_test.go:2254">
P2: This injected `TokenResolver` bypasses the env-var fallback, so the test no longer verifies `GITHUB_TOKEN`-based reporting auth.</violation>
</file>
<file name="cmd/kelos-webhook-server/main.go">
<violation number="1" location="cmd/kelos-webhook-server/main.go:55">
P1: Do not use secret env vars as `flag` defaults; they can be echoed in help/usage output and leak the token or private key.</violation>
</file>
<file name="cmd/kelos-spawner/reconciler.go">
<violation number="1" location="cmd/kelos-spawner/reconciler.go:81">
P2: Don't silently skip reporting when reporting is enabled but no token resolver is configured.</violation>
<violation number="2" location="cmd/kelos-spawner/reconciler.go:93">
P2: Returning an empty token here turns resolver failures into repeated unauthenticated GitHub write attempts for every task.</violation>
</file>
<file name="internal/controller/workspace_controller_test.go">
<violation number="1" location="internal/controller/workspace_controller_test.go:91">
P2: This test now covers only `GITHUB_APP_ID`, so it can miss regressions where the other required GitHub App env vars are absent or the deleted token-file wiring is still present.
(Based on your team's feedback about adding or extending tests for new behavior.) [FEEDBACK_USED]</violation>
</file>
<file name="cmd/kelos-spawner/main.go">
<violation number="1" location="cmd/kelos-spawner/main.go:202">
P2: Check `ts.Spec.Suspend` before resolving a GitHub token. As written, suspended GitHub-backed spawners can still mint tokens and fail the cycle instead of skipping cleanly.</violation>
<violation number="2" location="cmd/kelos-spawner/main.go:665">
P1: Do not return a nil resolver on invalid GitHub App credentials. That silently falls back to unauthenticated GitHub requests instead of failing fast on a bad secret.</violation>
</file>
<file name="cmd/ghproxy/main.go">
<violation number="1" location="cmd/ghproxy/main.go:388">
P2: Bound GitHub App token generation with a timeout. Using `context.Background()` here lets a stalled token endpoint hang proxied requests indefinitely.</violation>
<violation number="2" location="cmd/ghproxy/main.go:410">
P1: Avoid using env-derived defaults for secret flags; Go's usage output will print those values when help/usage is shown.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| } | ||
|
|
||
| src, err := buildSourceWithProxy(&ts, githubOwner, githubRepo, ghProxyURL, githubAPIBaseURL, githubTokenFile, jiraBaseURL, jiraProject, jiraJQL, httpClient) | ||
| src, err := buildSourceWithProxy(ctx, &ts, githubOwner, githubRepo, ghProxyURL, githubAPIBaseURL, tokenResolver, jiraBaseURL, jiraProject, jiraJQL, httpClient) |
There was a problem hiding this comment.
P2: Check ts.Spec.Suspend before resolving a GitHub token. As written, suspended GitHub-backed spawners can still mint tokens and fail the cycle instead of skipping cleanly.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At cmd/kelos-spawner/main.go, line 202:
<comment>Check `ts.Spec.Suspend` before resolving a GitHub token. As written, suspended GitHub-backed spawners can still mint tokens and fail the cycle instead of skipping cleanly.</comment>
<file context>
@@ -170,27 +179,27 @@ func runReportingCycle(ctx context.Context, cl client.Client, key types.Namespac
}
- src, err := buildSourceWithProxy(&ts, githubOwner, githubRepo, ghProxyURL, githubAPIBaseURL, githubTokenFile, jiraBaseURL, jiraProject, jiraJQL, httpClient)
+ src, err := buildSourceWithProxy(ctx, &ts, githubOwner, githubRepo, ghProxyURL, githubAPIBaseURL, tokenResolver, jiraBaseURL, jiraProject, jiraJQL, httpClient)
if err != nil {
return fmt.Errorf("building source: %w", err)
</file context>
There was a problem hiding this comment.
2 issues found across 8 files (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="internal/githubapp/token.go">
<violation number="1" location="internal/githubapp/token.go:248">
P1: Recompute the expiry check after the refresh attempt fails; using the pre-request timestamp can return an already expired cached token.</violation>
</file>
<file name="internal/githubapp/token_test.go">
<violation number="1" location="internal/githubapp/token_test.go:392">
P3: This test never verifies that the refresh was attempted, so an implementation that incorrectly skips the refresh path would still pass.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
271170d to
0b7b81e
Compare
Replace the token-refresher sidecar binary with in-process GitHub App token generation using a shared TokenProvider. This eliminates an entire binary, simplifies pod specs, and saves resources on every GitHub App spawner/ghproxy pod. - Add githubapp.TokenProvider: thread-safe, caches tokens with 5-min expiry margin, falls back to cached token on transient refresh errors, 10-second default HTTP timeout on the token client - Deployment builders inject GITHUB_APP_* env vars from the workspace secret instead of running a sidecar with shared volumes - Spawner, ghproxy, and webhook server accept GitHub App credentials as command-line flags (with env var fallback for K8s compatibility); secrets are never exposed in flag defaults or -help output - Replace GitHubReporter.TokenFile with TokenFunc for dynamic token resolution without filesystem I/O - ghproxy token resolver accepts request context so refresh is bounded by request lifetime - Error explicitly when GitHub reporting is enabled but no token resolver is configured - Delete cmd/kelos-token-refresher, its Dockerfile, and all associated controller flags, Helm values, and CLI options - Retain init container / volume reconciliation to clean up existing deployments during upgrade
0b7b81e to
7883042
Compare
kelos-bot bot
pushed a commit
that referenced
this pull request
Apr 12, 2026
…riable Based on review feedback from recent PRs: - PR #971: code review flagged P1 security issue where os.Getenv() for secrets was used as flag.StringVar default in two binaries. Go's flag package prints defaults in usage output, leaking secret values. Added as convention to AGENTS.md, agentconfig, workers config, and reviewer security checklist. - PR #965: human reviewer flagged that .Branch template variable is empty for issue_comment events from issues (not PRs). Clarified the template variable table in self-development/README.md. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
kelos-bot bot
pushed a commit
that referenced
this pull request
Apr 14, 2026
…eedback Adds conventions learned from recent PR reviews: 1. Fail fast on invalid configuration (PR #971): three P1 and four P2 issues flagged silent degradation when credentials or config were invalid, falling back to unauthenticated requests instead of erroring. 2. No manual PR branch checkout in TaskSpawner prompts (PR #974): Kelos already checks out the PR branch automatically; manual checkout instructions are redundant and confusing. Also carries forward the previously proposed changes from PR #786: - os.Getenv() secret-in-flag-defaults convention (PR #971) - TaskSpawner creation conventions (PR #965) - Branch template variable documentation fix (PR #965) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What type of PR is this?
/kind cleanup
What this PR does / why we need it:
Removes the
kelos-token-refreshersidecar binary entirely. Spawner and ghproxy pods now generate GitHub App installation tokens in-process using a sharedgithubapp.TokenProvider, eliminating an extra container, two volumes, and a separate Docker image from every GitHub App pod.Key changes:
internal/githubapp: NewTokenProvider— thread-safe, caches tokens with a 5-minute expiry margin, accepts*TokenClientand*Credentialsat the constructor level (no env-var reads inside the package)GITHUB_APP_ID,GITHUB_APP_INSTALLATION_ID,GITHUB_APP_PRIVATE_KEYenv vars injected from the workspace secret--github-app-id, etc.) defaulting toos.Getenv(…)for backward-compatible K8s deployments. Token resolver is created inmain()and passed down — no credential reads inside library packagesinternal/reporting:GitHubReporter.TokenFilereplaced withTokenFunc func() stringfor dynamic token resolution without filesystem I/Ocmd/kelos-token-refresher/, its Dockerfile, all--token-refresher-*controller flags, Helm values, and CLI install optionsInitContainers/Volumesdiffs so existing deployments are cleaned up automaticallyWhich issue(s) this PR is related to:
N/A
Special notes for your reviewer:
GITHUB_API_BASE_URLenv var is also injected for GHES so theTokenProvidercalls the correct API endpointDoes this PR introduce a user-facing change?
🤖 Generated with Claude Code
Summary by cubic
Removed the
kelos-token-refreshersidecar and now generate GitHub App tokens in-process for spawner,ghproxy, and the webhook. This simplifies deployments, cuts per-pod resources, and makes token refresh request-aware with a 10s default HTTP timeout.New Features
internal/githubapp.TokenProvider(thread-safe; caches with a 5‑min expiry margin; falls back to cached token on refresh errors;NewTokenClientuses a 10s timeout).ghproxytoken resolver now acceptscontext.Context, so refresh is bound to each request.ghproxy, and webhook accept PATs via--github-tokenand GitHub App creds via flags/env; a resolver is built inmain()(secrets read after flag parsing and not shown in--help).TokenFuncininternal/reporting.GitHubReporter; spawner errors if reporting is enabled without a token resolver.Migration
cmd/kelos-token-refresherand removed all--token-refresher-*controller/CLI options and Helm values (images, docs, and dev workflow updated).--github-token-fileacross spawner,ghproxy, and webhook. Use--github-tokenor GitHub App flags/env (GITHUB_APP_ID,GITHUB_APP_INSTALLATION_ID,GITHUB_APP_PRIVATE_KEY); for GHES set--github-api-base-url.Written for commit 7883042. Summary will update on new commits.