Skip to content

feat(iam): Implement OIDC discovery and JWKS-based JWT validation#49

Merged
intel352 merged 3 commits intomainfrom
copilot/implement-oidc-operations
Feb 22, 2026
Merged

feat(iam): Implement OIDC discovery and JWKS-based JWT validation#49
intel352 merged 3 commits intomainfrom
copilot/implement-oidc-operations

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Feb 22, 2026

OIDCProvider was a stub that only validated config JSON — no discovery, no JWKS fetch, no token verification. This replaces it with a full OIDC implementation.

Core changes

  • OIDCConfig — adds optional jwks_url and discovery_url for non-standard providers
  • OIDCProvider — adds injectable HTTPClient (defaults to http.DefaultClient) and thread-safe in-memory caches (1-hour TTL) for discovery docs and JWKS keys
  • OIDC discoverydiscover() fetches {issuer}/.well-known/openid-configuration; result is cached per issuer
  • JWKS / JWT validation — fetches and caches RSA public keys; keyFunc auto-refreshes on unknown kid to handle key rotation; validates RS256/384/512 signatures plus iss, aud, exp, iat claims
  • ResolveIdentities() — when id_token is present in credentials, performs full JWT validation and extracts the configured claim; falls back to direct claim extraction otherwise (backward-compatible)
  • TestConnection() — performs real OIDC discovery and verifies JWKS reachability instead of just parsing config JSON

Example

// Wire up a real OIDC provider with an injectable client for testing
p := &OIDCProvider{HTTPClient: testServer.Client()}

cfg, _ := json.Marshal(OIDCConfig{
    Issuer:   "https://auth.example.com",
    ClientID: "my-app",
    ClaimKey: "email",
})

ids, err := p.ResolveIdentities(ctx, cfg, map[string]string{
    "id_token": "eyJhbGci...", // validated against JWKS
})

Tests (iam/oidc_test.go)

Uses httptest.NewServer with in-process RSA key generation to cover: valid token, custom claim, expired token, wrong audience/issuer, invalid signature, JWKS key rotation, and TestConnection success/failure.

Original prompt

Problem

The iam/oidc.go file contains an OIDCProvider struct that is a stub — it validates the configuration JSON format but performs no actual OIDC operations: no discovery, no JWKS fetching, no JWT signature verification, no token validation.

Current Code

// iam/oidc.go

// OIDCProvider maps OIDC claims to roles.
// This is a stub implementation that validates config format but does not make
// actual OIDC discovery or token validation calls.
type OIDCProvider struct{}

func (p *OIDCProvider) ValidateConfig(config json.RawMessage) error {
    var c OIDCConfig
    if err := json.Unmarshal(config, &c); err != nil {
        return fmt.Errorf("invalid oidc config: %w", err)
    }
    if c.Issuer == "" {
        return fmt.Errorf("issuer is required")
    }
    if c.ClientID == "" {
        return fmt.Errorf("client_id is required")
    }
    return nil
}

func (p *OIDCProvider) ResolveIdentities(_ context.Context, config json.RawMessage, credentials map[string]string) ([]ExternalIdentity, error) {
    // ... extracts claim from credentials map directly, no token validation
}

func (p *OIDCProvider) TestConnection(_ context.Context, config json.RawMessage) error {
    return p.ValidateConfig(config) // Just validates JSON format, no network call
}

Requirements

  1. Add OIDC/JWKS dependencies: github.com/MicahParks/keyfunc/v2 (for JWKS) and github.com/golang-jwt/jwt/v5 (for JWT parsing). Alternatively, use github.com/coreos/go-oidc/v3 which provides a complete OIDC client.

  2. Implement OIDC Discovery in TestConnection() and/or a new Discover() method:

    • Fetch {issuer}/.well-known/openid-configuration.
    • Parse the discovery document to extract jwks_uri, authorization_endpoint, token_endpoint, userinfo_endpoint.
    • Cache the discovery document for a configurable TTL (e.g., 1 hour).
  3. Implement JWKS-based token validation in ResolveIdentities():

    • Fetch the JWKS from jwks_uri discovered above.
    • Cache the JWKS with automatic refresh on key rotation.
    • Parse and validate the JWT id_token:
      • Verify signature against JWKS.
      • Verify iss matches the configured issuer.
      • Verify aud contains the configured client_id.
      • Verify exp is not in the past.
      • Verify iat is not too far in the future.
    • Extract claims and return ExternalIdentity with the configured claim_key value.
  4. Implement TestConnection() properly:

    • Perform actual OIDC discovery (fetch .well-known/openid-configuration).
    • Verify the JWKS endpoint is reachable.
    • Return an error if discovery fails or JWKS is unreachable.
  5. Make the HTTP client injectable for testing:

    • Accept an *http.Client or HTTPClient interface so tests can use httptest.NewServer.
  6. Update OIDCConfig if needed:

    type OIDCConfig struct {
        Issuer   string `json:"issuer"`
        ClientID string `json:"client_id"`
        ClaimKey string `json:"claim_key"`
        // Optional: for non-standard providers
        JWKSURL          string `json:"jwks_url,omitempty"`
        DiscoveryURL     string `json:"discovery_url,omitempty"`
    }
  7. Remove the stub label from the struct comment.

  8. Write tests in iam/oidc_test.go:

    • Use httptest.NewServer to mock the OIDC discovery and JWKS endpoints.
    • Generate test RSA keys, create a JWKS document, and sign test JWTs.
    • Test successful token validation with valid JWT.
    • Test rejection of expired tokens.
    • Test rejection of tokens with wrong audience.
    • Test rejection of tokens with wrong issuer.
    • Test rejection of tokens with invalid signature.
    • Test TestConnection with valid and invalid discovery endpoints.
    • Test JWKS key rotation (new key ID works after refresh).
  9. Ensure all existing tests pass (go test ./...).

This pull request was created from Copilot chat.


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Co-authored-by: intel352 <77607+intel352@users.noreply.github.com>
Copilot AI changed the title [WIP] Add OIDC discovery and token validation to OIDCProvider feat(iam): Implement OIDC discovery and JWKS-based JWT validation Feb 22, 2026
Copilot AI requested a review from intel352 February 22, 2026 19:54
@intel352 intel352 marked this pull request as ready for review February 22, 2026 20:16
@intel352 intel352 merged commit c363564 into main Feb 22, 2026
12 checks passed
@intel352 intel352 deleted the copilot/implement-oidc-operations branch February 22, 2026 20:23
intel352 added a commit that referenced this pull request Apr 24, 2026
…egistry, typed gRPC args, migrate image, teardown

Five features bundled into v0.19.0 for shared config-file shape (wfctl.yaml +
.wfctl-lock.yaml) and release boundary. Each addresses architectural debt
surfaced during BMW tonight's deploy blocker chain.

Features:
- A. Plugin manifest + lockfile split (tasks #42/#43)
- B. Multi-registry + IaCProvider.EnsureRegistryAuth (task #48)
- C. Typed-args refactor for IaCProvider gRPC (task #41)
- D. Official workflow-migrate Docker image (task #49)
- E. wfctl infra teardown with mandatory dry-run + --approve flag (new)

Non-goals: constraint-based plugin resolution (v0.20.0), transitive plugin
deps, OCI chart/artifact registries, cross-registry mirroring.

Autonomous pipeline target: v0.19.0 after BMW post-teardown stabilizes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
intel352 added a commit that referenced this pull request Apr 24, 2026
* docs: v0.19.0 architectural cleanup design — plugin manifest, multi-registry, typed gRPC args, migrate image, teardown

Five features bundled into v0.19.0 for shared config-file shape (wfctl.yaml +
.wfctl-lock.yaml) and release boundary. Each addresses architectural debt
surfaced during BMW tonight's deploy blocker chain.

Features:
- A. Plugin manifest + lockfile split (tasks #42/#43)
- B. Multi-registry + IaCProvider.EnsureRegistryAuth (task #48)
- C. Typed-args refactor for IaCProvider gRPC (task #41)
- D. Official workflow-migrate Docker image (task #49)
- E. wfctl infra teardown with mandatory dry-run + --approve flag (new)

Non-goals: constraint-based plugin resolution (v0.20.0), transitive plugin
deps, OCI chart/artifact registries, cross-registry mirroring.

Autonomous pipeline target: v0.19.0 after BMW post-teardown stabilizes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: v0.19.0 design — add Features F, G, H (outputs, verify, secret sinks)

Scope expanded from 5 to 7 features per user feedback on BMW CI gap audit:
- F. wfctl infra outputs with masked-by-default sensitivity + GHA ::add-mask::
- G. wfctl deploy verify with multi-target healthcheck + retry/timeout gate
- H. Declarative secret sinks (outputs.<field>.sinks[]) — plaintext never
  leaves wfctl process; built-in github_secret + github_env handlers;
  aws/gcp/azure sinks via plugin fan-out in v0.19.x

Motivation: BMW's Capture staging DB URL step uses doctl + awk + gh secret
set shell pipeline, leaking DATABASE_URL plaintext through stdout/env/argv.
Declarative sink pattern (like terraform's output-to-secret-manager) writes
the value in-process directly to the GitHub secrets API with libsodium
encryption. Matches user's stated principle: "if BMW CI has provider-specific
shell, fix it in workflow/wfctl so the CI stays declarative."

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: v0.19.0 implementation plan — 7 features × 9 phases

Matches design doc 2026-04-24-v0.19.0-architectural-cleanup-design.md:
- Phase 1 alpha.1: Feature A (plugin manifest + lockfile)
- Phase 2 alpha.2: Feature C client-side (typed gRPC args)
- Phase 3 (DO plugin v0.8.0): Feature C server-side + integration tests
- Phase 4 alpha.3: Feature B (multi-registry)
- Phase 5 (DO plugin v0.8.1): Feature B server-side (EnsureRegistryAuth)
- Phase 6a rc1: Feature D (workflow-migrate image)
- Phase 6b rc2: Feature E (wfctl infra teardown)
- Phase 6c rc3: Features F + G + H (outputs + verify + sinks)
- Phase 7: v0.19.0 final + changelog + docs
- Phase 8: Plugin fan-out (aws/gcp/azure/tofu) in parallel
- Phase 9: BMW migration PR (after v0.19.0 stabilizes)

Timing: all phases can merge independently; final v0.19.0 tag and Phase 9
hold until BMW's tonight deploy chain reaches prod /healthz green (task #26).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: address PR #474 review — reconcile feature count, flag naming, source task column

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
intel352 added a commit that referenced this pull request Apr 24, 2026
…nfra_output (#476)

* docs: v0.19.0 architectural cleanup design — plugin manifest, multi-registry, typed gRPC args, migrate image, teardown

Five features bundled into v0.19.0 for shared config-file shape (wfctl.yaml +
.wfctl-lock.yaml) and release boundary. Each addresses architectural debt
surfaced during BMW tonight's deploy blocker chain.

Features:
- A. Plugin manifest + lockfile split (tasks #42/#43)
- B. Multi-registry + IaCProvider.EnsureRegistryAuth (task #48)
- C. Typed-args refactor for IaCProvider gRPC (task #41)
- D. Official workflow-migrate Docker image (task #49)
- E. wfctl infra teardown with mandatory dry-run + --approve flag (new)

Non-goals: constraint-based plugin resolution (v0.20.0), transitive plugin
deps, OCI chart/artifact registries, cross-registry mirroring.

Autonomous pipeline target: v0.19.0 after BMW post-teardown stabilizes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: v0.19.0 design — add Features F, G, H (outputs, verify, secret sinks)

Scope expanded from 5 to 7 features per user feedback on BMW CI gap audit:
- F. wfctl infra outputs with masked-by-default sensitivity + GHA ::add-mask::
- G. wfctl deploy verify with multi-target healthcheck + retry/timeout gate
- H. Declarative secret sinks (outputs.<field>.sinks[]) — plaintext never
  leaves wfctl process; built-in github_secret + github_env handlers;
  aws/gcp/azure sinks via plugin fan-out in v0.19.x

Motivation: BMW's Capture staging DB URL step uses doctl + awk + gh secret
set shell pipeline, leaking DATABASE_URL plaintext through stdout/env/argv.
Declarative sink pattern (like terraform's output-to-secret-manager) writes
the value in-process directly to the GitHub secrets API with libsodium
encryption. Matches user's stated principle: "if BMW CI has provider-specific
shell, fix it in workflow/wfctl so the CI stays declarative."

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: v0.19.0 implementation plan — 7 features × 9 phases

Matches design doc 2026-04-24-v0.19.0-architectural-cleanup-design.md:
- Phase 1 alpha.1: Feature A (plugin manifest + lockfile)
- Phase 2 alpha.2: Feature C client-side (typed gRPC args)
- Phase 3 (DO plugin v0.8.0): Feature C server-side + integration tests
- Phase 4 alpha.3: Feature B (multi-registry)
- Phase 5 (DO plugin v0.8.1): Feature B server-side (EnsureRegistryAuth)
- Phase 6a rc1: Feature D (workflow-migrate image)
- Phase 6b rc2: Feature E (wfctl infra teardown)
- Phase 6c rc3: Features F + G + H (outputs + verify + sinks)
- Phase 7: v0.19.0 final + changelog + docs
- Phase 8: Plugin fan-out (aws/gcp/azure/tofu) in parallel
- Phase 9: BMW migration PR (after v0.19.0 stabilizes)

Timing: all phases can merge independently; final v0.19.0 tag and Phase 9
hold until BMW's tonight deploy chain reaches prod /healthz green (task #26).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: address PR #474 review — reconcile feature count, flag naming, source task column

* docs: v0.18.9 phase-continuation design — env-resolution consistency

BMW deploy run 24888583717 created a duplicate DO App Platform app because
wfctl infra apply used env-resolved name "bmw-staging" while wfctl ci run
--phase deploy used base module name "bmw-app". Both paths call driver.Read
by name; with different names they find different resources (or none) and
each calls Create, producing duplicates.

Root cause: cmd/wfctl/deploy_providers.go:769 reads m.Name directly after
ResolveForEnv has been applied. Same class as v0.18.7's Task #32 fix but
in the deploy-phase code path.

Fix: refactor resolveModCfg closure to return *ResolvedModule, use
resolved.Name at call sites. Audit + patch infra_output source resolution
(task #56) with the same pattern. Ship as v0.18.9.

Does not require state-sharing between IaC and CI phases; the bug is about
names, not state. Both phases use driver.Read by name; aligning the names
aligns the lookups.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: v0.18.9 phase-continuation implementation plan

9 tasks across Phase 1 (core fixes: deploy_providers.go + infra_secrets.go
+ regression tests) and Phase 2 (release + BMW unblock: PR, merge, tag,
BMW bump, teardown, redeploy).

Same-class fix as v0.18.7 Task #32: env-resolved Name used consistently
wherever modules are consumed. Target: v0.18.9 hotfix; unblocks BMW
staging deploy from run 24888583717 duplicate-resource failure.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(wfctl): ci run deploy uses env-resolved module name (not base)

Refactored resolveModCfg closure in deploy_providers.go to return
*config.ResolvedModule so callers see both resolved.Name (env-override
lifted from Config["name"]) and resolved.Config. All three call sites
(iac.provider lookup, findByType, fallback loop) now read resolved.Name
instead of m.Name.

Same class as v0.18.7 Task #32 fix for ResourceSpec.Name — env override
of Config["name"] was lifted into ResolvedModule.Name but deploy_providers.go
read m.Name directly, ignoring the override. Caused BMW deploy run
24888583717 to create duplicate DO apps (bmw-app vs bmw-staging).

Regression tested via:
- TestPluginDeployProvider_UsesEnvResolvedName (new, was failing)
- TestPluginDeployProvider_FallsBackToModuleNameWhenNoEnv (new, baseline)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(wfctl): infra_output source module name flows through env resolution

Introduces resolveInfraOutput(wfCfg, source, envName, stateOutputs)
which translates the base module name in a "module.field" source string
to its env-resolved name before looking up state. State is persisted
under the env-resolved name (e.g. "bmw-staging-db"), so "bmw-database.uri"
with --env staging now correctly finds the state entry.

syncInfraOutputSecrets now accepts wfCfg and envName so the new
resolution is applied for every infra_output secret in the generate list.
The call site in infra.go (runInfraApply) loads the workflow config and
passes it through.

Closes task #56. Regression tested via:
- TestInfraOutput_EnvResolvesModuleSource (new, was failing)
- TestInfraOutput_NoEnvUsesBaseName (new, baseline)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: CHANGELOG v0.18.9 entry

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(wfctl): stateKeys actually sorts keys (comment matched implementation)

Agent-Logs-Url: https://github.com/GoCodeAlone/workflow/sessions/a0429849-a053-4485-914d-ccb115be94e8

Co-authored-by: intel352 <77607+intel352@users.noreply.github.com>

* fix(wfctl): address 4 Copilot round-1 findings on v0.18.9 (#476)

- resolveInfraOutput: ResolveForEnv ok=false now errors (config error)
  instead of silently falling back to base module name — prevents
  the env-resolution fix from being bypassed on misconfigured envs
- stateKeys: add sort.Strings so error messages list available modules
  in deterministic order (comment already said "sorted")
- infra.go: surface config.LoadFromFile error instead of discarding it —
  silent failure would regress env resolution to the pre-fix nil-wfCfg path
- CHANGELOG: replace "Closes task #60" (ambiguous GitHub issue ref) with
  "Root cause from BMW deploy run 24888583717"

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(wfctl): accurate error message + test for explicitly-disabled module in resolveInfraOutput

Agent-Logs-Url: https://github.com/GoCodeAlone/workflow/sessions/3accbfdf-259b-4b98-a44e-8b538d3f5857

Co-authored-by: intel352 <77607+intel352@users.noreply.github.com>

* fix(wfctl): gate LoadFromFile on envName + infra_output presence (#476)

Skip config.LoadFromFile when env resolution is not needed:
- envName="" → no env resolution, wfCfg=nil is correct
- no infra_output generators → syncInfraOutputSecrets ignores wfCfg

Avoids unnecessary file I/O on every infra apply when the caller
has no infra_output secrets or is not running with --env.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: intel352 <77607+intel352@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants