Skip to content

feat: add session label controller for Workbench session pods#123

Merged
amdove merged 18 commits intomainfrom
label-controller
Apr 14, 2026
Merged

feat: add session label controller for Workbench session pods#123
amdove merged 18 commits intomainfrom
label-controller

Conversation

@amdove
Copy link
Copy Markdown
Contributor

@amdove amdove commented Apr 8, 2026

Description

Adds a SessionGroupLabelReconciler that watches Workbench session pods and injects numbered labels derived from values extracted from a configurable pod field — e.g. user-group-1: some_value, user-group-2: another_value. Labels can be consumed by any downstream tooling that reads pod metadata.

Replaces functionality for: #113
Intends to be more lightweight and supportable.

How it works

The controller reads a configurable field from each session pod, applies a regex to find matching entries, sanitizes them to valid Kubernetes label values, and patches the pod with numbered labels (up to 30). A posit.co/session-group-labels-injected: "true" marker prevents reprocessing.

Configuration is per-site via a new sessionLabels field on the Workbench CRD. Adding it to site.yaml enables the feature for that site; omitting it disables it. PTD automatically enables the controller via Helm when any site has the block configured — no manual flag needed.

Enabling

# site.yaml
spec:
  workbench:
    sessionLabels:
      sourceField: "spec.containers[0].args"  # dot-path into pod spec; supports array index notation
      sourceKey: "--container-user-groups"     # flag name (args) or map key (annotations/labels)
      searchRegex: "_entra_[^ ,]+"            # only matching comma-separated entries become labels
      trimPrefix: "_"                          # stripped from the start of each matched value

All fields are optional — defaults cover the standard Workbench + Entra ID setup.

Reprocessing existing pods

By default, already-processed pods are skipped (the marker label prevents re-reconciliation). To force re-labeling of existing session pods — e.g. after changing searchRegex or trimPrefix — set reprocess: true:

sessionLabels:
  reprocess: true   # re-labels all existing session pods for this site immediately

When reprocess: true is set on a Workbench CR, the controller re-enqueues all existing session pods for that site immediately. Previously written group labels are cleared before the new set is applied, so stale labels from the old config are removed. Set back to false (or omit it) once done — it defaults to false and has no effect otherwise.

Config flow

site.yaml workbench.sessionLabels
  → PTD detects block, sets sessionGroupLabels.enable=true in Helm
  → SiteReconciler propagates to Workbench CR
  → SessionGroupLabelReconciler reads config at reconcile time
  → Pod labels: user-group-1: <matched_value_1>
                user-group-2: <matched_value_2>

Category of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Build: a code change that affects the build system or external dependencies
  • Performance: a code change that improves performance
  • Refactor: a code change that neither fixes a bug nor adds a feature
  • Documentation: documentation changes
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

Checklist

  • I have run just test and all tests pass
  • I have reviewed my own diff and added inline comments on lines I want reviewers to focus on or that I am uncertain about

@claude

This comment was marked as outdated.

Copy link
Copy Markdown

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review of session label controller PR — see detailed feedback in the PR comment.

@amdove
Copy link
Copy Markdown
Contributor Author

amdove commented Apr 9, 2026

@claude review this PR

@claude

This comment was marked as outdated.

@amdove
Copy link
Copy Markdown
Contributor Author

amdove commented Apr 9, 2026

@claude review again please

@claude

This comment was marked as outdated.

@amdove
Copy link
Copy Markdown
Contributor Author

amdove commented Apr 9, 2026

@claude one more time! review please.

@claude

This comment was marked as outdated.

@amdove amdove marked this pull request as ready for review April 9, 2026 22:16
@amdove
Copy link
Copy Markdown
Contributor Author

amdove commented Apr 10, 2026

@claude review again please

@claude
Copy link
Copy Markdown

claude bot commented Apr 10, 2026

Claude finished @amdove's task —— View job


Code Review: Session Group Label Controller

  • Read controller implementation
  • Read controller tests
  • Read API types and CRD changes
  • Read Helm chart and deployment changes
  • Read site controller integration
  • Review cmd/main.go changes
  • Compile findings and post review

Overall this is a well-structured, focused feature. The controller design is clean, the configuration is appropriately flexible, and the test coverage for the extraction/sanitization logic is thorough. The documentation is excellent. A few items worth considering before merge:

Issues

1. Regex compiled on every reconcilesessiongrouplabel_controller.go:218

regexp.Compile(searchRegex) is called inside extractGroupLabels on every reconcile. For pods hitting the same site config, this recompiles the same regex repeatedly. Consider caching the compiled regex (e.g., in a sync.Map keyed by the regex string, or at least a simple field that's invalidated when config changes). Not a blocking issue at current scale, but worth noting.

2. JSON marshal/unmarshal round-trip per podsessiongrouplabel_controller.go:274-281

Every reconcile marshals the entire pod to JSON and unmarshals it back to interface{} just to walk a path. This works but is expensive for a hot path. For the default case (spec.containers[0].args), you could read pod.Spec.Containers[0].Args directly and only fall back to the JSON round-trip for non-standard paths. Not blocking, but a performance improvement opportunity.

3. r.Log used in extractGroupLabels instead of context loggersessiongrouplabel_controller.go:235

The extractGroupLabels method uses r.Log directly for the "group label cap reached" message, but the Reconcile method uses a context-derived logger. This means the cap-reached log won't carry the pod/controller structured values. Minor inconsistency — pass the logger in or use the context.

4. RBAC: list for pods may need wider namespace scopesessiongrouplabel_controller.go:59

The RBAC annotation scopes to namespace=posit-team. The workbenchToSessionPods method lists pods in workbench.Namespace. If any Workbench CR ends up in a different namespace, the list call will fail silently (the method swallows errors by returning nil). Confirm this is always posit-team.

5. Workbench CR RBAC only grants get, but watch needs list + watchsessiongrouplabel_controller.go:60

The controller Watches Workbench CRs (line 421-424), which requires list and watch verbs on the Workbench resource. The RBAC annotation only grants get:

//+kubebuilder:rbac:namespace=posit-team,groups=core.posit.team,resources=workbenches,verbs=get

This should be:

//+kubebuilder:rbac:namespace=posit-team,groups=core.posit.team,resources=workbenches,verbs=get;list;watch

Without list and watch, the informer for Workbench CRs won't be able to start, and the controller will fail at runtime. This is a bug.

Suggestions (non-blocking)

6. sanitizeGroupLabelValue truncation edge casesessiongrouplabel_controller.go:366-369

After truncating to 63 chars, strings.Trim could leave the value shorter. But if the 63rd character happens to be a _, -, or ., the trim removes it, which is correct. However, consider whether the truncated value could end with a multi-byte sequence that gets split. Given the sanitizeInvalidChars regex has already replaced non-ASCII chars, this is safe — just wanted to confirm the reasoning.

7. Test coverage for Reconcile loop is unit-only — The tests cover extractGroupLabels, walkJSONPath, and sanitizeGroupLabelValue well (which is the core logic), but there are no integration tests for the Reconcile method itself (e.g., testing the marker label logic, reprocessing, skipping terminating pods). The unit tests for the extraction logic give good confidence, but an envtest-based test for the reconcile loop would catch RBAC or patching issues early.

8. Consider adding a +kubebuilder:validation:MaxLength for sourceField and sourceKeyworkbench_types.go:153,161

SearchRegex has MaxLength=256, but SourceField and SourceKey have no length constraints. These are user-provided strings that get used in path traversal and flag matching. Adding reasonable limits (e.g., 256) would be consistent.

9. Stale label cleanup prefix match is broadsessiongrouplabel_controller.go:164-168

The cleanup loop deletes any label whose key starts with prefix (default user-group-). If another controller or system also writes labels with the user-group- prefix, they'd be removed. This is unlikely given the prefix is configurable and specific, but documenting this behavior would help.

Looks Good

  • Clean separation: config in CRD types, propagation in site controller, extraction in the session label controller
  • The reprocess toggle with stale label cleanup is well thought out
  • The marker label approach (posit.co/session-group-labels-injected) is a good pattern for idempotency
  • Helm integration is minimal and clean — single flag, auto-detected by PTD
  • Documentation in workbench-configuration.md is thorough with a clear config reference table
  • Deep copy generation is correct for the new SessionLabelsConfig type
  • The workbenchToSessionPods mapper correctly only fires when Reprocess is true

TL;DR: Fix the Workbench RBAC annotation (item 5 — needs list;watch). The rest are suggestions for future improvement. The feature is well-designed and ready to ship after the RBAC fix.

@amdove
Copy link
Copy Markdown
Contributor Author

amdove commented Apr 10, 2026

Claude finished @amdove's task —— View job

Fixed #3 and #5. The rest don't feel critical.

@amdove amdove requested a review from stevenolen April 14, 2026 17:29
Copy link
Copy Markdown
Collaborator

@stevenolen stevenolen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good!

@amdove amdove added this pull request to the merge queue Apr 14, 2026
Merged via the queue into main with commit f0f8146 Apr 14, 2026
8 checks passed
@amdove amdove deleted the label-controller branch April 14, 2026 20:08
ian-flores pushed a commit that referenced this pull request Apr 14, 2026
# [1.23.0](v1.22.0...v1.23.0) (2026-04-14)

### Bug Fixes

* only tag docker release after corresponding image is pushed ([#122](#122)) ([c1532c0](c1532c0))

### Features

* add session label controller for Workbench session pods ([#123](#123)) ([f0f8146](f0f8146))
amdove added a commit that referenced this pull request Apr 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants