fix: increase readiness probe timeout and period for all products by Lytol · Pull Request #136 · posit-dev/team-operator

Lytol · 2026-05-07T22:05:51Z

Summary

Raises TimeoutSeconds from 1s → 3s and PeriodSeconds from 3s → 5s on readiness probes for Connect, Workbench, PackageManager, and Flightdeck
The 1s timeout was causing false readiness failures on healthy pods under load spikes (observed on Workbench)
Liveness probes and all other probe fields (thresholds, initial delay) are unchanged
No API/CRD changes — purely controller defaults; existing deployments will pick up the new values automatically on the next reconcile via a rolling update

Note that I also fixed the CLAUDE.md to use the appropriate nesting for worktree paths.

Test plan

go test ./... passes (pre-existing failures in internal/controller/core are unrelated — missing kubebuilder etcd binary locally, reproduce identically on unmodified main)
go vet ./... clean
Deploy to a test cluster and confirm readiness probes reflect new values (kubectl describe pod)
Verify no false readiness failures under normal and loaded conditions

Raise TimeoutSeconds from 1→3 and PeriodSeconds from 3→5 across Connect, Workbench, PackageManager, and Flightdeck. The 1s timeout was causing false readiness failures under load spikes on otherwise healthy pods.

../../.worktrees resolves to Projects/.worktrees, not ptd-workspace/.worktrees. Correct to ../.worktrees throughout.

claude · 2026-05-07T22:06:17Z

Claude finished @Lytol's task —— View job

Review Complete

Read review guidelines
Analyze diff and changed files
Submit review

Submitted a review via the GitHub review system. The changes look good — the timeout/period increases are reasonable for addressing false readiness failures under load. Left one non-blocking suggestion about extracting the duplicated magic numbers into shared constants.

claude

Review Summary

Changes verified: The diff correctly updates TimeoutSeconds (1→3) and PeriodSeconds (3→5) on readiness probes for Connect, Workbench, PackageManager, and Flightdeck. Liveness probes are untouched. The CLAUDE.md changes fix worktree paths (unrelated but fine to bundle).

Controller checklist:

✅ Reconciliation remains idempotent — these are static defaults applied every reconcile
✅ No API/CRD changes required — values are hardcoded in controller logic
✅ Existing deployments pick up new values on next rolling update
✅ No security concerns (probe config only)

One suggestion (non-blocking):

The timeout and period values are now duplicated as magic numbers across four controllers (connect.go:731-732, workbench.go:934-935, package_manager.go:582-583, flightdeck_controller.go:349-350). Consider extracting shared constants like:

const (
    defaultReadinessTimeoutSeconds int32 = 3
    defaultReadinessPeriodSeconds  int32 = 5
)

This would make future tuning a one-line change and make intent clearer. The codebase already uses this pattern for defaultWorkbenchReadinessProbePath.

Note on FailureThreshold differences: Connect and Workbench use FailureThreshold: 3 while PackageManager and Flightdeck use FailureThreshold: 2. This is pre-existing and likely intentional (heavier products get more chances), but worth calling out for visibility.

LGTM — the values are reasonable for reducing false failures under load while still detecting genuinely unhealthy pods within ~15-25s.

…tants

## [1.24.1](v1.24.0...v1.24.1) (2026-05-07) ### Bug Fixes * increase readiness probe timeout and period for all products ([#136](#136)) ([fe3803e](fe3803e))

Lytol added 2 commits May 7, 2026 15:05

fix: increase readiness probe timeout and period for all products

f083632

Raise TimeoutSeconds from 1→3 and PeriodSeconds from 3→5 across Connect, Workbench, PackageManager, and Flightdeck. The 1s timeout was causing false readiness failures under load spikes on otherwise healthy pods.

docs: fix worktree path in CLAUDE.md

d69b61b

../../.worktrees resolves to Projects/.worktrees, not ptd-workspace/.worktrees. Correct to ../.worktrees throughout.

claude Bot reviewed May 7, 2026

View reviewed changes

Lytol marked this pull request as ready for review May 7, 2026 22:08

amdove previously approved these changes May 7, 2026

View reviewed changes

refactor: extract readiness probe timeout and period into shared cons…

6f25e8f

…tants

Lytol dismissed amdove’s stale review via 6f25e8f May 7, 2026 22:11

Lytol enabled auto-merge May 7, 2026 22:13

amdove approved these changes May 7, 2026

View reviewed changes

Lytol added this pull request to the merge queue May 7, 2026

Merged via the queue into main with commit fe3803e May 7, 2026
7 checks passed

Lytol deleted the probe-readiness-timeout branch May 7, 2026 22:19

ian-flores pushed a commit that referenced this pull request May 7, 2026

chore(release): 1.24.1 [skip ci]

00ee87b

## [1.24.1](v1.24.0...v1.24.1) (2026-05-07) ### Bug Fixes * increase readiness probe timeout and period for all products ([#136](#136)) ([fe3803e](fe3803e))

stevenolen mentioned this pull request May 15, 2026

chore: bump default team-operator chart version to v1.25.0 posit-dev/ptd#289

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: increase readiness probe timeout and period for all products#136

fix: increase readiness probe timeout and period for all products#136
Lytol merged 3 commits into
mainfrom
probe-readiness-timeout

Lytol commented May 7, 2026 •

edited

Loading

Uh oh!

claude Bot commented May 7, 2026 •

edited

Loading

Uh oh!

claude Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Lytol commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

claude Bot commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review Complete

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Review Summary

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Lytol commented May 7, 2026 •

edited

Loading

claude Bot commented May 7, 2026 •

edited

Loading