feat(helm): Add full probe configuration support for Scheduler #60854

GaneshPatil7517 · 2026-01-21T04:48:39Z

This change extends the scheduler pod's probe configuration to support the full Kubernetes probe specification, bringing it in line with other chart components like workers and webserver.

Changes:

Extended livenessProbe schema to accept httpGet, tcpSocket, grpc handlers
Added readinessProbe support for scheduler (disabled by default)
Extended startupProbe schema with same flexibility
Added 'enabled' flag to all probes for fine-grained control
Updated scheduler-deployment.yaml template to dynamically render probes
Maintains backward compatibility with existing configurations

Fixes #60134

This change extends the scheduler pod's probe configuration to support the full Kubernetes probe specification, bringing it in line with other chart components like workers and webserver. Changes: - Extended livenessProbe schema to accept httpGet, tcpSocket, grpc handlers - Added readinessProbe support for scheduler (disabled by default) - Extended startupProbe schema with same flexibility - Added 'enabled' flag to all probes for fine-grained control - Updated scheduler-deployment.yaml template to dynamically render probes - Maintains backward compatibility with existing configurations Fixes apache#60134

…conflicts

Copilot

Pull request overview

This PR extends the Airflow Helm chart's scheduler component to support the full Kubernetes probe specification, aligning it with other components like workers and webserver. Additionally, it includes a significant refactoring of the workers persistence configuration structure (not mentioned in the PR description).

Changes:

Extended scheduler probe configurations (liveness, readiness, startup) to support all Kubernetes probe handlers (exec, httpGet, tcpSocket, grpc) with enabled flags for fine-grained control
Added readinessProbe support for scheduler (disabled by default)
Refactored workers.persistence configuration to workers.celery.persistence with deprecation of the old structure

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File	Description
chart/values.yaml	Added deprecation notices for workers.persistence, introduced workers.celery.persistence configuration, and extended scheduler probe configurations with full Kubernetes probe spec support including comments and examples
chart/values.schema.json	Updated schema to mark workers.persistence as deprecated, added workers.celery.persistence schema, and extended scheduler probe schemas to include all probe handler types and optional fields like successThreshold and terminationGracePeriodSeconds
chart/templates/scheduler/scheduler-deployment.yaml	Implemented backward compatibility logic for workers.persistence migration, updated probe rendering to support multiple handler types (httpGet, tcpSocket, grpc, exec) with conditional rendering based on enabled flags

Comments suppressed due to low confidence (3)

chart/templates/scheduler/scheduler-deployment.yaml:28

The backward compatibility logic for workers.celery.persistence.enabled has a flaw. When a user upgrades from an old chart version with 'workers.persistence.enabled: false' in their values, but hasn't added 'workers.celery.persistence' to their configuration, Helm will merge their old value with the new chart's default value for 'workers.celery.persistence.enabled: true'. The template will then evaluate the first part of the 'or' expression as true (using the new default), ignoring the user's original 'false' setting.

The 'has' check only helps when the value is nil/unset, but with a boolean default value defined in values.yaml line 1074, the field will never be nil. Consider using a pattern that checks if the old config should take precedence, such as checking if workers.persistence.enabled is explicitly set by examining if it differs from the default, or removing the default value for workers.celery.persistence.enabled to let it be truly optional during the migration period.

{{- $persistence := or .Values.workers.celery.persistence.enabled (and (not (has .Values.workers.celery.persistence.enabled (list true false))) .Values.workers.persistence.enabled) }}

chart/values.yaml:1099

The PR title and description focus exclusively on scheduler probe configuration changes, but this PR also includes significant refactoring of the workers.persistence configuration structure (deprecating workers.persistence in favor of workers.celery.persistence). These persistence-related changes should either be documented in the PR description or separated into their own PR, as they represent a distinct feature change beyond the probe configuration improvements. This affects multiple files and introduces backward compatibility considerations that deserve explicit discussion.

    livenessProbe:
      enabled: true
      initialDelaySeconds: 10
      timeoutSeconds: 20
      failureThreshold: 5
      periodSeconds: 60
      command: ~

    # List of worker sets. Each item can override values from the parent 'workers' section.
    sets: []
    # sets:
    #   - name: highcpu
    #     replicas: 2
    #     queue: "highcpu"
    #     resources:
    #       requests:
    #         memory: "2Gi"
    #         cpu: "4000m"
    #       limits:
    #         memory: "4Gi"
    #         cpu: "8000m"
    #   - name: highmem
    #     replicas: 2
    #     queue: "highmem"
    #     resources:
    #       requests:
    #         memory: "4Gi"
    #         cpu: "2000m"
    #       limits:

chart/values.schema.json:2881

The description states that successThreshold must be 1 for liveness probes, but the schema doesn't enforce this constraint (no enum or const limiting it to 1). While Kubernetes will validate this at deployment time, consider adding schema-level validation (e.g., "const": 1) to catch misconfigurations earlier during values validation for liveness and startup probes, while allowing flexibility for readiness probes where values > 1 are permitted.

                                "storageClassName": {
                                    "description": "If using a custom StorageClass, pass name ref to all StatefulSets here (templated).",
                                    "type": [
                                        "string",

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-21T04:55:13Z

chart/values.schema.json

                            }
+                        },
+                        "persistence": {
+                            "description": "Persistence configuration for Airflow Celery workers.",


The description states that successThreshold must be 1 for startup probes, but the schema doesn't enforce this constraint. While Kubernetes will validate this at deployment time, consider adding schema-level validation (e.g., "const": 1) to catch misconfigurations earlier during values validation.

jscheffl

I understand that technically it is possible to add much more probes than currently supported by the chart. Nevertheless adding more probe options also adds complexity in maintenance and confuse users.

So far I am not convinced because for the scheduler component only certain probes make sense. I my view we should provide options that are meaningful and helpful that a standard setup is operated in good and stable manner per default with some options to tweak. Delivering a swiss army knife w/o proposed guidance is not what I would favor.

Can you please convince me for the proposed PR why and how a grpc check could be an better option to the existing ot how a tcpSocket check can be used for the scheduler that would improve for the admin who deploys?

Copilot AI review requested due to automatic review settings January 21, 2026 04:48

GaneshPatil7517 requested review from hussein-awala, jedcunningham and jscheffl as code owners January 21, 2026 04:48

boring-cyborg bot added the area:helm-chart Airflow Helm Chart label Jan 21, 2026

Copilot started reviewing on behalf of GaneshPatil7517 January 21, 2026 04:49 View session

Merge upstream/main into helm-fix-scheduler-probe-support - resolved …

b7e6bee

…conflicts

Copilot AI reviewed Jan 21, 2026

View reviewed changes

jscheffl reviewed Jan 23, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(helm): Add full probe configuration support for Scheduler #60854

feat(helm): Add full probe configuration support for Scheduler #60854

GaneshPatil7517 commented Jan 21, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jan 21, 2026

Uh oh!

jscheffl left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat(helm): Add full probe configuration support for Scheduler #60854

Are you sure you want to change the base?

feat(helm): Add full probe configuration support for Scheduler #60854

Conversation

GaneshPatil7517 commented Jan 21, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

jscheffl left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants