Skip to content

Add http-service chart#333

Merged
tasuku43 merged 44 commits intomasterfrom
feature/add-http-service-chart
Mar 9, 2026
Merged

Add http-service chart#333
tasuku43 merged 44 commits intomasterfrom
feature/add-http-service-chart

Conversation

@tasuku43
Copy link
Contributor

@tasuku43 tasuku43 commented Mar 6, 2026

Summary

Add http-service chart — an opinionated, language-agnostic Helm chart for HTTP services with sensible defaults.

Designed to absorb common boilerplate currently hand-written across Helmfile repositories using the slime chart.

Key features

  • Opinionated defaults: RollingUpdate strategy, PDB enabled, Reloader, Datadog unified service tags, required graceful shutdown and health probes
  • Type-based autoscaling (none / hpa / keda) with mutual exclusivity validation against replicas
  • Argo Rollouts support via workloadRef with automatic HPA/KEDA scaleTargetRef switching
  • Graceful shutdown: trafficDrainSeconds (preStop sleep for LB deregistration) + appShutdownTimeoutSeconds (time after SIGTERM) = auto-calculated terminationGracePeriodSeconds

Testing

helm lint --strict and helm template pass for all 5 example patterns (basic, HPA, Ingress, KEDA, Rollout).

$ for f in examples/*.yaml; do echo "=> $f"; helm lint --strict -f "$f" .; done
=> examples/deployment-hpa.yaml
1 chart(s) linted, 0 chart(s) failed
=> examples/deployment-ingress.yaml
1 chart(s) linted, 0 chart(s) failed
=> examples/deployment-keda.yaml
1 chart(s) linted, 0 chart(s) failed
=> examples/deployment-rollout.yaml
1 chart(s) linted, 0 chart(s) failed
=> examples/deployment.yaml
1 chart(s) linted, 0 chart(s) failed

tasuku43 and others added 30 commits March 5, 2026 12:12
Starting point for the new http-service chart.
Subsequent commits will add opinionated defaults for HTTP services.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
HTTP services don't need CronJob support.
This keeps the chart focused on its use case.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…onfig

The http-service chart assumes a single main container per pod.
- image, containerPort, env, resources, etc. are now top-level values
- Health check probes (startup, liveness, readiness) shown as commented
  samples; startup and liveness are required in the template
- extraContainers escape hatch for sidecar use cases
- initContainers kept as-is for DB migrations and config prep

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This chart always creates a Deployment, so the enabled flag is removed.
Default strategy set to RollingUpdate with maxSurge 25% and
maxUnavailable 1, matching the most common pattern across services.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
PDB is enabled out of the box to protect service availability during
node drains. maxUnavailable (not minAvailable) is used as the default
to remain safe even with single-replica deployments.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace the generic ingresses map with explicit private/public ingress
sections. Each has its own enabled flag, allowing services to declare
which ALB types they need. The template will auto-generate appropriate
ALB annotations for each type.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Commented-out sample showing the required gracefulShutdown.seconds
value. The template will use this to set terminationGracePeriodSeconds
and generate a preStop lifecycle hook for ALB deregistration delay.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Provide nodeAffinity.key and nodeAffinity.value inputs for generating
requiredDuringSchedulingIgnoredDuringExecution node affinity rules.
Concrete defaults (e.g. ARM node groups) are injected via Helmfile
settings, keeping the chart vendor-neutral.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Datadog integration is enabled by default, auto-generating
tags.datadoghq.com/{env,service,version} labels on Deployment and Pod.
When enabled, env/service/version are required values. Services using
alternative observability (e.g. OpenTelemetry) can set enabled to false.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Rename flat annotations/labels/podAnnotations/podLabels to namespaced
deployment.{annotations,labels} and pod.{annotations,labels,securityContext}
for consistency with other resource sections (serviceAccount, PDB, ingress).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The http-service chart always creates a Deployment, so the conditional
guard is removed. All slime helper references updated to http-service.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Align template references with the restructured values.yaml namespaces:
- annotations/labels -> deployment.annotations/labels
- podAnnotations/podLabels -> pod.annotations/labels
- podSecurityContext -> pod.securityContext

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Strategy always has a default value (RollingUpdate), so the conditional
with-guard is replaced with unconditional rendering.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When reloader.enabled is true, the template adds
reloader.stakater.com/auto: "true" to Deployment annotations
for automatic pod restart on ConfigMap/Secret changes.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When datadog.enabled is true, tags.datadoghq.com/{env,service,version}
labels are added to both Deployment metadata and Pod template metadata.
All three values are validated with required to fail early on missing
configuration.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove the generic containers range loop in favor of a single main
container referencing top-level values (image, containerPort, env, etc).
tpl calls removed for simplicity. extraContainers appended after the
main container as an escape hatch for sidecars.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Split into trafficDrainSeconds (preStop sleep for LB deregistration)
and appShutdownTimeoutSeconds (time the app gets after SIGTERM).
terminationGracePeriodSeconds is auto-calculated as their sum.
Added inline documentation explaining the preStop sleep rationale.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Both gracefulShutdown values are required:
- trafficDrainSeconds: preStop sleep waiting for LB deregistration
- appShutdownTimeoutSeconds: time the app gets after SIGTERM
terminationGracePeriodSeconds is auto-calculated as their sum.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Template rendering fails with a clear error message if either probe
is not configured. readinessProbe remains optional.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ration

The chart passes affinity config through as-is via toYaml. Concrete
node affinity rules (e.g. ARM node groups) are injected by Helmfile
settings, keeping the chart vendor-neutral and simple.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove hostNetwork, dnsPolicy, dnsConfig, shareProcessNamespace, and
restartPolicy — none are needed for standard HTTP services.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove tpl wrapper for simplicity. Template expressions within values
are not needed for this opinionated chart.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace the range loop with auto-injected selectorLabels with a simple
toYaml passthrough. Users provide the full constraint spec including
labelSelector, giving more flexibility.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
All three are passthrough fields:
- tolerations: for tainted node groups
- topologySpreadConstraints: for pod spread across topology domains
- extraPodSpec: escape hatch for arbitrary Pod spec fields

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace the private/public split with the original ingresses map.
Private vs public is an infrastructure concern handled by Helmfile,
not the chart. The chart stays vendor-neutral.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove the semverCompare branch for policy/v1beta1 which was removed
in Kubernetes 1.25. Use policy/v1 unconditionally.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Rename slime to http-service in hpa, rbac, configmap, secret, test
- Delete cronjob.yaml (not applicable to HTTP services)
- HPA: use autoscaling/v2 unconditionally (v2beta2 removed in k8s 1.26)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
tasuku43 and others added 11 commits March 5, 2026 18:05
Replace the boolean enabled flag with a type selector. Default is none
(fixed replicas). HPA and KEDA sections are commented out and become
required when their type is selected. minReplicas/maxReplicas also
commented out since they are only needed when autoscaling is active.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Validates autoscaling.type is one of none/hpa/keda. When type is hpa,
renders HPA with required metrics and minReplicas/maxReplicas. behavior
is optional.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When autoscaling.type is keda, renders a ScaledObject with required
triggers and minReplicaCount/maxReplicaCount. pollingInterval and
cooldownPeriod are optional.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Only set fixed replicas when autoscaling.type is none. When hpa or
keda is active, the autoscaler manages replica count.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fail template rendering if replicas is set while autoscaling.type is
hpa or keda. Update replicas comment to reflect type-based autoscaling.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
New rollout.yaml template creates a Rollout resource using workloadRef
to reference the existing Deployment. When rollout.enabled is true,
HPA and KEDA scaleTargetRef automatically switch from Deployment to
Rollout. rollout.strategy is required when enabled.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
CronJob template was deleted from http-service chart.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Rewrite existing examples (deployment, hpa, ingress) to match new
values structure (single container, gracefulShutdown, required probes,
datadog tags, type-based autoscaling).

Add new examples for KEDA autoscaling and Argo Rollouts.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove CronJob lint/test targets. Add lint targets for KEDA and
Rollout examples. Test target only covers the basic deployment
example (helm test requires a running cluster).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Document opinionated defaults, required values, autoscaling types,
Argo Rollouts integration, graceful shutdown design, and examples.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Align with existing pod.* namespace convention (pod.annotations,
pod.labels, pod.securityContext).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
tag: ""
pullPolicy: IfNotPresent

containerPort: 8080 # port exposed by the main container
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Design decision: single container structure

Replaced slime's containers[] array with a single main container defined at the values root (image, containerPort, resources, etc.). HTTP services have one primary container; sidecars go in extraContainers.

# deregistration takes time — during that gap, traffic still arrives.
# The sleep holds the container alive until routing fully stops.
#gracefulShutdown:
# trafficDrainSeconds: 25 # wait for LB deregistration to complete
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Design decision: separate drain vs app shutdown

Split into trafficDrainSeconds (preStop sleep waiting for LB deregistration) and appShutdownTimeoutSeconds (time the app gets after SIGTERM). The chart auto-calculates terminationGracePeriodSeconds = drain + app.

A single gracefulShutdownSeconds value would silently reduce the app's actual shutdown time by the preStop sleep duration, which is not the user's intent.

{{- end }}
spec:
{{- if and (ne .Values.autoscaling.type "none") .Values.replicas }}
{{- fail "replicas should not be set when autoscaling.type is hpa or keda — the autoscaler manages replica count" }}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Design decision: replicas vs autoscaling guard

Setting replicas when autoscaling is active (hpa or keda) causes a template error. This prevents a common misconfiguration where a fixed replica count conflicts with an autoscaler.

{{- toYaml . | nindent 4 }}
{{- end }}
spec:
workloadRef:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Design decision: Rollout uses workloadRef

The Rollout references the Deployment via workloadRef instead of embedding its own Pod template. This avoids duplicating the Pod spec and keeps the Deployment as the single source of truth.

maxUnavailable: 1 # allow at most 1 pod unavailable during disruptions
minAvailable: # PDB minAvailable

affinity: {} # affinity (e.g. node affinity for ARM, set via Helmfile settings)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Design decision: affinity as passthrough

No auto-generated anti-affinity or node affinity rules. The chart passes values through as-is. Environment-specific settings (e.g. ARM node affinity) are injected at the Helmfile layer, keeping the chart vendor-neutral.

#service: "" # required when enabled
#version: "" # required when enabled

pod:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Design decision: pod. namespace convention*

All Pod-level settings grouped under pod. (annotations, labels, securityContext, disruptionBudget). Mirrors the deployment. namespace for Deployment-level settings.

# When enabled, the template auto-generates tags.datadoghq.com/{env,service,version}
# labels on both the Deployment and Pod. Set enabled to false for services
# using alternative observability (e.g. OpenTelemetry).
datadog:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Design decision: Datadog enabled toggle

Defaults to true with required validation on env/service/version to prevent tag omission. Can be set to false for services using alternative observability (e.g. OpenTelemetry).

# periodSeconds: 5
# failureThreshold: 3

extraContainers: [] # additional sidecar containers (escape hatch)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Design decision: escape hatches and no tpl

extraContainers for sidecars and extraPodSpec for arbitrary Pod spec fields. These cover edge cases without adding complexity to the main template.

Also note: this chart does not use tpl anywhere — all values are rendered with plain toYaml to avoid the debugging complexity of nested Go template evaluation.

The type enum check is a general concern, not HPA-specific.
deployment.yaml is always rendered, making it the right place for
global validations alongside the existing replicas guard.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
{{- end }}
spec:
{{- if not (has .Values.autoscaling.type (list "none" "hpa" "keda")) }}
{{- fail "autoscaling.type must be one of: none, hpa, keda" }}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Design decision: type-based autoscaling

Using type: none | hpa | keda instead of a boolean enabled. HPA and KEDA are mutually exclusive (KEDA creates its own HPA), so an enum makes invalid states unrepresentable. Invalid type values produce a template error.

tasuku43 and others added 2 commits March 6, 2026 18:06
- Replace tpl with plain variable/toYaml in configmap, secret, and
  test templates to eliminate all tpl usage from the chart.
- Fix hpa.yaml required message: behavior is optional (wrapped in
  with), so the message should describe the hpa map, not behavior.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Datadog: validate env/service/version once at the top of
  deployment.yaml, use plain .Values references downstream
- Probes: extract startupProbe/livenessProbe into variables
- ScaledObject: extract $keda and $triggers variables, access
  pollingInterval/cooldownPeriod through $keda
- Rollout: extract $strategy variable
- HPA: already done in previous commit ($hpa, $metrics)

Eliminates redundant required calls and inline required+toYaml
nesting throughout all templates.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Contributor

@cw-atkhry cw-atkhry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@tasuku43 tasuku43 merged commit 38609c4 into master Mar 9, 2026
2 checks passed
@tasuku43 tasuku43 deleted the feature/add-http-service-chart branch March 9, 2026 00:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants