Attune

Safe, in-place Kubernetes pod resource right-sizing. VPA done right.

Attune is a Kubernetes operator that automatically right-sizes pod resource requests and limits using In-Place Pod Resize (beta in Kubernetes 1.33+, alpha with feature gate in 1.32). In-place by default, optional eviction fallback for infeasible resizes, and no HPA conflicts.

Why

Problem	Impact
Average CPU utilization is 8%	Billions wasted industry-wide (CAST AI 2026)
70% cite overprovisioning as #1 cost driver	Resources allocated "just in case" never reclaimed (CNCF 2023)
<1% run VPA fully automated	VPA evicts pods, conflicts with HPA, causes outages (ScaleOps 2026)
In-Place Pod Resize is beta (K8s 1.33+, alpha in 1.32)	The foundation for non-disruptive right-sizing now exists

How It's Different

	VPA	Goldilocks	Attune
Resize method	Evicts pods	No resize (recommend only)	In-place (no restarts)
HPA compatible	No (death spirals)	N/A	Yes (adjusts base, not %)
Safety	Minimal guardrails	N/A	Graduated rollout + auto-revert
Algorithm	Backward-looking histograms	VPA recommender	Time-of-day-aware + burst detection
Production confidence	<1% use automated	N/A	Observe -> Recommend -> Canary (auto-promote) -> Auto

Migrating from VPA? See the step-by-step migration guide for field-by-field mapping, side-by-side YAML, and zero-downtime cutover.

Quick Start

Prerequisites

Kubernetes 1.32+ (1.32 requires enabling the InPlacePodVerticalScaling feature gate; 1.33+ has it enabled by default)
Prometheus (for usage metrics)
Helm 3.16+ or 4.x
cert-manager (for admission webhook TLS; to skip, install with --set webhooks.enabled=false)

Optional GitOps export mode: Recommendations can be written to ConfigMaps instead of (or in addition to) direct resizing. Ideal for ArgoCD/Flux workflows. See the Auto mode guide.

Install

helm install attune oci://ghcr.io/attune-io/charts/attune \
  --namespace attune-system --create-namespace

Also available via OperatorHub.io (OLM) and raw manifests. See the Installation Guide for all options.

Create a Policy

Start in Recommend mode (safe, no changes applied):

apiVersion: attune.io/v1alpha1
kind: AttunePolicy
metadata:
  name: api-services
  namespace: production
spec:
  targetRef:
    kind: Deployment
    selector:
      matchLabels:
        tier: api
  metricsSource:
    prometheus:
      address: http://prometheus-server.monitoring:80
  cpu:
    percentile: 95
    overhead: "20"
    minAllowed: "1m"
    maxAllowed: "4000m"
  memory:
    percentile: 99
    overhead: "30"
    minAllowed: "4Mi"
    maxAllowed: "8Gi"
  updateStrategy:
    type: Recommend

kubectl apply -f policy.yaml

Check Recommendations

kubectl get attunepolicies -n production
# NAME            TYPE        WORKLOADS   RECS   RESIZED   READY   AGE
# api-services    Recommend   3           0      0         False   5m

After enough data accumulates, recommendations appear:

kubectl get attunepolicies -n production
# NAME            TYPE        WORKLOADS   RECS   RESIZED   READY   AGE
# api-services    Recommend   3           3      0         True    2d

kubectl attune recommendations -n production
# NAMESPACE   POLICY        WORKLOAD    CONTAINER  CPU REQ  CPU REC  MEM REQ  MEM REC  CONFIDENCE
# production  api-services  api-server  app        500m     320m     512Mi    384Mi    92.0%
# production  api-services  worker      main       1000m    480m     2Gi      1.2Gi    88.5%
# production  api-services  frontend    nginx      250m     120m     256Mi    180Mi    95.1%

kubectl attune savings -n production
# NAMESPACE   NAME          CPU SAVED  MEMORY SAVED  % SAVED  EST. MONTHLY
# production  api-services  830m       1012Mi        34%      $72.40

Note: minimumDataPoints counts Prometheus range-query samples, not hours. With the default queryStep: 5m, minimumDataPoints: 48 needs about 4 hours of data. If you increase queryStep, the wall-clock time rises too. See the quickstart guide for details.

Effective defaults: Most defaultable policy fields are applied by the controller at reconcile time so that AttuneDefaults and AttuneNamespaceDefaults can override them. Those fields may appear empty in kubectl get attunepolicy -o yaml, but the policy still follows the built-in and inherited runtime behavior unless you override those fields. Use kubectl attune explain -n <namespace> <policy> to inspect the effective values for the key controller-applied defaults and see whether each one came from the policy, a namespace default, a cluster default, or the built-in default.

Upgrading? Review the changelog for breaking changes.

Helm installs: If you use restrictive cluster networking, review the chart's Helm README before installing with networkPolicy.enabled=true (the default). The policy allows webhook, metrics, DNS, API server, and Prometheus egress on networkPolicy.prometheusPort (default 9090).

Upgrade to Canary Mode

Once you trust the recommendations, switch to Canary mode to apply changes to 10% of pods first:

spec:
  updateStrategy:
    type: Canary
    canary:
      percentage: 10
      observationPeriod: 30m
    autoRevert: true

See the examples/ directory for more scenarios: Auto mode, HPA coexistence, cluster-wide defaults, and multi-workload selectors.

kubectl Plugin

A kubectl attune plugin provides quick access to policy status, savings, recommendations, resize history, and recommendation reasoning without raw YAML parsing.

# Install via Krew (recommended)
kubectl krew install attune

# Or build from source
make build-plugin
sudo cp bin/kubectl-attune /usr/local/bin/

# Usage
kubectl attune status -n production
kubectl attune savings -n production
kubectl attune recommendations -n production
kubectl attune history -n production
kubectl attune explain -n production api-services

# All namespaces
kubectl attune status -A

Example output:

NAMESPACE   NAME          TYPE    WORKLOADS  RESIZED  READY       AGE
production  api-services  Canary  3          1        Monitoring  2d

NAMESPACE   POLICY        WORKLOAD    CONTAINER  CPU REQ  CPU REC  MEM REQ  MEM REC  CONFIDENCE
production  api-services  api-server  app        500m     320m     512Mi    384Mi    92.0%

Grafana Dashboard

Helm chart (recommended): Enable grafanaDashboard.enabled: true in your Helm values to auto-provision the dashboard via the Grafana sidecar:

helm upgrade attune oci://ghcr.io/attune-io/charts/attune \
  --set grafanaDashboard.enabled=true

Manual import: The raw JSON is at deploy/grafana/dashboard.json. Import it into Grafana and select your Prometheus data source.

The dashboard includes:

Overview: total resizes, reverts, CPU/memory saved
Resize Operations: resize rate by result, reverts by reason
Recommendations: per-workload CPU/memory recommendations and confidence scores
Operator Health: reconcile latency (p50/p99), Prometheus query duration, query errors

Architecture

┌────────────────────────────────────────────────────┐
│                       attune                       │
│                                                    │
│  Policy         Metrics         Recommender        │
│  Controller ──► Collector ──►  Engine              │
│       │                     (percentile -> margin  │
│       │                      -> confidence ->      │
│       ▼                      bounds clamping)      │
│  Resize         Safety                             │
│  Engine ◄────► Monitor                             │
│  (/resize       (OOMKill, throttle,                │
│   subresource)   restarts, auto-revert)            │
└────────────────────────────────────────────────────┘
         │                    │
         ▼                    ▼
    Kubernetes API       Prometheus
    (Pod /resize)        (usage data)

Features

Safety

Auto-revert: automatically restores original resources on OOMKill, CPU throttle, restart spikes, or pod NotReady.
Graduated rollout: five modes from zero-risk to full automation -- Observe, Recommend, OneShot, Canary, Auto.
Node capacity guard: validates post-resize requests fit within node allocatable before applying changes.
LimitRange/ResourceQuota guard: skips resizes that would violate namespace constraints or exceed quota headroom.
Exponential backoff: cooldown doubles per consecutive revert (capped at 16x). Degraded condition flags workloads needing tuning.

Intelligence

Confidence scaling: conservative when data is sparse, precise as it accumulates. No premature optimization.
Time-of-day awareness: hourly usage profiles ensure recommendations cover peak hours, not just the average.
HPA coexistence: adjusts base resource requests without interfering with HPA's percentage-based scaling. No death spirals.
Always-bounded: resource bounds (minAllowed/maxAllowed) per-policy with safe defaults (CPU: 1m-4000m, Memory: 4Mi-8Gi).

Operations

In-place resize: adjusts CPU and memory on running pods via the K8s 1.32+ /resize subresource. The default path is in-place with no restarts. InPlaceOrRecreate can optionally fall back to eviction when kubelet rejects an in-place resize.
Cost savings estimation: per-workload EstimatedMonthlySavings in status, CLI (kubectl attune savings), and Grafana dashboard.
Scheduled resize windows: restrict resizes to specific time windows and days of the week. Recommendations compute continuously regardless.
Per-cycle budget caps: limit aggregate CPU/memory increases per reconcile cycle, preventing cluster-wide spikes.
Concurrent pod processing: parallel pod resizes within a cycle for reduced latency at scale.

Compatibility

Multi-data-source: Thanos, VictoriaMetrics, Grafana Mimir, managed Prometheus. Bearer token auth, custom headers, TLS.
Prometheus auto-discovery: finds Prometheus via the Operator CRD or well-known service names when no address is configured.
Batch workloads: CronJobs and Jobs for recommend-only right-sizing.
Namespace-scoped defaults: per-namespace AttuneNamespaceDefaults override cluster-scoped defaults for production vs staging.
Conflict detection: warns about VPA, overlapping policies, or active rollouts targeting the same workload.
VPA recommendation consumption: use existing VerticalPodAutoscaler recommendations as an alternative to Prometheus queries via metricsSource.vpa.
SLO-based guardrails: PromQL-based application health checks (latency, error rate) that auto-revert resizes on threshold breach.
GitOps diff command: kubectl attune diff outputs recommendations in diff format for ArgoCD/Flux review workflows.
Initial sizing webhook: set pod resources at creation time based on existing policy recommendations, eliminating the "deploy with bad defaults" gap.
Directional change caps: asymmetric maxIncreasePercent/maxDecreasePercent per resource (memory decreases are riskier than CPU increases).
Memory-from-CPU derivation: memoryFromCpuRatio derives memory from CPU for JVM and heap-bound workloads.
Pause reconciliation: spec.paused: true halts all activity without reverting existing resizes.
Webhook warnings: 13 admission-time warnings for nonsensical config combinations with 31 runtime K8s events and per-policy suppression.

Documentation

Guide	Description
Why Attune?	The problem, why VPA fails, and how in-place resize changes everything
Savings Calculator	Estimate your monthly savings with an interactive calculator
Quickstart	Get running in 5 minutes
Migrating from VPA	Step-by-step VPA replacement
HPA Coexistence	Running alongside HPA
Scaling Guide	Cluster size presets, tuning, and HA deployment
Canary Rollout	Graduated rollout strategy
CLI Reference	kubectl plugin commands
API Reference	CRD specification
Troubleshooting	Common issues and solutions
Examples	Ready-to-use policy manifests
Contributing	Development setup and guidelines
Changelog	Release history and breaking changes
Adopters	Organizations using Attune

License

Apache License 2.0. See LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 1,141 Commits
.github		.github
api/v1alpha1		api/v1alpha1
charts/attune		charts/attune
cmd		cmd
config		config
deploy		deploy
dist		dist
docker		docker
docs		docs
examples		examples
hack		hack
internal		internal
pkg/defaults		pkg/defaults
test		test
.chainsaw.yaml		.chainsaw.yaml
.codecov.yml		.codecov.yml
.dockerignore		.dockerignore
.gitignore		.gitignore
.golangci.yml		.golangci.yml
.goreleaser.yaml		.goreleaser.yaml
.ko.yaml		.ko.yaml
.krew.yaml		.krew.yaml
.release-please-manifest.json		.release-please-manifest.json
ADOPTERS.md		ADOPTERS.md
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
DCO		DCO
Dockerfile		Dockerfile
GEMINI.md		GEMINI.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
SECURITY.md		SECURITY.md
artifacthub-repo.yml		artifacthub-repo.yml
codecov.yml		codecov.yml
go.mod		go.mod
go.sum		go.sum
lychee.toml		lychee.toml
mkdocs.yml		mkdocs.yml
release-please-config.json		release-please-config.json
renovate.json		renovate.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Attune

Why

How It's Different

Quick Start

Prerequisites

Install

Create a Policy

Check Recommendations

Upgrade to Canary Mode

kubectl Plugin

Grafana Dashboard

Architecture

Features

Safety

Intelligence

Operations

Compatibility

Documentation

License

About

Uh oh!

Releases 10

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Attune

Why

How It's Different

Quick Start

Prerequisites

Install

Create a Policy

Check Recommendations

Upgrade to Canary Mode

kubectl Plugin

Grafana Dashboard

Architecture

Features

Safety

Intelligence

Operations

Compatibility

Documentation

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 10

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages