Skip to content

Extension: CloudNativePG #19

@dimityrmirchev

Description

@dimityrmirchev

Extension: CloudNativePG

What is this extension about?

This extension integrates CloudNativePG into Gardener-managed Kubernetes clusters, providing automated deployment and lifecycle management of production-grade PostgreSQL databases.

CloudNativePG Overview

CloudNativePG is a Kubernetes operator that manages PostgreSQL database clusters natively within Kubernetes. It provides:

  • Automated Operations: Handles deployment, failover, and lifecycle management of PostgreSQL clusters using primary/standby architecture with streaming replication
  • Native Kubernetes Integration: Works directly with the Kubernetes API server to maintain cluster state (no external dependencies)
  • High Availability: Self-healing capabilities with automatic failover
  • Data Management: Custom persistent volume claim handling for PostgreSQL data storage
  • Security: TLS encryption for in-transit data, certificate integration with cert-manager
  • Disaster Recovery: Continuous backups to object stores and Point-In-Time-Recovery (PITR)
  • Observability: Built-in Prometheus exporters and JSON-formatted logging

Why?

The Problem: Relational Data in Cloud-Native Ecosystems

The NeoNephos ecosystem hosts multiple projects that produce or consume data fundamentally unsuitable for etcd. While etcd excels as a distributed key-value store for Kubernetes state, it is explicitly not designed for:

  • Large datasets (etcd has a 1.5MB value size limit and recommends keeping total database size under 8GB)
  • Complex queries (no SQL, joins, or aggregations)
  • High write throughput (optimized for consistency over write performance)
  • Historical/audit data (no built-in retention policies or time-series support)
  • Relational data models (compliance reports, security findings, identity mappings)

Projects that can make use of PostgreSQL (non-exhaustive list)

The following projects have data requirements that can make use of a proper relational database:

Project Database Need Data Characteristics
inventory Resource & relationship storage Collects resources from multiple cloud providers (AWS, Azure, GCP), persists them, and maps relationships between resources. Functions as a CMDB for infrastructure tracking and dependency analysis - inherently relational data requiring complex graph-like queries. Already uses PostgreSQL.
diki Compliance scan storage Historical scan results, JSON reports, trend analysis across clusters. Diki generates compliance reports comparing security posture over time - this requires queryable historical storage, not key-value lookups.
SPIRE (Zero-Trust Identity) Identity datastore SPIRE explicitly supports PostgreSQL as its datastore backend. A Gardener-based zero-trust service would need PostgreSQL for storing SVID registrations, attestation policies, and identity mappings across federated trust domains.
Open Delivery Gear Delivery/compliance metadata Software bill of materials (SBOM), vulnerability scan results, compliance attestations, artifact metadata. These are inherently relational (components → vulnerabilities → remediations).
gardener-falco-extension Security event storage Runtime security events, threat detections, audit trails. High-volume time-series data requiring retention policies and complex queries for incident investigation.

Why Not Use Managed Cloud Databases?

While cloud providers offer managed PostgreSQL (AWS RDS, Azure Database, GCP Cloud SQL), there are compelling reasons to run self-managed PostgreSQL via CloudNativePG:

Concern Managed Service Limitation CloudNativePG Advantage
Digital Sovereignty Data residency concerns; provider lock-in; limited control over encryption keys Full control over data location, encryption, and access; no vendor dependency
Cost at Scale Managed databases are expensive, especially for multi-region HA (often 3-5x self-managed costs) Runs on existing Kubernetes infrastructure; pay only for compute/storage
Air-Gapped Environments Not available in disconnected or restricted networks Fully self-contained; works in air-gapped deployments
Compliance Requirements May not meet specific regulatory frameworks (e.g., BSI C5, GDPR data processing agreements) Complete audit trail; customizable security policies
Latency Cross-network calls to managed service Co-located with workloads in the same cluster
Consistent Operations Different APIs/tools per cloud provider Unified Kubernetes-native operations across all environments

The Gardener Extension Value Proposition

By providing CloudNativePG as a Gardener extension, we enable:

  1. Unified Database Infrastructure - Same PostgreSQL deployment pattern across AWS, Azure, GCP, OpenStack, and bare-metal shoots
  2. Operator Deployment Only - The extension deploys and manages the CloudNativePG operator itself, not individual database instances. Teams retain full responsibility for provisioning, scaling, backup, and lifecycle operations of their PostgreSQL clusters using the operator's CRDs (Cluster, Backup, ScheduledBackup). This separation ensures teams have flexibility while benefiting from a standardized operator deployment.
  3. Platform Team Enablement - Central teams define secure defaults; application teams consume databases declaratively
  4. Ecosystem Synergy - Other extensions can depend on this extension for their storage needs

How to categorize this topic?

/area cost
/area open-source
/area storage
/area ipcei

/kind enhancement

/label teamsize/small

Metadata

Metadata

Labels

area/costCost relatedarea/ipceiIPCEI (Important Project of Common European Interest)area/open-sourceOpen Source (community, enablement, contributions, conferences, CNCF, etc.) relatedarea/storageStorage relatedkind/enhancementEnhancement, improvement, extensionteamsize/smallA team of 1-2 people.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions