From e0e7880abab92f9841e6b65892444f2ea2db03c2 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Arda=20G=C3=BC=C3=A7l=C3=BC?= Date: Tue, 17 Mar 2026 10:35:55 +0300 Subject: [PATCH 01/10] Add KMS TPv2 foundations --- .../kms-encryption-foundations.md | 339 ++++++++++++++---- 1 file changed, 276 insertions(+), 63 deletions(-) diff --git a/enhancements/kube-apiserver/kms-encryption-foundations.md b/enhancements/kube-apiserver/kms-encryption-foundations.md index 6f350f4dfb..337e99569d 100644 --- a/enhancements/kube-apiserver/kms-encryption-foundations.md +++ b/enhancements/kube-apiserver/kms-encryption-foundations.md @@ -3,20 +3,20 @@ title: kms-encryption-foundations authors: - "@ardaguclu" reviewers: - - "@flavianmissi" - - "@ibihim" + - "@p0lyn0mial" + - "@bertinatto" # for plugin lifecycle + - "@flavianmissi" # for API alignment approvers: - "@benluddy" api-approvers: - "@JoelSpeed" creation-date: 2025-12-03 -last-updated: 2026-01-08 +last-updated: 2026-03-17 tracking-link: - - "https://issues.redhat.com/browse/OCPSTRAT-108" + - "https://redhat.atlassian.net/browse/CNTRLPLANE-243" see-also: - "enhancements/kube-apiserver/encrypting-data-at-datastore-layer.md" - "enhancements/etcd/storage-migration-for-etcd-encryption.md" - - "[encrypt data at rest with KMS](https://github.com/openshift/enhancements/pull/1872)" replaces: - "[KMS Encryption Provider for Etcd Secrets](https://github.com/openshift/enhancements/pull/1682/)" --- @@ -49,10 +49,11 @@ KMS support enables integration with external key management systems where encry ### Non-Goals - Implementing KMS plugins (provided by upstream Kubernetes/vendors) -- KMS plugin deployment/lifecycle management -- KMS plugin health checks (Tech Preview v2) -- Recovery from KMS key loss (separate EP for GA) -- Automatic `key_id` rotation detection (Tech Preview v2) +- KMS plugin deployment/lifecycle management (covered by a separate EP) +- KMS plugin health checks (GA) +- Recovery from KMS key loss (GA) +- Automatic `key_id` rotation detection (GA) +- API field definitions for KMS provider configuration in APIServer resource (covered by a separate EP) ## Proposal @@ -65,8 +66,21 @@ Encryption controllers use the static endpoint in EncryptionConfiguration. KMS-t **Tech Preview v2 (Managed Plugin Lifecycle):** -Users specify plugin-specific configuration for managed KMS provider types (e.g. Vault). -From the encryption controllers' perspective, the core logic remains the same; only the tracked fields change. +Users specify plugin-specific configuration for managed KMS provider types (e.g. Vault) via the APIServer resource (API fields covered by a separate EP). +Encryption controllers split the KMS configuration API into two parts stored atomically in encryption key secrets: + +1. `kms-config` — fields for EncryptionConfiguration (apiVersion, name, endpoint, timeout) +2. `kms-sidecar-config` — provider-specific fields for sidecar containers (image, vault-address, listen-address, transit-mount, transit-key, etc.) + +Storing both in the same secret avoids race conditions where EncryptionConfiguration references a KMS plugin whose sidecar configuration is not yet available. + +The keyID is appended to the UDS path (`unix:///var/run/kmsplugin/kms-{keyID}.sock`) to ensure uniqueness among providers. The UDS path is the sole configuration shared between kms-config and kms-sidecar-config. + +keyController performs field-level comparison to determine whether a change requires migration or can be applied in-place: +- Migration-triggering fields (affect KEK): vault-address, vault-namespace, transit-key, transit-mount +- In-place fields (container spec only): e.g., image + +keyController validates referenced credential secrets. If missing, the controller goes degraded and no changes are propagated. **Key changes in library-go:** 1. Add KMS mode constant to encryption state types @@ -74,9 +88,10 @@ From the encryption controllers' perspective, the core logic remains the same; o 3. Manage encryption key secrets with KMS configuration (actual keys are stored externally in KMS provider) 4. Detect configuration changes to trigger migration 5. Reuse existing migration controller (no changes needed) - -**Additional Tech Preview v2 capabilities:** -- Poll KMS plugin Status endpoint for health checks and `key_id` changes to detect external key rotation +6. Split KMS configuration into kms-config and kms-sidecar-config (Tech Preview v2) +7. Copy kms-sidecar-config with keyID suffix to encryption-configuration secrets (Tech Preview v2) +8. Field-level comparison to distinguish migration-requiring vs. in-place changes (Tech Preview v2) +9. Credential secret validation with degraded status reporting (Tech Preview v2) ### Workflow Description @@ -86,16 +101,18 @@ From the encryption controllers' perspective, the core logic remains the same; o **KMS** is the external Key Management Service that stores and manages the Key Encryption Key (KEK). -**KMS plugin** is a gRPC service implementing Kubernetes KMS v2 API, running as a static pod on each control plane node. It communicates with the external KMS to encrypt/decrypt data encryption keys (DEKs). +**KMS plugin** is a gRPC service implementing Kubernetes KMS v2 API. In Tech Preview v1, it runs as a static pod on each control plane node. In Tech Preview v2, it runs as a sidecar container alongside with API Servers (kube-apiserver, oauth-apiserver, openshift-apiservers) managed by the APIServer operators. It communicates with the external KMS to encrypt/decrypt data encryption keys . **API server operator** is the OpenShift operator (kube-apiserver-operator, openshift-apiserver-operator, or authentication-operator) managing API server deployments. #### Encryption Controllers **keyController** manages encryption key lifecycle. Creates encryption key secrets in `openshift-config-managed` namespace. For KMS mode, creates secrets storing KMS configuration. +For Tech Preview v2, also splits configuration into `kms-config` and `kms-sidecar-config`, performs field-level comparison, and validates credential secrets. **stateController** generates EncryptionConfiguration for API server consumption. Implements distributed state machine ensuring all API servers converge to same revision. For KMS mode, generates EncryptionConfiguration using the KMS configuration. +For Tech Preview v2, also copies `kms-sidecar-config` with keyID suffix (e.g., `kms-sidecar-config-1`) to the encryption-configuration secret. **migrationController** orchestrates resource re-encryption. Marks resources as migrated after rewriting in etcd. Works with all encryption modes including KMS. @@ -122,15 +139,15 @@ To enable the apiservers to access the KMS plugin, the `/var/run/kmsplugin` dire apiVersion: v1 kind: Secret metadata: - name: openshift-kube-apiserver-encryption-1 + name: encryption-key-kube-apiserver-1 namespace: openshift-config-managed annotations: - encryption.apiserver.operator.openshift.io/mode: "kms" + encryption.apiserver.operator.openshift.io/mode: "KMS" data: encryption.apiserver.operator.openshift.io-key: "" # Contains base64-encoded structured data with KMS configuration: # - Tech Preview v1: Static endpoint path (unix:///var/run/kmsplugin/kms.sock) - # - Tech Preview v2: Will also include key_id and other plugin-specific configuration for other kms provider types + # - Tech Preview v2: kms-ec-config and kms-sidecar-config (see Tech Preview v2 section below) ``` 4. stateController generates EncryptionConfiguration using the endpoint: @@ -152,37 +169,172 @@ To enable the apiservers to access the KMS plugin, the `/var/run/kmsplugin` dire **Note:** Automatic weekly key rotation (used for aescbc/aesgcm) is disabled for KMS since rotation is triggered externally. -#### Variation: KMS Key Rotation (Tech Preview v2) +#### Steps for Enabling KMS Encryption (Tech Preview v2) + +1. Cluster admin configures KMS provider in the APIServer resource (API fields covered by a separate EP): + ```yaml + apiVersion: config.openshift.io/v1 + kind: APIServer + spec: + encryption: + type: KMSv2 + # Vault API specific fields + ``` + +2. keyController detects the configuration, splits it into `kms-config` and `kms-sidecar-config`, and creates an encryption key secret: + ```yaml + apiVersion: v1 + kind: Secret + metadata: + name: encryption-key-kube-apiserver-1 + namespace: openshift-config-managed + annotations: + encryption.apiserver.operator.openshift.io/mode: "KMSv2" + type: Opaque + data: + kms-ec-config: + kms-sidecar-config: + ``` + +3. stateController uses `kms-ec-config` to generate the EncryptionConfiguration (with keyID in the endpoint and provider name): + ```yaml + apiVersion: apiserver.config.k8s.io/v1 + kind: EncryptionConfiguration + resources: + - resources: + - secrets + providers: + - kms: + apiVersion: v2 + name: kms-1_secrets + endpoint: unix:///var/run/kmsplugin/kms-1.sock + timeout: 10s + ``` + +4. stateController copies `kms-sidecar-config` with keyID suffix to the encryption-configuration secret: + ```yaml + apiVersion: v1 + kind: Secret + metadata: + name: encryption-config-kube-apiserver-9 + namespace: openshift-kube-apiserver + type: Opaque + data: + encryption-config: + kms-sidecar-config-1: + ``` + +5. The encryption-configuration secret is revisioned, triggering a new rollout. The respective operator configures sidecars accordingly. + +6. migrationController initiates re-encryption (no code changes - works with any mode). + +7. conditionController updates status conditions: `EncryptionInProgress`, then `EncryptionCompleted`. -When external KMS rotates the key internally: +There are no preconditions for enabling KMS for the first time. -1. keyController polls KMS plugin Status endpoint for `key_id`. -2. Compares `key_id` with `key_id` stored in secret `Data` field. -3. If `key_id` differs: - - Creates new encryption key secret with new `key_id` - - migrationController automatically triggers re-encryption -4. If `key_id` matches: No action. +#### Variation: Updates Requiring Migration (Tech Preview v2) -> **Note:** API server operators are not privileged and cannot directly communicate with KMS plugins running as static pods on control plane nodes. -> Tech Preview v2 will require introducing a mechanism to poll KMS plugin Status endpoints for `key_id` changes and health monitoring, and expose this information to the operators. +If a field affecting the KEK is changed (**vault-address**, **vault-namespace**, **transit-key**, **transit-mount**), keyController creates a new encryption key secret with the next keyID. -**Two change detection mechanisms:** -- Tracking KMS configuration detects admin configuration changes -- Tracking key_id detects external key rotation +stateController generates an EncryptionConfiguration with both providers — new as write key, old as read key: -#### Variation: Migration Between Encryption Modes +```yaml +apiVersion: apiserver.config.k8s.io/v1 +kind: EncryptionConfiguration +resources: + - resources: + - secrets + providers: + - kms: + apiVersion: v2 + name: kms-2_secrets + endpoint: unix:///var/run/kmsplugin/kms-2.sock + timeout: 10s + - kms: + apiVersion: v2 + name: kms-1_secrets + endpoint: unix:///var/run/kmsplugin/kms-1.sock + timeout: 10s +``` + +stateController copies kms-config and kms-sidecar-config from both encryption key secrets into the encryption-configuration secret: + +```yaml +apiVersion: v1 +kind: Secret +metadata: + name: encryption-config-kube-apiserver-9 + namespace: openshift-kube-apiserver +data: + encryption-config: + kms-sidecar-config-1: + kms-sidecar-config-2: +``` + +Both providers run as separate sidecar containers with different unix domain sockets (kms-1.sock, kms-2.sock). + +#### Variation: Updates Not Requiring Migration (Tech Preview v2) + +Fields that only affect the container spec (e.g., image for CVE fixes) do not change the KEK: + +1. keyController updates the existing encryption key secret in-place. No new secret is created. +2. stateController detects the change and triggers a new revision with the updated `kms-sidecar-config`. + +Only the active provider receives the update. Older providers retain their original sidecar configuration as fallback. + +#### Variation: Disabling KMS Encryption (Tech Preview v2) + +When the user sets the encryption mode to identity, keyController creates a new encryption key secret for identity mode. The EncryptionConfiguration contains identity as write provider and the KMS plugin as read provider until migration completes. + +After migration, the unused KMS plugin is removed from EncryptionConfiguration and status conditions notify the admin. Backups encrypted with the previous KMS plugin are not restorable without access to that plugin. This removal mechanism is out of scope in Tech Preview v2. + +#### Variation: Migration from KMS Plugin A to KMS Plugin B (Tech Preview v2) + +keyController creates a new encryption key secret with the new plugin's configuration. stateController generates an EncryptionConfiguration with both providers — new as write key, old as read key. Both run as separate sidecars until migration completes. + +#### Variation: Migration Between KMS and Static Encryption (Tech Preview v2) + +**From KMS to static encryption (aesgcm/aescbc):** +keyController creates a new encryption key secret for the static mode. EncryptionConfiguration contains static as write provider and KMS as read provider until migration completes. The KMS plugin must remain accessible during migration. + +**From static encryption to KMS:** +keyController creates a new encryption key secret with KMS configuration. EncryptionConfiguration contains KMS as write key and static provider as read key. + +#### Variation: KMS Plugin A to Identity to KMS Plugin A (Tech Preview v2) + +Even with identical plugin configuration, keyController creates a new encryption key secret with the next keyID (e.g., keyID 3 vs original keyID 1). stateController generates an EncryptionConfiguration with kms-3 as write key, identity and kms-1 as read providers: + +```yaml +apiVersion: apiserver.config.k8s.io/v1 +kind: EncryptionConfiguration +resources: + - resources: + - secrets + providers: + - kms: + apiVersion: v2 + name: kms-3_secrets + endpoint: unix:///var/run/kmsplugin/kms-3.sock + timeout: 10s + - identity: {} + - kms: + apiVersion: v2 + name: kms-1_secrets + endpoint: unix:///var/run/kmsplugin/kms-1.sock + timeout: 10s +``` + +Both KMS providers run as separate sidecar containers without deduplication, maintaining full isolation. -**From aescbc to KMS:** -1. Admin deploys KMS plugin and updates APIServer: `type: KMS` with KMS configuration. -2. keyController creates KMS secret (empty data, with KMS configuration annotation). -3. migrationController re-encrypts resources using external KMS. +#### Preconditions for Configuration Changes (Tech Preview v2) -**From KMS to aescbc:** -1. Admin updates APIServer: `type: aescbc`. -2. keyController creates aescbc secret (with actual key material). -3. migrationController re-encrypts resources using local AES key. +- No preconditions for first-time KMS enablement. +- During write key promotion, keyController will not generate a new encryption key. The in-progress key must complete the full state machine first. +- To fix in-progress configuration (e.g., increase timeout), admin must provide the same KMS configuration. This associates the fix with the existing encryption key. -Migration controller reuses existing logic - no changes required. +#### Variation: KMS Key Rotation + +When a KMS plugin rotates its `key_id` (KEK), this triggers neither a new encryption key secret nor a new revision. The mechanism for detecting and handling `key_id` rotation is under evaluation and not covered in this enhancement. ### User Stories @@ -206,6 +358,10 @@ For Tech Preview v1, no new API fields are added to the APIServer resource. Users simply set `encryption.type: KMS` ([EncryptionType](https://github.com/openshift/api/blob/6fb7fdae95fd20a36809d502cfc0e0459550d527/config/v1/types_apiserver.go#L214)) and deploy KMS plugins at the hardcoded endpoint `unix:///var/run/kmsplugin/kms.sock`. Current `KMSConfig` will not be used. +**Tech Preview V2** + +API changes for Tech Preview v2 are covered by a separate EP. This EP assumes the API exists and describes only the encryption controller-side implementation. The API provides provider-specific fields (image, vault-address, vault-namespace, transit-key, transit-mount, etc.) that keyController splits into `kms-config` and `kms-sidecar-config`. + ### Topology Considerations #### Hypershift / Hosted Control Planes @@ -231,15 +387,27 @@ This feature does not depend on the features that are excluded from the OKE prod ### Implementation Details/Notes/Constraints +- Both `kms-config` and `kms-sidecar-config` are stored in the same encryption key secret for atomicity +- keyController uses provider-specific field-level comparison (not simple equality) to determine migration necessity +- UDS path convention: `unix:///var/run/kmsplugin/kms-{keyID}.sock` — keyID appended for uniqueness + ### Risks and Mitigations **Risk: KMS Plugin Unavailable During Controller Sync** - **Impact:** Controllers cannot detect key rotation -- **Mitigation:** No mitigation in Tech Preview. Tech Preview v2 will add health checks and expose it to cluster admin via operator conditions to degrade +- **Mitigation:** No mitigation in Tech Preview. GA will add health checks and expose it to cluster admin via operator conditions to degrade + +**Risk: Race Condition Between EncryptionConfiguration and Sidecar Availability (Tech Preview v2)** +- **Impact:** KAS instance broken if sidecar configuration not yet available +- **Mitigation:** Atomic storage of both configs in same encryption key secret -**Risk: etcd Backup Restoration Without KMS Key Access** -- **Impact:** Cannot decrypt data if KMS key deleted/unavailable/expired -- **Mitigation:** No mitigation in Tech Preview. Document KMS key retention requirements. +**Risk: Invalid Credential Secret (Tech Preview v2)** +- **Impact:** KMS plugin cannot authenticate to external KMS +- **Mitigation:** keyController validates and goes degraded; old credentials continue to be used + +**Risk: Configuration Change During Write Key Promotion (Tech Preview v2)** +- **Impact:** Conflict with in-progress state machine +- **Mitigation:** keyController blocks new encryption key generation during promotion ### Drawbacks @@ -258,11 +426,16 @@ This feature does not depend on the features that are excluded from the OKE prod - Explore MOM framework for integration tests in apiserver operators (add tests if it makes sense) **E2E Tests** (v1): -- Migration between identity ↔ KMS +- Migration between identity ↔ KMS **E2E Tests** (v2): - Full cluster with KMS encryption enabled - Migration between encryption modes (aescbc → KMS, KMS → KMS) +- Migration from KMS Plugin A to KMS Plugin B +- In-place update (image change without migration) +- KMS to identity and back to KMS (duplicate provider scenario) +- KMS to static encryption and vice versa +- Invalid credential secret handling (degraded state) - Verify data re-encryption completes ## Graduation Criteria @@ -271,12 +444,20 @@ This feature does not depend on the features that are excluded from the OKE prod None +### Tech Preview v1 -> Tech Preview v2 + +- KMS configuration splitting into kms-config and kms-sidecar-config with atomic storage in encryption key secrets +- Multiple concurrent KMS providers during migration with UDS path isolation +- Field-level comparison for migration-requiring vs. in-place configuration changes +- Credential secret validation with degraded status reporting +- All migration scenarios validated (KMS-to-KMS, KMS-to-static, KMS-to-identity-to-KMS) + ### Tech Preview -> GA -- Dynamic `key_id` fetching via KMS plugin Status endpoint - Full support for key rotation, with automated data re-encryption -- Migration support between different KMS providers, with automated data re-encryption - Health check preconditions (block operations when plugin unhealthy) +- Failure mode coverage: loss of access to KMS service (detection + mitigation) +- Failure mode coverage: lost encryption keys (detection + mitigation) - Comprehensive integration and E2E test coverage - Production validation in multiple environments @@ -320,21 +501,24 @@ No special handling required. ### Failure Modes -**KMS Plugin Unavailable:** -- New resource creation fails -- Existing resources readable (if DEKs remain cached in API server memory; cache clears on restart) -- Detection: `KMSPluginDegraded=True` -- Recovery: Plugin restart (automatic or manual) +**Invalid Credential Secret:** +- keyController goes degraded, no changes propagated, old credentials continue to be used +- Detection: `EncryptionControllerDegraded=True` +- Recovery: Create/fix the credential secret; keyController resumes automatically -**Invalid KMS Configuration:** -- Plugin fails to start -- Detection: Plugin container crash loops -- Recovery: Fix APIServer configuration +**Configuration Change During Write Key Promotion:** +- keyController will not generate a new encryption key during promotion +- Admin can fix in-progress config by providing the same KMS configuration (e.g., increase timeout) +- Detection: `EncryptionMigrationControllerProgressing=True` -**Key Rotation Stuck:** -- Migration unable to complete -- Detection: `EncryptionMigrationControllerProgressing=True` for extended period -- Recovery: Check migration controller logs, verify KMS health +**Configuration Updates During Migration:** +- # TODO: Fix incorrect configurations. For instance there is a typo in transit-key (which triggers a new key), how cluster-admin can fix it +- Older KMS plugins (read-only providers) cannot be updated; only the active (write) provider can be changed + +**Non-Migration Update Fallback:** +- Only the active provider's sidecar config is updated; older providers retain their original configuration as fallback +- Detection: Revision rollout failure in operator status +- Recovery: Provide corrected configuration via APIServer resource ## Support Procedures @@ -347,11 +531,34 @@ oc get secrets -n openshift-config-managed -l encryption.apiserver.operator.open oc logs -n openshift-kube-apiserver-operator deployment/kube-apiserver-operator | grep -i kms ``` +### Inspecting Encryption Configuration (Tech Preview v2) +```bash +# Check encryption-configuration secrets for sidecar configs +oc get secrets -n openshift-kube-apiserver -l encryption.apiserver.operator.openshift.io/component -o yaml + +# Check encryption key secrets +oc get secrets -n openshift-config-managed -l encryption.apiserver.operator.openshift.io/component=encryption-key -o yaml +``` + ### Disabling KMS Encryption +**Tech Preview v1:** 1. Update APIServer: `spec.encryption.type: "aescbc"` 2. Wait for migration to complete -3. KMS plugin pods removed by operators +3. Manually remove KMS plugin static pods from control plane nodes + +**Tech Preview v2:** +1. Update APIServer: `spec.encryption.type: "aescbc"` (or `identity`) +2. keyController creates a new encryption key secret for the target mode +3. Migration proceeds automatically — KMS remains as read provider until migration completes +4. After migration, encryption controllers notify the cluster admin via status conditions that the KMS plugin can be safely decommissioned (GA) +5. Backups encrypted with the previous KMS plugin will not be restorable without access to that plugin + +### Recovering from Invalid KMS Configuration (Tech Preview v2) + +1. Check operator status: `oc get co kube-apiserver -o jsonpath='{.status.conditions}'` +2. If degraded due to missing credential secret: create/fix the secret. keyController resumes automatically. +3. If stuck during write key promotion: provide the same KMS configuration via APIServer resource. **etcd Backup/Restore:** - Before backup: Document KMS configuration, verify key availability @@ -366,9 +573,15 @@ Instead of extending existing controllers, create new KMS-only controllers. **Why not chosen:** - Code duplication (migration logic, state management) -- User confusion (different controllers for different encryption types) - More operational burden (additional monitoring, alerts) +### Alternative: Separate Secrets for EncryptionConfiguration and Sidecar Configuration + +**Why not chosen:** Creates race conditions — EncryptionConfiguration could reference a KMS plugin before sidecar configuration is available. + +### Alternative: Deduplication of KMS Plugin Instances During Migration + +**Why not chosen:** Adds complexity to plugin lifecycle (must detect identical providers), breaks isolation, and complicates rollback scenarios. ## Infrastructure Needed From 48d4b93cfbed1ab4500ab4479295da195c242b75 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Arda=20G=C3=BC=C3=A7l=C3=BC?= Date: Thu, 2 Apr 2026 08:51:04 +0300 Subject: [PATCH 02/10] Add read key promotion stall design --- enhancements/kube-apiserver/kms-encryption-foundations.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/enhancements/kube-apiserver/kms-encryption-foundations.md b/enhancements/kube-apiserver/kms-encryption-foundations.md index 337e99569d..8d5f8ec1a4 100644 --- a/enhancements/kube-apiserver/kms-encryption-foundations.md +++ b/enhancements/kube-apiserver/kms-encryption-foundations.md @@ -177,7 +177,7 @@ To enable the apiservers to access the KMS plugin, the `/var/run/kmsplugin` dire kind: APIServer spec: encryption: - type: KMSv2 + type: KMS # Vault API specific fields ``` @@ -189,7 +189,7 @@ To enable the apiservers to access the KMS plugin, the `/var/run/kmsplugin` dire name: encryption-key-kube-apiserver-1 namespace: openshift-config-managed annotations: - encryption.apiserver.operator.openshift.io/mode: "KMSv2" + encryption.apiserver.operator.openshift.io/mode: "KMS" type: Opaque data: kms-ec-config: @@ -512,7 +512,7 @@ No special handling required. - Detection: `EncryptionMigrationControllerProgressing=True` **Configuration Updates During Migration:** -- # TODO: Fix incorrect configurations. For instance there is a typo in transit-key (which triggers a new key), how cluster-admin can fix it +- When a migration-triggering field is misconfigured (e.g., typo in transit-key), the resulting encryption key is deployed but non-functional, and the system cannot recover because the key must complete its cycle. To prevent this, keyController runs pre-flight checks before generating a new encryption key: a pod with the KMS plugin is deployed to verify status and encrypt/decrypt capability. A new encryption key is only generated after pre-flight checks succeed. - Older KMS plugins (read-only providers) cannot be updated; only the active (write) provider can be changed **Non-Migration Update Fallback:** From a534616df07573eee0a3aa4efef1a70711140a42 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Arda=20G=C3=BC=C3=A7l=C3=BC?= Date: Wed, 8 Apr 2026 10:31:23 +0300 Subject: [PATCH 03/10] Add missing sections like preflight, cred, configmap --- .../kms-encryption-foundations.md | 56 +++++++++++++------ 1 file changed, 39 insertions(+), 17 deletions(-) diff --git a/enhancements/kube-apiserver/kms-encryption-foundations.md b/enhancements/kube-apiserver/kms-encryption-foundations.md index 8d5f8ec1a4..51506c73e8 100644 --- a/enhancements/kube-apiserver/kms-encryption-foundations.md +++ b/enhancements/kube-apiserver/kms-encryption-foundations.md @@ -51,7 +51,7 @@ KMS support enables integration with external key management systems where encry - Implementing KMS plugins (provided by upstream Kubernetes/vendors) - KMS plugin deployment/lifecycle management (covered by a separate EP) - KMS plugin health checks (GA) -- Recovery from KMS key loss (GA) +- Recovery from KMS key loss - Automatic `key_id` rotation detection (GA) - API field definitions for KMS provider configuration in APIServer resource (covered by a separate EP) @@ -67,12 +67,14 @@ Encryption controllers use the static endpoint in EncryptionConfiguration. KMS-t **Tech Preview v2 (Managed Plugin Lifecycle):** Users specify plugin-specific configuration for managed KMS provider types (e.g. Vault) via the APIServer resource (API fields covered by a separate EP). -Encryption controllers split the KMS configuration API into two parts stored atomically in encryption key secrets: +Encryption controllers split the KMS configuration API into multiple parts stored atomically in encryption key secrets: 1. `kms-config` — fields for EncryptionConfiguration (apiVersion, name, endpoint, timeout) 2. `kms-sidecar-config` — provider-specific fields for sidecar containers (image, vault-address, listen-address, transit-mount, transit-key, etc.) +3. `kms-credentials` — credential data fetched from referenced secrets (e.g., approle credentials from `openshift-config` namespace) +4. `kms-configmap-data` — ConfigMap data needed by KMS plugins (e.g., CA bundles) -Storing both in the same secret avoids race conditions where EncryptionConfiguration references a KMS plugin whose sidecar configuration is not yet available. +Storing all in the same secret avoids race conditions where EncryptionConfiguration references a KMS plugin whose sidecar configuration or credentials are not yet available. The keyID is appended to the UDS path (`unix:///var/run/kmsplugin/kms-{keyID}.sock`) to ensure uniqueness among providers. The UDS path is the sole configuration shared between kms-config and kms-sidecar-config. @@ -82,16 +84,19 @@ keyController performs field-level comparison to determine whether a change requ keyController validates referenced credential secrets. If missing, the controller goes degraded and no changes are propagated. +keyController periodically watches the content of referenced Secrets and ConfigMaps and keeps all active key secrets up to date — not just the latest write key. When referenced data changes (e.g., credential rotation), keyController updates the corresponding encryption key secrets without triggering key rotation or data migration. + **Key changes in library-go:** 1. Add KMS mode constant to encryption state types 2. Track KMS configuration in encryption key secrets 3. Manage encryption key secrets with KMS configuration (actual keys are stored externally in KMS provider) 4. Detect configuration changes to trigger migration 5. Reuse existing migration controller (no changes needed) -6. Split KMS configuration into kms-config and kms-sidecar-config (Tech Preview v2) -7. Copy kms-sidecar-config with keyID suffix to encryption-configuration secrets (Tech Preview v2) +6. Split KMS configuration into kms-config, kms-sidecar-config, kms-credentials, and kms-configmap-data (Tech Preview v2) +7. Copy kms-sidecar-config, kms-credentials, and kms-configmap-data with keyID suffix to encryption-configuration secrets (Tech Preview v2) 8. Field-level comparison to distinguish migration-requiring vs. in-place changes (Tech Preview v2) -9. Credential secret validation with degraded status reporting (Tech Preview v2) +9. Credential secret and ConfigMap validation with degraded status reporting (Tech Preview v2) +10. Periodic sync of referenced Secrets and ConfigMaps to all active key secrets (Tech Preview v2) ### Workflow Description @@ -108,11 +113,11 @@ keyController validates referenced credential secrets. If missing, the controlle #### Encryption Controllers **keyController** manages encryption key lifecycle. Creates encryption key secrets in `openshift-config-managed` namespace. For KMS mode, creates secrets storing KMS configuration. -For Tech Preview v2, also splits configuration into `kms-config` and `kms-sidecar-config`, performs field-level comparison, and validates credential secrets. +For Tech Preview v2, also splits configuration into `kms-config`, `kms-sidecar-config`, `kms-credentials`, and `kms-configmap-data`, performs field-level comparison, validates credential secrets, and periodically syncs referenced Secrets/ConfigMaps to all active key secrets. **stateController** generates EncryptionConfiguration for API server consumption. Implements distributed state machine ensuring all API servers converge to same revision. For KMS mode, generates EncryptionConfiguration using the KMS configuration. -For Tech Preview v2, also copies `kms-sidecar-config` with keyID suffix (e.g., `kms-sidecar-config-1`) to the encryption-configuration secret. +For Tech Preview v2, also copies `kms-sidecar-config`, `kms-credentials`, and `kms-configmap-data` with keyID suffix (e.g., `kms-sidecar-config-1`, `kms-credentials-1`, `kms-configmap-data-1`) to the encryption-configuration secret. **migrationController** orchestrates resource re-encryption. Marks resources as migrated after rewriting in etcd. Works with all encryption modes including KMS. @@ -194,6 +199,8 @@ To enable the apiservers to access the KMS plugin, the `/var/run/kmsplugin` dire data: kms-ec-config: kms-sidecar-config: + kms-credentials: + kms-configmap-data: ``` 3. stateController uses `kms-ec-config` to generate the EncryptionConfiguration (with keyID in the endpoint and provider name): @@ -211,7 +218,7 @@ To enable the apiservers to access the KMS plugin, the `/var/run/kmsplugin` dire timeout: 10s ``` -4. stateController copies `kms-sidecar-config` with keyID suffix to the encryption-configuration secret: +4. stateController copies `kms-sidecar-config`, `kms-credentials`, and `kms-configmap-data` with keyID suffix to the encryption-configuration secret: ```yaml apiVersion: v1 kind: Secret @@ -222,6 +229,8 @@ To enable the apiservers to access the KMS plugin, the `/var/run/kmsplugin` dire data: encryption-config: kms-sidecar-config-1: + kms-credentials-1: + kms-configmap-data-1: ``` 5. The encryption-configuration secret is revisioned, triggering a new rollout. The respective operator configures sidecars accordingly. @@ -257,7 +266,7 @@ resources: timeout: 10s ``` -stateController copies kms-config and kms-sidecar-config from both encryption key secrets into the encryption-configuration secret: +stateController copies kms-sidecar-config, kms-credentials, and kms-configmap-data from both encryption key secrets into the encryption-configuration secret: ```yaml apiVersion: v1 @@ -269,6 +278,10 @@ data: encryption-config: kms-sidecar-config-1: kms-sidecar-config-2: + kms-credentials-1: + kms-credentials-2: + kms-configmap-data-1: + kms-configmap-data-2: ``` Both providers run as separate sidecar containers with different unix domain sockets (kms-1.sock, kms-2.sock). @@ -286,7 +299,7 @@ Only the active provider receives the update. Older providers retain their origi When the user sets the encryption mode to identity, keyController creates a new encryption key secret for identity mode. The EncryptionConfiguration contains identity as write provider and the KMS plugin as read provider until migration completes. -After migration, the unused KMS plugin is removed from EncryptionConfiguration and status conditions notify the admin. Backups encrypted with the previous KMS plugin are not restorable without access to that plugin. This removal mechanism is out of scope in Tech Preview v2. +After migration, the unused KMS plugin is removed from EncryptionConfiguration. This is important because leaving stale providers in EncryptionConfiguration means the API server will continue attempting to connect to the old KMS plugin at startup, blocking readiness if the plugin is no longer available. Status conditions notify the admin that the KMS plugin can be safely decommissioned. Backups encrypted with the previous KMS plugin are not restorable without access to that plugin. The removal mechanism is out of scope in Tech Preview v2. #### Variation: Migration from KMS Plugin A to KMS Plugin B (Tech Preview v2) @@ -328,9 +341,18 @@ Both KMS providers run as separate sidecar containers without deduplication, mai #### Preconditions for Configuration Changes (Tech Preview v2) -- No preconditions for first-time KMS enablement. -- During write key promotion, keyController will not generate a new encryption key. The in-progress key must complete the full state machine first. -- To fix in-progress configuration (e.g., increase timeout), admin must provide the same KMS configuration. This associates the fix with the existing encryption key. +**Invariants:** +1. Once an encryption key is generated, it must propagate through the entire state machine. Each key has a monotonically increasing ID that determines provider ordering in the EncryptionConfiguration. +2. Once a write key has been used by a single instance, it must be assumed to have encrypted data. The rollout must finish before proceeding to the next key. +3. The API configuration must resolve to the same encryption key instance. + +**Pre-flight checks:** Before generating a new encryption key for migration-triggering changes, keyController deploys a pod with the KMS plugin to verify status and encrypt/decrypt capability. A new encryption key is only generated after pre-flight checks succeed. This prevents deadlocks where a misconfigured key (e.g., typo in transit-key) is deployed but non-functional, and the system cannot recover because the key must complete its cycle. + +**Blocked operations during promotion:** keyController will not generate a new encryption key while the in-progress key is being promoted. If the admin overwrites the configuration (e.g., switches from KMS1 to KMS2 while KMS1 is still rolling out), the new key is not generated. To fix the in-progress configuration, admin must provide the same KMS configuration — this associates the fix with the existing encryption key. + +**Recovery from incorrect configuration:** +- Migration-triggering fields: prevented by pre-flight checks (misconfiguration is caught before key generation). +- Non-migration fields (e.g., image): admin provides corrected configuration via APIServer resource. A new revision is created; older providers retain their original configuration as fallback. #### Variation: KMS Key Rotation @@ -387,7 +409,7 @@ This feature does not depend on the features that are excluded from the OKE prod ### Implementation Details/Notes/Constraints -- Both `kms-config` and `kms-sidecar-config` are stored in the same encryption key secret for atomicity +- `kms-config`, `kms-sidecar-config`, `kms-credentials`, and `kms-configmap-data` are stored in the same encryption key secret for atomicity - keyController uses provider-specific field-level comparison (not simple equality) to determine migration necessity - UDS path convention: `unix:///var/run/kmsplugin/kms-{keyID}.sock` — keyID appended for uniqueness @@ -507,12 +529,12 @@ No special handling required. - Recovery: Create/fix the credential secret; keyController resumes automatically **Configuration Change During Write Key Promotion:** -- keyController will not generate a new encryption key during promotion +- keyController will not generate a new encryption key during promotion — admin cannot overwrite the current configuration with a different provider - Admin can fix in-progress config by providing the same KMS configuration (e.g., increase timeout) - Detection: `EncryptionMigrationControllerProgressing=True` **Configuration Updates During Migration:** -- When a migration-triggering field is misconfigured (e.g., typo in transit-key), the resulting encryption key is deployed but non-functional, and the system cannot recover because the key must complete its cycle. To prevent this, keyController runs pre-flight checks before generating a new encryption key: a pod with the KMS plugin is deployed to verify status and encrypt/decrypt capability. A new encryption key is only generated after pre-flight checks succeed. +- Migration-triggering field misconfigurations are prevented by pre-flight checks (see Preconditions section) - Older KMS plugins (read-only providers) cannot be updated; only the active (write) provider can be changed **Non-Migration Update Fallback:** From 24d8f2bc4fccddd57abcf4d9122b1ef288fedc2d Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Arda=20G=C3=BC=C3=A7l=C3=BC?= Date: Thu, 9 Apr 2026 10:04:30 +0300 Subject: [PATCH 04/10] Address suggested changes --- .../kms-encryption-foundations.md | 94 ++++++++++--------- 1 file changed, 51 insertions(+), 43 deletions(-) diff --git a/enhancements/kube-apiserver/kms-encryption-foundations.md b/enhancements/kube-apiserver/kms-encryption-foundations.md index 51506c73e8..79d3f3603a 100644 --- a/enhancements/kube-apiserver/kms-encryption-foundations.md +++ b/enhancements/kube-apiserver/kms-encryption-foundations.md @@ -41,19 +41,36 @@ KMS support enables integration with external key management systems where encry ### Goals +**Tech Preview v1 — Goals:** - Support KMS v2 as a new encryption mode in existing encryption controllers -- Seamless migration between encryption modes (aescbc ↔ KMS, KMS ↔ KMS) - Provider-agnostic implementation with minimal provider-specific code +- Migration between identity ↔ KMS + +**Tech Preview v2 — Goals:** +- Split KMS configuration into kms-encryption-config, kms-provider-config, kms-secret-data, and kms-configmap-data +- Seamless migration between encryption modes (aescbc ↔ KMS, KMS ↔ KMS) +- Field-level comparison to distinguish migration-requiring vs. in-place changes +- Pre-flight checks before generating new encryption keys +- Credential/ConfigMap validation with degraded status reporting +- Periodic sync of referenced Secrets and ConfigMaps to all active key secrets +- KMS plugin deployment/lifecycle management (covered by a separate EP) +- API field definitions for KMS provider configuration in APIServer resource (covered by a [separate EP](https://github.com/openshift/enhancements/pull/1954)) + +**Tech Preview v3 — Goals:** +- Report current KMS encryption status to platform users (e.g., active KMS plugins, migration progress) +- Automatic `key_id` rotation detection +- KMS plugin health checks - Feature parity with existing modes (monitoring, migration, key rotation) +- Removal of unused KMS plugins from EncryptionConfiguration after migration completes +- Support updating the KMS timeout field via `unsupportedConfigOverrides` + +**GA — Goals:** +- Failure mode coverage: loss of access to KMS service, lost encryption keys, loss of credentials ### Non-Goals - Implementing KMS plugins (provided by upstream Kubernetes/vendors) -- KMS plugin deployment/lifecycle management (covered by a separate EP) -- KMS plugin health checks (GA) - Recovery from KMS key loss -- Automatic `key_id` rotation detection (GA) -- API field definitions for KMS provider configuration in APIServer resource (covered by a separate EP) ## Proposal @@ -69,22 +86,13 @@ Encryption controllers use the static endpoint in EncryptionConfiguration. KMS-t Users specify plugin-specific configuration for managed KMS provider types (e.g. Vault) via the APIServer resource (API fields covered by a separate EP). Encryption controllers split the KMS configuration API into multiple parts stored atomically in encryption key secrets: -1. `kms-config` — fields for EncryptionConfiguration (apiVersion, name, endpoint, timeout) -2. `kms-sidecar-config` — provider-specific fields for sidecar containers (image, vault-address, listen-address, transit-mount, transit-key, etc.) -3. `kms-credentials` — credential data fetched from referenced secrets (e.g., approle credentials from `openshift-config` namespace) -4. `kms-configmap-data` — ConfigMap data needed by KMS plugins (e.g., CA bundles) - -Storing all in the same secret avoids race conditions where EncryptionConfiguration references a KMS plugin whose sidecar configuration or credentials are not yet available. - -The keyID is appended to the UDS path (`unix:///var/run/kmsplugin/kms-{keyID}.sock`) to ensure uniqueness among providers. The UDS path is the sole configuration shared between kms-config and kms-sidecar-config. - -keyController performs field-level comparison to determine whether a change requires migration or can be applied in-place: -- Migration-triggering fields (affect KEK): vault-address, vault-namespace, transit-key, transit-mount -- In-place fields (container spec only): e.g., image - -keyController validates referenced credential secrets. If missing, the controller goes degraded and no changes are propagated. +1. `kms-encryption-config` — structured Kubernetes KMS v2 provider configuration used to generate the EncryptionConfiguration provider entry (apiVersion: v2, name, endpoint, timeout) +2. `kms-provider-config` — serialized `KMSConfig` resource ([config.openshift.io/v1](https://github.com/openshift/api/blob/master/config/v1/types_kmsencryption.go)), giving consumers access to provider-specific configuration (image, vault-address, transit-mount, transit-key, etc.) +3. `kms-secret-data` — content of the referenced Secret (e.g., approle credentials) +4. `kms-configmap-data` — content of the referenced ConfigMap (e.g., CA bundles) -keyController periodically watches the content of referenced Secrets and ConfigMaps and keeps all active key secrets up to date — not just the latest write key. When referenced data changes (e.g., credential rotation), keyController updates the corresponding encryption key secrets without triggering key rotation or data migration. +Storing all related data in a single secret ensures consistency and leverages existing revisioning and cleanup mechanisms. +The keyID is appended to the UDS path (`unix:///var/run/kmsplugin/kms-{keyID}.sock`) to ensure uniqueness among providers, enabling KMS-to-KMS migrations with multiple concurrent plugins. **Key changes in library-go:** 1. Add KMS mode constant to encryption state types @@ -92,8 +100,8 @@ keyController periodically watches the content of referenced Secrets and ConfigM 3. Manage encryption key secrets with KMS configuration (actual keys are stored externally in KMS provider) 4. Detect configuration changes to trigger migration 5. Reuse existing migration controller (no changes needed) -6. Split KMS configuration into kms-config, kms-sidecar-config, kms-credentials, and kms-configmap-data (Tech Preview v2) -7. Copy kms-sidecar-config, kms-credentials, and kms-configmap-data with keyID suffix to encryption-configuration secrets (Tech Preview v2) +6. Split KMS configuration into kms-encryption-config, kms-provider-config, kms-secret-data, and kms-configmap-data (Tech Preview v2) +7. Copy kms-provider-config, kms-secret-data, and kms-configmap-data with keyID suffix to encryption-configuration secrets (Tech Preview v2) 8. Field-level comparison to distinguish migration-requiring vs. in-place changes (Tech Preview v2) 9. Credential secret and ConfigMap validation with degraded status reporting (Tech Preview v2) 10. Periodic sync of referenced Secrets and ConfigMaps to all active key secrets (Tech Preview v2) @@ -113,11 +121,11 @@ keyController periodically watches the content of referenced Secrets and ConfigM #### Encryption Controllers **keyController** manages encryption key lifecycle. Creates encryption key secrets in `openshift-config-managed` namespace. For KMS mode, creates secrets storing KMS configuration. -For Tech Preview v2, also splits configuration into `kms-config`, `kms-sidecar-config`, `kms-credentials`, and `kms-configmap-data`, performs field-level comparison, validates credential secrets, and periodically syncs referenced Secrets/ConfigMaps to all active key secrets. +For Tech Preview v2, also splits configuration into `kms-encryption-config`, `kms-provider-config`, `kms-secret-data`, and `kms-configmap-data`, performs field-level comparison, validates credential secrets, and periodically syncs referenced Secrets/ConfigMaps to all active key secrets. **stateController** generates EncryptionConfiguration for API server consumption. Implements distributed state machine ensuring all API servers converge to same revision. For KMS mode, generates EncryptionConfiguration using the KMS configuration. -For Tech Preview v2, also copies `kms-sidecar-config`, `kms-credentials`, and `kms-configmap-data` with keyID suffix (e.g., `kms-sidecar-config-1`, `kms-credentials-1`, `kms-configmap-data-1`) to the encryption-configuration secret. +For Tech Preview v2, also copies `kms-provider-config`, `kms-secret-data`, and `kms-configmap-data` with keyID suffix (e.g., `kms-provider-config-1`, `kms-secret-data-1`, `kms-configmap-data-1`) to the encryption-configuration secret. **migrationController** orchestrates resource re-encryption. Marks resources as migrated after rewriting in etcd. Works with all encryption modes including KMS. @@ -149,10 +157,10 @@ To enable the apiservers to access the KMS plugin, the `/var/run/kmsplugin` dire annotations: encryption.apiserver.operator.openshift.io/mode: "KMS" data: - encryption.apiserver.operator.openshift.io-key: "" + encryption.apiserver.operator.openshift.io-key: "" # Contains base64-encoded structured data with KMS configuration: # - Tech Preview v1: Static endpoint path (unix:///var/run/kmsplugin/kms.sock) - # - Tech Preview v2: kms-ec-config and kms-sidecar-config (see Tech Preview v2 section below) + # - Tech Preview v2: kms-encryption-config and kms-provider-config (see Tech Preview v2 section below) ``` 4. stateController generates EncryptionConfiguration using the endpoint: @@ -186,7 +194,7 @@ To enable the apiservers to access the KMS plugin, the `/var/run/kmsplugin` dire # Vault API specific fields ``` -2. keyController detects the configuration, splits it into `kms-config` and `kms-sidecar-config`, and creates an encryption key secret: +2. keyController detects the configuration, splits it into `kms-encryption-config`, `kms-provider-config`, `kms-secret-data`, and `kms-configmap-data`, and creates an encryption key secret: ```yaml apiVersion: v1 kind: Secret @@ -197,13 +205,13 @@ To enable the apiservers to access the KMS plugin, the `/var/run/kmsplugin` dire encryption.apiserver.operator.openshift.io/mode: "KMS" type: Opaque data: - kms-ec-config: - kms-sidecar-config: - kms-credentials: + kms-encryption-config: + kms-provider-config: + kms-secret-data: kms-configmap-data: ``` -3. stateController uses `kms-ec-config` to generate the EncryptionConfiguration (with keyID in the endpoint and provider name): +3. stateController uses `kms-encryption-config` to generate the EncryptionConfiguration (with keyID in the endpoint and provider name): ```yaml apiVersion: apiserver.config.k8s.io/v1 kind: EncryptionConfiguration @@ -218,7 +226,7 @@ To enable the apiservers to access the KMS plugin, the `/var/run/kmsplugin` dire timeout: 10s ``` -4. stateController copies `kms-sidecar-config`, `kms-credentials`, and `kms-configmap-data` with keyID suffix to the encryption-configuration secret: +4. stateController copies `kms-provider-config`, `kms-secret-data`, and `kms-configmap-data` with keyID suffix to the encryption-configuration secret: ```yaml apiVersion: v1 kind: Secret @@ -228,8 +236,8 @@ To enable the apiservers to access the KMS plugin, the `/var/run/kmsplugin` dire type: Opaque data: encryption-config: - kms-sidecar-config-1: - kms-credentials-1: + kms-provider-config-1: + kms-secret-data-1: kms-configmap-data-1: ``` @@ -266,7 +274,7 @@ resources: timeout: 10s ``` -stateController copies kms-sidecar-config, kms-credentials, and kms-configmap-data from both encryption key secrets into the encryption-configuration secret: +stateController copies kms-provider-config, kms-secret-data, and kms-configmap-data from both encryption key secrets into the encryption-configuration secret: ```yaml apiVersion: v1 @@ -276,10 +284,10 @@ metadata: namespace: openshift-kube-apiserver data: encryption-config: - kms-sidecar-config-1: - kms-sidecar-config-2: - kms-credentials-1: - kms-credentials-2: + kms-provider-config-1: + kms-provider-config-2: + kms-secret-data-1: + kms-secret-data-2: kms-configmap-data-1: kms-configmap-data-2: ``` @@ -291,7 +299,7 @@ Both providers run as separate sidecar containers with different unix domain soc Fields that only affect the container spec (e.g., image for CVE fixes) do not change the KEK: 1. keyController updates the existing encryption key secret in-place. No new secret is created. -2. stateController detects the change and triggers a new revision with the updated `kms-sidecar-config`. +2. stateController detects the change and triggers a new revision with the updated `kms-provider-config`. Only the active provider receives the update. Older providers retain their original sidecar configuration as fallback. @@ -382,7 +390,7 @@ and deploy KMS plugins at the hardcoded endpoint `unix:///var/run/kmsplugin/kms. **Tech Preview V2** -API changes for Tech Preview v2 are covered by a separate EP. This EP assumes the API exists and describes only the encryption controller-side implementation. The API provides provider-specific fields (image, vault-address, vault-namespace, transit-key, transit-mount, etc.) that keyController splits into `kms-config` and `kms-sidecar-config`. +API changes for Tech Preview v2 are covered by a separate EP. This EP assumes the API exists and describes only the encryption controller-side implementation. The API provides provider-specific fields (image, vault-address, vault-namespace, transit-key, transit-mount, etc.) that keyController splits into `kms-encryption-config` and `kms-provider-config`. ### Topology Considerations @@ -409,7 +417,7 @@ This feature does not depend on the features that are excluded from the OKE prod ### Implementation Details/Notes/Constraints -- `kms-config`, `kms-sidecar-config`, `kms-credentials`, and `kms-configmap-data` are stored in the same encryption key secret for atomicity +- `kms-encryption-config`, `kms-provider-config`, `kms-secret-data`, and `kms-configmap-data` are stored in the same encryption key secret for atomicity - keyController uses provider-specific field-level comparison (not simple equality) to determine migration necessity - UDS path convention: `unix:///var/run/kmsplugin/kms-{keyID}.sock` — keyID appended for uniqueness @@ -468,7 +476,7 @@ None ### Tech Preview v1 -> Tech Preview v2 -- KMS configuration splitting into kms-config and kms-sidecar-config with atomic storage in encryption key secrets +- KMS configuration splitting into kms-encryption-config and kms-provider-config with atomic storage in encryption key secrets - Multiple concurrent KMS providers during migration with UDS path isolation - Field-level comparison for migration-requiring vs. in-place configuration changes - Credential secret validation with degraded status reporting From 44e716aa03dc745438b9a95da375d9763b408e2e Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Arda=20G=C3=BC=C3=A7l=C3=BC?= Date: Thu, 9 Apr 2026 10:31:13 +0300 Subject: [PATCH 05/10] Add preflight checks in TP v2 goal --- .../kube-apiserver/kms-encryption-foundations.md | 14 ++++---------- 1 file changed, 4 insertions(+), 10 deletions(-) diff --git a/enhancements/kube-apiserver/kms-encryption-foundations.md b/enhancements/kube-apiserver/kms-encryption-foundations.md index 79d3f3603a..7e2e46ef8b 100644 --- a/enhancements/kube-apiserver/kms-encryption-foundations.md +++ b/enhancements/kube-apiserver/kms-encryption-foundations.md @@ -95,16 +95,10 @@ Storing all related data in a single secret ensures consistency and leverages ex The keyID is appended to the UDS path (`unix:///var/run/kmsplugin/kms-{keyID}.sock`) to ensure uniqueness among providers, enabling KMS-to-KMS migrations with multiple concurrent plugins. **Key changes in library-go:** -1. Add KMS mode constant to encryption state types -2. Track KMS configuration in encryption key secrets -3. Manage encryption key secrets with KMS configuration (actual keys are stored externally in KMS provider) -4. Detect configuration changes to trigger migration -5. Reuse existing migration controller (no changes needed) -6. Split KMS configuration into kms-encryption-config, kms-provider-config, kms-secret-data, and kms-configmap-data (Tech Preview v2) -7. Copy kms-provider-config, kms-secret-data, and kms-configmap-data with keyID suffix to encryption-configuration secrets (Tech Preview v2) -8. Field-level comparison to distinguish migration-requiring vs. in-place changes (Tech Preview v2) -9. Credential secret and ConfigMap validation with degraded status reporting (Tech Preview v2) -10. Periodic sync of referenced Secrets and ConfigMaps to all active key secrets (Tech Preview v2) +1. Add KMS mode constant and track KMS configuration in encryption key secrets +2. Split configuration into kms-encryption-config, kms-provider-config, kms-secret-data, and kms-configmap-data; copy with keyID suffix to encryption-configuration secrets (Tech Preview v2) +3. Field-level comparison, credential/ConfigMap validation, and periodic sync of referenced resources to all active key secrets (Tech Preview v2) +4. Reuse existing migration controller (no changes needed) ### Workflow Description From 8fad1eb6f877013cca0a9e98a36d6b4a504018e8 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Arda=20G=C3=BC=C3=A7l=C3=BC?= Date: Thu, 9 Apr 2026 10:48:49 +0300 Subject: [PATCH 06/10] Update graduation criteria accordingly --- .../kms-encryption-foundations.md | 20 +++++++++++++------ 1 file changed, 14 insertions(+), 6 deletions(-) diff --git a/enhancements/kube-apiserver/kms-encryption-foundations.md b/enhancements/kube-apiserver/kms-encryption-foundations.md index 7e2e46ef8b..bc8ec1793c 100644 --- a/enhancements/kube-apiserver/kms-encryption-foundations.md +++ b/enhancements/kube-apiserver/kms-encryption-foundations.md @@ -470,18 +470,26 @@ None ### Tech Preview v1 -> Tech Preview v2 -- KMS configuration splitting into kms-encryption-config and kms-provider-config with atomic storage in encryption key secrets +- KMS configuration splitting into kms-encryption-config, kms-provider-config, kms-secret-data, and kms-configmap-data with atomic storage in encryption key secrets - Multiple concurrent KMS providers during migration with UDS path isolation - Field-level comparison for migration-requiring vs. in-place configuration changes -- Credential secret validation with degraded status reporting +- Pre-flight checks before generating new encryption keys +- Credential/ConfigMap validation with degraded status reporting +- Periodic sync of referenced Secrets and ConfigMaps to all active key secrets - All migration scenarios validated (KMS-to-KMS, KMS-to-static, KMS-to-identity-to-KMS) +### Tech Preview v2 -> Tech Preview v3 + +- Report current KMS encryption status to platform users (e.g., active KMS plugins, migration progress) +- Automatic `key_id` rotation detection +- KMS plugin health checks +- Feature parity with existing modes (monitoring, migration, key rotation) +- Removal of unused KMS plugins from EncryptionConfiguration after migration completes +- Support updating the KMS timeout field via `unsupportedConfigOverrides` + ### Tech Preview -> GA -- Full support for key rotation, with automated data re-encryption -- Health check preconditions (block operations when plugin unhealthy) -- Failure mode coverage: loss of access to KMS service (detection + mitigation) -- Failure mode coverage: lost encryption keys (detection + mitigation) +- Failure mode coverage: loss of access to KMS service - Comprehensive integration and E2E test coverage - Production validation in multiple environments From 4e5f0c59207583cd198222ea34372bcb4983ce1e Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Arda=20G=C3=BC=C3=A7l=C3=BC?= Date: Mon, 13 Apr 2026 10:11:04 +0300 Subject: [PATCH 07/10] Address requested changes --- .../kms-encryption-foundations.md | 22 ++++++++++++------- 1 file changed, 14 insertions(+), 8 deletions(-) diff --git a/enhancements/kube-apiserver/kms-encryption-foundations.md b/enhancements/kube-apiserver/kms-encryption-foundations.md index bc8ec1793c..95ffa3f0c5 100644 --- a/enhancements/kube-apiserver/kms-encryption-foundations.md +++ b/enhancements/kube-apiserver/kms-encryption-foundations.md @@ -60,17 +60,19 @@ KMS support enables integration with external key management systems where encry - Report current KMS encryption status to platform users (e.g., active KMS plugins, migration progress) - Automatic `key_id` rotation detection - KMS plugin health checks -- Feature parity with existing modes (monitoring, migration, key rotation) - Removal of unused KMS plugins from EncryptionConfiguration after migration completes - Support updating the KMS timeout field via `unsupportedConfigOverrides` **GA — Goals:** -- Failure mode coverage: loss of access to KMS service, lost encryption keys, loss of credentials +- Failure mode coverage (detection + mitigation for each): + - Misconfiguration of the KMS plugin + - Loss of access to the KMS service + - Loss of credentials ### Non-Goals - Implementing KMS plugins (provided by upstream Kubernetes/vendors) -- Recovery from KMS key loss +- Recovery from KMS key loss (if the key is deleted externally, recovery is equivalent to bootstrapping the cluster from scratch) ## Proposal @@ -91,7 +93,11 @@ Encryption controllers split the KMS configuration API into multiple parts store 3. `kms-secret-data` — content of the referenced Secret (e.g., approle credentials) 4. `kms-configmap-data` — content of the referenced ConfigMap (e.g., CA bundles) -Storing all related data in a single secret ensures consistency and leverages existing revisioning and cleanup mechanisms. +Storing all related data in a single secret avoids race conditions caused by reading live, independently changing configuration. +In kas-o, the targetConfigController operates on live data and may generate a manifest based on the current sidecar configuration. However, this configuration can change before the RevisionController creates a revision. +As a result, the generated manifest may no longer match the actual configuration state at the time the revision is created. Keeping all dependent configuration in a single secret ensures consistency and guarantees that both controllers operate on the same, atomic snapshot of data. + +Additionally, consolidating the data in a single secret leverages existing revisioning and cleanup mechanisms. The keyID is appended to the UDS path (`unix:///var/run/kmsplugin/kms-{keyID}.sock`) to ensure uniqueness among providers, enabling KMS-to-KMS migrations with multiple concurrent plugins. **Key changes in library-go:** @@ -115,7 +121,7 @@ The keyID is appended to the UDS path (`unix:///var/run/kmsplugin/kms-{keyID}.so #### Encryption Controllers **keyController** manages encryption key lifecycle. Creates encryption key secrets in `openshift-config-managed` namespace. For KMS mode, creates secrets storing KMS configuration. -For Tech Preview v2, also splits configuration into `kms-encryption-config`, `kms-provider-config`, `kms-secret-data`, and `kms-configmap-data`, performs field-level comparison, validates credential secrets, and periodically syncs referenced Secrets/ConfigMaps to all active key secrets. +For Tech Preview v2, also propagates updates from the API configuration, splits configuration into `kms-encryption-config`, `kms-provider-config`, `kms-secret-data`, and `kms-configmap-data`, performs field-level comparison, validates credential secrets, and periodically syncs referenced Secrets/ConfigMaps to all active key secrets. **stateController** generates EncryptionConfiguration for API server consumption. Implements distributed state machine ensuring all API servers converge to same revision. For KMS mode, generates EncryptionConfiguration using the KMS configuration. @@ -241,11 +247,11 @@ To enable the apiservers to access the KMS plugin, the `/var/run/kmsplugin` dire 7. conditionController updates status conditions: `EncryptionInProgress`, then `EncryptionCompleted`. -There are no preconditions for enabling KMS for the first time. +For first-time KMS enablement, keyController runs pre-flight checks by deploying a pod with the KMS plugin to verify status and encrypt/decrypt capability before generating the first encryption key. #### Variation: Updates Requiring Migration (Tech Preview v2) -If a field affecting the KEK is changed (**vault-address**, **vault-namespace**, **transit-key**, **transit-mount**), keyController creates a new encryption key secret with the next keyID. +If a field affecting the KEK is changed (**vault-address**, **vault-namespace**, **transit-key**, **transit-mount**), keyController creates a new encryption key secret with the next keyID (see [Preconditions for Configuration Changes](#preconditions-for-configuration-changes-tech-preview-v2) for invariants and pre-flight checks that apply before a new key is generated). stateController generates an EncryptionConfiguration with both providers — new as write key, old as read key: @@ -480,7 +486,7 @@ None ### Tech Preview v2 -> Tech Preview v3 -- Report current KMS encryption status to platform users (e.g., active KMS plugins, migration progress) +- Report current KMS encryption status to platform users (e.g., active KMS plugins) - Automatic `key_id` rotation detection - KMS plugin health checks - Feature parity with existing modes (monitoring, migration, key rotation) From b00d188a6dcb129c2e23975e59e9a61db6723f59 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Arda=20G=C3=BC=C3=A7l=C3=BC?= Date: Mon, 13 Apr 2026 13:19:53 +0300 Subject: [PATCH 08/10] Add key lost consideration --- .../kube-apiserver/kms-encryption-foundations.md | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/enhancements/kube-apiserver/kms-encryption-foundations.md b/enhancements/kube-apiserver/kms-encryption-foundations.md index 95ffa3f0c5..7b128a009b 100644 --- a/enhancements/kube-apiserver/kms-encryption-foundations.md +++ b/enhancements/kube-apiserver/kms-encryption-foundations.md @@ -72,7 +72,7 @@ KMS support enables integration with external key management systems where encry ### Non-Goals - Implementing KMS plugins (provided by upstream Kubernetes/vendors) -- Recovery from KMS key loss (if the key is deleted externally, recovery is equivalent to bootstrapping the cluster from scratch) +- Recovery from KMS key loss (see [KMS Key Loss Considerations](#kms-key-loss-considerations) for details) ## Proposal @@ -439,6 +439,18 @@ This feature does not depend on the features that are excluded from the OKE prod - **Impact:** Conflict with in-progress state machine - **Mitigation:** keyController blocks new encryption key generation during promotion +#### KMS Key Loss Considerations + +If the KMS key (the KEK used to encrypt the cluster seed, which Kubernetes then uses to generate DEKs for encrypting cluster data) is deleted externally, all encrypted resources in etcd become unreadable. + +Recovery from this situation would require deleting all resources that we are unable to decode and then recreating them from scratch. This process is costly and complex to implement (for example, all certificates would need to be reissued, the etcd cluster rebuilt, etc.), and is comparable in effort to implementing a full re-bootstrap. Additionally, the recovery flow would need to be covered by CI tests to catch potential regressions. + +Moreover, the platform itself would not be able to recreate resources required by user workloads, since only users have the necessary knowledge about them. In practice, this means users must have their own mechanisms for restoring these resources. + +On the Vault side, the key is stored in Vault's Transit secrets engine. By default, keys in Transit have `deletion_allowed` set to `false`. A Vault administrator would need to explicitly change this setting to `true` in order to allow key deletion. In general, standard best practices should be followed. This includes enforcing least-privilege access to sensitive API endpoints, such as those used for key deletion or key configuration updates. It is also recommended to periodically back up keys, so they can be restored if needed. + +For these reasons, recovery from KMS key loss is a non-goal of this enhancement. + ### Drawbacks - Adds complexity to encryption controllers for KMS-specific logic From b06d684b2ecc47e768fb95b053172e68cdbd5639 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Arda=20G=C3=BC=C3=A7l=C3=BC?= Date: Mon, 13 Apr 2026 14:06:15 +0300 Subject: [PATCH 09/10] Mention about unfinalized data content mechanism --- enhancements/kube-apiserver/kms-encryption-foundations.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/enhancements/kube-apiserver/kms-encryption-foundations.md b/enhancements/kube-apiserver/kms-encryption-foundations.md index 7b128a009b..ee336510b5 100644 --- a/enhancements/kube-apiserver/kms-encryption-foundations.md +++ b/enhancements/kube-apiserver/kms-encryption-foundations.md @@ -90,8 +90,8 @@ Encryption controllers split the KMS configuration API into multiple parts store 1. `kms-encryption-config` — structured Kubernetes KMS v2 provider configuration used to generate the EncryptionConfiguration provider entry (apiVersion: v2, name, endpoint, timeout) 2. `kms-provider-config` — serialized `KMSConfig` resource ([config.openshift.io/v1](https://github.com/openshift/api/blob/master/config/v1/types_kmsencryption.go)), giving consumers access to provider-specific configuration (image, vault-address, transit-mount, transit-key, etc.) -3. `kms-secret-data` — content of the referenced Secret (e.g., approle credentials) -4. `kms-configmap-data` — content of the referenced ConfigMap (e.g., CA bundles) +3. `kms-secret-data` — content of the referenced Secret (e.g., approle credentials). The exact mechanism and content are still under experimentation; this EP will be updated once finalized. +4. `kms-configmap-data` — content of the referenced ConfigMap (e.g., CA bundles). The exact mechanism and content are still under experimentation; this EP will be updated once finalized. Storing all related data in a single secret avoids race conditions caused by reading live, independently changing configuration. In kas-o, the targetConfigController operates on live data and may generate a manifest based on the current sidecar configuration. However, this configuration can change before the RevisionController creates a revision. From 36c0e741063be46ed4f2ce546a913907182e787e Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Arda=20G=C3=BC=C3=A7l=C3=BC?= Date: Mon, 13 Apr 2026 18:00:50 +0300 Subject: [PATCH 10/10] Add secret/configmap reference mechanism --- .../kms-encryption-foundations.md | 20 +++++++++++++++++-- 1 file changed, 18 insertions(+), 2 deletions(-) diff --git a/enhancements/kube-apiserver/kms-encryption-foundations.md b/enhancements/kube-apiserver/kms-encryption-foundations.md index ee336510b5..be8bdfcb08 100644 --- a/enhancements/kube-apiserver/kms-encryption-foundations.md +++ b/enhancements/kube-apiserver/kms-encryption-foundations.md @@ -90,8 +90,24 @@ Encryption controllers split the KMS configuration API into multiple parts store 1. `kms-encryption-config` — structured Kubernetes KMS v2 provider configuration used to generate the EncryptionConfiguration provider entry (apiVersion: v2, name, endpoint, timeout) 2. `kms-provider-config` — serialized `KMSConfig` resource ([config.openshift.io/v1](https://github.com/openshift/api/blob/master/config/v1/types_kmsencryption.go)), giving consumers access to provider-specific configuration (image, vault-address, transit-mount, transit-key, etc.) -3. `kms-secret-data` — content of the referenced Secret (e.g., approle credentials). The exact mechanism and content are still under experimentation; this EP will be updated once finalized. -4. `kms-configmap-data` — content of the referenced ConfigMap (e.g., CA bundles). The exact mechanism and content are still under experimentation; this EP will be updated once finalized. +3. `kms-secret-{key}-{keyID}` — individual keys from the referenced Secret are stored as separate entries (e.g., `kms-secret-id-1`, `kms-secret-login-1`, `kms-secret-password-1` for Vault approle credentials) +4. `kms-configmap-{key}-{keyID}` — individual keys from the referenced ConfigMap are stored as separate entries (e.g., `kms-configmap-ca-1` for CA bundles) + + For example, an encryption-configuration secret with this layout: + ```yaml + apiVersion: v1 + kind: Secret + metadata: + name: encryption-config-kube-apiserver-9 + data: + kms-provider-config-1: | + address: bar + ... + kms-secret-id-1: VALUE + kms-secret-login-1: VALUE + kms-secret-password-1: VALUE + kms-configmap-ca-1: VALUE + ``` Storing all related data in a single secret avoids race conditions caused by reading live, independently changing configuration. In kas-o, the targetConfigController operates on live data and may generate a manifest based on the current sidecar configuration. However, this configuration can change before the RevisionController creates a revision.