Skip to content
9 changes: 9 additions & 0 deletions charts/postgres-operator/crds/operatorconfigurations.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -122,6 +122,15 @@ spec:
users:
type: object
properties:
enable_password_rotation:
type: boolean
default: false
password_rotation_interval:
type: integer
default: 90
password_rotation_user_retention:
type: integer
default: 180
replication_username:
type: string
default: standby
Expand Down
10 changes: 10 additions & 0 deletions charts/postgres-operator/crds/postgresqls.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -551,6 +551,16 @@ spec:
- SUPERUSER
- nosuperuser
- NOSUPERUSER
usersWithPasswordRotation:
type: array
nullable: true
items:
type: string
usersWithInPlacePasswordRotation:
type: array
nullable: true
items:
type: string
volume:
type: object
required:
Expand Down
78 changes: 78 additions & 0 deletions docs/administrator.md
Original file line number Diff line number Diff line change
Expand Up @@ -293,6 +293,84 @@ that are aggregated into the K8s [default roles](https://kubernetes.io/docs/refe

For Helm deployments setting `rbac.createAggregateClusterRoles: true` adds these clusterroles to the deployment.

## Password rotation in K8s secrets

The operator regularly updates credentials in the K8s secrets if the
`enable_password_rotation` option is set to `true` in the configuration.
It happens only for `LOGIN` roles with an associated secret (manifest roles,
default users from `preparedDatabases`). Furthermore, there are the following
exceptions:

1. Infrastructure role secrets since rotation should happen by the infrastructure.
2. Team API roles that connect via OAuth2 and JWT token (no secrets to these roles anyway).
3. Database owners since ownership on database objects can not be inherited.
4. System users such as `postgres`, `standby` and `pooler` user.

The interval of days can be set with `password_rotation_interval` (default
`90` = 90 days, minimum 1). On each rotation the user name and password values
are replaced in the K8s secret. They belong to a newly created user named after
the original role plus rotation date in YYMMDD format. All priviliges are
inherited meaning that migration scripts should still grant and revoke rights
against the original role. The timestamp of the next rotation is written to the
secret as well. Note, if the rotation interval is decreased it is reflected in
the secrets only if the next rotation date is more days away than the new
length of the interval.

Pods still using the previous secret values which they keep in memory continue
to connect to the database since the password of the corresponding user is not
replaced. However, a retention policy can be configured for users created by
the password rotation feature with `password_rotation_user_retention`. The
operator will ensure that this period is at least twice as long as the
configured rotation interval, hence the default of `180` = 180 days. When
the creation date of a rotated user is older than the retention period it
might not get removed immediately. Only on the next user rotation it is checked
if users can get removed. Therefore, you might want to configure the retention
to be a multiple of the rotation interval.

### Password rotation for single users

From the configuration, password rotation is enabled for all secrets with the
mentioned exceptions. If you wish to first test rotation for a single user (or
just have it enabled only for a few secrets) you can specify it in the cluster
manifest. The rotation and retention intervals can only be configured globally.

```
spec:
usersWithSecretRotation:
- foo_user
- bar_reader_user
```

### Password replacement without extra users

For some use cases where the secret is only used rarely - think of a `flyway`
user running a migration script on pod start - we do not need to create extra
database users but can replace only the password in the K8s secret. This type
of rotation cannot be configured globally but specified in the cluster
manifest:

```
spec:
usersWithInPlaceSecretRotation:
- flyway
- bar_owner_user
```

This would be the recommended option to enable rotation in secrets of database
owners, but only if they are not used as application users for regular read
and write operations.

### Turning off password rotation

When password rotation is turned off again the operator will check if the
`username` value in the secret matches the original username and replace it
with the latter. A new password is assigned and the `nextRotation` field is
cleared. A final lookup for child (rotation) users to be removed is done but
they will only be dropped if the retention policy allows for it. This is to
avoid sudden connection issues in pods which still use credentials of these
users in memory. You have to remove these child users manually or re-enable
password rotation with smaller interval so they get cleaned up.

## Use taints and tolerations for dedicated PostgreSQL nodes

To ensure Postgres pods are running on nodes without any other application pods,
Expand Down
16 changes: 16 additions & 0 deletions docs/reference/cluster_manifest.md
Original file line number Diff line number Diff line change
Expand Up @@ -115,6 +115,22 @@ These parameters are grouped directly under the `spec` key in the manifest.
create the K8s secret in that namespace. The part after the first `.` is
considered to be the user name. Optional.

* **usersWithSecretRotation**
list of users to enable credential rotation in K8s secrets. The rotation
interval can only be configured globally. On each rotation a new user will
be added in the database replacing the `username` value in the secret of
the listed user. Although, rotation users inherit all rights from the
original role, keep in mind that ownership is not transferred. See more
details in the [administrator docs](https://github.com/zalando/postgres-operator/blob/master/docs/administrator.md#password-rotation-in-k8s-secrets).

* **usersWithInPlaceSecretRotation**
list of users to enable in-place password rotation in K8s secrets. The
rotation interval can only be configured globally. On each rotation the
password value will be replaced in the secrets which the operator reflects
in the database, too. List only users here that rarely connect to the
database, like a flyway user running a migration on Pod start. See more
details in the [administrator docs](https://github.com/zalando/postgres-operator/blob/master/docs/administrator.md#password-replacement-without-extra-users).

* **databases**
a map of database names to database owners for the databases that should be
created by the operator. The owner users should already exist on the cluster
Expand Down
22 changes: 22 additions & 0 deletions docs/reference/operator_parameters.md
Original file line number Diff line number Diff line change
Expand Up @@ -174,6 +174,28 @@ under the `users` key.
Postgres username used for replication between instances. The default is
`standby`.

* **enable_password_rotation**
For all `LOGIN` roles that are not database owners the operator can rotate
credentials in the corresponding K8s secrets by replacing the username and
password. This means, new users will be added on each rotation inheriting
all priviliges from the original roles. The rotation date (in YYMMDD format)
is appended to the names of the new user. The timestamp of the next rotation
is written to the secret. The default is `false`.

* **password_rotation_interval**
If password rotation is enabled (either from config or cluster manifest) the
interval can be configured with this parameter. The measure is in days which
means daily rotation (`1`) is the most frequent interval possible.
Default is `90`.

* **password_rotation_user_retention**
To avoid an ever growing amount of new users due to password rotation the
operator will remove the created users again after a certain amount of days
has passed. The number can be configured with this parameter. However, the
operator will check that the retention policy is at least twice as long as
the rotation interval and update to this minimum in case it is not.
Default is `180`.

## Major version upgrades

Parameters configuring automatic major version upgrades. In a
Expand Down
3 changes: 3 additions & 0 deletions e2e/tests/k8s_api.py
Original file line number Diff line number Diff line change
Expand Up @@ -321,6 +321,9 @@ def get_cluster_leader_pod(self, labels='application=spilo,cluster-name=acid-min
def get_cluster_replica_pod(self, labels='application=spilo,cluster-name=acid-minimal-cluster', namespace='default'):
return self.get_cluster_pod('replica', labels, namespace)

def get_secret_data(self, username, clustername='acid-minimal-cluster', namespace='default'):
return self.api.core_v1.read_namespaced_secret(
"{}.{}.credentials.postgresql.acid.zalan.do".format(username.replace("_","-"), clustername), namespace).data

class K8sBase:
'''
Expand Down
122 changes: 119 additions & 3 deletions e2e/tests/test_e2e.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,9 @@
import timeout_decorator
import os
import yaml
import base64

from datetime import datetime
from datetime import datetime, date, timedelta
from kubernetes import client

from tests.k8s_api import K8s
Expand Down Expand Up @@ -579,6 +580,7 @@ def verify_role():
"Parameters": None,
"AdminRole": "",
"Origin": 2,
"IsDbOwner": False,
"Deleted": False
})
return True
Expand All @@ -600,7 +602,6 @@ def test_lazy_spilo_upgrade(self):
but lets pods run with the old image until they are recreated for
reasons other than operator's activity. That works because the operator
configures stateful sets to use "onDelete" pod update policy.

The test covers:
1) enabling lazy upgrade in existing operator deployment
2) forcing the normal rolling upgrade by changing the operator
Expand Down Expand Up @@ -695,7 +696,6 @@ def test_logical_backup_cron_job(self):
Ensure we can (a) create the cron job at user request for a specific PG cluster
(b) update the cluster-wide image for the logical backup pod
(c) delete the job at user request

Limitations:
(a) Does not run the actual batch job because there is no S3 mock to upload backups to
(b) Assumes 'acid-minimal-cluster' exists as defined in setUp
Expand Down Expand Up @@ -1056,6 +1056,122 @@ def test_overwrite_pooler_deployment(self):
self.eventuallyEqual(lambda: k8s.count_running_pods("connection-pooler=acid-minimal-cluster-pooler"),
0, "Pooler pods not scaled down")

@timeout_decorator.timeout(TEST_TIMEOUT_SEC)
def test_password_rotation(self):
'''
Test password rotation and removal of users due to retention policy
'''
k8s = self.k8s
leader = k8s.get_cluster_leader_pod()
today = date.today()

# enable password rotation for owner of foo database
pg_patch_inplace_rotation_for_owner = {
"spec": {
"usersWithInPlaceSecretRotation": [
"zalando"
]
}
}
k8s.api.custom_objects_api.patch_namespaced_custom_object(
"acid.zalan.do", "v1", "default", "postgresqls", "acid-minimal-cluster", pg_patch_inplace_rotation_for_owner)
self.eventuallyEqual(lambda: k8s.get_operator_state(), {"0": "idle"}, "Operator does not get in sync")

# check if next rotation date was set in secret
secret_data = k8s.get_secret_data("zalando")
next_rotation_timestamp = datetime.fromisoformat(str(base64.b64decode(secret_data["nextRotation"]), 'utf-8'))
today90days = today+timedelta(days=90)
self.assertEqual(today90days, next_rotation_timestamp.date(),
"Unexpected rotation date in secret of zalando user: expected {}, got {}".format(today90days, next_rotation_timestamp.date()))

# create fake rotation users that should be removed by operator
# but have one that would still fit into the retention period
create_fake_rotation_user = """
CREATE ROLE foo_user201031 IN ROLE foo_user;
CREATE ROLE foo_user211031 IN ROLE foo_user;
CREATE ROLE foo_user"""+(today-timedelta(days=40)).strftime("%y%m%d")+""" IN ROLE foo_user;
"""
self.query_database(leader.metadata.name, "postgres", create_fake_rotation_user)

# patch foo_user secret with outdated rotation date
fake_rotation_date = today.isoformat() + ' 00:00:00'
fake_rotation_date_encoded = base64.b64encode(fake_rotation_date.encode('utf-8'))
secret_fake_rotation = {
"data": {
"nextRotation": str(fake_rotation_date_encoded, 'utf-8'),
},
}
k8s.api.core_v1.patch_namespaced_secret(
name="foo-user.acid-minimal-cluster.credentials.postgresql.acid.zalan.do",
namespace="default",
body=secret_fake_rotation)

# enable password rotation for all other users (foo_user)
# this will force a sync of secrets for further assertions
enable_password_rotation = {
"data": {
"enable_password_rotation": "true",
"password_rotation_interval": "30",
"password_rotation_user_retention": "30", # should be set to 60
},
}
k8s.update_config(enable_password_rotation)
self.eventuallyEqual(lambda: k8s.get_operator_state(), {"0": "idle"},
"Operator does not get in sync")

# check if next rotation date and username have been replaced
secret_data = k8s.get_secret_data("foo_user")
secret_username = str(base64.b64decode(secret_data["username"]), 'utf-8')
next_rotation_timestamp = datetime.fromisoformat(str(base64.b64decode(secret_data["nextRotation"]), 'utf-8'))
rotation_user = "foo_user"+today.strftime("%y%m%d")
today30days = today+timedelta(days=30)

self.assertEqual(rotation_user, secret_username,
"Unexpected username in secret of foo_user: expected {}, got {}".format(rotation_user, secret_username))
self.assertEqual(today30days, next_rotation_timestamp.date(),
"Unexpected rotation date in secret of foo_user: expected {}, got {}".format(today30days, next_rotation_timestamp.date()))

# check if oldest fake rotation users were deleted
# there should only be foo_user, foo_user+today and foo_user+today-40days
user_query = """
SELECT rolname
FROM pg_catalog.pg_roles
WHERE rolname LIKE 'foo_user%';
"""
self.eventuallyEqual(lambda: len(self.query_database(leader.metadata.name, "postgres", user_query)), 3,
"Found incorrect number of rotation users", 10, 5)

# disable password rotation for all other users (foo_user)
# and pick smaller intervals to see if the third fake rotation user is dropped
enable_password_rotation = {
"data": {
"enable_password_rotation": "false",
"password_rotation_interval": "15",
"password_rotation_user_retention": "30", # 2 * rotation interval
},
}
k8s.update_config(enable_password_rotation)
self.eventuallyEqual(lambda: k8s.get_operator_state(), {"0": "idle"},
"Operator does not get in sync")

# check if username in foo_user secret is reset
secret_data = k8s.get_secret_data("foo_user")
secret_username = str(base64.b64decode(secret_data["username"]), 'utf-8')
next_rotation_timestamp = str(base64.b64decode(secret_data["nextRotation"]), 'utf-8')
self.assertEqual("foo_user", secret_username,
"Unexpected username in secret of foo_user: expected {}, got {}".format("foo_user", secret_username))
self.assertEqual('', next_rotation_timestamp,
"Unexpected rotation date in secret of foo_user: expected empty string, got {}".format(next_rotation_timestamp))

# check roles again, there should only be foo_user and foo_user+today
user_query = """
SELECT rolname
FROM pg_catalog.pg_roles
WHERE rolname LIKE 'foo_user%';
"""
self.eventuallyEqual(lambda: len(self.query_database(leader.metadata.name, "postgres", user_query)), 2,
"Found incorrect number of rotation users", 10, 5)

@timeout_decorator.timeout(TEST_TIMEOUT_SEC)
def test_patroni_config_update(self):
'''
Expand Down
3 changes: 3 additions & 0 deletions manifests/complete-postgres-manifest.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,9 @@ spec:
zalando:
- superuser
- createdb
foo_user: []
# usersWithSecretRotation: "foo_user"
# usersWithInPlaceSecretRotation: "flyway,bar_owner_user"
enableMasterLoadBalancer: false
enableReplicaLoadBalancer: false
enableConnectionPooler: false # enable/disable connection pooler deployment
Expand Down
3 changes: 3 additions & 0 deletions manifests/configmap.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@ data:
# enable_init_containers: "true"
# enable_lazy_spilo_upgrade: "false"
enable_master_load_balancer: "false"
enable_password_rotation: "false"
enable_pgversion_env_var: "true"
# enable_pod_antiaffinity: "false"
# enable_pod_disruption_budget: "true"
Expand Down Expand Up @@ -91,6 +92,8 @@ data:
# pam_configuration: |
# https://info.example.com/oauth2/tokeninfo?access_token= uid realm=/employees
# pam_role_name: zalandos
# password_rotation_interval: "90"
# password_rotation_user_retention: "180"
pdb_name_format: "postgres-{cluster}-pdb"
# pod_antiaffinity_topology_key: "kubernetes.io/hostname"
pod_deletion_wait_timeout: 10m
Expand Down
9 changes: 9 additions & 0 deletions manifests/operatorconfiguration.crd.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -120,6 +120,15 @@ spec:
users:
type: object
properties:
enable_password_rotation:
type: boolean
default: false
password_rotation_interval:
type: integer
default: 90
password_rotation_user_retention:
type: integer
default: 180
replication_username:
type: string
default: standby
Expand Down
3 changes: 3 additions & 0 deletions manifests/postgresql-operator-default-configuration.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,9 @@ configuration:
# protocol: TCP
workers: 8
users:
enable_password_rotation: false
password_rotation_interval: 90
password_rotation_user_retention: 180
replication_username: standby
super_username: postgres
major_version_upgrade:
Expand Down
Loading