Skip to content

Commit

Permalink
Set node_readiness_label default to an empty value.
Browse files Browse the repository at this point in the history
Previously, it was set to the lifecycle-status:ready, breaking a
lot of minikube deployments. Also it was not possible befor to run
with this label set to an empty value.

Document the effect of the label in the new section of the
documentation.
  • Loading branch information
alexeyklyukin committed Jan 11, 2018
1 parent 56359d2 commit c9776c4
Show file tree
Hide file tree
Showing 4 changed files with 35 additions and 4 deletions.
26 changes: 25 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ This project is currently in active development. It is however already [used int

There is a talk about this project delivered by Josh Berkus on KubeCon 2017: [Kube-native Postgres](https://www.youtube.com/watch?v=Zn1vd7sQ_bc)

Please, report any issues discovered to https://github.com/zalando-incubator/postgres-operator/issues.
Please, report any issues discovered to httnodeps://github.com/zalando-incubator/postgres-operator/issues.

## Running and testing the operator

Expand Down Expand Up @@ -148,6 +148,30 @@ spec:
Please be aware that the taint and toleration only ensures that no other pod gets scheduled to a PostgreSQL node
but not that PostgreSQL pods are placed on such a node. This can be achieved by setting a node affinity rule in the ConfigMap.

### Using the operator to minimize the amount of failovers during the cluster upgrade

To make sure there is only one failover per each PostgreSQL deployment when nodes of the cluster are decomissioned
during the cluster upgrade, the operator relies on the `node_readiness_label` parameter. It is used in order to determine whether
a given node is considered ready to run postgres pods. When a node is both `unschedulable` and doesn't have a readiness
label the operator starts moving master pods out of it. In addition, the operator sets the `PodDisruptionBudget` for the
postgres pods to make sure pods with the role `master` are not killed by the cluster scale out operations. Each postgres
pod has a `nodeAffinity` set to avoid being scheduled on the nodes without the readiness label. Together, those measures
give us the following guarantees:

* no postgres pod runs on the node being not ready
* no postgres master pod is killed during the scale event before the operator has a chance to move it off the node
* when a postgres pod is killed by the operator because the node is decomissioned, it will not respawn on another node to-be-decomissioned.

By default the `node_readiness_label` is set to an empty string, disabling this feature altogether. It can be set to a
string containing one or more key:value parameters, i.e:
```
node_readiness_label: "lifecycle-status:ready,disagnostic-checks:ok"
```

when multiple labels are set the operator will require all of them to be present on a node (and set to the specified value) in order to consider
the node as ready.

#### Custom Pod Environment Variables

It is possible to configure a config map which is used by the Postgres pods as an additional provider for environment variables.
Expand Down
3 changes: 2 additions & 1 deletion manifests/minimal-postgres-manifest.yaml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
apiVersion: "acid.zalan.do/v1"
kind: postgresql
metadata:
name: acid-minimal-cluster
name: acid-minimal-cluster-recent
spec:
teamId: "ACID"
volume:
Expand All @@ -21,3 +21,4 @@ spec:
foo: zalando
postgresql:
version: "10"
dockerImage: registry.opensource.zalan.do/acid/spilotest-10:improve-support-for-namespaces-v1
8 changes: 7 additions & 1 deletion pkg/cluster/k8sres.go
Original file line number Diff line number Diff line change
Expand Up @@ -238,6 +238,9 @@ PatroniInitDBParams:

func (c *Cluster) nodeAffinity() *v1.Affinity {
matchExpressions := make([]v1.NodeSelectorRequirement, 0)
if len(c.OpConfig.NodeReadinessLabel) == 0 {
return nil
}
for k, v := range c.OpConfig.NodeReadinessLabel {
matchExpressions = append(matchExpressions, v1.NodeSelectorRequirement{
Key: k,
Expand Down Expand Up @@ -431,10 +434,13 @@ func (c *Cluster) generatePodTemplate(
ServiceAccountName: c.OpConfig.ServiceAccountName,
TerminationGracePeriodSeconds: &terminateGracePeriodSeconds,
Containers: []v1.Container{container},
Affinity: c.nodeAffinity(),
Tolerations: c.tolerations(tolerationsSpec),
}

if affinity := c.nodeAffinity(); affinity != nil {
podSpec.Affinity = affinity
}

if c.OpConfig.ScalyrAPIKey != "" && c.OpConfig.ScalyrImage != "" {
podSpec.Containers = append(
podSpec.Containers,
Expand Down
2 changes: 1 addition & 1 deletion pkg/util/config/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ type Resources struct {
DefaultCPULimit string `name:"default_cpu_limit" default:"3"`
DefaultMemoryLimit string `name:"default_memory_limit" default:"1Gi"`
PodEnvironmentConfigMap string `name:"pod_environment_configmap" default:""`
NodeReadinessLabel map[string]string `name:"node_readiness_label" default:"lifecycle-status:ready"`
NodeReadinessLabel map[string]string `name:"node_readiness_label" default:""`
MaxInstances int32 `name:"max_instances" default:"-1"`
MinInstances int32 `name:"min_instances" default:"-1"`
}
Expand Down

0 comments on commit c9776c4

Please sign in to comment.