Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[epic] API Design for EtcdCluster resource #109

Closed
6 tasks done
kvaps opened this issue Apr 2, 2024 · 8 comments
Closed
6 tasks done

[epic] API Design for EtcdCluster resource #109

kvaps opened this issue Apr 2, 2024 · 8 comments
Milestone

Comments

@kvaps
Copy link
Member

kvaps commented Apr 2, 2024

We are going to release MVP (v0.1.0) and we need a stable spec. Here is the corrected version of the spec from this proposal: #62 (original author: @sergeyshevch).

I'm going to use this meta-issue to link all parts for implementing this:

---
apiVersion: etcd.aenix.io/v1alpha1
kind: EtcdCluster
metadata:
  name: test
  namespace: default
spec:
  replicas: 3
  options: # map[string]string
    election-timeout: "1000"
    max-wals: "5"
    max-snapshots: "5"

  storage:
    emptyDir: {} # core.v1.EmptyDirVolumeSource Ready k8s type
    volumeClaimTemplate:
      metadata:
        labels:
          env: prod
        annotations:
          example.com/annotation: "true"
      spec: # core.v1.PersistentVolumeClaimSpec Ready k8s type
        storageClassName: gp3
        accessModes: [ "ReadWriteOnce" ]
        resources:
          requests:
            storage: 10Gi

  podTemplate:
    metadata:
      labels:
        env: prod
      annotations:
        example.com/annotation: "true"
    spec:
      imagePullSecrets:  # core.v1.LocalObjectReference Ready k8s type
      - name: myregistrykey
      serviceAccountName: default
      affinity: {} # core.v1.Affinity Ready k8s type
      nodeSelector: {} # map[string]string
      tolerations: [] # core.v1.Toleration Ready k8s type
      securityContext: {} # core.v1.PodSecurityContext Ready k8s type
      priorityClassName: "low"
      topologySpreadConstraints: [] # core.v1.TopologySpreadConstraint Ready k8s type
      terminationGracePeriodSeconds: 30 # int64
      schedulerName: "default-scheduler"
      runtimeClassName: "legacy"
      readinessGates: [] # core.v1.PodReadinessGate Ready k8s type
      containers: # []v1.Container
      - name: etcd
        image: "quay.io/coreos/etcd:v3.5.12"
        imagePullPolicy: Always
        resources: # core.v1.ResourceRequirements Ready k8s type
          requests:
            cpu: 100m
            memory: 100Mi
          limits:
            cpu: 200m
            memory: 200Mi
      volumes: [] # []v1.Volume

  serviceTemplate:
    metadata:
      labels:
        env: prod
      annotations:
        example.com/annotation: "true"
    spec: # core.v1.ServiceSpec Ready k8s type

  podDisruptionBudgetTemplate:
    metadata:
      labels:
        env: prod
      annotations:
        example.com/annotation: "true"
    spec:
      maxUnavailable: 1 # intstr.IntOrString
      minAvailable: 2

 status:
   conditions:
   - lastProbeTime: null
     lastTransitionTime: "2024-03-06T18:39:45Z"
     status: "True"
     type: Ready
   - lastProbeTime: null
     lastTransitionTime: "2024-03-06T18:39:45Z"
     status: "True"
     type: Initialized
@sergeyshevch
Copy link
Member

sergeyshevch commented Apr 2, 2024

I don't like such idea as well as podSpec field.

I guess simple implementation with flat list of parameters inside spec will be simpler for understanding and usage.

Some good examples in my opinion:

Proposed design closely match to kubernetes spec but looks like a lot of operator prefer other, more simpler approach

I suggest to build few usecases like:

  • Simple cluster with few replicas
  • Cluster with custom image
  • Cluster with persistent storage
  • Cluster with resources
  • Cluster with all field specified

Then we can look on both approaches and compare more correctly

@sergeyshevch
Copy link
Member

Also I guess we need to be aware of landscape around kubernetes. Let's imagine this usecase:

  • I deploy applications in my org with some common chart
  • I want to add EtcdCluster into such chart
  • I cannot just map some default container resources from values as is. I need to totally reimplement CR spec in my chart because helm doesn't support array merging.
  • I already met with such case in other operators and it was painfull to support

@kvaps
Copy link
Member Author

kvaps commented Apr 2, 2024

I guess simple implementation with flat list of parameters inside spec will be simpler for understanding and usage.

It makes sense, I just don't like the idea if parameters will repeat themself. It's better to have spec extendable but without sugar, then sugar without opportunity to extend :)

Proposed design closely match to kubernetes spec but looks like a lot of operator prefer other, more simpler approach

I would also mention ElasticSearch operator and piraeus-operator v2 (second major version, which reworked by mistakes from v1), both are using podTemplate to define default fields for pods:

I suggest to build few usecases like:

  • Simple cluster with few replicas
  • Cluster with custom image
  • Cluster with persistent storage
  • Cluster with resources
  • Cluster with all field specified

Then we can look on both approaches and compare more correctly

Great Idea, I will prepare a PR such examples and we can move discussion into review

@lllamnyp
Copy link
Collaborator

lllamnyp commented Apr 2, 2024

I'm happy with the spec as of the time of this comment.

@relyt0925
Copy link

One thing I would like to note:
A number of folks who I have talked to who run etcd clusters run them purely on nodes with local disk storage. Note this storage can be removed when nodes are being updated (and therefore reinitialized back to an initial empty directory). Currently: I see that done as etcd pods with an emptydir running in a clustered mode.

The operator will handle looking at the state of the cluster (etcd pods). If a pod dies or is removed: the operator will handle calling into the etcd cluster to remove the member, and then recalling back into the cluster to create the member and then also scheduling the pod associated with the member. Effectively: this allows pods to be able to move nodes/tolerate failures and automatically recover.

In this mode currently in the spec: I see we are using statefulsets only: I don't think using a stateful set necessarily precludes this mode: but we need to be able to use local volumes. Additionally: the etcd controller has to then be smart enough to know when a node that an etcd pod/volume was scheduled to has been fully removed, and then handle recreating the pv/pvc for the local volume to allow it to go on a different node.

More of a question: although not implemented in the controller backend yet (which I think can be extended) does the design support that use case?

@relyt0925
Copy link

Additionally: is there a mode where users can specify their own pki infrastructure (CA, certs, secrets) for the etcd cluster?

@kvaps
Copy link
Member Author

kvaps commented Apr 2, 2024

is there a mode where users can specify their own pki infrastructure (CA, certs, secrets) for the etcd cluster?

Yeah, right now there is a suggestion from @lllamnyp to add additional security section for that, but not in this iteration:

spec:
  security:
    serverTLSSecretRef: # secretRef
      name: server-tls-secret
    clientCertAuth: true # bool
    trustedCAFile: # secretRef
      name: trusted-tls-secret # Client certificates may be signed by a different CA than server certificates
    peerTLSSecretRef: # secretRef
      name: peer-tls-secret # However, the CA for intra-cluster certificates is the same for both incoming and outgoing requests
    peerClientCertAuth: true # bool

implementation design discussed here #87

@Kirill-Garbar
Copy link
Collaborator

Kirill-Garbar commented Apr 3, 2024

For now the most amount of information regarding auth and security topic is described in this issue #76. The PR that is referred in the previous comment will include the results of discussion and implementation.

@sergeyshevch sergeyshevch pinned this issue Apr 3, 2024
kvaps added a commit that referenced this issue Apr 3, 2024
following up
#109 (comment)

Here are examples for:


- Simple cluster with few replicas
- Cluster with custom image
- Cluster with persistent storage
- Cluster with resources
- Cluster with all field specified

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
@kvaps kvaps modified the milestones: v0.1.0, v0.2.0 Apr 11, 2024
@kvaps kvaps closed this as completed Apr 23, 2024
@kvaps kvaps unpinned this issue Apr 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

No branches or pull requests

5 participants