-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
etcd operator storage, crd and certificate issues #75
Comments
How about opening a PR in etcd-operator repo to make this configurable?
How about an init container in the etcd operator deployment, to ensure that both the azure container and storage account are created before etcd comes up? And move backup operator to run as a second container in the etcd operator deployment?
I like to think of the CRD as a global default in every underlay cluster. Our etcd operators shouldn't need cluster-wide access in order to create/delete the CRD. |
We could do something like this. Run all 3 operators in one pod. Question is it acceptable that etcd-controller provisions storage for itself and creates a secret for it (I think yes). The question which credential we should use to storage all these "backup storage accounts"
Agreed. We just need nice way to manage and lifecycle them too,. |
The operator is used for aks. Might be a good topic for the sync call. I’d
review the aks hrlm charts too.
…On Fri, Jul 13, 2018 at 11:30 Mangirdas Judeikis ***@***.***> wrote:
How about an init container in the etcd operator deployment, to ensure
that both the azure container and storage account are created before etcd
comes up? And move backup operator to run as a second container in the etcd
operator deployment?
We could do something like this. Run all 3 operators in one pod. Question
is it acceptable that etcd-controller provisions storage for itself and
creates a secret for it (I think yes). The question which credential we
should use to storage all these "backup storage accounts"
I like to think of the CRD as a global default in every underlay cluster.
Our etcd operators shouldn't need cluster-wide access in order to
create/delete the CRD.
Agreed. We just need nice way to manage and lifecycle them too,.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#75 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AE_PUy7UhedIrnuP2T0h0Y5OKLWo--uhks5uGL0MgaJpZM4VPDd_>
.
|
@pweil- It can be used but we need to agree on how. Sync call would be good. Check google doc in the email with small review. Is there any change I can get access to azure repo too? |
/kind feature |
The plan was to use
etcd-operators
. Now where we struggle.First, we need to know the namespace when generating certificates. This is because of
https://github.com/coreos/etcd-operator/blob/70d3bd74960dc7127870a393affffbe1df94728e/pkg/util/etcdutil/member.go#L38-L40
The result is that etcd advertises itself with
name.namespace.svc
and we need to have this in the certificates.Second (and a little bit bigger on) is storage.
First, etcd-operator online contains multiple misleading docs, examples. So we rely on code.
https://github.com/coreos/etcd-operator/blob/master/pkg/apis/etcd/v1beta2/cluster.go#L137
Upstream issue:
Persistent/Durable etcd cluster coreos/etcd-operator#1323
Idea is we run in memory and backup constantly. In DR situation if a single pod is alive - the operator will recover. If all pods restart - recovery is done using
etcd-restore-operator
and restore is done from backup.For this we need
etcd-backup
andetcd-restore
operators.backup operator supports 2 backup methods (Azure ASB and AWS S3) https://github.com/coreos/etcd-operator/blob/master/pkg/apis/etcd/v1beta2/backup_types.go#L19-L28
Configuration is what causes an issue. We need to have secret with storage account name and key.
https://github.com/coreos/etcd-operator/blob/master/doc/design/abs_backup.md
This means pre-requested are:
We don't want to create a storage account during ARM deployment as is not a client facing configuration and artifact. We could use one storage account with multiple buckets per customer. And inject from the backend.
Last one issue is helm ordering for CRD:
helm/helm#2994
TL;DR: When helm created CRD it takes some time for the cluster to accept them. Creating CRD resources fails as it's not yet available.
In addition, we dont want to manage global CRD's for all users from the user configuration side. If CRD is deleted - all etcd cluster are deleted too. It would look like we need to manage them outside
azure-helm
as part of HCP management.cc: @jim-minter @Kargakis @pweil-
The text was updated successfully, but these errors were encountered: