-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added role, rolebinding, serviceaccount, and environment variables. #233
Conversation
87e5df5
to
60e57fe
Compare
/assign |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR. I left a few comments and questions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aaronfern Thanks for the PR. This PR will now need to be modified to remove cronjob, after out-of-band discussions to introduce trigger-based compaction jobs. Also, please remove the claim logic for the three resources you've introduced, as this is not a necessity, as mentioned here. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing the feedback @aaronfern. After having a second look, I added some additional comments.
controllers/etcd_controller.go
Outdated
|
||
if !backupCompactionScheduleFound { | ||
if err == nil { | ||
err = r.Delete(ctx, &cronJob, client.PropagationPolicy(metav1.DeletePropagationForeground)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that we can simply remove the deletion of cron jobs because people might have already enabled that feature, so these objects will remain in the cluster after switching to Job
in #235.
In general I have to admit, I don't like that this PR tries to include both: Adding new resource for Backup-Restore and removing the CronJob
behavior. Why?
- It makes it depended on Enabled jobs instead of cronjobs to take care of running compaction. #235.
- It breaks the main branch after merging.
- It makes reviews harder to understand.
- It makes the change history complex.
Not sure why that decision was made Added role, rolebinding, serviceaccount, and environment variables. #233 (review) (/cc @shreyas-s-rao).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@timuthy Unfortunately since we saw the cost impact of cron-schedule-based compaction only after making the release v0.6.0, we came up with the on-demand compaction job approach and decided to release it as a patch on v0.6.0 rather than a new release v0.7.0, to avoid making v0.6.0 a "dead" release (because v0.6.0, though technically usable, is not practically a viable solution).
I agree, we also need to include code to remove any possible cronjob resources in the cluster created previously, which should be ideally handled by #235 .
Agreed that it would make more sense that the switch from cronjob to job happens fully in #235 , while this PR only takes care of adding the RBAC required for backup sidecar's lease updation feature.
@abdasgupta WDYT? Can you add the cronjob removal code and removal of BackupCompactionSchedule
in your PR with a breaking change release note?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I already took care of cron job related codes in my PR. Would add a release note. Thanks.
api/v1alpha1/etcd_types.go
Outdated
@@ -141,9 +141,6 @@ type BackupSpec struct { | |||
// EnableProfiling defines if profiling should be enabled for the etcd-backup-restore-sidecar | |||
// +optional | |||
EnableProfiling *bool `json:"enableProfiling,omitempty"` | |||
// BackupCompactionSchedule defines the cron standard for compacting the snapstore | |||
// +optional | |||
BackupCompactionSchedule *string `json:"backupCompactionSchedule,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should have a breaking change release note for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please undo this deletion. See previous comment about separation of duties between this PR and #235
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aaronfern Please remove the cronjob-removal related changes from your PR, since @abdasgupta will be including those changes in #235 with a breaking change release note. Let's keep this PR only for adding RBAC for backup sidecar to renew snapshot and member leases.
Please also add the relevant tests for sa, role, rb, like ensuring that the sa has read/write permissions on leases but not on other objects, and also ensuring that if the role/rb are not deployed, then sa cannot access leases as well.
@@ -1,241 +0,0 @@ | |||
apiVersion: batch/v1beta1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file should ideally be removed in #235. Please undo this deletion here, since we don't want to mix the RBAC for backup sidecar along with cronjob removal. Those are two separate changes.
cc @abdasgupta
@@ -87,11 +87,6 @@ func (in *BackupSpec) DeepCopyInto(out *BackupSpec) { | |||
*out = new(bool) | |||
**out = **in | |||
} | |||
if in.BackupCompactionSchedule != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See previous comment.
api/v1alpha1/etcd_types.go
Outdated
@@ -141,9 +141,6 @@ type BackupSpec struct { | |||
// EnableProfiling defines if profiling should be enabled for the etcd-backup-restore-sidecar | |||
// +optional | |||
EnableProfiling *bool `json:"enableProfiling,omitempty"` | |||
// BackupCompactionSchedule defines the cron standard for compacting the snapstore | |||
// +optional | |||
BackupCompactionSchedule *string `json:"backupCompactionSchedule,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please undo this deletion. See previous comment about separation of duties between this PR and #235
@@ -46,12 +46,6 @@ spec: | |||
backup: | |||
description: BackupSpec defines parametes associated with the full and delta snapshots of etcd | |||
properties: | |||
backupCompactionSchedule: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See previous comment about separation of tasks between this PR and #235
@@ -31,8 +31,8 @@ import ( | |||
"k8s.io/client-go/tools/cache" | |||
|
|||
appsv1 "k8s.io/api/apps/v1" | |||
batchv1 "k8s.io/api/batch/v1beta1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see my previous comment about separation of tasks between this PR and #235. Also applies to the other cronjob related changes in this file as well.
@@ -104,7 +104,7 @@ var _ = BeforeSuite(func(done Done) { | |||
Expect(err).NotTo(HaveOccurred()) | |||
|
|||
Expect(err).NotTo(HaveOccurred()) | |||
er, err := NewEtcdReconcilerWithImageVector(mgr, false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my previous comment about separation of tasks between this PR and #235
controllers/etcd_controller.go
Outdated
@@ -791,142 +797,129 @@ func (r *EtcdReconciler) getStatefulSetFromEtcd(etcd *druidv1alpha1.Etcd, values | |||
return decoded, nil | |||
} | |||
|
|||
func (r *EtcdReconciler) reconcileCronJob(ctx context.Context, logger logr.Logger, etcd *druidv1alpha1.Etcd, values map[string]interface{}) (*batchv1.CronJob, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See previous comment about separation of tasks between this PR and #235. Also applies to other cronjob-related changes in this file.
/needs rebase |
… allow etcd-backup-restore sidecar to become kubernetes aware
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one nit, otherwise it looks good.
controllers/etcd_controller_test.go
Outdated
@@ -777,8 +804,90 @@ var _ = Describe("Druid", func() { | |||
}) | |||
}) | |||
|
|||
Describe("with etcd resource", func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it necessary to create a separate test case for the additional resources? I would suggest to add the validation to existing tests, e.g. DescribeTable("when etcd resource is created with backupCompactionSchedule field"
. This will save some testing time. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that makes sense
Updated to add role validations to existing tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
Added role, rolebinding, serviceaccount, and environment variables to allow etcd-backup-restore sidecar to become kubernetes aware
How to categorize this PR?
/area backup
/area disaster-recovery
/area high-availability
/kind enhancement
What this PR does / why we need it:
As part of making the backup-restore sidecar kubernetes aware, this PR does the following
serviceaccount
,role
, androlebinding
and associates them to the etcd pods, to authenticate it to the API server.backup-restore
pod as environment variablesWhich issue(s) this PR fixes:
Partially fixes gardener/etcd-backup-restore#333
Special notes for your reviewer:
Release note: