New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cleanup DataImportCron jobs on deletion #2088
Cleanup DataImportCron jobs on deletion #2088
Conversation
c145c6c
to
9922d21
Compare
Functional test? |
9922d21
to
248312d
Compare
@@ -471,6 +470,34 @@ func (r *DataImportCronReconciler) cleanup(ctx context.Context, dataImportCron * | |||
return nil | |||
} | |||
|
|||
func (r *DataImportCronReconciler) deleteJobs(ctx context.Context, dataImportCron *cdiv1.DataImportCron) error { | |||
cronJobName := GetCronJobName(dataImportCron) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this not work?
cronJobName := GetCronJobName(dataImportCron)
deletePropagationBackground := metav1.DeletePropagationBackground
cronJob := &v1beta1.CronJob{ObjectMeta: metav1.ObjectMeta{Namespace: r.cdiNamespace, Name: cronJobName}}
if err := r.client.Delete(ctx, cronJob, &client.DeleteOptions{PropagationPolicy: &deletePropagationBackground}); IgnoreNotFound(err) != nil {
return err
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, this will delete the current cronjob job, but not the initial job which is not a cronjob one. However, initial job should be deleted by name, so no need to list the jobs. Fixed. Thanks @awels .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks good, what about a functional test to make sure everything is gone?
898a069
to
055e08d
Compare
tests/dataimportcron_test.go
Outdated
cron, err = f.CdiClient.CdiV1beta1().DataImportCrons(f.Namespace.Name).Create(context.TODO(), cron, metav1.CreateOptions{}) | ||
Expect(err).ToNot(HaveOccurred()) | ||
|
||
By("Verify jobs were created") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should it also create pods associated with those jobs? AFAICT there are 3 levels of objects. Top level DataImportCrons, second level CronJob and Initial Job, and third level the actual pods created by those jobs. So I would expect the test to verify all 3 get created, and that all 3 get deleted when the top level is deleted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
Signed-off-by: Arnon Gilboa <agilboa@redhat.com>
it is interesting - on the ceph lane the DV actually succeeded, but test failed (maybe it did not finish in time)
|
/retest |
055e08d
to
4f1c6a3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logic is fine, just some alternative suggestions on some tests to better convey their intentions.
tests/dataimportcron_test.go
Outdated
|
||
By("Verify initial job created") | ||
initialJobName := controller.GetInitialJobName(cron) | ||
Eventually(func() bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is more readable
Eventually(func() *corev1.Job {
job, _ := f.K8sClient.BatchV1().Jobs(f.CdiInstallNs).Get(context.TODO(), initialJobName, metav1.GetOptions{})
return job
}, dataImportCronTimeout, pollingInterval).ShouldNot(BeNil())
The intention is to verify the job exists, the eventually func doesn't have to return a bool, it can return anything.
tests/dataimportcron_test.go
Outdated
}, dataImportCronTimeout, pollingInterval).Should(BeTrue()) | ||
|
||
By("Verify initial job pod created") | ||
Eventually(func() bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternative suggestion:
Eventually(func() *corev1.Pod {
pod, _ := utils.FindPodByPrefixOnce(f.K8sClient, f.CdiInstallNs, initialJobName, "")
return pod
}, dataImportCronTimeout, pollingInterval).ShouldNot(BeNil())
tests/dataimportcron_test.go
Outdated
By("Verify cronjob created and has active job") | ||
cronJobName := controller.GetCronJobName(cron) | ||
var jobName string | ||
Eventually(func() bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternative suggestion:
Eventually(func() string {
jobName = ""
cronjob, _ := f.K8sClient.BatchV1beta1().CronJobs(f.CdiInstallNs).Get(context.TODO(), cronJobName, metav1.GetOptions{})
if cronjob != nil && len(cronjob.Status.Active) > 0 {
jobName = cronjob.Status.Active[0].Name
}
return jobName
}, dataImportCronTimeout, pollingInterval).ShouldNot(BeEmpty())
Note we are ignoring errors here, maybe write them to the output to help debug if there are issues.
tests/dataimportcron_test.go
Outdated
}, dataImportCronTimeout, pollingInterval).Should(BeTrue()) | ||
|
||
By("Verify cronjob first job created") | ||
Eventually(func() bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternative:
Eventually(func() *corev1.Job {
job, _ := f.K8sClient.BatchV1().Jobs(f.CdiInstallNs).Get(context.TODO(), jobName, metav1.GetOptions{})
return job
}, dataImportCronTimeout, pollingInterval).ShouldNot(BeNil())
tests/dataimportcron_test.go
Outdated
}, dataImportCronTimeout, pollingInterval).Should(BeTrue()) | ||
|
||
By("Verify cronjob first job pod created") | ||
Eventually(func() bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternative:
Eventually(func() *corev1.Pod {
pod, _ := utils.FindPodByPrefixOnce(f.K8sClient, f.CdiInstallNs, jobName, "")
return pod
}, dataImportCronTimeout, pollingInterval).ShouldNot(BeNil())
}, dataImportCronTimeout, pollingInterval).Should(BeTrue()) | ||
|
||
By("Verify initial job pod deleted") | ||
Eventually(func() bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternative:
Eventually(func() bool {
_, err := utils.FindPodByPrefixOnce(f.K8sClient, f.CdiInstallNs, intialJobName, "")
return errors.IsNotFound(err)
}, dataImportCronTimeout, pollingInterval).Should(BeTrue())
}, dataImportCronTimeout, pollingInterval).Should(BeTrue()) | ||
|
||
By("Verify cronjob first job pod deleted") | ||
Eventually(func() bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternative:
Eventually(func() bool {
_, err := utils.FindPodByPrefixOnce(f.K8sClient, f.CdiInstallNs, jobName, "")
return errors.IsNotFound(err)
}, dataImportCronTimeout, pollingInterval).Should(BeTrue())
Signed-off-by: Arnon Gilboa <agilboa@redhat.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
lets see if our fix to the findPodByPrefix func breaks anything, but looks fine otherwise.
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: awels The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/cherrypick release-v1.43 |
@arnongilboa: new pull request created: #2100 In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Signed-off-by: Arnon Gilboa agilboa@redhat.com
What this PR does / why we need it:
Mostly for the case the job pod is stuck
Pending
(e.g. due to failed MountVolume.SetUp) and not terminated even when deleting its DataImportCron.Warning FailedMount 24s (x8 over 87s) kubelet MountVolume.SetUp failed for volume "cdi-cert-vol" : configmap "no-certs" not found
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #
Special notes for your reviewer:
Release note: