Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to disable medusa-purge cron #1292

Closed
wants to merge 8 commits into from

Conversation

ericsmalling
Copy link
Contributor

@ericsmalling ericsmalling commented Apr 19, 2024

This change adds a toggle to disable the medusa-purge cronjob added in v1.12.0.

Specifically, it adds an optional purgeBackups: field to k8ssandraCluster CRD:

  medusa:
    purgeBackups: false
Set to false, the cronjob will not be created
Set to true (or omitted) the cronjob will be created

Fixes part of #1278 (Doesn't completely fix that issue but is a pre-cursor to resolve the bug)

Checklist

  • Changes manually tested
  • Automated Tests added/updated
  • Documentation added/updated
  • CHANGELOG.md updated (not required for documentation PRs)
  • CLA Signed: DataStax CLA

@ericsmalling ericsmalling requested a review from a team as a code owner April 19, 2024 11:05
Copy link

No linked issues found. Please add the corresponding issues in the pull request description.
Use GitHub automation to close the issue when a PR is merged

@ericsmalling
Copy link
Contributor Author

ericsmalling commented Apr 19, 2024

Apparently I need to update an admission controller?

validatingwebhookconfiguration.admissionregistration.k8s.io/k8ssandra-operator-validating-webhook-configuration serverside-applied
Error from server: failed to create typed patch object (/k8ssandraclusters.k8ssandra.io; apiextensions.k8s.io/v1, Kind=CustomResourceDefinition): .spec.versions[0].schema.openAPIV3Schema.properties.spec.properties.medusa.properties.purgeBackups: expected map, got &{false}

    suite_test.go:563: failed to deploy k8ssandra-operator
    suite_test.go:474: before test setup failed: exit status 1

@burmanm
Copy link
Contributor

burmanm commented Apr 19, 2024

Apparently I need to update an admission controller?

No, but did you forget to run make manifests generate ? As after running it the CRD yaml file becomes correct.

@ericsmalling ericsmalling force-pushed the cronconfig branch 2 times, most recently from 2c42d6d to c6f57a2 Compare April 19, 2024 16:32
@ericsmalling
Copy link
Contributor Author

Apparently I need to update an admission controller?

No, but did you forget to run make manifests generate ? As after running it the CRD yaml file becomes correct.

Is this documented? I don't know what all is needed for this project I guess 😊

@burmanm
Copy link
Contributor

burmanm commented Apr 19, 2024

I think tests should run it (and you should see that your git repo isn't clean anymore).

It's pretty normal to not edit the YAML file by hand, especially in controller-runtime based projects. But I don't remember if there's a warning to touch it, only a mention that it is a generated file.

@ericsmalling
Copy link
Contributor Author

Any advice on troubleshooting the remaining failures?

@ericsmalling
Copy link
Contributor Author

Looks like a rebase fixed the failing test. The only check failing now is related to the lack of a "fixed" issue. I did mention #1278 but this is more of a partial "fix" until a better solution can address the concerns.

@adejanovski
Copy link
Contributor

Hi @ericsmalling,

thanks for working on this!
I'd like to add one thing that would be needed in this PR, which is modifying an e2e test to verify this new behavior.
You could modify the fixture here and add this new field set to false, and verify that the CronJob was never created by modifying this. You'd probably need to create a checkNoPurgeCronJob as a copy of checkPurgeCronJobExists with a require.Never() block checking the CronJob object absence.

I thought I had posted this comment yesterday but probably didn't hit the button, sorry about that.

@ericsmalling
Copy link
Contributor Author

ericsmalling commented Apr 23, 2024 via email

@venkatsunil5
Copy link

Hi, We are running into same issue. Was this solution approved and when will this be merged?

@ericsmalling
Copy link
Contributor Author

ericsmalling commented May 6, 2024 via email

@ericsmalling
Copy link
Contributor Author

Back looking into this now

@ericsmalling
Copy link
Contributor Author

ericsmalling commented May 22, 2024

@adejanovski

I'd like to add one thing that would be needed in this PR, which is modifying an e2e test to verify this new behavior.
You could modify the fixture here and add this new field set to false, and verify that the CronJob was never created by modifying this. You'd probably need to create a checkNoPurgeCronJob as a copy of checkPurgeCronJobExists with a require.Never() block checking the CronJob object absence.

I'm finally back to look at this and have a question: would I need to create a complete copy of func createMultiDcSingleMedusaJob so we have one for each state?

@adejanovski
Copy link
Contributor

@adejanovski

I'd like to add one thing that would be needed in this PR, which is modifying an e2e test to verify this new behavior.
You could modify the fixture here and add this new field set to false, and verify that the CronJob was never created by modifying this. You'd probably need to create a checkNoPurgeCronJob as a copy of checkPurgeCronJobExists with a require.Never() block checking the CronJob object absence.

I'm finally back to look at this and have a question: would I need to create a complete copy of func createMultiDcSingleMedusaJob so we have one for each state?

Since we have the createSingleMedusaJob using the other setting we can avoid duplicating the whole multi dc e2e test. We have to be mindful of the number of parallel tests we're launching as we have a fairly limited number of GHA runners in the k8ssandra org.

test/e2e/medusa_test.go Outdated Show resolved Hide resolved
@ericsmalling
Copy link
Contributor Author

I am not sure what is causing the ChangeDseWorkload test to fail

@adejanovski
Copy link
Contributor

I am not sure what is causing the ChangeDseWorkload test to fail

It was just a flake, it's green after a re-run 👍

Copy link
Contributor

@adejanovski adejanovski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ericsmalling there are a few more things to do here. I can push some commits to get this PR to a wrap, unless you want to devote more cycles to it.
Let me know what works for you.

controllers/k8ssandra/medusa_reconciler.go Outdated Show resolved Hide resolved
controllers/k8ssandra/medusa_reconciler.go Outdated Show resolved Hide resolved
@ericsmalling
Copy link
Contributor Author

ericsmalling commented May 28, 2024

@ericsmalling there are a few more things to do here. I can push some commits to get this PR to a wrap, unless you want to devote more cycles to it. Let me know what works for you.

If you have the time to make the changes, I'm more than happy to let you do them! Thanks! If not, I'll try to tackle them later today.

@ericsmalling
Copy link
Contributor Author

ericsmalling commented May 31, 2024

@adejanovski I just pushed up fixes for your review comments. Ready for your re-review.

A couple of the tests failed but I'm wondering if it's the same flaky ones we saw before. I don't seem to have priv's to be able to re-run it.

@ericsmalling ericsmalling force-pushed the cronconfig branch 3 times, most recently from 558733e to b413fb3 Compare June 3, 2024 19:26
@adejanovski
Copy link
Contributor

Hi @ericsmalling, I'll review the PR today and see what's going on with the failing tests.

Copy link

sonarcloud bot commented Jun 10, 2024

Quality Gate Passed Quality Gate passed

Issues
1 New issue
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

@adejanovski
Copy link
Contributor

Merged as part of #1346

I had to create another PR as I didn't want to force push to your branch @ericsmalling.
I had to rework how the check was done for the existence of a cron job to delete if the purge feature was disabled. I added proper checks in the e2e tests for this part.

You're still credited for the commits there 👌
Thanks for all the time spent on this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants