Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: catalog source pod scheduling config overrides #2512

Merged

Conversation

perdasilva
Copy link
Collaborator

@perdasilva perdasilva commented Dec 8, 2021

Description of the change:
The CatalogSource CRD was recently updated to carry an additional and optional attribute grpcPodConfig as an envelope for pod spec overrides that we may want to give to the users, starting with: tolerations, nodeSelector, and priorityClassName. For instance:

apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
...
# optional
grpcPodConfig:
  # optional
  priorityClassName: https://github.com/operator-framework/api/pull/173
  #optional
  tolerations:
    - key: some
      operator: Equal
      value: value
      effect: NoSchedule
  # optional
  nodeSelector: 
     kubernetes.io/os: linux
     some: tag

Motivation for the change:
We want to give cluster admins a way to force catalog source pods to be scheduled on specific infrastructure (i.e. nodes)

Questions

Does it make sense to expose priorityClassName and not priority?

priorityClassName seems to be used with special (custom) priority classes and system-node-critical and system-cluster-critical (which I think are reserved and very high priority). I think it should be fine to not expose the priority knob as well. I think priorityClassName covers most cases where an admin would want to affect priority. But, I'd be keen to have this assumption checked =)

Should we update the api and use *string for PriorityClassName

Since we want to override settings, it could make sense to differentiate between not set and empty. A priorityClassName of "" says to use the default priority. We don't set our own priorityClassName in our catalog source pod definition. So, the only issue here is that in case we decide to do that (which I don't think is likely), the user will not be able to unset it, i.e. override it to be "". But, even though unlikely, if we have the chance and it is quick, why not "do it right"? (if you guys also agree that seems the most correctest)

NOTE: This PR will eventually also introduce vendor changes. For reviewer sanity, I'll try to keep those off the PR for as long as possible

Reviewer Checklist

  • Implementation matches the proposed design, or proposal is updated to match implementation
  • Sufficient unit test coverage
  • Sufficient end-to-end test coverage
  • Docs updated or added to /doc
  • Commit messages sensible and descriptive

@perdasilva
Copy link
Collaborator Author

/hold

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 8, 2021
@perdasilva
Copy link
Collaborator Author

holding until api changes are finalized and merged

Copy link
Contributor

@tylerslaton tylerslaton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. We can probably remove the hold to see if the tests work now as it has been a number of hours since the CRD change went in.

non-block question: Do we need to write an E2E test that validates that the changes to the pod spec are actually being utilized as expected? Or is that something that is owned by some other k8s API?

deploy/chart/crds/0000_50_olm_00-catalogsources.crd.yaml Outdated Show resolved Hide resolved
@tylerslaton
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Dec 8, 2021
@perdasilva
Copy link
Collaborator Author

non-block question: Do we need to write an E2E test that validates that the changes to the pod spec are actually being utilized as expected? Or is that something that is owned by some other k8s API?

Yeah, I'll def add some e2e tests!

Copy link
Member

@njhale njhale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NOTE: This PR will eventually also introduce vendor changes. For reviewer sanity, I'll try to keep those off the PR for as long as possible

I think it's sufficiently clean to include vendor changes in a separate commit, that way we can still view the meaningful diffs separately and don't need to re-lgtm later.

Other than that, this is looking good! I think we just need an e2e test to make sure OLM respects these settings and updates existing catalog pods when changes are detected.

pkg/controller/registry/reconciler/reconciler.go Outdated Show resolved Hide resolved
pkg/controller/registry/reconciler/reconciler.go Outdated Show resolved Hide resolved
Copy link
Member

@gallettilance gallettilance left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Dec 11, 2021
@perdasilva perdasilva force-pushed the pod_sched_override branch 4 times, most recently from fe3c84e to 41070eb Compare December 13, 2021 15:24
Copy link
Member

@gallettilance gallettilance left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Dec 13, 2021
@perdasilva
Copy link
Collaborator Author

non-block question: Do we need to write an E2E test that validates that the changes to the pod spec are actually being utilized as expected? Or is that something that is owned by some other k8s API?

@tylerslaton I thought about it, but yeah. That's the conclusion I reached. As long as its setting the right things in the right places the rest is up to kube

@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Dec 13, 2021
Copy link
Contributor

@anik120 anik120 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good mostly @perdasilva , had a couple of non-blocking questions otherwise this looks good to go.

test/e2e/catsrc_pod_config_e2e_test.go Outdated Show resolved Hide resolved
Signed-off-by: Per G. da Silva <perdasilva@redhat.com>
@perdasilva
Copy link
Collaborator Author

/unhold

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 13, 2021
@anik120
Copy link
Contributor

anik120 commented Dec 14, 2021

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Dec 14, 2021
@kevinrizza
Copy link
Member

/approve

@openshift-ci
Copy link

openshift-ci bot commented Dec 14, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: gallettilance, kevinrizza, perdasilva

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 14, 2021
@openshift-merge-robot openshift-merge-robot merged commit be0c556 into operator-framework:master Dec 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants