Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Kubernetes tolerations #1125

Open
alexellis opened this Issue Mar 13, 2019 · 4 comments

Comments

Projects
None yet
2 participants
@alexellis
Copy link
Member

alexellis commented Mar 13, 2019

Feature: Kubernetes tolerations

https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/

Expected Behaviour

Tolerations allow a Pod (function) to be scheduled on a node where there is a "taint" preventing Pods from being scheduled there.

Current Behaviour

We have support for constraints and labelled node-pools for scheduling, but there is a scenario a user came with where a node-pool has a taint and so constraints are not enough - they need to also add a "toleration" to the function Pod.

Possible Solution

We could extend the function schema to allow the Kubernetes tolerations spec to be specified.

This is not available in Swarm and possibly not available in the other back-ends, so it's the first hard requirement for orchestrator-specific knowledge in the OpenFaaS API.

Suggestions are welcome.

  • Extend the function spec with a tolerations map - taints and breaks the OpenFaaS API for other providers.
  • Add an untyped "metadata" field for custom providers such as Kubernetes - needs to be parsed in the provider, generic, extensible
  • Use an annotation or annotations with our existing mechanism to prevent littering the API spec - but then read this field and act upon it in the back-end - faas-netes / openfaas-operator
  • Do not support feature - user would have to find another way to achieve same result or create an operator to apply the taints after creating the functions. Perhaps a webhook pod/admission/mutating controller or similar. This could be a quick work-around for the user. https://medium.com/ibm-cloud/diving-into-kubernetes-mutatingadmissionwebhook-6ef3c5695f74 https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/

Steps to Reproduce (for bugs)

  1. Taint a node-pool
  2. Deploy a function with a constraint to run in that node-pool
  3. The function cannot be scheduled
  4. Apply a manual or programatic taint

Context

This affects advanced scheduling scenarios on Kubernetes.

Your Environment

  • FaaS-CLI version ( Full output from: faas-cli version ):

  • Docker version docker version (e.g. Docker 17.0.05 ):

  • Are you using Docker Swarm or Kubernetes (FaaS-netes)?

  • Operating System and version (e.g. Linux, Windows, MacOS):

  • Link to your project or a code example to reproduce issue:

  • Please also follow the troubleshooting guide and paste in any other diagnostic information you have:

@mercul3s

This comment has been minimized.

Copy link

mercul3s commented Mar 14, 2019

Using annotations for this purpose sounds pretty reasonable. It's already supported within the OpenFaaS API, and should only require modification to faas-netes to work with kubernetes - ie looking for a tolerations key in the annotations map, and adding its value to the deployment spec. This would ensure the OpenFaas API doesn't have to be aware of kubernetes specific tolerations, while still allowing support them via faas-netes.

@alexellis

This comment has been minimized.

Copy link
Member Author

alexellis commented Mar 14, 2019

I chatted with @stefanprodan and @embano1 - we had an idea for a way for users to extend OpenFaaS.

If you use a Mutating Webhook Admission Controller [1] then you can look for your annotation and act upon it as soon as the Pod is requesting creation. You get all the benefits of automation, extending the API in an open-closed way and don't create any engineering burden on the community. The slok [2] project can be used to put this together in a very short period of time.

Stefan suggests you'll need the CA from the cluster:

kubectl get configmap -n kube-system extension-apiserver-authentication -o=jsonpath='{.data.client-ca-file}' | base64 | tr -d '\n'

To sign the TLS key needed for the controller.

This may even be possible to deploy as an OpenFaaS function itself.

[1] https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/#mutatingadmissionwebhook-beta-in-19

[2] https://github.com/slok/kubewebhook

This could be a great example of how to extend OpenFaaS on Kubernetes for others to follow, too.

@mercul3s

This comment has been minimized.

Copy link

mercul3s commented Mar 14, 2019

@alexellis ok, let me give that a try - Admission Controllers are new to me, but kubewebhook looks pretty straightforward. I'll follow up with results here.

@alexellis alexellis changed the title Feature: Kubernetes tolerations Proposal: Kubernetes tolerations Mar 14, 2019

@alexellis

This comment has been minimized.

Copy link
Member Author

alexellis commented Mar 14, 2019

Keep us in the loop on it and feel free to ask questions on Kubernetes Slack or in our own #kubernetes channel on OF Slack.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.