Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Function Update if config/secret changes #1224

Merged
merged 10 commits into from Jul 23, 2019

Conversation

@vishal-biyani
Copy link
Collaborator

commented Jul 8, 2019

If a secret/configmap value changes - the function should update with the new value.

In the case of pool manager, this works as the specialized pod always picks up the new value from configmap/secret.

In the case of newdeploy - this controller terminates the old pod which in turn creates new pods with new values of configmap/secret.

Fixes #1164


This change is Reviewable

@vishal-biyani vishal-biyani added this to the 1.4 milestone Jul 8, 2019

newCm := newObj.(*apiv1.ConfigMap)
if oldCm.ObjectMeta.ResourceVersion != newCm.ObjectMeta.ResourceVersion {
if newCm.ObjectMeta.Namespace != "kube-system" {
logger.Info("Configmap changed",

This comment has been minimized.

Copy link
@life1347

life1347 Jul 9, 2019

Member

could we change it to logger.Debug ?

pkg/executor/newdeploy/newdeploymgr.go Show resolved Hide resolved
}

if f.Spec.InvokeStrategy.ExecutionStrategy.ExecutorType == fv1.ExecutorTypePoolmgr {
// Future implementation

This comment has been minimized.

Copy link
@life1347

life1347 Jul 9, 2019

Member

We need to document the different behavior of newdeploy/poolmgr when facing secret/configmap update in https://docs.fission.io/usage/access-secret-cfgmap-in-function/ so that users won't confuse about this.

func getSecretRelatedFuncs(logger *zap.Logger, m *metav1.ObjectMeta, fissionClient *crd.FissionClient) []fv1.Function {
funcList, err := fissionClient.Functions(metav1.NamespaceAll).List(metav1.ListOptions{})
if err != nil {
logger.Error("Error getting functions while updating secrets", zap.Error(err), zap.Any("configmap", m))

This comment has been minimized.

Copy link
@life1347

life1347 Jul 9, 2019

Member

if error happened we should return here

func getConfigmapRelatedFuncs(logger *zap.Logger, m *metav1.ObjectMeta, fissionClient *crd.FissionClient) []fv1.Function {
funcList, err := fissionClient.Functions(metav1.NamespaceAll).List(metav1.ListOptions{})
if err != nil {
logger.Error("Error getting functions while updating configmap", zap.Error(err), zap.Any("configmap", m))

This comment has been minimized.

Copy link
@life1347

life1347 Jul 9, 2019

Member

if error happened we should return here

@vishal-biyani

This comment has been minimized.

Copy link
Collaborator Author

commented Jul 9, 2019

Thinking more about the implementation and after looking at issue I think deleting pods like this is a pretty bad idea. Especially for new deploy executor - where if all pods are deleted and if it takes time for new pod to come up then the latency is higher than one would expect from new deploy function.

The closest solution I can think of is similar to kubernetes/kubernetes#76062 - but that is only on the client side and something similar would have to be done on the server side within Fission.

@life1347

This comment has been minimized.

Copy link
Member

commented Jul 10, 2019

Yeah, agree that would be a problem if under heavy workloads.

Of course, it's better to switch to the new function version. But consider the problem you mentioned, I think we should consider how to migrate traffic from old fn to the new one more smoothly instead of just thinking how fast we can achieve to let user's function get an update.

How about we add SHA of secret/cfgmap to deployment, then we simply update that label to trigger a rolling update? Kubernetes will handle the instance rolling update for us.

And in fact, it adds extra protection to users if their newer function won't come up.

What do you think?

UPDATE:
kubernetes/kubernetes#76062 also looks a possible way to do so. But I don't think we need to do a similar thing in fission, we can call k8s API in newdeploy to trigger deployment restart.

@vishal-biyani

This comment has been minimized.

Copy link
Collaborator Author

commented Jul 11, 2019

I think the migration of traffic from old to a new pod is taken care of by TerminationGracePeriod - it will ensure that old traffic is served and routing new traffic to the new pod is taken care by Service (At least in new deployment).

SHA of configmap - the reason I don't like because we can potentially have multiple configmaps/secrets per function which creates a lot of extra decisions.

I am inclining towards adding a new annotation/ ENV variable - the value will a simple timestamp called ConfigLastUpdated or something like that. Anytime we have a configmap/secret update - we simply update the TS and it triggers the rolling update.

@life1347

This comment has been minimized.

Copy link
Member

commented Jul 11, 2019

I like the idea of having timestamp. If I remember correctly the annotation update won't trigger an update, we have to add it to ENV.

@vishal-biyani vishal-biyani force-pushed the enc_cms_update branch from bec81b7 to 6b7179c Jul 16, 2019

@vishal-biyani

This comment has been minimized.

Copy link
Collaborator Author

commented Jul 16, 2019

The PR is now changed and ready for re-review based on the discussion above. The implementation is different for Pool manager vs. new deploy executor and is described below:

  1. For new deploy executor - a timestamp ENV has been added to function deployment. This env is updated when a secret/configmap changes. This causes rolling update (Currently with MaxUnavailable 25% and MaxSurge 100%) and the pods in deployment are replenished and hence have new configmaps and secrets when they start.

  2. For pool manager, the specialized pod is deleted. Existing requests will be served based on PreStop hook and TerminationGracePeriod but all new requests will go to a pod which is specialized from the pool. I want to monitor this more and make changes as we see any needs. We can not do a rolling update here as the pod is not under any deployment and even if it did - we would need to specialize specifically again on start!

Also doc update PR raised: fission/docs.fission.io#140

}

for _, po := range podList.Items {
err := gpm.kubernetesClient.CoreV1().Pods(po.ObjectMeta.Namespace).Delete(po.ObjectMeta.Name, &metav1.DeleteOptions{})

This comment has been minimized.

Copy link
@life1347

life1347 Jul 17, 2019

Member

We also need to invalidate the cache so that executor will specialize a new pod for function.
One thing I worry about deleting pod here is when a configmap/secret is comsumed by many functions, once it get updated poolmanager will specialize tons of new pods, thus cluster runs up all available resource in the end if old pods are still in Terminating state.

return relatedFunctions, nil
}

func recyclePods(logger *zap.Logger, funcs []fv1.Function, ndm *nd.NewDeploy, gpm *gpm.GenericPoolManager) {

This comment has been minimized.

Copy link
@life1347

life1347 Jul 17, 2019

Member

I don't really like the term recycle here since we are just trying to do the rolling update. Any thought? rolloutNewPods?

This comment has been minimized.

Copy link
@vishal-biyani

vishal-biyani Jul 18, 2019

Author Collaborator

I think Rollout is a bit confusing with Kubernetes's rollout and also in case of Pool Manager - we are simply deleting and not rolling out. I want to convey that we are taking old pods - fitting them with new configmaps/secrets and bringing them back to service. WDYT?

if f.Spec.InvokeStrategy.ExecutionStrategy.ExecutorType == fv1.ExecutorTypeNewdeploy {
err = ndm.RecycleFuncPods(logger, f)
}
if f.Spec.InvokeStrategy.ExecutionStrategy.ExecutorType == fv1.ExecutorTypePoolmgr {

This comment has been minimized.

Copy link
@life1347

life1347 Jul 17, 2019

Member

Use switch here would be better

newS := newObj.(*apiv1.Secret)
if oldS.ObjectMeta.ResourceVersion != newS.ObjectMeta.ResourceVersion {
if newS.ObjectMeta.Namespace != "kube-system" {
logger.Info("Secret changed",

This comment has been minimized.

Copy link
@life1347

life1347 Jul 17, 2019

Member

Info level is a little annoying, how about changing it to Debug

@vishal-biyani vishal-biyani force-pushed the enc_cms_update branch from 78d2614 to 5088748 Jul 22, 2019

@life1347

This comment has been minimized.

Copy link
Member

commented Jul 23, 2019

LGTM

@life1347 life1347 merged commit 93c5b53 into master Jul 23, 2019

6 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
security/snyk - environments/jvm/pom.xml (soamvasani) No new issues
Details
security/snyk - environments/nodejs/package.json (soamvasani) No manifest changes detected
security/snyk - environments/python/requirements.txt (soamvasani) No manifest changes detected
security/snyk - environments/ruby/Gemfile.lock (soamvasani) No manifest changes detected
security/snyk - examples/nodejs/package.json (soamvasani) No manifest changes detected

@life1347 life1347 deleted the enc_cms_update branch Jul 23, 2019

vishal-biyani added a commit to fission/docs.fission.io that referenced this pull request Jul 25, 2019

Configmap Doc Update: (#140)
Configmap update change - fission/fission#1224 related doc update
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.