Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow users to upgrade Policy Server using Helm chart #52

Closed
jvanz opened this issue Dec 15, 2021 · 17 comments · Fixed by #65
Closed

Allow users to upgrade Policy Server using Helm chart #52

jvanz opened this issue Dec 15, 2021 · 17 comments · Fixed by #65
Assignees
Labels
epic kind/bug Something isn't working

Comments

@jvanz
Copy link
Member

jvanz commented Dec 15, 2021

After installing the Kubewarden stack it is not possible to change configuration values with the helm upgrade command

Reproducible steps

  1. Launch Kubernetes cluster
$ cat k3d-default.yaml
---
apiVersion: k3d.io/v1alpha3
kind: Simple
name: k3s-default
servers: 1
agents: 0
image: docker.io/rancher/k3s:v1.22.4-k3s1

$ k3d cluster create $(PROFILE) --config k3d-default.yaml -v /dev/mapper:/dev/mapper
$ kubectl wait --for=condition=Ready nodes --all
  1. Install Kubewarden stack
$ kubectl apply -f https://github.com/jetstack/cert-manager/releases/download/v1.5.3/cert-manager.yaml                                                                                                                                  
$ kubectl wait --for=condition=Available deployment --timeout=3m -n cert-manager --

$ cat kubewarden-controller-values.yaml 
telemetry:
  enabled: false
policyServer:
  image:
    repository: ghcr.io/kubewarden/policy-server
    tag: "v0.2.5"
  replicaCount: 1
  telemetry:
    metrics:
      port: 8080
  env:
    - name: KUBEWARDEN_LOG_LEVEL
      value: debug
  annotations: {}

$ helm repo add --force-update kubewarden https://charts.kubewarden.io                                                                                                                                                                  
$ helm upgrade --install --wait --namespace kubewarden --create-namespace kubewarden-crds kubewarden/kubewarden-crds                                                                                                                    
$ helm upgrade --install --wait --namespace kubewarden --values kubewarden-controller-values.yaml kubewarden-controller  kubewarden/kubewarden-controller 
  1. Upgrade Kubewarden release changing some configuration
$ cat values.yaml                      
policyServer:
  replicaCount: 2
  image:
    tag: "v0.2.4"

$ helm upgrade  --namespace kubewarden --values values.yaml kubewarden-controller kubewarden/kubewarden-controller
  1. Check if the policy server definition is updated
$ kubectl get policyservers default -o=jsonpath="{['.spec.image', '.spec.replicas']}"
ghcr.io/kubewarden/policy-server:v0.2.5 1%
@viccuad
Copy link
Member

viccuad commented Dec 16, 2021

Looked into this, and I think it is because the changes to the policyServer are part of the post-install hook. It seems that post-install hooks are not tracked by Helm, once they run, and can't be updated: https://helm.sh/docs/topics/charts_hooks/#hook-resources-are-not-managed-with-corresponding-releases.

Tried a helm upgrade --wait -n kubewarden kubewarden-controller kubewarden/kubewarden-controller --set image.tag=latest which upgrades the controller image, and that did work.

@viccuad
Copy link
Member

viccuad commented Dec 16, 2021

It seems that for being able to upgrade the default policy-server, it would need to be on its own chart. It's possible to use subcharts:

kubewarden-stack
└── charts
    ├── kubewarden-controller
    └── kubewarden-crds
  • kubewarden-stack chart: end chart, deploys the default policy-server (and in the future default policies for minimum security of the controller, etc). It depends on 2 subcharts, that would get deployed beforehand automatically:
    • kubewarden-controller: current
    • kubewarden-crds: current

We will need to wait for kubewarden-controller to be up and with the webhooks set. It seems that the idiomatic way is with init-containers.
Separating in this charts allows for users to install the crds or controller chart on their own, too.

@jvanz jvanz added the kind/bug Something isn't working label Dec 16, 2021
@jvanz jvanz changed the title It's not possible to reconfigure Kubewarden stack with helm upgrade It's not possible to reconfigure Kubewarden default policy server with helm upgrade Dec 16, 2021
@jvanz jvanz self-assigned this Jan 27, 2022
@jvanz jvanz removed their assignment Jan 31, 2022
@viccuad
Copy link
Member

viccuad commented Feb 1, 2022

Also, we will need to ship a minimum of policies so the Kubewarden stack is secure (e.g: no privileged pods on same nodes), looking at kubewarden/kubewarden-controller#129.

Maybe it is worth it to write an RFC on how the Kubewarden subcharts are laid out.

@jvanz
Copy link
Member Author

jvanz commented Mar 2, 2022

I've being working on this to allow policy server upgrades and how to install the resources in a proper order to avoid weird issue like reported in kubewarden/kubewarden-controller#110. However, I found issues during my tests and I would like to discuss with you.

I've created a job to check if the controller webhook is reachable. But it works fine only if we do not install the Policy Server together. This happens because hook is the last thing to run. So, it will be too late to check if the service is reachable or not. Helm already tried to deploy the Policy Server and, potentially, failed. Thus, I cannot coordinate the installation of the controller and Policy Server purely using Helm charts.

Using Helm sub-charts, as mentioned in #52 (comment), do not help either. Helm merges all the templates from all the sub-charts and the parent chart into a single set. Which replicates the problem mentioned previously.

With that in mind, I can see two options now:

  1. Separated charts for crds, controller and policy server. Installing each one individually.
  2. Add a flag to the current controller chart to enable/disable default policy server installation. No job to check the controller service

I would say to go with option 2. The issue with the time it takes to get the controller reachable is a known issue and we can workaround in different ways. But we still can install the policy server enabling a config.

Any thoughts?

@jvanz
Copy link
Member Author

jvanz commented Mar 2, 2022

Another idea that I just had is change the controller removing the webhook for the policy server and moving the logic of adding the finalizers in a reconcile loop. I think this is the "right" fix

@ereslibre
Copy link
Member

Another idea that I just had is change the controller removing the webhook for the policy server and moving the logic of adding the finalizers in a reconcile loop. I think this is the "right" fix

I disagree on this one. I think webhooks are important so invalid or incoherent resources can be identified before they are even persisted. This can only be done with webhooks, and I think it makes logic easier because if we allow the persistence of invalid or incoherent resources, this means that our Reconcile() logic has to be defensive against this cases, making controller logic more complex to maintain and fragile.

In my opinion, and given the complexity proved to come from this timing problems, I would consider not installing a default policy server with the helm chart.

Installing a default policy server would become another step in the documentation.

Another variant of my solution would be that the controller when it starts, it reconciles a default policy server if missing. This behavior can be disabled with a flag on the controller.

@viccuad
Copy link
Member

viccuad commented Mar 3, 2022

Thanks José for looking into this and coming up with several paths!

I would skip (2) as it complicates updating and tracking the state of the default policy-server.

Another variant of my solution would be that the controller when it starts, it reconciles a default policy server if missing. This behavior can be disabled with a flag on the controller.

This solves the issue we are talking on (reconfiguring the policy-server), but makes automatic upgrades/reconfiguration more complicated.

My vote goes for (1), a kubewarden-defaults chart. It gives us the functionality needed (being able to update policy-server on its own). It's a pity that is yet another chart to install by hand, but to me it's the right abstraction. In the future, it can contain default needed policies to secure the kubewarden stack against tampering, default configmaps if needed, etc.

The more I think of this problem (and the problem of installing several charts that depend on each other), the more it reminds me on why we did https://github.com/rancher-sandbox/hypper. I don't want to tell Kubewarden users to install with hypper instead of helm, though. Just, that we thought about the problem, and I think it's a real shortcoming of helm.

@raulcabello
Copy link
Contributor

Can we remove the post-install hook from the default PolicyServer ? We added the post-install hook when we had just one helm chart for everything (including crds). Unless I'm missing somehting I don't see the need of the hook anymore, since now we have a separate chart for crds. This should fix the upgrade problem

@jvanz
Copy link
Member Author

jvanz commented Mar 3, 2022

Thank you all for the comments!

Okay, let's remove the default policy server from the kubewarden-controller chart. I don't know if we should add another chart just for the policy server now. Maybe we can wait until we define which policies we would like to install in a cluster by the default, as mentioned in #52 (comment).

@kubewarden/kubewarden-documentation , can you help us with the docs? :)

Can we remove the post-install hook from the default PolicyServer ? We added the post-install hook when we had just one helm chart for everything (including crds). Unless I'm missing somehting I don't see the need of the hook anymore, since now we have a separate chart for crds. This should fix the upgrade problem

Yes, I'll remove the hook. The hook is not necessary for the CRDs. AFAICS, it try to coordinate the installation of the Policy Server. Because the Kubewarden controller should be running before trying to deploy a policy server. Due to the time sometime takes to the controller be ready to handle requests, the Helm installation can fail.

@divya-mohan0209
Copy link

divya-mohan0209 commented Mar 3, 2022

@jvanz Sure, to my best understanding what we want to document is

  • Removal of the default PolicyServer resource from the kubewarden-controller chart

Once we decide which policies to incorporate as a default, we'll need to document that as well.

Did I get the expectation right?

Where all do we expect for this to be changed? Architecture and Quickstart are sections I can readily think of.

@jvanz
Copy link
Member Author

jvanz commented Mar 3, 2022

Did I get the expectation right?

Yes. :)

Where all do we expect for this to be changed? Architecture and Quickstart are sections I can readily think of.

I'm reading the docs and at first glance the docs already explain the user how to deploy a Policy Server. So, we may not need to document that.

I'm wondering if we need to change something at all. Maybe a warning after the instructions how to install Kubewarden stack telling that since Helm chart version x the default policy server is not installed anymore. What do you think, @divya-mohan0209 ?

@viccuad
Copy link
Member

viccuad commented Mar 3, 2022

Can we remove the post-install hook from the default PolicyServer ? [...] This should fix the upgrade problem

Ey, I didn't realise of this. If that's all it takes, then I prefer to just do that 🤦‍♂️ , and add a new values.yml boolean option for "install default policy-server". Less hassle for the user, less charts to install, and the default policy-server can still be upgraded.

@jvanz
Copy link
Member Author

jvanz commented Mar 4, 2022

Just to keep documented here as well. Removing the post-install hook will allow us to upgrade the policy server. But what I'm trying to do is find a way to mitigate the issue #52 too

Our Helm chart has a web hook for the PolicyServer kind. Which means that every time you deploy a Policy Server you need the controller running and able to respond. That's the problem. When we are installing the chart, we try to install a default policy server. That's means that the controller must be running and ready to receive request from kube-api. There are some situations like described in #52 where the controller is not receiving requests for any reason. Which cause our Helm installation to fail.

From: #65 (comment)

@jvanz
Copy link
Member Author

jvanz commented Mar 7, 2022

I've updated the PR adding a Helm chart for the Policy Server

@jvanz
Copy link
Member Author

jvanz commented Mar 10, 2022

I'm moving this issue to block due the good arguments from @viccuad in this comment #65 (review). So, block this issue until we decided what to do with the default policies installed in the chart.

@jvanz
Copy link
Member Author

jvanz commented Mar 11, 2022

After a team conversation, I'm converting this into an epic. Because we need to do other tasks before merging this changes.

@jvanz jvanz added the epic label Mar 11, 2022
@jvanz jvanz changed the title It's not possible to reconfigure Kubewarden default policy server with helm upgrade Allow users to upgrade Policy Server using Helm chart Mar 11, 2022
@jvanz jvanz closed this as completed in #65 Mar 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
epic kind/bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants