Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate LitmusChaos - Pod Memory Hog experiment #1938

Closed
wants to merge 59 commits into from

Conversation

mhmohona
Copy link
Contributor

@mhmohona mhmohona commented May 28, 2021

Signed-off-by: Mahfuza mhmohona@gmail.com

Related issue: #1622

What type of PR is this

/kind feature

Proposed Changes

Litmus provides a large number of experiments for testing containers, pods, nodes, as well as specific platforms and tools. The advantage of chaos engineering is that one can quickly figure out issues that other testing layers cannot easily capture. This can save a lot of time in the future and will help to find the loopholes in the system and fix them.

In this PR, I integrated LitmusChaos experiment to test Kyverno pod to check it's behavior.

Proof Manifests

  • Tested it out locally with the Alpine Docker image.
  • The Docker image needs to be pulled - docker pull ghcr.io/kyverno/kyverno:test-litmuschaos
  • Kyverno pod needs to be restarted - kubectl -n kyverno delete pod --all
  • Run the Chaos Experiment - go test ./litmuschaos/pod_cpu_hog -v

A successful Test should be passed like this -

image

Checklist

  • I have read the contributing guidelines.
  • I have added tests that prove my fix is effective or that my feature works.
  • My PR contains new or altered behavior to Kyverno and
    • [] I have added or changed the documentation myself in an existing PR and the link is:
    • [] I have raised an issue in kyverno/website to track the doc update and the link is:
    • [] I have read the PR documentation guide and followed the process including adding proof manifests to this PR.

@mhmohona mhmohona marked this pull request as draft May 28, 2021 15:49
@realshuting realshuting added the wip work in progress label May 28, 2021
@realshuting realshuting self-assigned this Jun 1, 2021
Copy link
Member

@realshuting realshuting left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mhmohona - can you please add "proof" of this test? And convert the draft to the PR?

litmuschaos/README.md Outdated Show resolved Hide resolved
@realshuting realshuting removed the wip work in progress label Jun 1, 2021
@mhmohona mhmohona marked this pull request as ready for review June 2, 2021 14:17
@realshuting
Copy link
Member

@mhmohona - can you please add "proof" of this test?

Is this addressed?

@realshuting realshuting changed the title WIP - Integrate LitmusChaos - Pod Memory Hog experiment Integrate LitmusChaos - Pod Memory Hog experiment Jun 2, 2021
@realshuting
Copy link
Member

@mhmohona - please resolve CI errors:

Run if [ "$(gofmt -s -l . | wc -l)" -ne 0 ]
The following files were found to be not go formatted:
test/e2e/utils.go
Please run 'make fmt' to go format the above files.
Error: Process completed with exit code 1.

@mhmohona mhmohona requested a review from realshuting June 3, 2021 16:08
@realshuting
Copy link
Member

@mhmohona - can you please rebase the main branch to pick up the fix for e2e failure?

litmuschaos/README.md Outdated Show resolved Hide resolved
litmuschaos/README.md Outdated Show resolved Hide resolved
litmuschaos/README.md Outdated Show resolved Hide resolved
Comment on lines 8 to 13
### Prerequisites
- At first, ensure that the Kyverno is running by executing `kubectl get pods` in operator namespace.If not, install from [here](https://kyverno.io/docs/installation/)
- Install Litmus Chaos operator using `make install-litmus-chaos`.
- We will change the base image soon so that the Litmuschaos tests can be run against the official images. For that, in [Dockerfile](https://github.com/kyverno/kyverno/blob/main/cmd/kyverno/Dockerfile) and [localDockerfile](https://github.com/kyverno/kyverno/blob/5dfd16ce44131c05c3867409f1edf9953e7b45c0/cmd/kyverno/localDockerfile) change `scratch` to `alpine` and execute both commands - `make docker-build-all-amd64` and `make docker-build-local-kyverno`.
- Pull the Docker image with test-litmuschaos tag ` docker pull ghcr.io/kyverno/kyverno:test-litmuschaos `.
- Restart the Kyverno pod so that new changes can be applied using `kubectl -n kyverno delete pod --all `.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's update to:

  • Ensure that Kubernetes Version > 1.15
  • Ensure that the Kyverno is running by executing kubectl get pods in operator namespace (typically, kyverno). If not, install from here.
  • Update Kyverno Deployment to use ghcr.io/kyverno/kyverno:test-litmuschaos image. Note that this image is built specifically to run Litmuschaos experiments per this request, CHAOS_KILL_COMMAND. The official Kyverno images will adopt this soon.
  • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus). If not, install from here.

Makefile Outdated
Comment on lines 255 to 256
install-litmus-chaos:
kubectl apply -f https://litmuschaos.github.io/litmus/litmus-operator-v1.13.5.yaml
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove this and add the instruction in "Prerequisites'? Otherwise we have to maintain the versions.

@realshuting
Copy link
Member

@mhmohona - can you please complete the DCO by signing your commits?

mhmohona and others added 14 commits June 15, 2021 02:11
Signed-off-by: Mahfuza Humayra Mohona <mhmohona@gmail.com>
* Add: resources for initContainers

Signed-off-by: Nicolas Lamirault <nicolas.lamirault@gmail.com>

* Update: increase memory limit for init container

Signed-off-by: Nicolas Lamirault <nicolas.lamirault@gmail.com>

* Add: init container resources

Signed-off-by: Nicolas Lamirault <nicolas.lamirault@gmail.com>

* Fix: kustomize CRD

Signed-off-by: Nicolas Lamirault <nicolas.lamirault@gmail.com>
Signed-off-by: Mahfuza Humayra Mohona <mhmohona@gmail.com>
* removed additionalProperties from policy schema

Signed-off-by: NoSkillGirl <singhpooja240393@gmail.com>

* added test cases

Signed-off-by: NoSkillGirl <singhpooja240393@gmail.com>
Signed-off-by: Mahfuza Humayra Mohona <mhmohona@gmail.com>
Signed-off-by: Shuting Zhao <shutting06@gmail.com>
Signed-off-by: Mahfuza Humayra Mohona <mhmohona@gmail.com>
… not exist (kyverno#1881)

* Improved testing to allow 'skip' status and fail if tested results do not exist

Signed-off-by: Trey Dockendorf <tdockendorf@osc.edu>

* Ensure exit 0 is seen as failure when should be failure

Signed-off-by: Trey Dockendorf <tdockendorf@osc.edu>
Signed-off-by: Mahfuza Humayra Mohona <mhmohona@gmail.com>
… policyreports (kyverno#1897)

Signed-off-by: Yashvardhan Kukreja <yash.kukreja.98@gmail.com>
Signed-off-by: Mahfuza Humayra Mohona <mhmohona@gmail.com>
* Pass by value in policy cache

Signed-off-by: Shuting Zhao <shutting06@gmail.com>

* Removes check for strategicMergePatch in forceMutate

Signed-off-by: Shuting Zhao <shutting06@gmail.com>

* Removes failed test

Signed-off-by: Shuting Zhao <shutting06@gmail.com>
Signed-off-by: Mahfuza Humayra Mohona <mhmohona@gmail.com>
Signed-off-by: Shuting Zhao <shutting06@gmail.com>
Signed-off-by: Mahfuza Humayra Mohona <mhmohona@gmail.com>
Signed-off-by: Jim Bugwadia <jim@nirmata.com>
Signed-off-by: Mahfuza Humayra Mohona <mhmohona@gmail.com>
Signed-off-by: Metzger, Simon <smnmtzgr@gmail.com>
Signed-off-by: Mahfuza Humayra Mohona <mhmohona@gmail.com>
Signed-off-by: Shuting Zhao <shutting06@gmail.com>
Signed-off-by: Mahfuza Humayra Mohona <mhmohona@gmail.com>
* Fix Dev setup

* fix GVK Issue for policy cache

Co-authored-by: vyankatesh <vyankatesh@neualto.com>
Signed-off-by: Mahfuza Humayra Mohona <mhmohona@gmail.com>
* bump swagger doc to 1.21.0

Signed-off-by: Shuting Zhao <shutting06@gmail.com>

* stores openapi schema by gvk

Signed-off-by: Shuting Zhao <shutting06@gmail.com>

* fix schema validation in CLI

Signed-off-by: Shuting Zhao <shutting06@gmail.com>

* add missing resource lists

Signed-off-by: Shuting Zhao <shutting06@gmail.com>

* add e2e tests

Signed-off-by: Shuting Zhao <shutting06@gmail.com>

* address review doc comments

Signed-off-by: Shuting Zhao <shutting06@gmail.com>
Signed-off-by: Mahfuza Humayra Mohona <mhmohona@gmail.com>
* fixed {{@}} behavior

Signed-off-by: Max Goncharenko <kacejot@fex.net>

* removed white space from test

Signed-off-by: Max Goncharenko <kacejot@fex.net>
Signed-off-by: Mahfuza Humayra Mohona <mhmohona@gmail.com>
mhmohona and others added 27 commits June 15, 2021 02:12
Signed-off-by: Mahfuza Humayra Mohona <mhmohona@gmail.com>
Signed-off-by: Marcel Mueller <marcel.mueller1@rwth-aachen.de>
Signed-off-by: Mahfuza Humayra Mohona <mhmohona@gmail.com>
* Add: Recommanded Kubernetes labels

Signed-off-by: Nicolas Lamirault <nicolas.lamirault@gmail.com>

* Add: feature to add custom labels to resources metadata

Signed-off-by: Nicolas Lamirault <nicolas.lamirault@gmail.com>

* Add: manage labels with Kustomize

Signed-off-by: Nicolas Lamirault <nicolas.lamirault@gmail.com>

* Add: app label

Signed-off-by: Nicolas Lamirault <nicolas.lamirault@gmail.com>

* Add: app label for chart

Signed-off-by: Nicolas Lamirault <nicolas.lamirault@gmail.com>

* Update: make kustomize-crds

Signed-off-by: Nicolas Lamirault <nicolas.lamirault@gmail.com>

* Update: refactoring labels

Signed-off-by: Nicolas Lamirault <nicolas.lamirault@gmail.com>

* Fix: clean kustomize code

Signed-off-by: Nicolas Lamirault <nicolas.lamirault@gmail.com>

* Fix: typo

Signed-off-by: Nicolas Lamirault <nicolas.lamirault@gmail.com>

* Update: application version v1.3.6

Signed-off-by: Nicolas Lamirault <nicolas.lamirault@gmail.com>

* Update: version v1.3.6

Signed-off-by: Nicolas Lamirault <nicolas.lamirault@gmail.com>
Signed-off-by: Mahfuza Humayra Mohona <mhmohona@gmail.com>
While trying out the tutorial found a recent change that caused the tutorial to not work.

```bash
$ k create -f install.yaml
namespace/kyverno created
customresourcedefinition.apiextensions.k8s.io/clusterpolicies.kyverno.io created
customresourcedefinition.apiextensions.k8s.io/clusterpolicyreports.wgpolicyk8s.io created
customresourcedefinition.apiextensions.k8s.io/clusterreportchangerequests.kyverno.io created
customresourcedefinition.apiextensions.k8s.io/generaterequests.kyverno.io created
customresourcedefinition.apiextensions.k8s.io/policies.kyverno.io created
customresourcedefinition.apiextensions.k8s.io/policyreports.wgpolicyk8s.io created
customresourcedefinition.apiextensions.k8s.io/reportchangerequests.kyverno.io created
serviceaccount/kyverno-service-account created
clusterrole.rbac.authorization.k8s.io/kyverno:admin-policies created
clusterrole.rbac.authorization.k8s.io/kyverno:admin-policyreport created
clusterrole.rbac.authorization.k8s.io/kyverno:admin-reportchangerequest created
clusterrole.rbac.authorization.k8s.io/kyverno:customresources created
clusterrole.rbac.authorization.k8s.io/kyverno:generatecontroller created
error: error validating "install.yaml": error validating data: ValidationError(ClusterRole.metadata): unknown field "app"
f you choose to ignore these errors, turn validation off with --validate=false
```

Signed-off-by: William Montgomery <wmontgomery@apexclearing.com>
Signed-off-by: Mahfuza Humayra Mohona <mhmohona@gmail.com>
… release/install.yaml (kyverno#1945)

Signed-off-by: Shuting Zhao <shutting06@gmail.com>
Signed-off-by: Mahfuza Humayra Mohona <mhmohona@gmail.com>
* added sample test

Signed-off-by: NoSkillGirl <singhpooja240393@gmail.com>

* case: when creating the new namespace without the label, there should not have any generated resource

Signed-off-by: NoSkillGirl <singhpooja240393@gmail.com>

* case: when adding the matched label to the namespace, the target resource should be generated

Signed-off-by: NoSkillGirl <singhpooja240393@gmail.com>

* removing comments

Signed-off-by: NoSkillGirl <singhpooja240393@gmail.com>

* trying to check updated network policy

Signed-off-by: NoSkillGirl <singhpooja240393@gmail.com>

* case: when synchronize flag is set to true in the policy, one cannot delete the generated resource

Signed-off-by: NoSkillGirl <singhpooja240393@gmail.com>

* trying to check updated generate policy

Signed-off-by: NoSkillGirl <singhpooja240393@gmail.com>

* case: change synchronize to false in the policy, the label in generated resource should be updated to policy.kyverno.io/synchronize: disable

Signed-off-by: NoSkillGirl <singhpooja240393@gmail.com>

* case: when changing the content in generate.data, the change should be synced to the generated resource

Signed-off-by: NoSkillGirl <singhpooja240393@gmail.com>

* added comments

Signed-off-by: NoSkillGirl <singhpooja240393@gmail.com>

* case: with synchronize==false, one should be able to delete the generated resource

Signed-off-by: NoSkillGirl <singhpooja240393@gmail.com>

* handling error

Signed-off-by: NoSkillGirl <singhpooja240393@gmail.com>

* added retrying

Signed-off-by: NoSkillGirl <singhpooja240393@gmail.com>

* minor e2e fixes

Signed-off-by: NoSkillGirl <singhpooja240393@gmail.com>

* e2e fixes

Signed-off-by: NoSkillGirl <singhpooja240393@gmail.com>

* added logs of mutate error

Signed-off-by: NoSkillGirl <singhpooja240393@gmail.com>

* printing configmap

Signed-off-by: NoSkillGirl <singhpooja240393@gmail.com>

* printing configmap using BY

Signed-off-by: NoSkillGirl <singhpooja240393@gmail.com>

* removing print statements

Signed-off-by: NoSkillGirl <singhpooja240393@gmail.com>

* print configmap name

Signed-off-by: NoSkillGirl <singhpooja240393@gmail.com>

* printing complete configmap

Signed-off-by: NoSkillGirl <singhpooja240393@gmail.com>
Signed-off-by: Mahfuza Humayra Mohona <mhmohona@gmail.com>
Signed-off-by: NoSkillGirl <singhpooja240393@gmail.com>
Signed-off-by: Mahfuza Humayra Mohona <mhmohona@gmail.com>
Signed-off-by: RinkiyaKeDad <arshsharma461@gmail.com>
Signed-off-by: Mahfuza Humayra Mohona <mhmohona@gmail.com>
Signed-off-by: RinkiyaKeDad <arshsharma461@gmail.com>
Signed-off-by: Mahfuza Humayra Mohona <mhmohona@gmail.com>
Signed-off-by: Yashvardhan Kukreja <yash.kukreja.98@gmail.com>
Signed-off-by: Mahfuza Humayra Mohona <mhmohona@gmail.com>
* fixed generate flow

Signed-off-by: NoSkillGirl <singhpooja240393@gmail.com>

* added test for generate policy with clone

Signed-off-by: NoSkillGirl <singhpooja240393@gmail.com>

* small conflict fix

Signed-off-by: NoSkillGirl <singhpooja240393@gmail.com>

* print logs for e2e

Signed-off-by: NoSkillGirl <singhpooja240393@gmail.com>

* changing log level

Signed-off-by: NoSkillGirl <singhpooja240393@gmail.com>

* added wait while creating policy

Signed-off-by: NoSkillGirl <singhpooja240393@gmail.com>

* remove log level from e2e

Signed-off-by: NoSkillGirl <singhpooja240393@gmail.com>

* added a clusterpolicy check while creating a namespaced resource in e2e tests

Signed-off-by: NoSkillGirl <singhpooja240393@gmail.com>

* updated the github_action name for e2e tests

Signed-off-by: NoSkillGirl <singhpooja240393@gmail.com>

* changing waiting time to 1 sec

Signed-off-by: NoSkillGirl <singhpooja240393@gmail.com>

* remove log

Signed-off-by: Shuting Zhao <shutting06@gmail.com>

Co-authored-by: Shuting Zhao <shutting06@gmail.com>
Signed-off-by: Mahfuza Humayra Mohona <mhmohona@gmail.com>
…cy_milliseconds and a small fix (kyverno#1970)

Signed-off-by: Yashvardhan Kukreja <yash.kukreja.98@gmail.com>
Signed-off-by: Mahfuza Humayra Mohona <mhmohona@gmail.com>
…o#1914)

* Fix Dev setup

* Update variable paths

* fix testcase issue

Co-authored-by: vyankatesh <vyankatesh@neualto.com>
Signed-off-by: Mahfuza Humayra Mohona <mhmohona@gmail.com>
Signed-off-by: Mahfuza Humayra Mohona <mhmohona@gmail.com>
This fails on openshift since we cannot specify users within this range. Also, this template should be as close as possible to the vanilla manifest for deployment https://github.com/kyverno/kyverno/blob/main/definitions/release/install.yaml

Vanilla manifest omits the user specification https://github.com/kyverno/kyverno/blob/main/definitions/release/install.yaml#L2478

Signed-off-by: Waleed Malik <ahmedwaleedmalik@gmail.com>
Signed-off-by: Mahfuza Humayra Mohona <mhmohona@gmail.com>
Signed-off-by: Mahfuza Humayra Mohona <mhmohona@gmail.com>
* Fix Dev setup

* webhook monitor - start webhook monitor in main process

Signed-off-by: Shuting Zhao <shutting06@gmail.com>

* add leaderelection

Signed-off-by: Jim Bugwadia <jim@nirmata.com>

* - add isLeader; - update to use configmap lock

Signed-off-by: Shuting Zhao <shutting06@gmail.com>

* - add initialization method - add methods to get attributes

Signed-off-by: Shuting Zhao <shutting06@gmail.com>

* address comments

Signed-off-by: Shuting Zhao <shutting06@gmail.com>

* remove newContext in runLeaderElection

Signed-off-by: Shuting Zhao <shutting06@gmail.com>

* add leader election to GenerateController

Signed-off-by: Jim Bugwadia <jim@nirmata.com>

* skip processing for non-leaders

Signed-off-by: Jim Bugwadia <jim@nirmata.com>

* skip processing for non-leaders

Signed-off-by: Jim Bugwadia <jim@nirmata.com>

* add leader election to generate cleanup controller

Signed-off-by: Jim Bugwadia <jim@nirmata.com>

* Gracefully drain request

* HA - Webhook Register / Webhook Monitor / Certificate Renewer (kyverno#1920)

* enable leader election for webhook register

Signed-off-by: Shuting Zhao <shutting06@gmail.com>

* extract certManager to its own process

Signed-off-by: Shuting Zhao <shutting06@gmail.com>

* leader election for cert manager

Signed-off-by: Shuting Zhao <shutting06@gmail.com>

* certManager - init certs by the leader

Signed-off-by: Shuting Zhao <shutting06@gmail.com>

* add leader election to webhook monitor

Signed-off-by: Shuting Zhao <shutting06@gmail.com>

* update log message

Signed-off-by: Shuting Zhao <shutting06@gmail.com>

* add leader election to policy controller

Signed-off-by: Shuting Zhao <shutting06@gmail.com>

* add leader election to policy report controller

Signed-off-by: Shuting Zhao <shutting06@gmail.com>

* rebuild leader election config

Signed-off-by: Shuting Zhao <shutting06@gmail.com>

* start informers in leaderelection

Signed-off-by: Shuting Zhao <shutting06@gmail.com>

* start policy informers in main

Signed-off-by: Shuting Zhao <shutting06@gmail.com>

* enable leader election in main

Signed-off-by: Shuting Zhao <shutting06@gmail.com>

* move eventHandler to the leader election start method

Signed-off-by: Shuting Zhao <shutting06@gmail.com>

* address reviewdog comments

Signed-off-by: Shuting Zhao <shutting06@gmail.com>

* add clusterrole leaderelection

Signed-off-by: Shuting Zhao <shutting06@gmail.com>

* fixed generate flow (kyverno#1936)

Signed-off-by: NoSkillGirl <singhpooja240393@gmail.com>

* - init separate kubeclient for leaderelection - fix webhook monitor

Signed-off-by: Shuting Zhao <shutting06@gmail.com>

* address reviewdog comments

Signed-off-by: Shuting Zhao <shutting06@gmail.com>

* cleanup Kyverno managed resources on stopLeading

Signed-off-by: Shuting Zhao <shutting06@gmail.com>

* tag v1.4.0-beta1

Signed-off-by: Shuting Zhao <shutting06@gmail.com>

* fix cleanup process on Kyverno stops

Signed-off-by: Shuting Zhao <shutting06@gmail.com>

* bump kind to 0.11.0, k8s v1.21 (kyverno#1980)

Co-authored-by: vyankatesh <vyankatesh@neualto.com>
Co-authored-by: vyankatesh <vyankateshkd@gmail.com>
Co-authored-by: Jim Bugwadia <jim@nirmata.com>
Co-authored-by: Pooja Singh <36136335+NoSkillGirl@users.noreply.github.com>
Signed-off-by: Mahfuza Humayra Mohona <mhmohona@gmail.com>
Signed-off-by: Shuting Zhao <shutting06@gmail.com>
Signed-off-by: Mahfuza Humayra Mohona <mhmohona@gmail.com>
…o#1935)

* change min support kubernetes version to 1.16 for kyverno 1.4

Signed-off-by: vineethvanga18 <reddy.8@iitj.ac.in>

* migrate deployment to apps/v1

Signed-off-by: vineethvanga18 <reddy.8@iitj.ac.in>
Signed-off-by: Mahfuza Humayra Mohona <mhmohona@gmail.com>
Signed-off-by: Mahfuza Humayra Mohona <mhmohona@gmail.com>
Signed-off-by: Mahfuza Humayra Mohona <mhmohona@gmail.com>
…#1978)

* Fix Dev setup

* Fix mutate policies kept applying to these terminating Pods

* fix patch resource issue

Co-authored-by: vyankatesh <vyankatesh@neualto.com>
Signed-off-by: Mahfuza Humayra Mohona <mhmohona@gmail.com>
Signed-off-by: Mahfuza Humayra Mohona <mhmohona@gmail.com>
Signed-off-by: vineethvanga18 <reddy.8@iitj.ac.in>
Signed-off-by: Mahfuza Humayra Mohona <mhmohona@gmail.com>
…rvice (kyverno#1988)

* Allow metrics service annotations to be defined separate from main service

Signed-off-by: Trey Dockendorf <tdockendorf@osc.edu>

* Add test for metrics during Helm deployment testing

Signed-off-by: Trey Dockendorf <tdockendorf@osc.edu>

* Make services separate for kustomize

Signed-off-by: Trey Dockendorf <tdockendorf@osc.edu>

* Run 'make kustomize-crd'

Signed-off-by: Trey Dockendorf <tdockendorf@osc.edu>

* Fix e2e tests for metrics

Signed-off-by: Trey Dockendorf <tdockendorf@osc.edu>

* Fix Helm chart for metrics service

Signed-off-by: Trey Dockendorf <tdockendorf@osc.edu>

* Fix helm chart testing

Signed-off-by: Trey Dockendorf <tdockendorf@osc.edu>
Signed-off-by: Mahfuza Humayra Mohona <mhmohona@gmail.com>
Signed-off-by: Mahfuza Humayra Mohona <mhmohona@gmail.com>
Signed-off-by: Mahfuza Humayra Mohona <mhmohona@gmail.com>
@mhmohona
Copy link
Contributor Author

Reopened it in #2014

@mhmohona mhmohona closed this Jun 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet