diff --git a/website/content/docs/proposals/calling-benchmark-job.png b/website/content/docs/proposals/calling-benchmark-job.png new file mode 100644 index 0000000..13b150e Binary files /dev/null and b/website/content/docs/proposals/calling-benchmark-job.png differ diff --git a/website/content/docs/proposals/green-reviews-pipeline-components.png b/website/content/docs/proposals/green-reviews-pipeline-components.png new file mode 100644 index 0000000..abc2a69 Binary files /dev/null and b/website/content/docs/proposals/green-reviews-pipeline-components.png differ diff --git a/website/content/docs/proposals/pipeline-run.png b/website/content/docs/proposals/pipeline-run.png new file mode 100644 index 0000000..b4c43a9 Binary files /dev/null and b/website/content/docs/proposals/pipeline-run.png differ diff --git a/website/content/docs/proposals/proposal-002-run.md b/website/content/docs/proposals/proposal-002-run.md index c6c6e27..b34a2f3 100644 --- a/website/content/docs/proposals/proposal-002-run.md +++ b/website/content/docs/proposals/proposal-002-run.md @@ -67,9 +67,9 @@ It is helpful to frame this to answer the question: "What is the problem this pr is trying to solve?" --> -This proposal is part of the pipeline automation of the Green Review for Falco. Currently, we are using Flux to watch the upstream Falco repository and run the benchmark tests constantly. For example, [this benchmark test](https://github.com/falcosecurity/cncf-green-review-testing/blob/main/kustomize/falco-driver/ebpf/stress-ng.yaml#L27-L32) is setup as a Kubernetes Deployment that runs an endless loop of [`stress-ng`](https://wiki.ubuntu.com/Kernel/Reference/stress-ng), which applies stress to the kernel. Instead, this proposal aims to provide a solution for how to deploy the benchmark tests only when they are needed. +This proposal is part of the pipeline automation of the Green Reviews tooling for Falco (and new CNCF projects in the future). Currently, we are using Flux to watch the upstream Falco repository and run the benchmark tests constantly. For example, [this benchmark test](https://github.com/falcosecurity/cncf-green-review-testing/blob/main/kustomize/falco-driver/ebpf/stress-ng.yaml#L27-L32) is setup as a Kubernetes Deployment that runs an endless loop of [`stress-ng`](https://wiki.ubuntu.com/Kernel/Reference/stress-ng), which applies stress to the kernel. Instead, this proposal aims to provide a solution for how to deploy the benchmark tests only when they are needed. -Secondly, automating the way we run benchmark tests in this pipeline will help to make it easier and faster to add new benchmark tests. It will enable both the WG Green Reviews and CNCF Project Maintainers to come up with new benchmark tests and run them to get feedback faster. +Secondly, automating the way we run benchmark tests in this pipeline will help to make it easier and faster to add new benchmark tests. It will enable both the WG Green Reviews and CNCF project maintainers to come up with new benchmark tests and run them to get feedback faster. ### Goals @@ -78,8 +78,8 @@ List the specific goals of the proposal. What is it trying to achieve? How will know that this has succeeded? --> - - Describe the actions to take immediately after the trigger from Proposal 1 https://github.com/cncf-tags/green-reviews-tooling/issues/84 - - Describe how the pipeline should _fetch_ the benchmark tests either from this repository (`cncf-tags/green-reviews-tooling`) or from an upstream repository (Falco's [`falcosecurity/cncf-green-review-testing`](https://github.com/falcosecurity/cncf-green-review-testing)). + - Describe the actions to take immediately after the trigger from [Proposal 1](https://github.com/cncf-tags/green-reviews-tooling/issues/84) + - Describe how the pipeline should _fetch_ the benchmark tests either from this repository (`cncf-tags/green-reviews-tooling`) or from an upstream repository (e.g. Falco's [`falcosecurity/cncf-green-review-testing`](https://github.com/falcosecurity/cncf-green-review-testing)). - Describe how the pipeline should _run_ the benchmark tests through GitHub Actions for a specific project e.g. Falco - Communicate the changes needed to be made by the Falco team to change the benchmark test to a GitHub Action file. - Provide _modularity_ for the benchmark tests. @@ -118,7 +118,7 @@ implementation. The "Design Details" section below is for the real nitty-gritty. --> -### User Stories (Optional) +### User Stories -As a CNCF Project Maintainer, -I want to run a benchmark test for a specific CNCF Project **from a separate GitHub repository**, -So that it runs in the Green Review pipeline and so that I can see the project's sustainability metrics. +**CNCF project maintainer creates a new benchmark test for their project** + +As a project maintainer, I create and run a benchmark test from a separate GitHub repository or from the green-reviews-tooling repository, by following the steps indicated in the Green Reviews documentation, so that the project’s benchmark tests runs in the Green Reviews pipeline and so that I can see the project's sustainability metrics. + + +**Green Reviews maintainer helps to create a new benchmark test for a specific CNCF project** + +As a Green Reviews maintainer, I can help a CNCF project maintainers to define the Functional Unit of a project so that the project maintainers can create a benchmark test. + +**Project maintainer modifies or removes a benchmark test** + +As a project maintainer, I can edit or remove a benchmark test if it is in a repository owned by the CNCF project itself, or otherwise if it’s in the Green Reviews repository by making a pull request with the changes. -As a CNCF Project Maintainer, -I want to run a benchmark test for a specific CNCF Project **to this GitHub repository**, -So that it runs in the Green Review pipeline and so that I can see the project's sustainability metrics. +As with every design document, there’s a risk that the solution doesn’t cover all cases, especially considering that at Green Reviews we have only had one test case (very appreciated guinea pig 🙂), which is Falco. This proposal needs to at least support that case. When other CNCF projects start using Green Reviews we will learn more and adapt the project as needed. + +Another risk is the question of ownership between Green Reviews contributors and CNCF project maintainers. As much as possible, Green Reviews contributors should empower project maintainers to create the benchmark tests. This does not stop Green Reviews maintainers from creating new benchmark tests. However, project maintainers should be encouraged to lead this work. Otherwise, this could add load on Green Reviews maintainers and it is hard to scale. Project maintainers should lead this since they have the expertise about how the project scales and performs under load. + + ## Design Details -There are different components defined here. +The Green Reviews automated pipeline relies on putting together different reusable GitHub Action workflows to modularise the different moving parts. A workflow runs one or more jobs, and each job runs one or more actions. It may be helpful to familiarise oneself with the documentation on [GitHub Action workflows](https://docs.github.com/en/actions/using-workflows/about-workflows) and especially [Reusing workflows](https://docs.github.com/en/actions/using-workflows/about-workflows) first to better understand the rest of the proposal as it explains some of these concepts well. The section on [Calling reusable workflows](https://docs.github.com/en/actions/using-workflows/about-workflows) describes an important concept that will be referenced later in this proposal. + +### Definitions + +There are different components defined here and shown in the following diagram. + +![Green Reviews pipeline components](green-reviews-pipeline-components.png "Green Reviews pipeline components") -* **Trigger Pipeline**: This refers to the initial GitHub Action Workflow. The name is up to debate. -* **Project Pipeline**: There is one project pipeline per CNCF Project. A project pipeline should be able to find a list of jobs and run them one after the other. One pipeline can contain one or multiple Jobs. -* **Job**: A Job refers to a GitHub Action Job. This is essentially the technical implementation of the benchmark test definition. It is what contains the test instructions. -* **Test instructions**: This refers to the actual benchmark test steps that should run on the cluster. These are usually related to the tool's Functional Unit as defined by the SCI. +1. **Green Reviews pipeline**: the Continuous Integration pipeline which deploys a CNCF project to a test cluster, runs a set of benchmarks while measuring carbon emissions and stores the results. It is implemented by the workflows listed below. +2. **Cron workflow**: This refers to the initial GitHub Action workflow (described in proposal 1) and which dispatches a project workflow (see next definition). +3. **Project workflow**: The project workflow is dispatched by the Cron workflow. A project workflow can be, for example, a Falco workflow. A project workflow deploys the project, calls the benchmark workflow (see below), and then cleans up the deployment. A project workflow can be dispatched more than once if there are multiple subcomponents. In addition, a Project workflow, which is also just another GitHub Action workflow, contains a list of GitHub Action Jobs. These are a list of Benchmark Jobs - more info below. +4. **[new] Benchmark Job**: This is an instance of a GitHub Action Job within the Project workflow. The benchmark job runs the test benchmarks of a CNCF project. Which benchmark test to run is defined by inputs in the calling workflow: a CNCF project and a subcomponent. It is the main concern of this proposal. This is essentially the technical implementation of the benchmark test definition. It is what contains the test instructions. It is described in Defining Benchmark Jobs. +5. **[new] Benchmark workflow**: This is a separate manifest containing the Benchmark Instructions. +6. **[new] Benchmark Instructions**: This refers to the actual benchmark test steps that should run on the cluster. These are usually related to the tool's Functional Unit as defined by the SCI. It is described in Defining Benchmark Jobs. -The next steps describe how to deploy a benchmark test to the cluster, taking Falco & [stress-ng](https://wiki.ubuntu.com/Kernel/Reference/stress-ng) as an example. -### Defining Jobs & how to run them from the project pipeline +### How the project workflow calls the benchmark job -This section dives deeper into the following: -* How a Job is related to a benchmark test -* How a Job should be called from the project pipeline +When the project workflow starts, it deploys the project on the test environment and then runs the benchmark job. For modularity and/or clarity, the benchmark test instructions could be defined in 3 different ways: + +1. As a Job and with in-line instructions/steps +2. As a Job that calls another GitHub Action workflow (yes, yet another workflow 🙂) that contains the instructions. The workflow can be either: + 1. In the Green Reviews WG repository + 2. In a separate repository + +The three options for defining a benchmark test are illustrated below. + +![Calling the benchmark job](calling-benchmark-job.png "Calling the Benchmark job") + +### Running & Defining Benchmark Jobs + +This section defines Benchmark Jobs (referred as Jobs) and benchmark instructions. It describes how to run them from the Project workflow. It dives deeper into the following: + +* How a Job should be called from the project workflow * What a Job must contain in order to run on the cluster +* How a Job is related to benchmark instructions -At a bare minimum, the benchmark test must contain test instructions of what should run in the Kubernetes cluster. Below are some examples of different benchmark test setups. -For example, Falco has define their `stress-ng` test in a Deployment manifest which is ready to be applied to a cluster. This Deployment manifest can be applied in a similar way to how Falco was deployed to the cluster, using `kubectl`, for example. This will help to not have a benchmark test running 24/7 on the cluster. +At a bare minimum, the benchmark test must contain test instructions of what should run in the Kubernetes cluster. For example, the Falco project maintainers have identified that one way to test the Falco project is through a test that runs `stress-ng` for a given period of time. The test instructions are contained in a Deployment manifest which can be directly applied to the benchmark test cluster using `kubectl` -Below is an example of the simplest benchmark test definition as part of a GitHub Action. +Below are 3 benchmark test use cases and their implementations. -The pipeline: +**Use Case 1: The Benchmark Job contains the test instructions in-line** + +Below is an example of a self-contained benchmark job with instructions defined in-line. ``` -# pipeline.yaml +# falco-ebpf-project-workflow.yaml jobs: # first, must authenticate to the Kubernetes cluster - call-workflow-in-current-repo: # this is a Job - uses: octo-org/current-repo/.github/workflows/workflow.yml@v1 # refers to Job contained in the current repository + example_falco_test: + runs-on: ubuntu-latest + steps: + - run: | + # depends on the Functional Unit of the CNCF project + # apply manifest with stress-ng loop + kubectl apply -f https://raw.githubusercontent.com/falcosecurity/cncf-green-review-testing/main/kustomize/falco-driver/ebpf/stress-ng.yaml + wait 15m + kubectl delete -f https://raw.githubusercontent.com/falcosecurity/cncf-green-review-testing/main/kustomize/falco-driver/ebpf/stress-ng.yaml ``` -We assume that the workflow already contains a kubeconfig to authenticate with the test cluster and Falco has already been deployed to it. It is required that the pipeline authenticates with the Kubernetes cluster before running the job with the test. +We assume that the workflow already contains a secret with a kubeconfig to authenticate with the test cluster (see for example here) and Falco has already been deployed to it. It is required that the pipeline authenticates with the Kubernetes cluster before running the job with the test. + +The job applies the upstream Kubernetes manifest. The Kubernetes manifest contains the test instructions for Falco. It contains a while loop that runs stress-ng. The functional unit test is time-bound in this case and scoped to 15 minutes. Therefore, we deploy this test, wait for 15 minutes, then delete the manifest to end the loop. The action to take in the Job depends on the Functional Unit of the CNCF project. + +For other CNCF projects, the test instructions could look different. For example, for Flux and ArgoCD, the benchmark tests defined here test these CNCF projects by using their CLI and performing CRUD operations on a demo workload. The functional unit of Flux and ArgoCD are different to Flux. For these GitOps tools, the functional unit depends on whether there are changes detected that they need to reconcile. In other words, the GitHub Action Job could do different things depending on how the CNCF project should be tested. + +Lastly, it is important that the test should run in a way that is highly configurable. For example, the job can configure parameters to indicate where the test should run e.g. define which namespace. + + +**Use Case 2: The Benchmark Job contains a Benchmark workflow with instructions defined in the same repository** -The job applies the upstream Kubernetes manifest: +The project workflow’s benchmark job calls a separate workflow which is located in the Green Reviews repository. +```yaml +# falco-ebpf-project-workflow.yaml +jobs: + # first, must authenticate to the Kubernetes cluster + # this is a Job + call-workflow-in-current-repo: + # job calls on a separate workflow + uses: octo-org/current-repo/.github/workflows/workflow.yml@v1 # refers to Job contained in the current repository ``` + +The separate job workflow contains the benchmark test instructions. + +```yaml # stress-ng-falco-test.yaml jobs: example_falco_test: runs-on: ubuntu-latest steps: - run: | - # the action to take here depends on the Functional Unit of the CNCF Project. wait for amount of time, for resources + # the action to take here depends on the Functional Unit of the CNCF project. wait for amount of time, for resources kubectl apply -f https://raw.githubusercontent.com/falcosecurity/cncf-green-review-testing/main/kustomize/falco-driver/ebpf/stress-ng.yaml wait 15m kubectl delete -f https://raw.githubusercontent.com/falcosecurity/cncf-green-review-testing/main/kustomize/falco-driver/ebpf/stress-ng.yaml ``` -The Kubernetes manifest contains the test instructions for Falco. It contains a `while` loop that runs `stress-ng`. The functional unit test is time-bound in this case and scoped to 15 minutes. Therefore, we deploy this test, wait for 15 minutes, then delete the manifest to end the loop. The action to take in the Job depends on the Functional Unit of the CNCF Project. - -In the example above, the Kubernetes manifest is located in a different repository. This is due to unique circumstances and limitations within the Falco project. To work around this, the Job points to the location of the `falcosecurity/cncf-green-review-testing` repository. The pipeline should accomodate different CNCF Projects. If CNCF Project Maintainers are able to contribute to `green-reviews-tooling`, then they are welcome to store test artefacts in the `green-reviews-tooling` repository. There is also nothing stopping us from having a Kubernetes manifest inline. +In the example above, the Kubernetes manifest is located in a different repository. This is due to unique circumstances and limitations within the Falco project. To work around this, the Job points to the location of the `falcosecurity/cncf-green-review-testing repository`. The pipeline should accommodate different CNCF projects. If CNCF project maintainers are able to contribute to green-reviews-tooling, then they are welcome to store test artefacts in the green-reviews-tooling repository. There is also nothing stopping us from having a Kubernetes manifest inline. -For other CNCF Projects, the test instructions could look different. For example, for Flux and ArgoCD, the benchmark tests defined [here](https://github.com/nikimanoledaki/gitops-energy-tests/blob/main/experiments/01-first-scenario.md#deploy-the-application-using-flux-cd) test these CNCF Projects by using their CLI and performing CRUD operations on a demo workload. The functional unit of Flux and ArgoCD are different to Flux. For these GitOps tools, the functional unit depends on whether there are changes detected that they need to reconcile. In other words, the GitHub Action Job could do different things depending on how the CNCF Project should be tested. -Lastly, it is important that the test should run in a way that is highly configurable. For example, the job can configure parameters to indicate where the test should run e.g. define which namespace. - -#### Job defined in upstream Kubernetes manifests +**Use Case 3: The Benchmark Job contains a Benchmark workflow with instructions defined in a different repository** -We want to accomodate different methods of setting up the tests depending on the CNCF Project. Given this, the Job containing the test instructions could be defined in a different repository. In this case, the pipeline could call the job like this: +We want to accommodate different methods of setting up the tests depending on the CNCF project. Given this, the Job containing the test instructions could be defined in a different repository. In this case, the pipeline could call the job like this: -``` -# pipeline.yaml +```yaml +# falco-ebpf-project-workflow.yaml jobs: call-workflow-in-another-repo: uses: octo-org/another-repo/.github/workflows/workflow.yml@v1 # refer to other repo @@ -238,21 +294,9 @@ jobs: A Job can be fetched from other GitHub organizations and repositories using the `jobs..uses` syntax defined [here](https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobsjob_iduses). This syntax can be configured to use `@main` or `@another-branch` which would be nice for versioning and testing specific releases. -TODO: -* From the trigger pipeline, go to the Project Pipeline to run the tests of a specific CNCF Project. -* Directory structure for CNCF Projects, their pipeline/Jobs, etc. Collecting the Jobs & categorising them in "libraries"? -* Cleaning up test artefacts -* Creating a template for how to "register" a benchmark test with everything that is needed and clear instructions on how to add a new one. Audience is primarily CNCF Project Maintainers. - -### Graduation Criteria (Optional) +![Pipeline run](pipeline-run.png "An example pipeline run") - ## Drawbacks (Optional) @@ -278,3 +322,9 @@ Use this section if you need things from the project/SIG. Examples include a new subproject, repos requested, or GitHub details. Listing these here allows a SIG to get the process for these resources started right away. --> + +TODO: +* From the trigger pipeline, go to the Project Pipeline to run the tests of a specific CNCF project. → I suppose this is covered between Proposal 1 and this. Only thing we didn’t dig into is the subcomponents. +* Directory structure for CNCF projects, their pipeline/Jobs, etc. Collecting the Jobs & categorising them in "libraries"? → We should probably specify a standard way to store the tests in directories, where do they go and what is the directory structure, something in line with the subcomponents in Falco. +* Cleaning up test artefacts → Oops, not yet there either 😛 +* Creating a template for how to "register" a benchmark test with everything that is needed and clear instructions on how to add a new one. Audience is primarily CNCF project maintainers.