Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KEP-32: Community Repository Management #1574

Merged
merged 9 commits into from
Jul 17, 2020
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
117 changes: 117 additions & 0 deletions keps/0032-community-repository-management.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
---
kep-number: 32
title: Community Repository Management
short-desc: Details on how to add operator to the community repository
authors:
- "@nfnt"
owners:
- "@nfnt"
creation-date: 2020-07-02
last-updated: 2020-07-15
status: provisional
see-also:
- KEP-10
- KEP-15
---

# Community Repository Management

## Table of Contents

- [Community Repository Management](#community-repository-management)
- [Table of Contents](#table-of-contents)
- [Summary](#summary)
- [Motivation](#motivation)
- [Goals](#goals)
- [Non-Goals](#non-goals)
- [Proposal](#proposal)
- [Risks and Mitigations](#risks-and-mitigations)
- [Graduation Criteria](#graduation-criteria)
- [Implementation History](#implementation-history)

## Summary

By default, every KUDO deployment installs operator packages from the community repository. As we encourage the KUDO community to add new operator packages to this repository, this KEP defines a workflow to add new operator packages to the community repository.

## Motivation

Currently, most operator packages in the community repository are provided from the `kudobuilder/operators` Git repository. This Git repository contains package definitions, documentation and sometimes tests for each operator. While it is convenient, to have this all in a single repository, this is challenging for larger or third-party operators:

- Large operators usually provide their own Docker images and/or additional dependencies. These have to be built, tested and deployed. The tools for this can't be provided by `kudobuilder/operators`.
- Some operator tests cannot be covered by the test tooling provided in `kudobuilder/operators` making it necessary to host the tests in a separate Git repository.
- Providing multiple versions of an operator is only possible by using separate folders. Teams developing operators might prefer different workflows (e.g. Git tags)

These requirements make it necessary for operator packages to be developed in a separate Git repository. As a result, the operator packages in `kudobuilder/operators` are copies of specific versions of the respective upstream Git repository. This approach has challenges as well, because bugs discovered in an operator need to be resolved in the upstream Git repository, not in `kudobuilder/operators`. I.e., there needs to be metadata to link to the upstream Git repository. By having some form of metadata describing upstream operator package sources, the same metadata can be used to describe other properties of a package, e.g. the maturity level.

### Goals
nfnt marked this conversation as resolved.
Show resolved Hide resolved

- Provide a simple workflow for upstream operator packages to get added to the community repository
- Remove the need to "host" copies of upstream operator packages in a Git repository
- Provide a mechanism to define upstream operator package sources

### Non-Goals

- Change the practice of creating a PR against a Git repository to add operator packages to the community repository
- Change operator package management as defined in [KEP-10](0010-package-manager.md) and [KEP-15](0015-repository-management.md)
- Integration with [ArtifactHub](https://artifacthub.io/)

## Proposal

Upstream operator developers still create PRs against a Git repository to add their operator packages to the community repository. This Git repository lists references to upstream operator packages instead of a full copy of an operator package. A reference can point to a Git repository or a package tarball. A version of an operator package is described by a specific tag of a Git repository or a URL pointing to an operator tarball of that release.

For example, consider an operator package developed at `github.com/example/example-operator` that has tagged operator versions `1.0.0`, `1.1.0` and the operator package in the `operator` folder. To add or update this operator package, the developers would create a PR referencing their upstream Git repository and the specific version, e.g. by adding a file `example-operator.yaml` like

```yaml
apiVersion: index.kudo.dev/v1alpha1
kind: Operator
name: Example Operator
nfnt marked this conversation as resolved.
Show resolved Hide resolved
gitSources:
- name: git-repo
url: github.com/example/example-operator.git
versions:
- appVersion: "1.0.0"
operatorVersion: "1.0.0"
git:
source: git-repo
tag: "1.0.0_1.0.0"
directory: operator
- appVersion: "1.1.0"
operatorVersion: "1.0.0"
git:
source: git-repo
tag: "1.1.0_1.0.0"
directory: operator
```

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be responsible for compiling the operator as part of the KUDO community repo, or should we be referencing releases that are built and hosted on GitHub (e.g. like Krew does https://github.com/kubernetes-sigs/krew-index/blob/master/plugins/kudo.yaml)?

By referencing a release, we could eventually support various release objects (e.g. helm charts/Docker Images/OCI Artifact Specs/etc) rather than requiring the internal objects to build those objects.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends on what "compiling" should cover. We should be responsible for creating tarballs from operator package folders because these tarballs are specific to repositories and operator developers shouldn't keep them as part of their Git repositories. Though we could also support these tarballs directly if developers want to do that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Creating the tarballs ourself also makes sure that they're not changed afterwards

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with having a model that creates the tarballs... but I also want to support referencing prebuilt tars. Requiring us to build them assumes that the build process is exactly how we define it... and that the git repo has an instance of this... I could imagine additional templates or build techniques for which the files to create the operator are an intermediary step (which an org may not want to check-in, outside of our requirement on them). Supporting prebuilts is necessary IMO. We should: 1) either make this a current non-goal (for future definition) or 2) define it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of "version" concerns me a bit...

context: The version of an operator is defined in the operator.yaml and is part of the tarball name. The tarball version also defines this as app_version and op_version. so...

  1. Is the "version" used here... app, or op?
  2. Is this expected to be the app_V-op_v?
  3. Is it checked / verified against the artifact? does the PR fail if it doesn't match?

Additionally... can we make it easy for people... can we assume that the "tag" is the "version" if not specified? It seems like we could assume that the version is the tag but allow tag to override that assumption.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each package that is to be added to the community repo needs to be checked for validity by the CI. This would at least cover that it is recognized by KUDO as an operator package. More checks for conformance are of course possible as well. All of these checks will work with directories as well as tarballs. When referencing tarballs, these will be copied into the community repository. I'll clarify this in the KEP.

@kensipe Good point regarding the version. This is again a question on which metadata should in these reference files versus what metadata is provided by the actual operator package. A reference file could be as simple as

versions:
-   git:
    repository: github.com/example/example-operator.git
    tag: "1.0.0"
    dir: operator

because the other metadata (name as well as version) is redundant. But IMO there's value to have them duplicated here, as it helps developers to understand what is being referenced. These values will show up in CI logs and could help debug CI failures, e.g. when using a wrong Git tag.

Another example for an operator package that is provides as a tarball. This package doesn't set the optional application version and provides a URL for the package tarball.

```yaml
apiVersion: index.kudo.dev/v1alpha1
kind: Operator
name: Example Operator
version:
- operatorVersion: "0.9.0"
url: example.org/example-operator-0.9.0.tgz
```

While metadata like `name`, `appVersion`, and `operatorVersion` are also present in the referenced operator package, it is helpful for debugging purposes to duplicate this information here. This metadata will be available even if resolving the actual operator package fails.

Once this PR is merged, CI tooling detects the new YAML file, clones the referenced upstream Git repository, checks out the tag, and adds the operator package in the specified folder to the existing index. This workflow is similar to [krew-index](https://github.com/kubernetes-sigs/krew-index). Of course, CI tests that don't update the community repository can run before the PR is merged. These tests include checking the referenced operator package for validity. Additional conformance testing can be added as well.

We can add more metadata to the YAML reference file. E.g., support for different upstream sources like Mercurial.

### Risks and Mitigations

- The current `kudobuilder/operators` Git repository contains some operator packages that don't have an upstream Git repository. If we want to keep them part of the community repository, we need to ensure that they keep getting hosted
- A repository index and individual operator packages already provide metadata, e.g. operator maintainers. We should use this data (if possible) instead of adding similar metadata fields to package references

## Graduation Criteria

- Provide a tool to update the community repository from a list of operator references
- Update `kudobuilder/operators` to the new workflow
- Ensure that existing operators that aren't hosted in a separate Git repository are still part of the community repository

## Implementation History

- 2020/07/02 - Initial draft (@nfnt)
- 2020/07/15 - Updated API after tests with a prototype. Removed maturity levels (@nfnt)