-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Added document explaining the scope of the Kubernetes project, including
historical rationales and decision criteria.
- Loading branch information
1 parent
380e6a8
commit f6e1200
Showing
1 changed file
with
342 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,342 @@ | ||
# Kubernetes scope | ||
|
||
Purpose of this doc: Clarify factors affecting decisions regarding | ||
what is and is not in scope for the Kubernetes project. | ||
|
||
Related documents: | ||
* [What is Kubernetes?](https://kubernetes.io/docs/concepts/overview/what-is-kubernetes/) | ||
* [Kubernetes design and architecture](architecture.md) | ||
* [Kubernetes architectural roadmap (2017)](architectural-roadmap.md) | ||
* [Design principles](principles.md) | ||
* [Kubernetes resource management](resource-management.md) | ||
|
||
Kubernetes is a portable, extensible open-source platform for managing | ||
containerized workloads and services, that facilitates both | ||
declarative configuration and automation. Workload portability is an | ||
especially high priority. Kubernetes provides a flexible, easy-to-run, | ||
secure foundation for running containerized applications on any cloud | ||
provider or your own systems. | ||
|
||
While not a full distribution in the Linux sense, adoption of | ||
Kubernetes has been facilitated by the fact that the upstream releases | ||
are usable on their own, with minimal dependencies (e.g., etcd, a | ||
container runtime, and a networking implementation). | ||
|
||
The high-level scope and goals are often insufficient for making | ||
decisions about where to draw the line, so this documents where the | ||
line is, the rationale for some past decisions, and some general | ||
criteria that have been applied, including non-technical | ||
considerations. For instance, user adoption and continued operation of | ||
the project itself are also important factors. | ||
|
||
## Significant areas | ||
|
||
More details can be found below, but a concise list of areas in scope follows: | ||
* Containerized workload execution and management | ||
* Service discovery, load balancing, and routing | ||
* Workload identity propagation and authentication | ||
* Declarative resource management platform | ||
* Command-line tool | ||
* Web dashboard (UI) | ||
* Cluster lifecycle tools | ||
* Extensibility to support execution and management in diverse environments | ||
* Multi-cluster management tools and systems | ||
* Project GitHub automation and other process automation | ||
* Project continuous build and test infrastructure | ||
* Release tooling | ||
* Documentation | ||
* Usage data collection mechanisms | ||
|
||
## Scope domains | ||
|
||
Most decisions are regarding whether any part of the project should | ||
undertake efforts in a particular area. However, some decisions may | ||
sometimes be necessary for smaller scopes. The term "core" is sometimes | ||
used, but is not well defined. The following are scopes that may be relevant: | ||
* Kubernetes project github orgs | ||
* All github orgs | ||
* The kubernetes github org | ||
* The kubernetes-sigs and kubernetes-incubator github orgs | ||
* The kubernetes-client github org | ||
* Other github orgs | ||
* Release artifacts | ||
* The Kubernetes release bundle | ||
* Binaries built in kubernetes/kubernetes | ||
* “core” server components: apiserver, controller manager, scheduler, kube-proxy, kubelet | ||
* kubectl | ||
* kubeadm | ||
* Other images, packages, etc. | ||
* The kubernetes/kubernetes repository (aka k/k) | ||
* master branch | ||
* kubernetes/kubernetes/master/pkg | ||
* kubernetes/kubernetes/master/staging | ||
* [Functionality layers](architectural-roadmap.md) | ||
* required | ||
* pluggable | ||
* optional | ||
* usable independently of the rest of Kubernetes | ||
|
||
## Other inclusion considerations | ||
|
||
The Kubernetes project is a large, complex effort. | ||
|
||
* Is the functionality consistent with the existing implementation | ||
conventions, design principles, architecture, and direction? | ||
|
||
* Do the subproject owners, approvers, reviewers, and regular contributors | ||
agree to maintain the functionality? | ||
|
||
* Do the contributors to the functionality agree to follow the | ||
project’s development conventions and requirements, including CLA, | ||
code of conduct, github and build tooling, testing, documentation, | ||
and release criteria, etc.? | ||
|
||
* Does the functionality improve existing use cases, or mostly enable | ||
new ones? The project isn't completely blocking new functionality | ||
(more reducing the rate of expansion), but it is trying to | ||
limit additions to kubernetes/kubernetes/master, and aims to improve the | ||
quality of the functionality that already exists. | ||
|
||
* Is it needed by project contributors? Example: We need cluster | ||
creation and upgrade functionality in order to run end-to-end tests. | ||
|
||
* Is it necessary in order to enable workload portability? | ||
|
||
* Is it needed in order for upstream releases to be usable? For | ||
example, things without which users otherwise were | ||
reverse-engineering Kubernetes to figure out, and/or copying code | ||
out of Kubernetes itself to make work. | ||
|
||
* Is it functionality that users expect, such as because other | ||
container platforms and/or service discovery and routing mechanisms | ||
provide it? If a capability that relates to Kubernetes's fundamental | ||
purpose were to become table stakes in the industry, Kubernetes | ||
would need to support it in order to stay relevant. (Whether it | ||
would need to be addressed by the core project would depend on the | ||
other criteria.) | ||
|
||
* Is there sufficiently broad user demand and/or sufficient expected | ||
user benefit for the functionality? | ||
|
||
* Is there an adequate mechanism to discover, deploy, express a | ||
dependency on, and upgrade the functionality if implemented using an | ||
extension mechanism? Are there consistent notions of releases, maturity, | ||
quality, version skew, conformance, etc. for extensions? | ||
|
||
* Is it needed as a reference implementation exercising extension | ||
points or other APIs? | ||
|
||
* Is the functionality sufficiently general-purpose? | ||
|
||
* Is it an area where we want to provide an opinionated solution | ||
and/or where fragmentation would be problematic for users, or are | ||
there many reasonable alternative approaches and solutions to the | ||
problem? | ||
|
||
* Is it an area where we want to foster exploration and innovation in | ||
the ecosystem? | ||
|
||
* Has the ecosystem produced adequate solutions on its own? For | ||
instance, have ecosystem projects taken on requirements of the | ||
Kubernetes project, if needed? Example: etcd3 added a number of features | ||
and other improvements to benefit Kubernetes, so the project didn't | ||
need to launch a separate storage effort. | ||
|
||
* Is there an acceptable home for the recommended ecosystem solution(s)? | ||
Example: the [CNCF Sandbox](https://github.com/cncf/toc/blob/master/process/sandbox.md) is one possible home | ||
|
||
* Has the functionality been provided by the project/release/component | ||
historically? | ||
|
||
## Technical scope details and rationale | ||
|
||
### Containerized workload execution and management | ||
|
||
Including: | ||
* common general categories of workloads, such as stateless, stateful, batch, and cluster services | ||
* provisioning, allocation, accessing, and managing compute, storage, and network resources on behalf of the workloads, and enforcement of security policies on those resources | ||
* workload prioritization, capacity assessment, placement, and relocation (aka scheduling) | ||
* graceful workload eviction | ||
* local container image caching | ||
* configuration and secret distribution | ||
* manual and automatic horizontal and vertical scaling | ||
* deployment, progressive (aka rolling) upgrades, and downgrades | ||
* self-healing | ||
* exposing container logs, status, health, and resource usage metrics for collection | ||
|
||
### Service discovery, load balancing, and routing | ||
|
||
Including: | ||
* endpoint tracking and discovery, including pod and non-pod endpoints | ||
* the most common L4 and L7 Internet protocols (TCP, UDP, SCTP, HTTP, HTTPS) | ||
* intra-cluster DNS configuration and serving | ||
* external DNS configuration | ||
* accessing external services (e.g., imported services, Open Service Broker) | ||
* exposing traffic latency, throughput, and status metrics for collection | ||
* access authorization | ||
|
||
### Workload identity propagation and authentication | ||
|
||
Including: | ||
* internal identity (e.g., SPIFFE support) | ||
* external identity (e.g., TLS certificate management) | ||
|
||
### Declarative resource management platform | ||
|
||
Including: | ||
* CRUD API operations and behaviors, diff, patch, dry run, watch | ||
* declarative updates (apply) | ||
* resource type definition, registration, discovery, documentation, and validation mechanisms | ||
* pluggable authentication, authorization, admission (API-level policy enforcement), and audit-logging mechanisms | ||
* Namespace (resource scoping primitive) lifecycle | ||
* resource instance persistence and garbage collection | ||
* asynchronous event reporting | ||
* API producer SDK | ||
* API client SDK / libraries in widely used languages | ||
* dynamic, resource-oriented CLI, as a reference implementation for interacting with the API and basic tool for declarative and imperative management | ||
* simplifies getting started and avoids complexities of documenting the system with just, for instance, curl | ||
|
||
### Command-line tool | ||
|
||
Since some Kubernetes primitives are fairly low-level, in addition to | ||
general-purpose resource-oriented operations, the CLI also supports | ||
“porcelain” for common simple, domain-specific operational operations (both | ||
status/progress extraction and mutations) that don’t have discrete API | ||
implementations, such as run, expose, rollout, cp, top, cordon, and | ||
drain. And there should be support for non-resource-oriented APIs, | ||
such as exec, logs, attach, port-forward, and proxy. | ||
|
||
### Web dashboard (UI) | ||
|
||
The project supported a dashboard, initially built into the apiserver, | ||
almost from the beginning. Other projects in the space had UIs and | ||
users expected one. There wasn’t a vendor-neutral one in the | ||
ecosystem, however, and a solution was needed for the project's local | ||
cluster environment, minikube. The dashboard has also served as a UI | ||
reference implementation and a vehicle to drive conventions (e.g., | ||
around resource category terminology). The dashboard has also been | ||
useful as a tool to demonstrate and to learn about Kubernetes | ||
concepts, features, and behaviors. | ||
|
||
### Cluster lifecycle tools | ||
|
||
Cluster lifecycle includes provisioning, bootstrapping, | ||
upgrade/downgrade, and teardown. The project develops several such tools. | ||
Tools are needed for the following scenarios/purposes: | ||
* usability of upstream releases: at least one solution that can be used to bootstrap the upstream release (e.g., kubeadm) | ||
* testing: solutions that can be used to run multi-node end-to-end tests (e.g., kind), integration tests, upgrade/downgrade tests, version-skew tests, scalability tests, and other types of tests the projects deems necessary to ensure adequate release quality | ||
* portable, low-dependency local environment: at least one local environment (e.g., minikube), in order to simplify documentation tutorials that require a cluster to exist | ||
|
||
### Extensibility to support execution and management in diverse environments | ||
|
||
Including: | ||
* CRI | ||
* CNI | ||
* CSI | ||
* external cloud providers | ||
* KMS providers | ||
* OSB brokers | ||
* Cluster APIs | ||
|
||
### Multi-cluster management tools and systems | ||
|
||
Many users desire to operate in and deploy applications to multiple | ||
geographic locations and environments, even across multiple providers. | ||
This generally requires managing multiple Kubernetes clusters. While | ||
general deployment pipeline tools and continuous deployment systems | ||
are not in scope, the project has explored multiple mechanisms to | ||
simplify management of resources across multiple clusters, including | ||
Federation v1, Federation v2, and the Cluster Registry API. | ||
|
||
### Project GitHub automation and other process automation | ||
|
||
As one of the largest, most active projects on Github, Kubernetes has | ||
some extreme needs. | ||
|
||
Including: | ||
* prow | ||
* gubernator | ||
* velodrome and kettle | ||
* website infrastructure | ||
* k8s.io | ||
|
||
### Project continuous build and test infrastructure | ||
|
||
Including: | ||
* prow | ||
* tide | ||
* triage dashboard | ||
|
||
### Release tooling | ||
|
||
Including: | ||
* anago | ||
|
||
### Documentation | ||
|
||
Documentation of project-provided functionality and components, for | ||
multiple audiences, including: | ||
* application developers | ||
* application operators | ||
* cluster operators | ||
* ecosystem developers | ||
* distribution providers, and others who want to port Kubernetes to new environments | ||
* project contributors | ||
|
||
### Usage data collection mechanisms | ||
|
||
Including: | ||
* Spartakus | ||
|
||
## Examples of projects and areas not in scope | ||
|
||
Some of these are obvious, but many have been seriously deliberated in the | ||
past. | ||
* The resource instance store (etcd) | ||
* Container runtimes, other than current grandfathered ones | ||
* Network and storage plugins, other than current grandfathered ones | ||
* CoreDNS | ||
* Since intra-cluster DNS is in scope, we need to ensure we have | ||
some solution, which has been kubedns, but now that there is an | ||
adequate alternative outside the project, we are adopting it. | ||
* Service load balancers (e.g., Envoy, Linkerd), other than kube-proxy | ||
* Cloud provider implementations, other than current grandfathered ones | ||
* Container image build tools | ||
* Image registries and distribution mechanisms | ||
* Identity (user/group) sources of truth (e.g., LDAP) | ||
* Key management systems (e.g., Vault) | ||
* CI, CD, and GitOps (push to deploy) systems, other than | ||
infrastructure used to build and test the Kubernetes project itself | ||
* Application-level services, such as middleware (e.g., message | ||
buses), data-processing frameworks (e.g., Spark), machine-learning | ||
frameworks (e.g., Kubeflow), databases (e.g., Mysql), caches, nor | ||
cluster storage systems (e.g., Ceph) as built-in services. Such | ||
components can run on Kubernetes, and/or can be accessed by | ||
applications running on Kubernetes through portable mechanisms, such | ||
as the Open Service Broker. Application-specific Operators (e.g., | ||
Cassandra Operator) are also not in scope. | ||
* Application and cluster log aggregation and searching, application | ||
and cluster monitoring aggregation and dashboarding (other than | ||
heapster, which is grandfathered), alerting, application performance | ||
management, tracing, and debugging tools | ||
* General-purpose machine configuration (e.g., Chef, Puppet, Ansible, | ||
Salt), maintenance, automation (e.g., Rundeck), and management systems | ||
* Templating and configuration languages (e.g., jinja, jsonnet, | ||
starlark, hcl, dhall, hocon) | ||
* File packaging tools (e.g., helm, kpm, kubepack, duffle) | ||
* Managing non-containerized applications in VMs, and other general | ||
IaaS functionality | ||
* Full Platform as a Service functionality | ||
* Full Functions as a Service functionality | ||
* [Workflow | ||
orchestration](https://github.com/kubernetes/kubernetes/pull/24781#issuecomment-215914822): | ||
"Workflow" is a very broad, diverse area, with solutions typically | ||
tailored to specific use cases (e.g., data-flow graphs, data-driven | ||
processing, deployment pipelines, event-driven automation, | ||
business-process execution, iPaaS) and specific input and event | ||
sources, and often requires arbitrary code to evaluate conditions, | ||
actions, and/or failure handling. | ||
* Other forms of human-oriented and programmatic interfaces over the | ||
Kubernetes API other than “basic” CLIs (e.g., kubectl) and UI | ||
(dashboard), such as mobile dashboards, IDEs, chat bots, SQL, | ||
interactive shells, etc. |