Added document explaining the scope of the Kubernetes project, including

historical rationales and decision criteria.
kubernetes · Jan 31, 2019 · f6e1200 · f6e1200
1 parent 380e6a8
commit f6e1200
Showing 1 changed file with 342 additions and 0 deletions.
diff --git a/contributors/design-proposals/architecture/scope.md b/contributors/design-proposals/architecture/scope.md
@@ -0,0 +1,342 @@
+# Kubernetes scope
+
+Purpose of this doc: Clarify factors affecting decisions regarding
+what is and is not in scope for the Kubernetes project.
+
+Related documents:
+* [What is Kubernetes?](https://kubernetes.io/docs/concepts/overview/what-is-kubernetes/)
+* [Kubernetes design and architecture](architecture.md)
+* [Kubernetes architectural roadmap (2017)](architectural-roadmap.md)
+* [Design principles](principles.md)
+* [Kubernetes resource management](resource-management.md)
+
+Kubernetes is a portable, extensible open-source platform for managing
+containerized workloads and services, that facilitates both
+declarative configuration and automation.  Workload portability is an
+especially high priority. Kubernetes provides a flexible, easy-to-run,
+secure foundation for running containerized applications on any cloud
+provider or your own systems.
+
+While not a full distribution in the Linux sense, adoption of
+Kubernetes has been facilitated by the fact that the upstream releases
+are usable on their own, with minimal dependencies (e.g., etcd, a
+container runtime, and a networking implementation). 
+
+The high-level scope and goals are often insufficient for making
+decisions about where to draw the line, so this documents where the
+line is, the rationale for some past decisions, and some general
+criteria that have been applied, including non-technical
+considerations. For instance, user adoption and continued operation of
+the project itself are also important factors.
+
+## Significant areas
+
+More details can be found below, but a concise list of areas in scope follows:
+* Containerized workload execution and management
+* Service discovery, load balancing, and routing
+* Workload identity propagation and authentication
+* Declarative resource management platform
+* Command-line tool
+* Web dashboard (UI)
+* Cluster lifecycle tools
+* Extensibility to support execution and management in diverse environments
+* Multi-cluster management tools and systems
+* Project GitHub automation and other process automation
+* Project continuous build and test infrastructure
+* Release tooling
+* Documentation
+* Usage data collection mechanisms
+
+## Scope domains
+
+Most decisions are regarding whether any part of the project should
+undertake efforts in a particular area. However, some decisions may
+sometimes be necessary for smaller scopes. The term "core" is sometimes
+used, but is not well defined. The following are scopes that may be relevant:
+* Kubernetes project github orgs
+  * All github orgs
+  * The kubernetes github org
+  * The kubernetes-sigs and kubernetes-incubator github orgs
+  * The kubernetes-client github org
+  * Other github orgs
+* Release artifacts
+  * The Kubernetes release bundle
+  * Binaries built in kubernetes/kubernetes
+    * “core” server components: apiserver, controller manager, scheduler, kube-proxy, kubelet
+    * kubectl
+    * kubeadm
+  * Other images, packages, etc.
+* The kubernetes/kubernetes repository (aka k/k)
+  * master branch
+  * kubernetes/kubernetes/master/pkg
+  * kubernetes/kubernetes/master/staging
+* [Functionality layers](architectural-roadmap.md)
+  * required
+  * pluggable
+  * optional
+  * usable independently of the rest of Kubernetes
+
+## Other inclusion considerations
+
+The Kubernetes project is a large, complex effort.
+
+* Is the functionality consistent with the existing implementation
+  conventions, design principles, architecture, and direction?
+
+* Do the subproject owners, approvers, reviewers, and regular contributors
+  agree to maintain the functionality?
+
+* Do the contributors to the functionality agree to follow the
+  project’s development conventions and requirements, including CLA,
+  code of conduct, github and build tooling, testing, documentation,
+  and release criteria, etc.?
+
+* Does the functionality improve existing use cases, or mostly enable
+  new ones? The project isn't completely blocking new functionality
+  (more reducing the rate of expansion), but it is trying to
+  limit additions to kubernetes/kubernetes/master, and aims to improve the
+  quality of the functionality that already exists.
+
+* Is it needed by project contributors? Example: We need cluster
+  creation and upgrade functionality in order to run end-to-end tests.
+
+* Is it necessary in order to enable workload portability?
+
+* Is it needed in order for upstream releases to be usable? For
+  example, things without which users otherwise were
+  reverse-engineering Kubernetes to figure out, and/or copying code
+  out of Kubernetes itself to make work.
+
+* Is it functionality that users expect, such as because other
+  container platforms and/or service discovery and routing mechanisms
+  provide it? If a capability that relates to Kubernetes's fundamental
+  purpose were to become table stakes in the industry, Kubernetes
+  would need to support it in order to stay relevant. (Whether it
+  would need to be addressed by the core project would depend on the
+  other criteria.)
+
+* Is there sufficiently broad user demand and/or sufficient expected
+  user benefit for the functionality?
+
+* Is there an adequate mechanism to discover, deploy, express a
+  dependency on, and upgrade the functionality if implemented using an
+  extension mechanism? Are there consistent notions of releases, maturity,
+  quality, version skew, conformance, etc. for extensions?
+
+* Is it needed as a reference implementation exercising extension
+  points or other APIs?
+
+* Is the functionality sufficiently general-purpose?
+
+* Is it an area where we want to provide an opinionated solution
+  and/or where fragmentation would be problematic for users, or are
+  there many reasonable alternative approaches and solutions to the
+  problem?
+
+* Is it an area where we want to foster exploration and innovation in
+  the ecosystem?
+
+* Has the ecosystem produced adequate solutions on its own? For
+  instance, have ecosystem projects taken on requirements of the
+  Kubernetes project, if needed? Example: etcd3 added a number of features
+  and other improvements to benefit Kubernetes, so the project didn't
+  need to launch a separate storage effort.
+
+* Is there an acceptable home for the recommended ecosystem solution(s)?
+  Example: the [CNCF Sandbox](https://github.com/cncf/toc/blob/master/process/sandbox.md) is one possible home
+
+* Has the functionality been provided by the project/release/component
+  historically?
+
+## Technical scope details and rationale
+
+### Containerized workload execution and management
+
+Including:
+* common general categories of workloads, such as stateless, stateful, batch, and cluster services
+* provisioning, allocation, accessing, and managing compute, storage, and network resources on behalf of the workloads, and enforcement of security policies on those resources
+* workload prioritization, capacity assessment, placement, and relocation (aka scheduling)
+* graceful workload eviction
+* local container image caching
+* configuration and secret distribution
+* manual and automatic horizontal and vertical scaling
+* deployment, progressive (aka rolling) upgrades, and downgrades
+* self-healing
+* exposing container logs, status, health, and resource usage metrics for collection
+
+### Service discovery, load balancing, and routing
+
+Including:
+* endpoint tracking and discovery, including pod and non-pod endpoints
+* the most common L4 and L7 Internet protocols (TCP, UDP, SCTP, HTTP, HTTPS)
+* intra-cluster DNS configuration and serving
+* external DNS configuration
+* accessing external services (e.g., imported services, Open Service Broker)
+* exposing traffic latency, throughput, and status metrics for collection
+* access authorization
+
+### Workload identity propagation and authentication
+
+Including:
+* internal identity (e.g., SPIFFE support)
+* external identity (e.g., TLS certificate management)
+
+### Declarative resource management platform
+
+Including:
+* CRUD API operations and behaviors, diff, patch, dry run, watch
+* declarative updates (apply)
+* resource type definition, registration, discovery, documentation, and validation mechanisms
+* pluggable authentication, authorization, admission (API-level policy enforcement), and audit-logging mechanisms
+* Namespace (resource scoping primitive) lifecycle
+* resource instance persistence and garbage collection
+* asynchronous event reporting
+* API producer SDK
+* API client SDK / libraries in widely used languages
+* dynamic, resource-oriented CLI, as a reference implementation for interacting with the API and basic tool for declarative and imperative management
+  * simplifies getting started and avoids complexities of documenting the system with just, for instance, curl
+
+### Command-line tool
+
+Since some Kubernetes primitives are fairly low-level, in addition to
+general-purpose resource-oriented operations, the CLI also supports
+“porcelain” for common simple, domain-specific operational operations (both
+status/progress extraction and mutations) that don’t have discrete API
+implementations, such as run, expose, rollout, cp, top, cordon, and
+drain. And there should be support for non-resource-oriented APIs,
+such as exec, logs, attach, port-forward, and proxy.
+
+### Web dashboard (UI)
+
+The project supported a dashboard, initially built into the apiserver,
+almost from the beginning. Other projects in the space had UIs and
+users expected one. There wasn’t a vendor-neutral one in the
+ecosystem, however, and a solution was needed for the project's local
+cluster environment, minikube. The dashboard has also served as a UI
+reference implementation and a vehicle to drive conventions (e.g.,
+around resource category terminology). The dashboard has also been
+useful as a tool to demonstrate and to learn about Kubernetes
+concepts, features, and behaviors.
+
+### Cluster lifecycle tools
+
+Cluster lifecycle includes provisioning, bootstrapping,
+upgrade/downgrade, and teardown. The project develops several such tools.
+Tools are needed for the following scenarios/purposes:
+* usability of upstream releases: at least one solution that can be used to bootstrap the upstream release (e.g., kubeadm)
+* testing: solutions that can be used to run multi-node end-to-end tests (e.g., kind), integration tests, upgrade/downgrade tests, version-skew tests, scalability tests, and other types of tests the projects deems necessary to ensure adequate release quality
+* portable, low-dependency local environment: at least one local environment (e.g., minikube), in order to simplify documentation tutorials that require a cluster to exist
+
+### Extensibility to support execution and management in diverse environments
+
+Including:
+* CRI
+* CNI
+* CSI
+* external cloud providers
+* KMS providers
+* OSB brokers
+* Cluster APIs
+
+### Multi-cluster management tools and systems
+
+Many users desire to operate in and deploy applications to multiple
+geographic locations and environments, even across multiple providers.
+This generally requires managing multiple Kubernetes clusters.  While
+general deployment pipeline tools and continuous deployment systems
+are not in scope, the project has explored multiple mechanisms to
+simplify management of resources across multiple clusters, including
+Federation v1, Federation v2, and the Cluster Registry API.
+
+### Project GitHub automation and other process automation
+
+As one of the largest, most active projects on Github, Kubernetes has
+some extreme needs.
+
+Including:
+* prow
+* gubernator
+* velodrome and kettle
+* website infrastructure
+* k8s.io
+
+### Project continuous build and test infrastructure
+
+Including:
+* prow
+* tide
+* triage dashboard
+
+### Release tooling
+
+Including:
+* anago
+
+### Documentation
+
+Documentation of project-provided functionality and components, for
+multiple audiences, including:
+* application developers
+* application operators
+* cluster operators
+* ecosystem developers
+* distribution providers, and others who want to port Kubernetes to new environments
+* project contributors
+
+### Usage data collection mechanisms
+
+Including:
+* Spartakus
+
+## Examples of projects and areas not in scope
+
+Some of these are obvious, but many have been seriously deliberated in the
+past.
+* The resource instance store (etcd)
+* Container runtimes, other than current grandfathered ones
+* Network and storage plugins, other than current grandfathered ones
+* CoreDNS
+  * Since intra-cluster DNS is in scope, we need to ensure we have
+    some solution, which has been kubedns, but now that there is an
+    adequate alternative outside the project, we are adopting it.
+* Service load balancers (e.g., Envoy, Linkerd), other than kube-proxy
+* Cloud provider implementations, other than current grandfathered ones
+* Container image build tools
+* Image registries and distribution mechanisms
+* Identity (user/group) sources of truth (e.g., LDAP)
+* Key management systems (e.g., Vault)
+* CI, CD, and GitOps (push to deploy) systems, other than
+  infrastructure used to build and test the Kubernetes project itself
+* Application-level services, such as middleware (e.g., message
+  buses), data-processing frameworks (e.g., Spark), machine-learning
+  frameworks (e.g., Kubeflow), databases (e.g., Mysql), caches, nor
+  cluster storage systems (e.g., Ceph) as built-in services. Such
+  components can run on Kubernetes, and/or can be accessed by
+  applications running on Kubernetes through portable mechanisms, such
+  as the Open Service Broker. Application-specific Operators (e.g.,
+  Cassandra Operator) are also not in scope.
+* Application and cluster log aggregation and searching, application
+  and cluster monitoring aggregation and dashboarding (other than
+  heapster, which is grandfathered), alerting, application performance
+  management, tracing, and debugging tools
+* General-purpose machine configuration (e.g., Chef, Puppet, Ansible,
+  Salt), maintenance, automation (e.g., Rundeck), and management systems
+* Templating and configuration languages (e.g., jinja, jsonnet,
+  starlark, hcl, dhall, hocon)
+* File packaging tools (e.g., helm, kpm, kubepack, duffle)
+* Managing non-containerized applications in VMs, and other general
+  IaaS functionality
+* Full Platform as a Service functionality
+* Full Functions as a Service functionality
+* [Workflow
+ orchestration](https://github.com/kubernetes/kubernetes/pull/24781#issuecomment-215914822):
+ "Workflow" is a very broad, diverse area, with solutions typically
+ tailored to specific use cases (e.g., data-flow graphs, data-driven
+ processing, deployment pipelines, event-driven automation,
+ business-process execution, iPaaS) and specific input and event
+ sources, and often requires arbitrary code to evaluate conditions,
+ actions, and/or failure handling.
+* Other forms of human-oriented and programmatic interfaces over the
+  Kubernetes API other than “basic” CLIs (e.g., kubectl) and UI
+  (dashboard), such as mobile dashboards, IDEs, chat bots, SQL,
+  interactive shells, etc.