Skip to content

Commit

Permalink
Merge pull request #2423 from bgrant0607/docfix
Browse files Browse the repository at this point in the history
Documentation improvements. Fixes #2004, #2115, #2171.
  • Loading branch information
brendandburns committed Nov 17, 2014
2 parents f0fce55 + d5700ea commit 6fa798c
Show file tree
Hide file tree
Showing 5 changed files with 145 additions and 67 deletions.
4 changes: 2 additions & 2 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ Kubernetes documentation is organized into several categories.
- in [examples](../examples)
- Hands on introduction and example config files
- **API documentation**
- in [api](../api)
- automatically generated REST API documentation
- in [the API conventions doc](api-conventions.md)
- and automatically generated API documentation served by the master
- **Wiki**
- in [wiki](https://github.com/GoogleCloudPlatform/kubernetes/wiki)
104 changes: 101 additions & 3 deletions docs/api-conventions.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,8 @@ All singular JSON resources returned by an API MUST have the following fields:

### Objects

#### Metadata

Every object MUST have the following metadata in a nested object field called "metadata":

* namespace: a namespace is a DNS compatible subdomain that objects are subdivided into. The default namespace is 'default'. See [namespaces.md](namespaces.md) for more.
Expand All @@ -44,17 +46,43 @@ Every object MUST have the following metadata in a nested object field called "m

Every object SHOULD have the following metadata in a nested object field called "metadata":

* resourceVersion: a string that identifies the internal version of this object that can be used by clients to determine when objects have changed. This value MUST be treated as opaque by clients and passed unmodified back to the server. Clients should not assume that the resource version has meaning across namespaces, different kinds of resources, or different servers.
* resourceVersion: a string that identifies the internal version of this object that can be used by clients to determine when objects have changed. This value MUST be treated as opaque by clients and passed unmodified back to the server. Clients should not assume that the resource version has meaning across namespaces, different kinds of resources, or different servers. (see [concurrency control](#concurrency), below, for more details)
* creationTimestamp: a string representing an RFC 3339 date of the date and time an object was created
* labels: a map of string keys and values that can be used to organize and categorize objects (see [labels.md](labels.md))
* annotations: a map of string keys and values that can be used by external tooling to store and retrieve arbitrary metadata about this object (see [annotations.md](annotations.md))

Labels are intended for organizational purposes by end users (select the pods that match this label query). Annotations enable third party automation and tooling to decorate objects with additional metadata for their own use.

By convention, the Kubernetes API makes a distinction between the specification of a resource (a nested object field called "spec") and the status of the resource at the current time (a nested object field called "status"). The specification is persisted in stable storage with the API object and reflects user input. The status is generated at runtime and summarizes the current effect that the spec has on the system, and is ignored on user input. The PUT and POST verbs will ignore the "status" values.
#### Spec and Status

By convention, the Kubernetes API makes a distinction between the specification of the desired state of a resource (a nested object field called "spec") and the status of the resource at the current time (a nested object field called "status"). The specification is persisted in stable storage with the API object and reflects user input. The status is generated at runtime and summarizes the current effect that the spec has on the system.

For example, a pod object has a "spec" field that defines how the pod should be run. The pod also has a "status" field that shows details about what is happening on the host that is running the containers in the pod (if available) and a summarized "status" string that can guide callers as to the overall state of their pod.

When a new version of an object is POSTed or PUT, the "spec" is updated and available immediately. Over time the system will work to bring the "status" into line with the "spec". The system will drive toward the most recent "spec" regardless of previous versions of that stanza. In other words, if a value is changed from 2 to 5 in one PUT and then back down to 3 in another PUT the system is not required to 'touch base' at 5 before changing the "status" to 3.

The PUT and POST verbs will ignore the "status" values. Otherwise, PUT expects the whole object to be specified. Therefore, if a field is omitted it is assumed that the client wants to clear that field's value.

Modification of just part of an object may be achieved by GETting the resource, modifying part of the spec, labels, or annotations, and then PUTting it back. See [concurrency control](#concurrency), below, regarding read-modify-write consistency when using this pattern.

#### Lists of named subobjects preferred over maps

Discussed in [#2004](https://github.com/GoogleCloudPlatform/kubernetes/issues/2004) and elsewhere. There are no maps of subobjects in any API objects. Instead, the convention is to use a list of subobjects containing name fields.

For example:
```yaml
ports:
- name: www
containerPort: 80
```
vs.
```yaml
ports:
www:
containerPort: 80
```
This rule maintains the invariant that all JSON/YAML keys are fields in API objects. The only exceptions are pure maps in the API (currently, labels, selectors, and annotations), as opposed to sets of subobjects.
### Lists
Expand All @@ -72,6 +100,8 @@ A "Status" object SHOULD be returned by an API when an operation is not successf
An "Operation" object MAY be returned by any non-GET API if the operation may take a significant amount of time. The name of the Operation may be used to retrieve the final result of an operation at a later time.
TODO: More details (refer to another doc for details)
TODO: Use SelfLink to retrieve operation instead.
Expand Down Expand Up @@ -100,6 +130,10 @@ Kubernetes by convention exposes additional verbs as new endpoints with singular
Support of additional verbs is not required for all object types.
TODO: document proxy
TODO: more documentation of Watch
Idempotency
-----------
Expand All @@ -108,7 +142,33 @@ All compatible Kubernetes APIs MUST support "name idempotency" and respond with
TODO: name generation
APIs SHOULD set resourceVersion on retrieved resources, and support PUT idempotency by rejecting HTTP requests with a 409 HTTP status code where an HTTP header `If-Match: resourceVersion=` or `?resourceVersion=` query parameter are set and do not match the currently stored version of the resource.
Concurrency Control and Consistency
-----------------------------------
<a name="#concurrency"></a>
Read-modify-write consistency is accomplished with optimistic currency.
All resources have "resourceVersion" as part of their metadata. resourceVersion is a string that identifies the internal version of an object that can be used by clients to determine when objects have changed. It is changed by the server every time an object is modified. If resourceVersion is included with the PUT operation the system will verify that there have not been other successful mutations to the resource during a read/modify/write cycle, by verifying that the current value of resourceVersion matches the specified value.
The only way for a client to know the expected value of resourceVersion is to have received it from the server in response to a prior operation, typically a GET. This value MUST be treated as opaque by clients and passed unmodified back to the server. Clients should not assume that the resource version has meaning across namespaces, different kinds of resources, or different servers. Currently, the value of resourceVersion is set to match etcd's sequencer. You could think of it as a logical clock the API server can use to order requests. However, we expect the implementation of resourceVersion to change in the future, such as in the case we shard the state by kind and/or namespace, or port to another storage system.
APIs SHOULD set resourceVersion on retrieved resources, and support PUT idempotency by rejecting HTTP requests with a StatusConflict (409) HTTP status code where an HTTP header `If-Match: resourceVersion=` or `?resourceVersion=` query parameter are set and do not match the currently stored version of the resource. (Currently, the API simply uses the value from the PUT request body.) The correct client action at this point is to GET the resource again, apply the changes afresh and try submitting again.

This mechanism can be used to prevent races like the following:

```
Client #1 Client #2
GET Foo GET Foo
Set Foo.Bar = "one" Set Foo.Baz = "two"
PUT Foo PUT Foo
```

When these sequences occur in parallel, either the change to Foo.Bar or the change to Foo.Baz can be lost.

On the other hand, when specifying the resourceVersion, one of the PUTs will fail, since whichever write succeeds changes the resourceVersion for Foo.

resourceVersion may be used as a precondition for other operations (e.g., GET, DELETE) in the future, such as for read-after-write consistency in the presence of caching.

"Watch" operations specify resourceVersion using a query parameter. It is used to specify the point at which to begin watching the specified resources. This may be used to ensure that no mutations are missed between a GET of a resource (or list of resources) and a subsequent Watch, even if the current version of the resource is more recent. This is currently the main reason that list operations (GET on a collection) return resourceVersion.

TODO: better syntax?

Expand All @@ -131,3 +191,41 @@ Examples:
* Find the field "current" in the object "state" in the second item in the array "fields": `fields[0].state.current`

TODO: Plugins, extensions, nested kinds, headers


Status codes
------------

The following status codes may be returned by the API.

TODO: Document when each of these codes is returned

#### Success codes

* `StatusOK`
* `StatusCreated`
* `StatusAccepted`
* `StatusNoContent`

#### Error codes

* `StatusNotFound`
* `StatusMethodNotAllowed`
* `StatusUnsupportedMediaType`
* `StatusNotAcceptable`
* `StatusBadRequest`
* `StatusUnauthorized`
* `StatusForbidden`
* `StatusRequestTimeout`
* `StatusConflict`
* `StatusPreconditionFailed`
* `StatusUnprocessableEntity`
* `StatusInternalServerError`
* `StatusServiceUnavailable`

TODO: also document API status strings, reasons, and causes

Events
------

TODO: Document events (refer to another doc for details)
32 changes: 16 additions & 16 deletions docs/container-environment.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,12 @@
This document describes the environment for Kubelet managed containers on a Kubernetes node (kNode).  In contrast to the Kubernetes cluster API, which provides an API for creating and managing containers, the Kubernetes container environment provides the container access to information about what else is going on in the cluster. 

This cluster information makes it possible to build applications that are *cluster aware*.  
Additionally, the Kubernetes container environment defines a series of signals that are surfaced to optional signal handlers defined as part of individual containers.  Container signals are somewhat analagous to operating system signals in a traditional process model.   However these signals are designed to make it easier to build reliable, scalable cloud applications in the Kubernetes cluster.  Containers that participate in this cluster lifecycle become *cluster native*
Additionally, the Kubernetes container environment defines a series of hooks that are surfaced to optional hook handlers defined as part of individual containers.  Container hooks are somewhat analagous to operating system signals in a traditional process model.   However these hooks are designed to make it easier to build reliable, scalable cloud applications in the Kubernetes cluster.  Containers that participate in this cluster lifecycle become *cluster native*

Another important part of the container environment is the file system that is available to the container. In Kubernetes, the filesystem is a combination of an [image](./images.md) and one or more [volumes](./volumes.md).


The following sections describe both the cluster information provided to containers, as well as the signals and life-cycle that allows containers to interact with the management system.
The following sections describe both the cluster information provided to containers, as well as the hooks and life-cycle that allows containers to interact with the management system.

## Cluster Information
There are two types of information that are available within the container environment.  There is information about the container itself, and there is information about other objects in the system.
Expand All @@ -32,29 +32,29 @@ FOO_SERVICE_PORT=<the port the service is running on>

Going forward, we expect that Services will have a dedicated IP address.  In that context, we will also surface services to the container via DNS.  Of course DNS is still not an enumerable protocol, so we will continue to provide environment variables so that containers can do discovery.

## Container Signals
*NB*: Container signals are under active development, we anticipate adding additional signals as the Kubernetes container management system evolves.*
## Container Hooks
*NB*: Container hooks are under active development, we anticipate adding additional hooks as the Kubernetes container management system evolves.*

Container signals provide information to the container about events in its management lifecycle.  For example, immediately after a container is started, it receives a *PostStart* signal.  These signals are broadcast *into* the container with information about the life-cycle of the container.  They are different from the events provided by Docker and other systems which are *output* from the container.  Output events provide a log of what has already happened.  Input signals provide real-time notification about things that are happening, but no historical log.  
Container hooks provide information to the container about events in its management lifecycle.  For example, immediately after a container is started, it receives a *PostStart* hook.  These hooks are broadcast *into* the container with information about the life-cycle of the container.  They are different from the events provided by Docker and other systems which are *output* from the container.  Output events provide a log of what has already happened.  Input hooks provide real-time notification about things that are happening, but no historical log.  

### Signal Details
There are currently two container signals that are surfaced to containers, and two proposed signals:
### Hook Details
There are currently two container hooks that are surfaced to containers, and two proposed hooks:

*PreStart - ****Proposed***

This signal is sent immediately before a container is created.  It signals that the container will be created immediately after the call completes.  No parameters are passed. *Note - *Some event handlers (namely ‘exec’ are incompatible with this event)
This hook is sent immediately before a container is created.  It notifies that the container will be created immediately after the call completes.  No parameters are passed. *Note - *Some event handlers (namely ‘exec’ are incompatible with this event)

*PostStart*

This signal is sent immediately after a container is created.  It signals to the container that it has been created.  No parameters are passed to the handler.
This hook is sent immediately after a container is created.  It notifies the container that it has been created.  No parameters are passed to the handler.

*PostRestart - ****Proposed***

This signal is called before the PostStart handler, when a container has been restarted, rather than started for the first time.  No parameters are passed to the handler.
This hook is called before the PostStart handler, when a container has been restarted, rather than started for the first time.  No parameters are passed to the handler.

*PreStop*

This signal is called immediately before a container is terminated.  This event handler is blocking, and must complete before the call to delete the container is sent to the Docker daemon. The SIGTERM notification sent by Docker is also still sent.
This hook is called immediately before a container is terminated.  This event handler is blocking, and must complete before the call to delete the container is sent to the Docker daemon. The SIGTERM notification sent by Docker is also still sent.

A single parameter named reason is passed to the handler which contains the reason for termination.  Currently the valid values for reason are:
*```Delete``` - indicating an API call to delete the pod containing this container.
Expand All @@ -64,13 +64,13 @@ A single parameter named reason is passed to the handler which contains the reas
Eventually, user specified reasons may be [added to the API](https://github.com/GoogleCloudPlatform/kubernetes/issues/137).


### Signal Handler Execution
When a management signal occurs, the management system calls into any registered signal handlers in the container for that signal.  These signal handler calls are synchronous in the context of the pod containing the container. Note:this means that signal handler execution blocks any further management of the pod.  If your signal handler blocks, no other management (including health checks) will occur until the signal handler completes.  Blocking signal handlers do *not* affect management of other Pods.  Typically we expect that users will make their signal handlers as lightweight as possible, but there are cases where long running commands make sense (e.g. saving state prior to container stop)
### Hook Handler Execution
When a management hook occurs, the management system calls into any registered hook handlers in the container for that hook.  These hook handler calls are synchronous in the context of the pod containing the container. Note:this means that hook handler execution blocks any further management of the pod.  If your hook handler blocks, no other management (including health checks) will occur until the hook handler completes.  Blocking hook handlers do *not* affect management of other Pods.  Typically we expect that users will make their hook handlers as lightweight as possible, but there are cases where long running commands make sense (e.g. saving state prior to container stop)

For signals which have parameters, these parameters are passed to the event handler as a set of key/value pairs.  The details of this parameter passing is handler implementation dependent (see below)
For hooks which have parameters, these parameters are passed to the event handler as a set of key/value pairs.  The details of this parameter passing is handler implementation dependent (see below)

### Signal Handler Implementations
Signal handlers are the way that signals are surfaced to containers.  Containers can select the type of signal handler they would like to implement.  Kubernetes currently supports two different signal handler types:
### Hook Handler Implementations
Hook handlers are the way that hooks are surfaced to containers.  Containers can select the type of hook handler they would like to implement.  Kubernetes currently supports two different hook handler types:

* Exec - Executes a specific command (e.g. pre-stop.sh) inside the cgroup and namespaces of the container.  Resources consumed by the command are counted against the container.  Commands which return non-zero values are treated as container failures (and will cause kubelet to forcibly restart the container).  Parameters are passed to the command as traditional linux command line flags (e.g. pre-stop.sh --reason=HEALTH)

Expand Down
Loading

0 comments on commit 6fa798c

Please sign in to comment.