Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Admission control plugins: LimitRanger and ResourceQuota #3057

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
122 changes: 122 additions & 0 deletions docs/design/admission_control_limit_range.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
# Admission control plugin: LimitRanger

## Background

This document proposes a system for enforcing min/max limits per resource as part of admission control.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to see a user-documentation version of this, in addition to, or in place of this "docs/design" file. It would be pretty close to this, but with phrases like "is introduced" and "future enhancements" removed. Assuming you do that in a separate PR. You may want to merge all the admission controller user docs into one file for readability. Up to you.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, user docs would just describe the JSON, not the go code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And it would have a sentence that explains why one might want to setup a LimitRanger. e.g. to prevent users from creating Pod that are too large to fit on any machine, or which have such low limits that they are sure to fail.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

100% agree, I think this document has lived a long time, so wanted to just get agreement on the design. The implementation PRs can add the user documentation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never mind. I was thinking this PR included code too. It is still just outline. So, ignore above comments.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem. There are implementation PRs per each design, I can add user documentation to those assuming you were good with this outline.


## Model Changes

A new resource, **LimitRange**, is introduced to enumerate min/max limits for a resource type scoped to a
Kubernetes namespace.

```
const (
// Limit that applies to all pods in a namespace
LimitTypePod string = "Pod"
// Limit that applies to all containers in a namespace
LimitTypeContainer string = "Container"
)

// LimitRangeItem defines a min/max usage limit for any resource that matches on kind
type LimitRangeItem struct {
// Type of resource that this limit applies to
Type string `json:"type,omitempty"`
// Max usage constraints on this kind by resource name
Max ResourceList `json:"max,omitempty"`
// Min usage constraints on this kind by resource name
Min ResourceList `json:"min,omitempty"`
}

// LimitRangeSpec defines a min/max usage limit for resources that match on kind
type LimitRangeSpec struct {
// Limits is the list of LimitRangeItem objects that are enforced
Limits []LimitRangeItem `json:"limits"`
}

// LimitRange sets resource usage limits for each kind of resource in a Namespace
type LimitRange struct {
TypeMeta `json:",inline"`
ObjectMeta `json:"metadata,omitempty"`

// Spec defines the limits enforced
Spec LimitRangeSpec `json:"spec,omitempty"`
}

// LimitRangeList is a list of LimitRange items.
type LimitRangeList struct {
TypeMeta `json:",inline"`
ListMeta `json:"metadata,omitempty"`

// Items is a list of LimitRange objects
Items []LimitRange `json:"items"`
}
```

## AdmissionControl plugin: LimitRanger

The **LimitRanger** plug-in introspects all incoming admission requests.

It makes decisions by evaluating the incoming object against all defined **LimitRange** objects in the request context namespace.

The following min/max limits are imposed:

**Type: Container**

| ResourceName | Description |
| ------------ | ----------- |
| cpu | Min/Max amount of cpu per container |
| memory | Min/Max amount of memory per container |

**Type: Pod**

| ResourceName | Description |
| ------------ | ----------- |
| cpu | Min/Max amount of cpu per pod |
| memory | Min/Max amount of memory per pod |

If the incoming object would cause a violation of the enumerated constraints, the request is denied with a set of
messages explaining what constraints were the source of the denial.

If a constraint is not enumerated by a **LimitRange** it is not tracked.

## kube-apiserver

The server is updated to be aware of **LimitRange** objects.

The constraints are only enforced if the kube-apiserver is started as follows:

```
$ kube-apiserver -admission_control=LimitRanger
```

## kubectl

kubectl is modified to support the **LimitRange** resource.

```kubectl describe``` provides a human-readable output of limits.

For example,

```
$ kubectl namespace myspace
$ kubectl create -f examples/limitrange/limit-range.json
$ kubectl get limits
NAME
limits
$ kubectl describe limits limits
Name: limits
Type Resource Min Max
---- -------- --- ---
Pod memory 1Mi 1Gi
Pod cpu 250m 2
Container cpu 250m 2
Container memory 1Mi 1Gi
```

## Future Enhancements: Define limits for a particular pod or container.

In the current proposal, the **LimitRangeItem** matches purely on **LimitRangeItem.Type**

It is expected we will want to define limits for particular pods or containers by name/uid and label/field selector.

To make a **LimitRangeItem** more restrictive, we will intend to add these additional restrictions at a future point in time.
152 changes: 152 additions & 0 deletions docs/design/admission_control_resource_quota.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,152 @@
# Admission control plugin: ResourceQuota

## Background

This document proposes a system for enforcing hard resource usage limits per namespace as part of admission control.

## Model Changes

A new resource, **ResourceQuota**, is introduced to enumerate hard resource limits in a Kubernetes namespace.

A new resource, **ResourceQuotaUsage**, is introduced to support atomic updates of a **ResourceQuota** status.

```
// The following identify resource constants for Kubernetes object types
const (
// Pods, number
ResourcePods ResourceName = "pods"
// Services, number
ResourceServices ResourceName = "services"
// ReplicationControllers, number
ResourceReplicationControllers ResourceName = "replicationcontrollers"
// ResourceQuotas, number
ResourceQuotas ResourceName = "resourcequotas"
)

// ResourceQuotaSpec defines the desired hard limits to enforce for Quota
type ResourceQuotaSpec struct {
// Hard is the set of desired hard limits for each named resource
Hard ResourceList `json:"hard,omitempty"`
}

// ResourceQuotaStatus defines the enforced hard limits and observed use
type ResourceQuotaStatus struct {
// Hard is the set of enforced hard limits for each named resource
Hard ResourceList `json:"hard,omitempty"`
// Used is the current observed total usage of the resource in the namespace
Used ResourceList `json:"used,omitempty"`
}

// ResourceQuota sets aggregate quota restrictions enforced per namespace
type ResourceQuota struct {
TypeMeta `json:",inline"`
ObjectMeta `json:"metadata,omitempty"`

// Spec defines the desired quota
Spec ResourceQuotaSpec `json:"spec,omitempty"`

// Status defines the actual enforced quota and its current usage
Status ResourceQuotaStatus `json:"status,omitempty"`
}

// ResourceQuotaUsage captures system observed quota status per namespace
// It is used to enforce atomic updates of a backing ResourceQuota.Status field in storage
type ResourceQuotaUsage struct {
TypeMeta `json:",inline"`
ObjectMeta `json:"metadata,omitempty"`

// Status defines the actual enforced quota and its current usage
Status ResourceQuotaStatus `json:"status,omitempty"`
}

// ResourceQuotaList is a list of ResourceQuota items
type ResourceQuotaList struct {
TypeMeta `json:",inline"`
ListMeta `json:"metadata,omitempty"`

// Items is a list of ResourceQuota objects
Items []ResourceQuota `json:"items"`
}

```

## AdmissionControl plugin: ResourceQuota

The **ResourceQuota** plug-in introspects all incoming admission requests.

It makes decisions by evaluating the incoming object against all defined **ResourceQuota.Status.Hard** resource limits in the request
namespace. If acceptance of the resource would cause the total usage of a named resource to exceed its hard limit, the request is denied.

The following resource limits are imposed as part of core Kubernetes at the namespace level:

| ResourceName | Description |
| ------------ | ----------- |
| cpu | Total cpu usage |
| memory | Total memory usage |
| pods | Total number of pods |
| services | Total number of services |
| replicationcontrollers | Total number of replication controllers |
| resourcequotas | Total number of resource quotas |

Any resource that is not part of core Kubernetes must follow the resource naming convention prescribed by Kubernetes.

This means the resource must have a fully-qualified name (i.e. mycompany.org/shinynewresource)

If the incoming request does not cause the total usage to exceed any of the enumerated hard resource limits, the plug-in will post a
**ResourceQuotaUsage** document to the server to atomically update the observed usage based on the previously read
**ResourceQuota.ResourceVersion**. This keeps incremental usage atomically consistent, but does introduce a bottleneck (intentionally)
into the system.

To optimize system performance, it is encouraged that all resource quotas are tracked on the same **ResourceQuota** document. As a result,
its encouraged to actually impose a cap on the total number of individual quotas that are tracked in the **Namespace** to 1 by explicitly
capping it in **ResourceQuota** document.

## kube-apiserver

The server is updated to be aware of **ResourceQuota** objects.

The quota is only enforced if the kube-apiserver is started as follows:

```
$ kube-apiserver -admission_control=ResourceQuota
```

## kube-controller-manager

A new controller is defined that runs a synch loop to calculate quota usage across the namespace.

**ResourceQuota** usage is only calculated if a namespace has a **ResourceQuota** object.

If the observed usage is different than the recorded usage, the controller sends a **ResourceQuotaUsage** resource
to the server to atomically update.

The synchronization loop frequency will control how quickly DELETE actions are recorded in the system and usage is ticked down.

To optimize the synchronization loop, this controller will WATCH on Pod resources to track DELETE events, and in response, recalculate
usage. This is because a Pod deletion will have the most impact on observed cpu and memory usage in the system, and we anticipate
this being the resource most closely running at the prescribed quota limits.

## kubectl

kubectl is modified to support the **ResourceQuota** resource.

```kubectl describe``` provides a human-readable output of quota.

For example,

```
$ kubectl namespace myspace
$ kubectl create -f examples/resourcequota/resource-quota.json
$ kubectl get quota
NAME
myquota
$ kubectl describe quota myquota
Name: myquota
Resource Used Hard
-------- ---- ----
cpu 100m 20
memory 0 1.5Gb
pods 1 10
replicationControllers 1 10
services 2 3
```