Skip to content

Commit

Permalink
Merge pull request #20 from stefanprodan/scheduler
Browse files Browse the repository at this point in the history
Add canary analysis schedule interval to CRD
  • Loading branch information
stefanprodan committed Jan 11, 2019
2 parents b035c1e + 07f66e8 commit 3bff2c3
Show file tree
Hide file tree
Showing 46 changed files with 328 additions and 219 deletions.
9 changes: 5 additions & 4 deletions README.md
Expand Up @@ -25,8 +25,7 @@ helm repo add flagger https://flagger.app
# install or upgrade
helm upgrade -i flagger flagger/flagger \
--namespace=istio-system \
--set metricsServer=http://prometheus.istio-system:9090 \
--set controlLoopInterval=1m
--set metricsServer=http://prometheus.istio-system:9090
```

Flagger is compatible with Kubernetes >1.10.0 and Istio >1.0.0.
Expand Down Expand Up @@ -75,7 +74,7 @@ You can change the canary analysis _max weight_ and the _step weight_ percentage
For a deployment named _podinfo_, a canary promotion can be defined using Flagger's custom resource:

```yaml
apiVersion: flagger.app/v1alpha2
apiVersion: flagger.app/v1alpha3
kind: Canary
metadata:
name: podinfo
Expand All @@ -102,8 +101,10 @@ spec:
- public-gateway.istio-system.svc.cluster.local
# Istio virtual service host names (optional)
hosts:
- app.iowa.weavedx.com
- podinfo.example.com
canaryAnalysis:
# schedule interval (default 60s)
interval: 1m
# max number of failed metric checks before rollback
threshold: 10
# max traffic percentage routed to canary
Expand Down
6 changes: 4 additions & 2 deletions artifacts/canaries/canary.yaml
@@ -1,4 +1,4 @@
apiVersion: flagger.app/v1alpha2
apiVersion: flagger.app/v1alpha3
kind: Canary
metadata:
name: podinfo
Expand Down Expand Up @@ -27,6 +27,8 @@ spec:
hosts:
- app.iowa.weavedx.com
canaryAnalysis:
# schedule interval (default 60s)
interval: 10s
# max number of failed metric checks before rollback
threshold: 10
# max traffic percentage routed to canary
Expand All @@ -50,7 +52,7 @@ spec:
# external checks (optional)
webhooks:
- name: integration-tests
url: http://podinfo.test:9898/echo
url: https://httpbin.org/post
timeout: 1m
metadata:
test: "all"
Expand Down
18 changes: 13 additions & 5 deletions artifacts/flagger/crd.yaml
Expand Up @@ -4,11 +4,14 @@ metadata:
name: canaries.flagger.app
spec:
group: flagger.app
version: v1alpha2
version: v1alpha3
versions:
- name: v1alpha2
- name: v1alpha3
served: true
storage: true
- name: v1alpha2
served: true
storage: false
- name: v1alpha1
served: true
storage: false
Expand Down Expand Up @@ -39,7 +42,9 @@ spec:
name:
type: string
autoscalerRef:
type: object
anyOf:
- type: string
- type: object
required: ['apiVersion', 'kind', 'name']
properties:
apiVersion:
Expand All @@ -56,6 +61,9 @@ spec:
type: number
canaryAnalysis:
properties:
interval:
type: string
pattern: "^[0-9]+(m|s)"
threshold:
type: number
maxWeight:
Expand All @@ -73,7 +81,7 @@ spec:
type: string
interval:
type: string
pattern: "^[0-9]+(m)"
pattern: "^[0-9]+(m|s)"
threshold:
type: number
webhooks:
Expand All @@ -90,4 +98,4 @@ spec:
format: url
timeout:
type: string
pattern: "^[0-9]+(s)"
pattern: "^[0-9]+(m|s)"
2 changes: 1 addition & 1 deletion artifacts/flagger/deployment.yaml
Expand Up @@ -22,7 +22,7 @@ spec:
serviceAccountName: flagger
containers:
- name: flagger
image: quay.io/stefanprodan/flagger:0.2.0
image: quay.io/stefanprodan/flagger:0.3.0-beta.1
imagePullPolicy: Always
ports:
- name: http
Expand Down
4 changes: 2 additions & 2 deletions charts/flagger/Chart.yaml
@@ -1,11 +1,11 @@
apiVersion: v1
name: flagger
version: 0.2.0
appVersion: 0.2.0
appVersion: 0.3.0-beta.1
kubeVersion: ">=1.9.0-0"
engine: gotpl
description: Flagger is a Kubernetes operator that automates the promotion of canary deployments using Istio routing for traffic shifting and Prometheus metrics for canary analysis.
home: https://flagger.app
home: https://docs.flagger.app
icon: https://raw.githubusercontent.com/stefanprodan/flagger/master/docs/logo/flagger-icon.png
sources:
- https://github.com/stefanprodan/flagger
Expand Down
10 changes: 5 additions & 5 deletions charts/flagger/README.md
@@ -1,7 +1,7 @@
# Flagger

[Flagger](https://flagger.app) is a Kubernetes operator that automates the promotion of canary deployments
using Istio routing for traffic shifting and Prometheus metrics for canary analysis.
[Flagger](https://github.com/stefanprodan/flagger) is a Kubernetes operator that automates the promotion of
canary deployments using Istio routing for traffic shifting and Prometheus metrics for canary analysis.
Flagger implements a control loop that gradually shifts traffic to the canary while measuring key performance indicators
like HTTP requests success rate, requests average duration and pods health.
Based on the KPIs analysis a canary is promoted or aborted and the analysis result is published to Slack.
Expand Down Expand Up @@ -48,7 +48,6 @@ Parameter | Description | Default
`image.repository` | image repository | `quay.io/stefanprodan/flagger`
`image.tag` | image tag | `<VERSION>`
`image.pullPolicy` | image pull policy | `IfNotPresent`
`controlLoopInterval` | wait interval between checks | `10s`
`metricsServer` | Prometheus URL | `http://prometheus.istio-system:9090`
`slack.url` | Slack incoming webhook | None
`slack.channel` | Slack channel | None
Expand All @@ -68,7 +67,8 @@ Specify each parameter using the `--set key=value[,key=value]` argument to `helm
```console
$ helm upgrade -i flagger flagger/flagger \
--namespace istio-system \
--set controlLoopInterval=1m
--set slack.url=https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK \
--set slack.channel=general
```

Alternatively, a YAML file that specifies the values for the above parameters can be provided while installing the chart. For example,
Expand All @@ -80,5 +80,5 @@ $ helm upgrade -i flagger flagger/flagger \
```

> **Tip**: You can use the default [values.yaml](values.yaml)
```

19 changes: 13 additions & 6 deletions charts/flagger/templates/crd.yaml
Expand Up @@ -5,11 +5,14 @@ metadata:
name: canaries.flagger.app
spec:
group: flagger.app
version: v1alpha2
version: v1alpha3
versions:
- name: v1alpha2
- name: v1alpha3
served: true
storage: true
- name: v1alpha2
served: true
storage: false
- name: v1alpha1
served: true
storage: false
Expand Down Expand Up @@ -40,7 +43,9 @@ spec:
name:
type: string
autoscalerRef:
type: object
anyOf:
- type: string
- type: object
required: ['apiVersion', 'kind', 'name']
properties:
apiVersion:
Expand All @@ -57,6 +62,9 @@ spec:
type: number
canaryAnalysis:
properties:
interval:
type: string
pattern: "^[0-9]+(m|s)"
threshold:
type: number
maxWeight:
Expand All @@ -74,7 +82,7 @@ spec:
type: string
interval:
type: string
pattern: "^[0-9]+(m)"
pattern: "^[0-9]+(m|s)"
threshold:
type: number
webhooks:
Expand All @@ -91,6 +99,5 @@ spec:
format: url
timeout:
type: string
pattern: "^[0-9]+(s)"

pattern: "^[0-9]+(m|s)"
{{- end }}
1 change: 0 additions & 1 deletion charts/flagger/templates/deployment.yaml
Expand Up @@ -35,7 +35,6 @@ spec:
command:
- ./flagger
- -log-level=info
- -control-loop-interval={{ .Values.controlLoopInterval }}
- -metrics-server={{ .Values.metricsServer }}
{{- if .Values.slack.url }}
- -slack-url={{ .Values.slack.url }}
Expand Down
3 changes: 1 addition & 2 deletions charts/flagger/values.yaml
Expand Up @@ -2,10 +2,9 @@

image:
repository: quay.io/stefanprodan/flagger
tag: 0.2.0
tag: 0.3.0-beta.1
pullPolicy: IfNotPresent

controlLoopInterval: "1m"
metricsServer: "http://prometheus.istio-system.svc.cluster.local:9090"

slack:
Expand Down
4 changes: 2 additions & 2 deletions cmd/flagger/main.go
Expand Up @@ -37,7 +37,7 @@ func init() {
flag.StringVar(&kubeconfig, "kubeconfig", "", "Path to a kubeconfig. Only required if out-of-cluster.")
flag.StringVar(&masterURL, "master", "", "The address of the Kubernetes API server. Overrides any value in kubeconfig. Only required if out-of-cluster.")
flag.StringVar(&metricsServer, "metrics-server", "http://prometheus:9090", "Prometheus URL")
flag.DurationVar(&controlLoopInterval, "control-loop-interval", 10*time.Second, "wait interval between rollouts")
flag.DurationVar(&controlLoopInterval, "control-loop-interval", 10*time.Second, "Kubernetes API sync interval")
flag.StringVar(&logLevel, "log-level", "debug", "Log level can be: debug, info, warning, error.")
flag.StringVar(&port, "port", "8080", "Port to listen on.")
flag.StringVar(&slackURL, "slack-url", "", "Slack hook URL.")
Expand Down Expand Up @@ -77,7 +77,7 @@ func main() {
}

flaggerInformerFactory := informers.NewSharedInformerFactory(flaggerClient, time.Second*30)
canaryInformer := flaggerInformerFactory.Flagger().V1alpha2().Canaries()
canaryInformer := flaggerInformerFactory.Flagger().V1alpha3().Canaries()

logger.Infof("Starting flagger version %s revision %s", version.VERSION, version.REVISION)

Expand Down
13 changes: 8 additions & 5 deletions docs/gitbook/how-it-works.md
Expand Up @@ -9,7 +9,7 @@
For a deployment named _podinfo_, a canary promotion can be defined using Flagger's custom resource:

```yaml
apiVersion: flagger.app/v1alpha2
apiVersion: flagger.app/v1alpha3
kind: Canary
metadata:
name: podinfo
Expand Down Expand Up @@ -38,6 +38,8 @@ spec:
hosts:
- podinfo.example.com
canaryAnalysis:
# schedule interval (default 60s)
interval: 1m
# max number of failed metric checks before rollback
threshold: 10
# max traffic percentage routed to canary
Expand Down Expand Up @@ -128,12 +130,13 @@ You can change the canary analysis _max weight_ and the _step weight_ percentage
### Canary Analysis

The canary analysis runs periodically until it reaches the maximum traffic weight or the failed checks threshold.
By default the analysis interval is set to one minute and can be configured with the `controlLoopInterval` command flag.

Spec:

```yaml
canaryAnalysis:
# schedule interval (default 60s)
interval: 1m
# max number of failed metric checks before rollback
threshold: 10
# max traffic percentage routed to canary
Expand All @@ -148,13 +151,13 @@ The above analysis, if it succeeds, will run for 25 minutes while validating the
You can determine the minimum time that it takes to validate and promote a canary deployment using this formula:

```
controlLoopInterval * (maxWeight / stepWeight)
interval * (maxWeight / stepWeight)
```

And the time it takes for a canary to be rollback:
And the time it takes for a canary to be rollback when the metrics or webhook checks are failing:

```
controlLoopInterval * threshold
interval * threshold
```

### HTTP Metrics
Expand Down
3 changes: 1 addition & 2 deletions docs/gitbook/install/install-flagger.md
Expand Up @@ -23,8 +23,7 @@ Deploy Flagger in the _**istio-system**_ namespace:
```bash
helm upgrade -i flagger flagger/flagger \
--namespace=istio-system \
--set metricsServer=http://prometheus.istio-system:9090 \
--set controlLoopInterval=1m
--set metricsServer=http://prometheus.istio-system:9090
```

Enable **Slack** notifications:
Expand Down
4 changes: 3 additions & 1 deletion docs/gitbook/usage/progressive-delivery.md
Expand Up @@ -20,7 +20,7 @@ kubectl apply -f ${REPO}/artifacts/canaries/hpa.yaml
Create a canary custom resource \(replace example.com with your own domain\):

```yaml
apiVersion: v1alpha2
apiVersion: flagger.app/v1alpha3
kind: Canary
metadata:
name: podinfo
Expand Down Expand Up @@ -49,6 +49,8 @@ spec:
hosts:
- app.example.com
canaryAnalysis:
# schedule interval (default 60s)
interval: 1m
# max number of failed metric checks before rollback
threshold: 5
# max traffic percentage routed to canary
Expand Down
2 changes: 1 addition & 1 deletion hack/update-codegen.sh
Expand Up @@ -23,6 +23,6 @@ CODEGEN_PKG=${CODEGEN_PKG:-$(cd ${SCRIPT_ROOT}; ls -d -1 ./vendor/k8s.io/code-ge

${CODEGEN_PKG}/generate-groups.sh "deepcopy,client,informer,lister" \
github.com/stefanprodan/flagger/pkg/client github.com/stefanprodan/flagger/pkg/apis \
flagger:v1alpha2 \
flagger:v1alpha3 \
--go-header-file ${SCRIPT_ROOT}/hack/boilerplate.go.txt

Expand Up @@ -16,6 +16,6 @@ limitations under the License.

// +k8s:deepcopy-gen=package

// Package v1alpha2 is the v1alpha2 version of the API.
// Package v1alpha3 is the v1alpha3 version of the API.
// +groupName=flagger.app
package v1alpha2
package v1alpha3
Expand Up @@ -14,7 +14,7 @@ See the License for the specific language governing permissions and
limitations under the License.
*/

package v1alpha2
package v1alpha3

import (
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
Expand All @@ -25,7 +25,7 @@ import (
)

// SchemeGroupVersion is group version used to register these objects
var SchemeGroupVersion = schema.GroupVersion{Group: rollout.GroupName, Version: "v1alpha2"}
var SchemeGroupVersion = schema.GroupVersion{Group: rollout.GroupName, Version: "v1alpha3"}

// Kind takes an unqualified kind and returns back a Group qualified GroupKind
func Kind(kind string) schema.GroupKind {
Expand Down

0 comments on commit 3bff2c3

Please sign in to comment.