Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Add several docs and re-organize #1095

Merged
merged 8 commits into from
Mar 5, 2021
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
129 changes: 129 additions & 0 deletions docs/dr_ha_recommendations.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,129 @@
# HA/DR Recommendations

## EventBus

A simple EventBus used for non-prod deployment or testing purpose could be:

```yaml
apiVersion: argoproj.io/v1alpha1
kind: EventBus
metadata:
name: default
spec:
nats:
native:
auth: token
```

However this is not good enough to run your production deployment, following
settings are recommended to make it more reliable, and achieve high
availability.

### Persistent Volumes

Even though the EventBus PODs already have data sync mechanism between them,
persistent volumes are still recommended to be used to avoid any events data
lost when the PODs crash.

An EventBus with persistent volumes looks like below:

```yaml
spec:
nats:
native:
auth: token
persistence:
storageClassName: standard
accessMode: ReadWriteOnce
volumeSize: 20Gi
```

### Anti-Affinity

You can run the EventBus PODs with anti-affinity, to avoid the situation that
all PODs are gone when a disaster happens.

An EventBus with best effort node anti-affinity:

```yaml
spec:
nats:
native:
auth: token
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- podAffinityTerm:
labelSelector:
matchLabels:
controller: eventbus-controller
eventbus-name: default
topologyKey: kubernetes.io/hostname
weight: 100
```

An EventBus with hard requirement node anti-affinity:

```yaml
spec:
nats:
native:
auth: token
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
controller: eventbus-controller
eventbus-name: default
topologyKey: kubernetes.io/hostname
```

To do AZ (Availablity Zone) anti-affinity, change the value of `topologyKey`
from `kubernetes.io/hostname` to `topology.kubernetes.io/zone`.

Besides `affinity`,
[nodeSelector](https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#nodeselector)
and
[tolerations](https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/)
also could be set through `spec.nats.native.nodeSelector` and
`spec.nats.native.tolerations`.

### POD Priority

Setting
[POD Prioryty](https://kubernetes.io/docs/concepts/configuration/pod-priority-preemption/)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spelling

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixed!

could reduce the chance of PODs being evicted.

Priority could be set through `spec.nats.native.priorityClassName` or
`spec.nats.native.priority`.

## EventSources

### Replicas

For below types of EventSources, `spec.replica` could be set to a number `>1` to
make them HA, see more detail [here](eventsources/deployment-strategies.md).

- AWS SNS
- AWS SQS
- Github
- Gitlab
- NetApp Storage GRID
- Slack
- Stripe
- Webhook

### EventSource POD Node Selection

EventSource POD `affinity`, `nodeSelector` and `tolerations` could be set
through `spec.template.affinity`, `spec.template.nodeSelector` and
`spec.template.tolerations`.

## Sensors

### Sensor POD Node Selection

Sensor POD `affinity`, `nodeSelector` and `tolerations` could also be set
through `spec.template.affinity`, `spec.template.nodeSelector` and
`spec.template.tolerations`.
39 changes: 39 additions & 0 deletions docs/eventsources/calendar-catch-up.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# Calender EventSource Catch Up

Catch-up feature allow Calender eventsources to execute the missed schedules
from last run.

## Enable Catch-up forEventSource Definition

User can configure catchup on each events in eventsource.

```yaml
example-with-catch-up:
# Catchup the missed events from last Event timestamp. last event will be persisted in configmap.
schedule: "* * * * *"
persistence:
catchup:
enabled: true # Check missed schedules from last persisted event time on every start
maxDuration: 5m # maximum amount of duration go back for the catch-up
configMap: # Configmap for persist the last successful event timestamp
createIfNotExist: true
name: test-configmap
```

Last calender event persisted in configured configmap. Multiple event can use
the same configmap to persist the events.

```yaml
data:
calendar.example-with-catch-up:
'{"eventTime":"2020-10-19 22:50:00.0003192 +0000 UTC m=+683.567066901"}'
```

## Disable the catchup

Set `false` to catchup-->enabled element

```yaml
catchup:
enabled: false
```
30 changes: 0 additions & 30 deletions docs/eventsources/catup.md

This file was deleted.

2 changes: 1 addition & 1 deletion docs/eventsources/services.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,5 +29,5 @@ expose the endpoint for external access, please manage it by using native K8s
objects (i.e. a Load Balancer type Service, or an Ingress), and remove `service`
field from the EventSource object.

You can refer to [webhook heath check](../webhook-health-check.md) if you need a
You can refer to [webhook heath check](webhook-health-check.md) if you need a
health check endpoint for LB Service or Ingress configuration.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Webhook Authentication

![GA](assets/ga.svg)
![GA](../assets/ga.svg)

> v1.0 and after

Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Webhook Health Check

![GA](assets/ga.svg)
![GA](../assets/ga.svg)

> v1.0 and after

Expand Down
113 changes: 113 additions & 0 deletions docs/more-about-sensors-and-triggers.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
# More About Sensors And Triggers

## Multiple Dependencies

If there are mulitple dependencies defined in the `Sensor`, you can configure
[Trigger Conditions](trigger-conditions.md) to determine what kind of situation
could get the trigger executed.

For example, there are 2 dependencies `A` and `B` are defined, then condition
`A || B` means an event from either `A` or `B` will execute the trigger.

What happens if `A && B` is defined? Assume before `B` has an event `b1`
delivered, `A` has already got events `a1` - `a10`, in this case, `a10` and `b1`
will be used to execute the trigger, and `a1` - `a9` will be dropped.

In short, at the moment `Trigger Conditions` resolve to true, the latest events
from each dependencies will be used to trigger the actions.

## Duplicate Dependencies

Due to technical reasons, same `eventSourceName` and `eventName` combo can not
be referenced twice in one `Sensor` object. For example, following dependency
definitions are not allowed. However, it can be referenced unlimited times in
different `Sensor` objects, so if you do have similar requirements, use 2
`Sensor` objects instead.

```yaml
spec:
dependencies:
- name: dep01
eventSourceName: webhook
eventName: example
filters:
data:
- path: body.value
type: number
comparator: "<"
value:
- "20.0"
- name: dep02
eventSourceName: webhook
eventName: example
filters:
data:
- path: body.value
type: number
comparator: ">"
value:
- "50.0"
```

## Events Delivery Order

Following statements are based on using `NATS Streaming` as the EventBus.

In general, the order of events delivered to a `Sensor` is the order they were
published, but there's no guarantee for that. There could be cases that the
`Sensor` fails to acknowledge the first message, and then succeeds to
acknowledge the second one before the first one is redelivered.

## Events Delivery Guarantee

`NATS Streaming` offers `at-least-once` delivery guarantee. In the `Sensor`
application, an in-memory cache is implemented to cache the events IDs delivered
in the last 5 minutes, this is used to make sure there won't be any duplicate
events delivered.

Based on this, it is considered as `exact-once` delivery.

## Trigger Retries

By default, there's no retry for the trigger execution, this is based on the
fact that `Sensor` has no idea if failure retry would bring any unexpected
results.

If you prefer to have retry for the `trigger`, add `retryStrategy` to the spec.

```yaml
spec:
triggers:
- template:
name: http-trigger
http:
url: https://xxxxx.com/
method: GET
retryStrategy:
# Give up after this many times
steps: 3
```

Or if you want more control on the retries:

```yaml
spec:
triggers:
- retryStrategy:
# Give up after this many times
steps: 3
# The initial duration, use strings like "2s", "1m"
duration: 2s
# Duration is multiplied by factor each retry, if factor is not zero
# and steps limit has not been reached.
# Should not be negative
#
# Defaults to "1.0"
factor: 2.0
# The sleep between each retry is the duration plus an additional
# amount chosen uniformly at random from the interval between
# zero and `jitter * duration`.
#
# Defaults to "1"
jitter: 2
```
22 changes: 11 additions & 11 deletions docs/validating-admission-webhook.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,11 +19,11 @@ kubectl apply -n argo-events -f https://raw.githubusercontent.com/argoproj/argo-

Using the validating webhook has following benefits:

1. It notifies the error at the time applying the faulty spec, so that you don't
need to check the CRD object `status` field to see if there's any condition
errors later on.
- It notifies the error at the time applying the faulty spec, so that you don't
need to check the CRD object `status` field to see if there's any condition
errors later on.

e.g. Creating an `exotic` NATS EventBus without `ClusterID` specified:
e.g. Creating an `exotic` NATS EventBus without `ClusterID` specified:

```sh
cat <<EOF | kubectl create -f -
Expand All @@ -38,15 +38,15 @@ cat <<EOF | kubectl create -f -
Error from server (BadRequest): error when creating "STDIN": admission webhook "webhook.argo-events.argoproj.io" denied the request: "spec.nats.exotic.clusterID" is missing
```

2. Spec updating behavior can be validated.
- Spec updating behavior can be validated.

Updating existing specs requires more validation, besides checking if the new
spec is valid, we also need to check if there's any immutable fields being
updated. This can not be done in the controller reconciliation, but we can do it
by using the validating webhook.
Updating existing specs requires more validation, besides checking if the new
spec is valid, we also need to check if there's any immutable fields being
updated. This can not be done in the controller reconciliation, but we can do
it by using the validating webhook.

For example, updating Auth Strategy for a native NATS EventBus is prohibited, a
denied response as following will be returned.
For example, updating Auth Strategy for a native NATS EventBus is prohibited,
a denied response as following will be returned.

```sh
Error from server (BadRequest): error when applying patch:
Expand Down