Skip to content

Commit ee2f139

Browse files
committed
admin: schedule: Updating for style and consistency
Signed-off-by: Lynette Miles <lynette.miles@chronosphere.io>
1 parent ccb54db commit ee2f139

File tree

1 file changed

+50
-36
lines changed

1 file changed

+50
-36
lines changed

administration/scheduling-and-retries.md

Lines changed: 50 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -2,30 +2,39 @@
22

33
<img referrerpolicy="no-referrer-when-downgrade" src="https://static.scarf.sh/a.png?x-pxid=a70a6008-106f-43c8-8930-243806371482" />
44

5-
[Fluent Bit](https://fluentbit.io) has an Engine that helps to coordinate the data ingestion from input plugins and calls the _Scheduler_ to decide when it is time to flush the data through one or multiple output plugins. The Scheduler flushes new data at a fixed time of seconds and the _Scheduler_ retries when asked.
5+
[Fluent Bit](https://fluentbit.io) has an engine that helps to coordinate the data
6+
ingestion from input plugins. The engine calls the _scheduler_ to decide when it's time to
7+
flush the data through one or multiple output plugins. The scheduler flushes new data
8+
at a fixed time of seconds and retries when asked.
69

7-
Once an output plugin gets called to flush some data, after processing that data it can notify the Engine three possible return statuses:
10+
When an output plugin gets called to flush some data, after processing that data it
11+
can notify the engine using these possible return statuses:
812

9-
* OK
10-
* Retry
11-
* Error
13+
- `OK`: Data successfully processed and flushed.
14+
- `Retry`: If a retry is requested, the engine asks the scheduler to retry flushing
15+
that data. The scheduler decides how many seconds to wait before retry.
16+
- `Error`: An unrecoverable error occurred and the engine shouldn't try to flush that data again.
1217

13-
If the return status was **OK**, it means it was successfully able to process and flush the data. If it returned an **Error** status, it means that an unrecoverable error happened and the engine should not try to flush that data again. If a **Retry** was requested, the _Engine_ will ask the _Scheduler_ to retry to flush that data, the Scheduler will decide how many seconds to wait before that happens.
18+
## Configure wait time for retry
1419

15-
## Configuring Wait Time for Retry
20+
The scheduler provides two configuration options called `scheduler.cap` and
21+
`scheduler.base` which can be set in the Service section. These determine the waiting
22+
time before a retry happens.
1623

17-
The Scheduler provides two configuration options called **scheduler.cap** and **scheduler.base** which can be set in the Service section.
24+
| Key | Description | Default |
25+
| --- | ------------| --------------|
26+
| `scheduler.cap` | Set a maximum retry time in seconds. Supported in v1.8.7 or greater. | `2000` |
27+
| `scheduler.base` | Set a base of exponential backoff. Supported in v1.8.7 or greater. | `5` |
1828

19-
| Key | Description | Default Value |
20-
| -- | ------------| --------------|
21-
| scheduler.cap | Set a maximum retry time in seconds. The property is supported from v1.8.7. | 2000 |
22-
| scheduler.base | Set a base of exponential backoff. The property is supported from v1.8.7. | 5 |
29+
The `scheduler.base` determines the lower bound of time and the `scheduler.cap`
30+
determines the upper bound for each retry.
2331

24-
These two configuration options determine the waiting time before a retry will happen.
32+
Fluent Bit uses an exponential backoff and jitter algorithm to determine the waiting
33+
time before a retry. The waiting time is a random number between a configurable upper
34+
and lower bound. For a detailed explanation of the exponential backoff and jitter algorithm, see
35+
[Exponential Backoff And Jitter](https://aws.amazon.com/blogs/architecture/exponential-backoff-and-jitter/).
2536

26-
Fluent Bit uses an exponential backoff and jitter algorithm to determine the waiting time before a retry.
27-
28-
The waiting time is a random number between a configurable upper and lower bound.
37+
For example:
2938

3039
For the Nth retry, the lower bound of the random number will be:
3140

@@ -35,23 +44,26 @@ The upper bound will be:
3544

3645
`min(base * (Nth power of 2), cap)`
3746

38-
Given an example where `base` is set to 3 and `cap` is set to 30.
39-
40-
1st retry: The lower bound will be 3, the upper bound will be 3 * 2 = 6. So the waiting time will be a random number between (3, 6).
47+
For example:
4148

42-
2nd retry: the lower bound will be 3, the upper bound will be 3 * (2 * 2) = 12. So the waiting time will be a random number between (3, 12).
49+
When `base` is set to 3 and `cap` is set to 30:
4350

44-
3rd retry: the lower bound will be 3, the upper bound will be 3 * (2 * 2 * 2) = 24. So the waiting time will be a random number between (3, 24).
51+
First retry: The lower bound will be 3. The upper bound will be `3 * 2 = 6`.
52+
The waiting time will be a random number between (3, 6).
4553

46-
4th retry: the lower bound will be 3, since 3 * (2 * 2 * 2 * 2) = 48 > 30, the upper bound will be 30. So the waiting time will be a random number between (3, 30).
54+
Second retry: The lower bound will be 3. The upper bound will be `3 * (2 * 2) = 12`.
55+
The waiting time will be a random number between (3, 12).
4756

48-
Basically, the **scheduler.base** determines the lower bound of time between each retry and the **scheduler.cap** determines the upper bound.
57+
Third retry: The lower bound will be 3. The upper bound will be `3 * (2 * 2 * 2) =24`.
58+
The waiting time will be a random number between (3, 24).
4959

50-
For a detailed explanation of the exponential backoff and jitter algorithm, please check this [blog](https://aws.amazon.com/blogs/architecture/exponential-backoff-and-jitter/).
60+
Fourth retry: The lower bound will be 3, because `3 * (2 * 2 * 2 * 2) = 48` > `30`.
61+
The upper bound will be 30. The waiting time will be a random number between (3, 30).
5162

52-
### Example
63+
### Wait time example
5364

54-
The following example configures the **scheduler.base** as 3 seconds and **scheduler.cap** as 30 seconds.
65+
The following example configures the `scheduler.base` as `3` seconds and
66+
`scheduler.cap` as `30` seconds.
5567

5668
```text
5769
[SERVICE]
@@ -64,26 +76,29 @@ The following example configures the **scheduler.base** as 3 seconds and **sched
6476

6577
The waiting time will be:
6678

67-
| Nth retry | waiting time range (seconds) |
68-
| --- | --- |
79+
| Nth retry | Waiting time range (seconds) |
80+
| --- | --- |
6981
| 1 | (3, 6) |
7082
| 2 | (3, 12) |
7183
| 3 | (3, 24) |
7284
| 4 | (3, 30) |
7385

74-
## Configuring Retries
86+
## Configure retries
7587

76-
The Scheduler provides a simple configuration option called **Retry\_Limit**, which can be set independently on each output section. This option allows us to disable retries or impose a limit to try N times and then discard the data after reaching that limit:
88+
The scheduler provides a configuration option called `Retry_Limit`, which can be set
89+
independently on each output section. This option lets you disable retries or
90+
impose a limit to try N times and then discard the data after reaching that limit:
7791

7892
| | Value | Description |
7993
| :--- | :--- | :--- |
80-
| Retry\_Limit | N | Integer value to set the maximum number of retries allowed. N must be &gt;= 1 \(default: 1\) |
81-
| Retry\_Limit | `no_limits` or `False` | When Retry\_Limit is set to `no_limits` or`False`, means that there is not limit for the number of retries that the Scheduler can do. |
82-
| Retry\_Limit | no\_retries | When Retry\_Limit is set to no\_retries, means that retries are disabled and Scheduler would not try to send data to the destination if it failed the first time. |
94+
| `Retry_Limit` | N | Integer value to set the maximum number of retries allowed. N must be &gt;= 1 (default: `1`) |
95+
| `Retry_Limit` | `no_limits` or `False` | When set there no limit for the number of retries that the scheduler can do. |
96+
| `Retry_Limit` | `no_retries` | When set, retries are disabled and scheduler doesn't try to send data to the destination if it failed the first time. |
8397

84-
### Example
98+
### Retry example
8599

86-
The following example configures two outputs where the HTTP plugin has an unlimited number of while the Elasticsearch plugin have a limit of 5 retries:
100+
The following example configures two outputs where the HTTP plugin has an unlimited
101+
number of while the Elasticsearch plugin have a limit of `5` retries:
87102

88103
```text
89104
[OUTPUT]
@@ -99,4 +114,3 @@ The following example configures two outputs where the HTTP plugin has an unlimi
99114
Logstash_Format On
100115
Retry_Limit 5
101116
```
102-

0 commit comments

Comments
 (0)