Skip to content

Commit 6daba46

Browse files
authored
filter_lua: Document record split and many minor improvements (#436)
Add description about that the Lua callback can return an array of records, i.e., the third return value of the callback function can be an array of tables. This feature enables this Lua filter to split an input record into multiple records. Document the record split feature as a new subsection and add an example about it. See also fluent/fluent-bit#811 . While there, variosu minor grammar and format fixes. Signed-off-by: Weitian LI <liweitianux@live.com>
1 parent 4619c45 commit 6daba46

File tree

1 file changed

+79
-22
lines changed

1 file changed

+79
-22
lines changed

pipeline/filters/lua.md

Lines changed: 79 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -1,28 +1,28 @@
11
# Lua
22

3-
Lua Filter allows you to modify the incoming records using custom [Lua](https://www.lua.org/) Scripts.
3+
The **Lua** filter allows you to modify the incoming records (even split one record into multiple records) using custom [Lua](https://www.lua.org/) scripts.
44

5-
Due to the necessity to have a flexible filtering mechanism, now is possible to extend Fluent Bit capabilities writing simple filters using Lua programming language. A Lua based filter takes two steps:
5+
Due to the necessity to have a flexible filtering mechanism, it is now possible to extend Fluent Bit capabilities by writing custom filters using Lua programming language. A Lua-based filter takes two steps:
66

7-
* Configure the Filter in the main configuration
8-
* Prepare a Lua script that will be used by the Filter
7+
1. Configure the Filter in the main configuration
8+
2. Prepare a Lua script that will be used by the Filter
99

1010
## Configuration Parameters <a id="config"></a>
1111

1212
The plugin supports the following configuration parameters:
1313

1414
| Key | Description |
1515
| :--- | :--- |
16-
| script | Path to the Lua script that will be used. |
17-
| call | Lua function name that will be triggered to do filtering. It's assumed that the function is declared inside the Script defined above. |
16+
| script | Path to the Lua script that will be used. This can be a relative path against the main configuration file. |
17+
| call | Lua function name that will be triggered to do filtering. It's assumed that the function is declared inside the **script** parameter defined above. |
1818
| type\_int\_key | If these keys are matched, the fields are converted to integer. If more than one key, delimit by space. Note that starting from Fluent Bit v1.6 integer data types are preserved and not converted to double as in previous versions. |
1919
| type\_array\_key| If these keys are matched, the fields are handled as array. If more than one key, delimit by space. It is useful the array can be empty. |
20-
| protected\_mode | If enabled, Lua script will be executed in protected mode. It prevents to crash when invalid Lua script is executed. Default is true. |
21-
| time\_as\_table | By default when the Lua script is invoked, the record timestamp is passed as a Floating number which might lead to loss precision when the data is converted back. If you desire timestamp precision enabling this option will pass the timestamp as a Lua table with keys `sec` for seconds since epoch and `nsec` for nanoseconds. |
20+
| protected\_mode | If enabled, Lua script will be executed in protected mode. It prevents Fluent Bit from crashing when invalid Lua script is executed or the triggered Lua function throws exceptions. Default is true. |
21+
| time\_as\_table | By default when the Lua script is invoked, the record timestamp is passed as a *floating number* which might lead to precision loss when it is converted back. If you desire timestamp precision, enabling this option will pass the timestamp as a Lua table with keys `sec` for seconds since epoch and `nsec` for nanoseconds. |
2222

2323
## Getting Started <a id="getting_started"></a>
2424

25-
In order to test the filter, you can run the plugin from the command line or through the configuration file. The following examples uses the [dummy](../inputs/dummy.md) input plugin for data ingestion, invoke Lua filter using the [test.lua](https://github.com/fluent/fluent-bit/blob/master/scripts/test.lua) script and calls the [cb\_print\(\)](https://github.com/fluent/fluent-bit/blob/master/scripts/test.lua#L29) function which only print the same information to the standard output:
25+
In order to test the filter, you can run the plugin from the command line or through the configuration file. The following examples use the [dummy](../inputs/dummy.md) input plugin for data ingestion, invoke Lua filter using the [test.lua](https://github.com/fluent/fluent-bit/blob/master/scripts/test.lua) script and call the [cb\_print\(\)](https://github.com/fluent/fluent-bit/blob/master/scripts/test.lua#L29) function which only prints the same information to the standard output:
2626

2727
### Command Line
2828

@@ -38,7 +38,7 @@ In your main configuration file append the following _Input_, _Filter_ & _Output
3838

3939
```python
4040
[INPUT]
41-
Name dummy
41+
Name dummy
4242

4343
[FILTER]
4444
Name lua
@@ -47,26 +47,27 @@ In your main configuration file append the following _Input_, _Filter_ & _Output
4747
call cb_print
4848

4949
[OUTPUT]
50-
Name null
51-
Match *
50+
Name null
51+
Match *
5252
```
5353

5454
## Lua Script Filter API <a id="lua_script"></a>
5555

5656
The life cycle of a filter have the following steps:
5757

58-
* Upon Tag matching by filter\_lua, it may process or bypass the record.
59-
* If filter\_lua accepts the record, it will invoke the function defined in the _call_ property which basically is the name of a function defined in the Lua _script_.
60-
* Invoke Lua function passing each record in JSON format.
61-
* Upon return, validate return value and take some action \(described above\)
58+
1. Upon Tag matching by this filter, it may process or bypass the record.
59+
2. If tag matched, it will accept the record and invoke the function defined in the `call` property which basically is the name of a function defined in the Lua `script`.
60+
3. Invoke Lua function and pass each record in JSON format.
61+
4. Upon return, validate return value and continue the pipeline.
6262

6363
## Callback Prototype
6464

65-
The Lua script can have one or multiple callbacks that can be used by filter\_lua, it prototype is as follows:
65+
The Lua script can have one or multiple callbacks that can be used by this filter. The function prototype is as follows:
6666

6767
```lua
6868
function cb_print(tag, timestamp, record)
69-
return code, timestamp, record
69+
...
70+
return code, timestamp, record
7071
end
7172
```
7273

@@ -75,7 +76,7 @@ end
7576
| name | description |
7677
| :--- | :--- |
7778
| tag | Name of the tag associated with the incoming record. |
78-
| timestamp | Unix timestamp with nanoseconds associated with the incoming record. The original format is a double \(seconds.nanoseconds\) |
79+
| timestamp | Unix timestamp with nanoseconds associated with the incoming record. The original format is a double (seconds.nanoseconds) |
7980
| record | Lua table with the record content |
8081

8182
#### Return Values
@@ -84,9 +85,9 @@ Each callback **must** return three values:
8485

8586
| name | data type | description |
8687
| :--- | :--- | :--- |
87-
| code | integer | The code return value represents the result and further action that may follows. If _code_ equals -1, means that filter\_lua must drop the record. If _code_ equals 0 the record will not be modified, otherwise if _code_ equals 1, means the original timestamp and record have been modified so it must be replaced by the returned values from _timestamp_ \(second return value\) and _record_ \(third return value\). If _code_ equals 2, means the original timestamp is not modified and the record has been modified so it must be replaced by the returned values from _record_ \(third return value\). The _code_ 2 is supported from v1.4.3. |
88+
| code | integer | The code return value represents the result and further action that may follows. If _code_ equals -1, means that the record will be dropped. If _code_ equals 0, the record will not be modified, otherwise if _code_ equals 1, means the original timestamp and record have been modified so it must be replaced by the returned values from _timestamp_ (second return value) and _record_ (third return value). If _code_ equals 2, means the original timestamp is not modified and the record has been modified so it must be replaced by the returned values from _record_ (third return value). The _code_ 2 is supported from v1.4.3. |
8889
| timestamp | double | If code equals 1, the original record timestamp will be replaced with this new value. |
89-
| record | table | if code equals 1, the original record information will be replaced with this new value. Note that the format of this value **must** be a valid Lua table. |
90+
| record | table | If code equals 1, the original record information will be replaced with this new value. Note that the _record_ value **must** be a valid Lua table. This value can be an array of tables (i.e., array of objects in JSON format), and in that case the input record is effectively split into multiple records. (see below for more details) |
9091

9192
### Code Examples
9293

@@ -96,9 +97,65 @@ For functional examples of this interface, please refer to the code samples prov
9697

9798
### Number Type
9899

99-
In Lua, Fluent Bit treats number as double. It means an integer field \(e.g. IDs, log levels\) will be converted double. To avoid type conversion, **Type\_int\_key** property is available.
100+
+Lua treats number as double. It means an integer field (e.g. IDs, log levels) will be converted double. To avoid type conversion, The `type_int_key` property is available.
100101

101102
### Protected Mode
102103

103104
Fluent Bit supports protected mode to prevent crash when executes invalid Lua script. See also [Error Handling in Application Code](https://www.lua.org/pil/24.3.1.html).
104105

106+
### Record Split
107+
108+
The Lua callback function can return an array of tables (i.e., array of records) in its third _record_ return value. With this feature, the Lua filter can split one input record into multiple records according to custom logic.
109+
110+
For example:
111+
112+
#### Lua script
113+
114+
```lua
115+
function cb_split(tag, timestamp, record)
116+
if record["x"] ~= nil then
117+
return 2, timestamp, record["x"]
118+
else
119+
return 2, timestamp, record
120+
end
121+
end
122+
```
123+
124+
#### Configuration
125+
126+
```python
127+
[Input]
128+
Name stdin
129+
130+
[Filter]
131+
Name lua
132+
Match *
133+
script test.lua
134+
call cb_split
135+
136+
[Output]
137+
Name stdout
138+
Match *
139+
```
140+
141+
#### Input
142+
143+
```
144+
{"x": [ {"a1":"aa", "z1":"zz"}, {"b1":"bb", "x1":"xx"}, {"c1":"cc"} ]}
145+
{"x": [ {"a2":"aa", "z2":"zz"}, {"b2":"bb", "x2":"xx"}, {"c2":"cc"} ]}
146+
{"a3":"aa", "z3":"zz", "b3":"bb", "x3":"xx", "c3":"cc"}
147+
```
148+
149+
#### Output
150+
151+
```
152+
[0] stdin.0: [1538435928.310583591, {"a1"=>"aa", "z1"=>"zz"}]
153+
[1] stdin.0: [1538435928.310583591, {"x1"=>"xx", "b1"=>"bb"}]
154+
[2] stdin.0: [1538435928.310583591, {"c1"=>"cc"}]
155+
[3] stdin.0: [1538435928.310588359, {"z2"=>"zz", "a2"=>"aa"}]
156+
[4] stdin.0: [1538435928.310588359, {"b2"=>"bb", "x2"=>"xx"}]
157+
[5] stdin.0: [1538435928.310588359, {"c2"=>"cc"}]
158+
[6] stdin.0: [1538435928.310589790, {"z3"=>"zz", "x3"=>"xx", "c3"=>"cc", "a3"=>"aa", "b3"=>"bb"}]
159+
```
160+
161+
See also [Fluent Bit: PR 811](https://github.com/fluent/fluent-bit/pull/811).

0 commit comments

Comments
 (0)