diff --git a/.gitbook.yaml b/.gitbook.yaml index e823767ac..17e9f0e60 100644 --- a/.gitbook.yaml +++ b/.gitbook.yaml @@ -85,3 +85,9 @@ redirects: # Other concepts/buffering: ./pipeline/buffering.md + stream-processing/changelog: ./stream-processing/overview.md + stream-processing/introduction: ./stream-processing/overview.md + stream-processing/get-started: ./stream-processing/overview.md + stream-processing/getting-started/fluent-bit-sql: ./stream-processing/fluent-bit-sql.md + stream-processing/getting-started/check-keys-null-values: ./stream-processing/check-keys-null-values.md + stream-processing/getting-started/hands-on: stream-processing/tutorial.md diff --git a/.gitbook/assets/stream_processor.png b/.gitbook/assets/stream_processor.png deleted file mode 100644 index 475e1c796..000000000 Binary files a/.gitbook/assets/stream_processor.png and /dev/null differ diff --git a/README.md b/README.md index b658e7cb4..5e10f0763 100644 --- a/README.md +++ b/README.md @@ -27,7 +27,7 @@ description: High Performance Telemetry Agent for Logs, Metrics and Traces - Wasm: [Wasm Filter Plugins](development/wasm-filter-plugins.md) or [Wasm Input Plugins](development/wasm-input-plugins.md) - Write [Filters in Lua](pipeline/filters/lua.md) or [Output plugins in Golang](development/golang-output-plugins.md) - [Monitoring](administration/monitoring.md): Expose internal metrics over HTTP in JSON and [Prometheus](https://prometheus.io/) format -- [Stream Processing](stream-processing/introduction.md): Perform data selection and transformation using basic SQL queries +- [Stream Processing](stream-processing/overview.md): Perform data selection and transformation using basic SQL queries - Create new streams of data using query results - Aggregation windows - Data analysis and prediction: Time series forecasting diff --git a/SUMMARY.md b/SUMMARY.md index 6c8dbf5c8..cc115411b 100644 --- a/SUMMARY.md +++ b/SUMMARY.md @@ -46,7 +46,7 @@ * [YAML configuration](administration/configuring-fluent-bit/yaml.md) * [Configuration file](administration/configuring-fluent-bit/yaml/configuration-file.md) * [Environment variables](administration/configuring-fluent-bit/yaml/environment-variables-section.md) - * [Includes](administration/configuring-fluent-bit/yaml/includes-section.md) + * [Includes](administration/configuring-fluent-bit/yaml/includes-section.md) * [Service](administration/configuring-fluent-bit/yaml/service-section.md) * [Parsers](administration/configuring-fluent-bit/yaml/parsers-section.md) * [Multiline parsers](administration/configuring-fluent-bit/yaml/multiline-parsers-section.md) @@ -54,15 +54,15 @@ * [Plugins](administration/configuring-fluent-bit/yaml/plugins-section.md) * [Upstream servers](administration/configuring-fluent-bit/yaml/upstream-servers-section.md) * [Classic mode](administration/configuring-fluent-bit/classic-mode.md) - * [Configuration file](administration/configuring-fluent-bit/classic-mode/configuration-file.md) + * [Configuration file](administration/configuring-fluent-bit/classic-mode/configuration-file.md) * [Commands](administration/configuring-fluent-bit/classic-mode/commands.md) * [Format and schema](administration/configuring-fluent-bit/classic-mode/format-schema.md) - * [Record accessor syntax](administration/configuring-fluent-bit/classic-mode/record-accessor.md) + * [Record accessor syntax](administration/configuring-fluent-bit/classic-mode/record-accessor.md) * [Upstream servers](administration/configuring-fluent-bit/classic-mode/upstream-servers.md) * [Variables](administration/configuring-fluent-bit/classic-mode/variables.md) * [Multiline parsing](administration/configuring-fluent-bit/multiline-parsing.md) * [Unit Sizes](administration/configuring-fluent-bit/unit-sizes.md) -* [AWS credentials](administration/aws-credentials.md) +* [AWS credentials](administration/aws-credentials.md) * [Backpressure](administration/backpressure.md) * [Buffering and storage](administration/buffering-and-storage.md) * [Hot reload](administration/hot-reload.md) @@ -73,7 +73,7 @@ * [Networking](administration/networking.md) * [Performance tips](administration/performance.md) * [Scheduling and retries](administration/scheduling-and-retries.md) -* [TLS](administration/transport-security.md) +* [TLS](administration/transport-security.md) * [Troubleshooting](administration/troubleshooting.md) ## Local testing @@ -225,13 +225,10 @@ ## Stream processing -* [Introduction to stream processing](stream-processing/introduction.md) * [Overview](stream-processing/overview.md) -* [Changelog](stream-processing/changelog.md) -* [Get started](stream-processing/get-started.md) - * [Fluent Bit and SQL](stream-processing/getting-started/fluent-bit-sql.md) - * [Check keys and null values](stream-processing/getting-started/check-keys-null-values.md) - * [Tutorial](stream-processing/getting-started/hands-on.md) +* [Fluent Bit and SQL](stream-processing/fluent-bit-sql.md) +* [Check keys and null values](stream-processing/check-keys-null-values.md) +* [Tutorial](stream-processing/tutorial.md) ## Fluent Bit for developers diff --git a/administration/configuring-fluent-bit/classic-mode/configuration-file.md b/administration/configuring-fluent-bit/classic-mode/configuration-file.md index bf1c12ebb..ddf6a9815 100644 --- a/administration/configuring-fluent-bit/classic-mode/configuration-file.md +++ b/administration/configuring-fluent-bit/classic-mode/configuration-file.md @@ -27,7 +27,7 @@ The `Service` section defines global properties of the service. The following ke | `log_level` | Set the logging verbosity level. Allowed values are: `off`, `error`, `warn`, `info`, `debug`, and `trace`. Values are cumulative. If `debug` is set, it will include `error`, `warning`, `info`, and `debug`. Trace mode is only available if Fluent Bit was built with the _`WITH_TRACE`_ option enabled. | `info` | | `parsers_file` | Path for a `parsers` configuration file. Multiple `Parsers_File` entries can be defined within the section. | _none_ | | `plugins_file` | Path for a `plugins` configuration file. A `plugins` configuration file defines paths for external plugins. [See an example](https://github.com/fluent/fluent-bit/blob/master/conf/plugins.conf). | _none_ | -| `streams_file` | Path for the Stream Processor configuration file. [Learn more about Stream Processing configuration](../../../stream-processing/introduction.md). | _none_| +| `streams_file` | Path for the Stream Processor configuration file. [Learn more about Stream Processing configuration](../../../stream-processing/overview.md). | _none_| | `http_server` | Enable the built-in HTTP Server. | `Off` | | `http_listen` | Set listening interface for HTTP Server when it's enabled. | `0.0.0.0` | | `http_port` | Set TCP Port for the HTTP Server. | `2020` | diff --git a/administration/configuring-fluent-bit/yaml/configuration-file.md b/administration/configuring-fluent-bit/yaml/configuration-file.md index 90e475ce7..0f41ef1f1 100644 --- a/administration/configuring-fluent-bit/yaml/configuration-file.md +++ b/administration/configuring-fluent-bit/yaml/configuration-file.md @@ -68,7 +68,7 @@ The `service` section defines the global properties of the service. The Service | `log_level` | Set the logging verbosity level. Allowed values are: `off`, `error`, `warn`, `info`, `debug`, and `trace`. Values are accumulative. For example, if `debug` is set, it will include `error`, `warning`, `info`, and `debug`. `trace` mode is only available if Fluent Bit was built with the `WITH_TRACE` option enabled. | `info` | | `parsers_file` | Path for a file that defines custom parsers. Only a single entry is supported. | _none_ | | `plugins_file` | Path for a `plugins` configuration file. A `plugins` configuration file allows the definition of paths for external plugins; for an example, [see here](https://github.com/fluent/fluent-bit/blob/master/conf/plugins.conf). | _none_ | -| `streams_file` | Path for the Stream Processor configuration file. Learn more about [Stream Processing configuration](../../../stream-processing/introduction.md). | _none_ | +| `streams_file` | Path for the Stream Processor configuration file. Learn more about [Stream Processing configuration](../../../stream-processing/overview.md). | _none_ | | `http_server` | Enable built-in HTTP server. | `Off` | | `http_listen` | Set listening interface for HTTP server when it's enabled. | `0.0.0.0` | | `http_port` | Set TCP Port for the HTTP server | `2020` | diff --git a/administration/configuring-fluent-bit/yaml/service-section.md b/administration/configuring-fluent-bit/yaml/service-section.md index 05049e093..367f6c4b8 100644 --- a/administration/configuring-fluent-bit/yaml/service-section.md +++ b/administration/configuring-fluent-bit/yaml/service-section.md @@ -12,7 +12,7 @@ The `service` section defines global properties of the service. The available co | `log_level` | Sets the logging verbosity level. Allowed values are: `off`, `error`, `warn`, `info`, `debug`, and `trace`. Values are cumulative. If `debug` is set, it will include `error`, `warn`, `info`, and `debug`. Trace mode is only available if Fluent Bit was built with the _`WITH_TRACE`_ option enabled. | `info` | | `parsers_file` | Path for a `parsers` configuration file. Multiple `parsers_file` entries can be defined within the section. However, with the new YAML configuration schema, defining parsers using this key is now optional. Parsers can be declared directly in the `parsers` section of your YAML configuration, offering a more streamlined and integrated approach. | _none_ | | `plugins_file` | Path for a `plugins` configuration file. This file specifies the paths to external plugins (.so files) that Fluent Bit can load at runtime. With the new YAML schema, the `plugins_file` key is optional. External plugins can now be referenced directly within the `plugins` section, simplifying the plugin management process. [See an example](https://github.com/fluent/fluent-bit/blob/master/conf/plugins.conf). | _none_ | -| `streams_file` | Path for the Stream Processor configuration file. This file defines the rules and operations for stream processing within Fluent Bit. The `streams_file` key is optional, as Stream Processor configurations can be defined directly in the `streams` section of the YAML schema. This flexibility allows for easier and more centralized configuration. [Learn more about Stream Processing configuration](../../../stream-processing/introduction.md). | _none_ | +| `streams_file` | Path for the Stream Processor configuration file. This file defines the rules and operations for stream processing within Fluent Bit. The `streams_file` key is optional, as Stream Processor configurations can be defined directly in the `streams` section of the YAML schema. This flexibility allows for easier and more centralized configuration. [Learn more about Stream Processing configuration](../../../stream-processing/overview.md). | _none_ | | `http_server` | Enables the built-in HTTP Server. | `off` | | `http_listen` | Sets the listening interface for the HTTP Server when it's enabled. | `0.0.0.0` | | `http_port` | Sets the TCP port for the HTTP Server. | `2020` | diff --git a/installation/requirements.md b/installation/requirements.md index f71f5365d..e994729b8 100644 --- a/installation/requirements.md +++ b/installation/requirements.md @@ -6,7 +6,7 @@ The build process requires the following components: - Compiler: GCC or clang - CMake -- Flex and Bison: Required for [Stream Processor](https://docs.fluentbit.io/manual/stream-processing/introduction) or [Record Accessor](https://docs.fluentbit.io/manual/administration/configuring-fluent-bit/classic-mode/record-accessor) +- Flex and Bison: Required for [Stream Processor](https://docs.fluentbit.io/manual/stream-processing/overview) or [Record Accessor](https://docs.fluentbit.io/manual/administration/configuring-fluent-bit/classic-mode/record-accessor) - Libyaml development headers and libraries Core has no other dependencies. Some features depend on third-party components. For example, output plugins with special backend libraries like Kafka include those libraries in the main source code repository. diff --git a/pipeline/outputs/kinesis.md b/pipeline/outputs/kinesis.md index 047ba89f6..dd6617ce3 100644 --- a/pipeline/outputs/kinesis.md +++ b/pipeline/outputs/kinesis.md @@ -53,7 +53,7 @@ In your main configuration file append the following: pipeline: outputs: - - name: kinesis_steams + - name: kinesis_streams match: '*' region: us-east-1 stream: my-stream diff --git a/stream-processing/changelog.md b/stream-processing/changelog.md deleted file mode 100644 index f51dca6f1..000000000 --- a/stream-processing/changelog.md +++ /dev/null @@ -1,54 +0,0 @@ -# Changelog - -This page details new additions to the stream processor engine in major release versions of Fluent Bit. - -## Fluent Bit v1.2 - -> Release date: June 27, 2019 - -### Sub-key selection and conditionals support - -Added the ability to use nested maps and sub-keys to perform conditions and key selections. For example, consider the following record: - -```javascript -{ - "key1": 123, - "key2": 456, - "key3": { - "sub1": { - "sub2": 789 - } - } -} -``` - -Now you can perform queries like: - -```sql -SELECT key3['sub1']['sub2'] FROM STREAM:test WHERE key3['sub1']['sub2'] = 789; -``` - -### New @record functions - -For conditionals, added the new _@record_ functions: - -| Function | Description | -| :--- | :--- | -| `@record.time()` | Returns the record timestamp. | -| `@record.contains(key)` | Returns `true` or false if `key` exists in the record, or `false` if not. | - -### `IS NULL` and `IS NOT NULL` - -Added `IS NULL` and `IS NOT NULL` statements to determine whether an existing key in a record has a null value. For example: - -```sql -SELECT * FROM STREAM:test WHERE key3['sub1'] IS NOT NULL; -``` - -For more details, see [Check keys and null values](../stream-processing/getting-started/check-keys-null-values.md). - -## Fluent Bit v1.1 - -> Release date: 2019-05-09 - -Added the stream processor to Fluent Bit. diff --git a/stream-processing/getting-started/check-keys-null-values.md b/stream-processing/check-keys-null-values.md similarity index 100% rename from stream-processing/getting-started/check-keys-null-values.md rename to stream-processing/check-keys-null-values.md diff --git a/stream-processing/getting-started/fluent-bit-sql.md b/stream-processing/fluent-bit-sql.md similarity index 100% rename from stream-processing/getting-started/fluent-bit-sql.md rename to stream-processing/fluent-bit-sql.md diff --git a/stream-processing/get-started.md b/stream-processing/get-started.md deleted file mode 100644 index 86e9b2346..000000000 --- a/stream-processing/get-started.md +++ /dev/null @@ -1,9 +0,0 @@ -# Get started - -| Concept | Description | -| :--- | :--- | -| Stream | A stream is a single flow of data being ingested by an input plugin. By default, each stream name is the name of its input plugin plus a number (for example, `tail.0`). You can use the `alias` property to change this name. | -| Task | A single execution unit. For example, a SQL query. | -| Results | After a stream processor runs a SQL query, results are generated. You can re-ingest these results back into the main Fluent Bit pipeline or redirect them to the standard output interface for debugging purposes. | -| Tag | Fluent Bit groups records and assigns tags to them. These tags define routing rules and can be used to apply stream processors to specific tags that match a pattern. | -| Match | Matching rules can use a wildcard to match specific records associated with a tag. | diff --git a/stream-processing/introduction.md b/stream-processing/introduction.md deleted file mode 100644 index 1fbea8754..000000000 --- a/stream-processing/introduction.md +++ /dev/null @@ -1,7 +0,0 @@ -# Introduction to stream processing - -![Fluent Bit stream processing](../.gitbook/assets/stream_processor.png) - -Fluent Bit is a fast and flexible log processor that collects, parsers, filters, and delivers logs to remote databases, where data analysis can then be performed. - -For real-time and complex analysis needs, you can also process the data while it's still in motion through _stream processing on the edge_. diff --git a/stream-processing/overview.md b/stream-processing/overview.md index 7d0cee09e..511af1d21 100644 --- a/stream-processing/overview.md +++ b/stream-processing/overview.md @@ -1,12 +1,12 @@ # Overview -Stream processing is a feature that lets you query continuous data streams while they're still in motion. Fluent Bit uses a streaming SQL engine for this process. +Stream processing is a feature that lets you query continuous data streams while they're still in the log processor. Fluent Bit uses a streaming SQL engine for this process. To understand how stream processing works in Fluent Bit, follow this overview of Fluent Bit architecture and how data travels through the pipeline. ## Fluent Bit data pipeline -[Fluent Bit](https://fluentbit.io) collects and process logs (also known as _records_) from different input sources, then parses and filters these records before they're stored. After data is processed and in a safe state, meaning either in memory or in the file system, the records are routed through the proper output destinations. +Fluent Bit collects and process logs (also known as _records_) from different input sources, then parses and filters these records before they're stored. After data is processed and in a safe state, meaning either in memory or in the file system, the records are routed through the proper output destinations. Most of the phases in the pipeline are implemented through plugins: input, filter, and output. @@ -25,3 +25,13 @@ Every input instance is considered a stream. These streams collect data and inge By configuring specific SQL queries, you can perform specific tasks like key selections, filtering, and data aggregation. Keep in mind that there is no database; everything is schema-less and happens in memory. Concepts like tables that are common in relational database don't exist in Fluent Bit. One powerful feature of the Fluent Bit stream processor is the ability to create new streams of data using the results from a previous SQL query. These results are re-ingested back into the pipeline to be consumed again for the stream processor, if desired, or routed to output destinations by any common record using tag/matching rules. (Stream processor results can be tagged.) + +## Concepts + +| Concept | Description | +| :--- | :--- | +| Stream | A stream is a single flow of data being ingested by an input plugin. By default, each stream name is the name of its input plugin plus a number (for example, `tail.0`). You can use the `alias` property to change this name. | +| Task | A single execution unit. For example, a SQL query. | +| Results | After a stream processor runs a SQL query, results are generated. You can re-ingest these results back into the main Fluent Bit pipeline or redirect them to the standard output interface for debugging purposes. | +| Tag | Fluent Bit groups records and assigns tags to them. These tags define routing rules and can be used to apply stream processors to specific tags that match a pattern. | +| Match | Matching rules can use a wildcard to match specific records associated with a tag. | diff --git a/stream-processing/stream-processing.md b/stream-processing/stream-processing.md deleted file mode 100644 index bd06b3687..000000000 --- a/stream-processing/stream-processing.md +++ /dev/null @@ -1,7 +0,0 @@ -# Introduction - -![Fluent Bit stream processing](../.gitbook/assets/stream_processor.png) - -[Fluent Bit](https://fluentbit.io) is a fast and flexible log processor that aims to collect, parse, filter, and deliver logs to remote databases so data analysis can be performed. - -Data analysis usually happens after the data is stored and indexed in a database. However, for real-time and complex analysis needs, processing the data while it's still in motion in the log processor brings a lot of advantages. This approach is called **Stream Processing on the Edge**. diff --git a/stream-processing/getting-started/hands-on.md b/stream-processing/tutorial.md similarity index 100% rename from stream-processing/getting-started/hands-on.md rename to stream-processing/tutorial.md