Skip to content

Commit

Permalink
Introduce datasources in package to configure inputs and streams
Browse files Browse the repository at this point in the history
In elastic/beats#15940 datasources, inputs and streams are introduced into the agent config. To make it possible to configure these in the UI and through the API, some changes to the manifest definitions of a package and datasets are needed.

**Package manifest**

Each package must specify the datasources it supports with the supported inputs inside. So far all the packages only support one datasource but I want to keep the door open for this to potentially change in the future. It also makes it possible to have the manifest config of a datasource be identical to the config which ends up in the agent config.

The package manifest datasource definition looks as following (nginx example):

```
datasources:
  -
    # Do we need a name for the data source?
    name: nginx

    # List of inputs this datasource supports
    inputs:
      -
        # An id can be given, in case the type used here is not unique
        # This is for selection in the stream
        # id: nginx
        type: metrics/nginx

        # Common configuration options for this input
        vars:
          - name: hosts
            description: Nginx hosts
            default:
              ["http://127.0.0.1"]
            # All the config options that are required should be shown in the UI
            required: true
          - name: period
            description: "Collection period. Valid values: 10s, 5m, 2h"
            default: "10s"
          - name: username
            type: text
          - name: password
            # This is the html input type?
            type: password

      -
        type: logs

        # Common configuration options for this input
        vars:

      -
        type: syslog

        # Common configuration options for this input
        vars:

```

Inside the datasource, the supported inputs are specified with the common variables across all streams which use a certain input. In the UI I expect that we show the `required` configs by default and all the others are under "Advanced" or similar.

**Dataset manifest**

With the datasources and inputs defined on the package level, each dataset can specify which inputs it supports. Most datasets will only support one input for now. For the nginx metrics this looks as following:

```
inputs:
  - type: "metric/nginx"

    # Only the variables have to be repeated that are not specified as part of the input
    vars:
      # All variables are specified in the input already
```

As an example with supporting multiple inputs, we have the nginx error logs:

```
inputs:
  - type: log
    vars:
      - name: paths
        required: true
        default:
          - /var/log/nginx/error.log*
        os.darwin:
          - /usr/local/var/log/nginx/error.log*
        os.windows:
          - c:/programdata/nginx/logs/error.log*

  - type: syslog
    vars:
      # Are udp and tcp syslog input two different inputs?
      - name: protocol.udp.host
        required: true
        default:
          - "localhost:9000"
```

The log and syslog input are supported (not the case today, just an example). One the dataset level also all additional variables for this dataset are specified. The ones already specified on the input level in the package don't have to be specified again.

**Stream definition**

Now that the dataset has its supported inputs and variables defined, the stream can be defined. The stream defines which input it uses from the dataset and its configuration variables. Here an example for nginx metrics:

```
input: metrics/nginx
metricsets: ["stubstatus"]
period: {{period}}
enabled: true

hosts: {{hosts}}

{{#if username}}
username: "{{username}}"
{{/if}}
{{#if password}}
password: "{{password}}"
{{/if}}
```

During creation time of the stream config the variables from the datasource inputs and local variables from the dataset are filled in.

A stream definition could also support multiple inputs as seen in the following example:

```

{{#if input == log}}
input: log

{{#each paths}}
paths: "{{this}}"
{{/each}}
exclude_files: [".gz$"]

processors:
  - add_locale: ~
{{/if}}

{{#if input == syslog}}
input: syslog

{{/if}}
```

**Further changes**

* Rename `agent/input` to `agent/stream` as a stream is configured there.
  • Loading branch information
ruflin committed Feb 11, 2020
1 parent b9cd78b commit e1e166b
Show file tree
Hide file tree
Showing 18 changed files with 145 additions and 64 deletions.

This file was deleted.

Original file line number Diff line number Diff line change
@@ -1,8 +1,10 @@
type: log

input: log
paths:
{{#each paths}}
paths: "{{this}}"
- "{{this}}"
{{/each}}
ingest_pipeline: {{pipeline}}

exclude_files: [".gz$"]

processors:
Expand Down
28 changes: 16 additions & 12 deletions dev/package-examples/nginx-1.2.0/dataset/access/manifest.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,18 +8,22 @@ type: logs
# The ingest pipeline which should be used.
ingest_pipeline: default

vars:
- name: paths
# Should we define this as array? How will the UI best make sense of it?
type: textarea
default:
- /var/log/nginx/access.log*
# I suggest to use ECS fields for this config options here: https://github.com/elastic/ecs/blob/master/schemas/os.yml
# This would need to be based on a predefined definition on what can be filtered on
os.darwin:
- /usr/local/var/log/nginx/access.log*
os.windows:
- c:/programdata/nginx/logs/*access.log*
# List of supported inputs
inputs:
- type: log
vars:
- name: paths
required: true
# Should we define this as array? How will the UI best make sense of it?
type: textarea
default:
- /var/log/nginx/access.log*
# I suggest to use ECS fields for this config options here: https://github.com/elastic/ecs/blob/master/schemas/os.yml
# This would need to be based on a predefined definition on what can be filtered on
os.darwin:
- /usr/local/var/log/nginx/access.log*
os.windows:
- c:/programdata/nginx/logs/*access.log*

requirements:
elasticsearch.processors:
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@

# The selected input has to be passed to the stream config when processed.

{{#if input == log}}
input: log

{{#each paths}}
paths: "{{this}}"
{{/each}}
exclude_files: [".gz$"]

processors:
- add_locale: ~
{{/if}}


# This is an example stream config on how multiple inputs could be supported

{{#if input == syslog}}
input: syslog

# TODO: would need some more config options
{{/if}}
32 changes: 24 additions & 8 deletions dev/package-examples/nginx-1.2.0/dataset/error/manifest.yml
Original file line number Diff line number Diff line change
@@ -1,11 +1,27 @@
title: Nginx Error Logs
type: logs
ingest_pipeline: pipeline
vars:
- name: paths
default:
- /var/log/nginx/error.log*
os.darwin:
- /usr/local/var/log/nginx/error.log*
os.windows:
- c:/programdata/nginx/logs/error.log*


# This is an example that multiple inputs are supported by one dataset
inputs:
- type: log
vars:
- name: paths
required: true
default:
- /var/log/nginx/error.log*
os.darwin:
- /usr/local/var/log/nginx/error.log*
os.windows:
- c:/programdata/nginx/logs/error.log*

- type: syslog
vars:
# Are udp and tcp syslog input two different inputs?
- name: protocol.udp.host
required: true
default:
- "localhost:9000"


Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
type: metric/nginx
input: metrics/nginx
metricsets: ["stubstatus"]
period: {{period}}
enabled: true
Expand Down
21 changes: 7 additions & 14 deletions dev/package-examples/nginx-1.2.0/dataset/stubstatus/manifest.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,17 +10,10 @@ compatibility: linux, freebsd
# Each input can be in its own release status
release: beta

vars:
- name: hosts
description: Nginx hosts
default:
["http://127.0.0.1"]
required: true
- name: period
description: "Collection period. Valid values: 10s, 5m, 2h"
default: "10s"
- name: username
type: text
- name: password
# This is the html input type?
type: password
# List of inputs this dataset supports
inputs:
- type: "metric/nginx"

# Only the variables have to be repeated that are not specified as part of the input
vars:
# All variables are specified in the input already
42 changes: 42 additions & 0 deletions dev/package-examples/nginx-1.2.0/manifest.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,45 @@ requirement:
versions: ">7.1.0 <7.6.0"
elasticsearch:
versions: ">7.0.1"

datasources:
-
# Do we need a name for the data source?
name: nginx

# List of inputs this datasource supports
inputs:
-
# An id can be given, in case the type used here is not unique
# This is for selection in the stream
# id: nginx
type: metrics/nginx

# Common configuration options for this input
vars:
- name: hosts
description: Nginx hosts
default:
["http://127.0.0.1"]
# All the config options that are required should be shown in the UI
required: true
- name: period
description: "Collection period. Valid values: 10s, 5m, 2h"
default: "10s"
- name: username
type: text
- name: password
# This is the html input type?
type: password

-
type: logs

# Common configuration options for this input
vars:

-
type: syslog

# Common configuration options for this input
vars:

This file was deleted.

Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
input: metrics/system
enabled: false # default true
metricset: cpu
period: 10s
dataset: system.cpu
metrics: ["percentages", "normalized_percentages"]
3 changes: 2 additions & 1 deletion dev/package-examples/system-0.9.0/dataset/cpu/manifest.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,5 @@ release: beta
# Needs to describe the type of this input
type: metrics


inputs:
- type: metrics/system

This file was deleted.

Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
inputy: metric/system
enabled: true
metricsets:
- load
3 changes: 2 additions & 1 deletion dev/package-examples/system-0.9.0/dataset/load/manifest.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,5 @@ release: beta
# Needs to describe the type of this input
type: metrics


inputs:
- type: system/metrics

This file was deleted.

Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
input: metric/system
enabled: true
metricsets:
- memory
2 changes: 2 additions & 0 deletions dev/package-examples/system-0.9.0/dataset/memory/manifest.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,6 @@ release: beta
# Needs to describe the type of this input
type: metrics

inputs:
- type: system/metrics

7 changes: 7 additions & 0 deletions dev/package-examples/system-0.9.0/manifest.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,10 @@ release: ga
requirement:
kibana:
versions: "<8.0.0"


datasources:
- inputs:
type: metrics/system
- inputs:
type: log

0 comments on commit e1e166b

Please sign in to comment.