Skip to content

Commit

Permalink
Introduce datasources in package to configure inputs and streams (#212)
Browse files Browse the repository at this point in the history
In elastic/beats#15940 datasources, inputs and streams are introduced into the agent config. To make it possible to configure these in the UI and through the API, some changes to the manifest definitions of a package and datasets are needed.

**Package manifest**

Each package must specify the datasources it supports with the supported inputs inside. So far all the packages only support one datasource but I want to keep the door open for this to potentially change in the future. It also makes it possible to have the manifest config of a datasource be identical to the config which ends up in the agent config.

The package manifest datasource definition looks as following (nginx example):

```
datasources:
  -
    # Do we need a name for the data source?
    name: nginx

    # List of inputs this datasource supports
    inputs:
      -
        # An id can be given, in case the type used here is not unique
        # This is for selection in the stream
        # id: nginx
        type: metrics/nginx

        # Common configuration options for this input
        vars:
          - name: hosts
            description: Nginx hosts
            default:
              ["http://127.0.0.1"]
            # All the config options that are required should be shown in the UI
            required: true
          - name: period
            description: "Collection period. Valid values: 10s, 5m, 2h"
            default: "10s"
          - name: username
            type: text
          - name: password
            # This is the html input type?
            type: password

      -
        type: logs

        # Common configuration options for this input
        vars:

      -
        type: syslog

        # Common configuration options for this input
        vars:

```

Inside the datasource, the supported inputs are specified with the common variables across all streams which use a certain input. In the UI I expect that we show the `required` configs by default and all the others are under "Advanced" or similar.

**Dataset manifest**

With the datasources and inputs defined on the package level, each dataset can specify which inputs it supports. Most datasets will only support one input for now. For the nginx metrics this looks as following:

```
inputs:
  - type: "metric/nginx"

    # Only the variables have to be repeated that are not specified as part of the input
    vars:
      # All variables are specified in the input already
```

As an example with supporting multiple inputs, we have the nginx error logs:

```
inputs:
  - type: log
    vars:
      - name: paths
        required: true
        default:
          - /var/log/nginx/error.log*
        os.darwin:
          - /usr/local/var/log/nginx/error.log*
        os.windows:
          - c:/programdata/nginx/logs/error.log*

  - type: syslog
    vars:
      # Are udp and tcp syslog input two different inputs?
      - name: protocol.udp.host
        required: true
        default:
          - "localhost:9000"
```

The log and syslog input are supported (not the case today, just an example). One the dataset level also all additional variables for this dataset are specified. The ones already specified on the input level in the package don't have to be specified again.

**Stream definition**

Now that the dataset has its supported inputs and variables defined, the stream can be defined. The stream defines which input it uses from the dataset and its configuration variables. Here an example for nginx metrics:

```
input: metrics/nginx
metricsets: ["stubstatus"]
period: {{period}}
enabled: true

hosts: {{hosts}}

{{#if username}}
username: "{{username}}"
{{/if}}
{{#if password}}
password: "{{password}}"
{{/if}}
```

During creation time of the stream config the variables from the datasource inputs and local variables from the dataset are filled in.

A stream definition could also support multiple inputs as seen in the following example:

```

{{#if input == log}}
input: log

{{#each paths}}
paths: "{{this}}"
{{/each}}
exclude_files: [".gz$"]

processors:
  - add_locale: ~
{{/if}}

{{#if input == syslog}}
input: syslog

{{/if}}
```

**Further changes**

* Rename `agent/input` to `agent/stream` as a stream is configured there.
  • Loading branch information
ruflin committed Feb 18, 2020
1 parent 7ef0654 commit e965931
Show file tree
Hide file tree
Showing 18 changed files with 157 additions and 64 deletions.

This file was deleted.

Original file line number Diff line number Diff line change
@@ -1,8 +1,10 @@
type: log

input: log
paths:
{{#each paths}}
paths: "{{this}}"
- "{{this}}"
{{/each}}
ingest_pipeline: {{pipeline}}

exclude_files: [".gz$"]

processors:
Expand Down
32 changes: 20 additions & 12 deletions dev/package-examples/nginx-1.2.0/dataset/access/manifest.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,18 +8,26 @@ type: logs
# The ingest pipeline which should be used.
ingest_pipeline: default

vars:
- name: paths
# Should we define this as array? How will the UI best make sense of it?
type: textarea
default:
- /var/log/nginx/access.log*
# I suggest to use ECS fields for this config options here: https://github.com/elastic/ecs/blob/master/schemas/os.yml
# This would need to be based on a predefined definition on what can be filtered on
os.darwin:
- /usr/local/var/log/nginx/access.log*
os.windows:
- c:/programdata/nginx/logs/*access.log*
# List of supported inputs
inputs:
- type: log
vars:
- name: paths
required: true
# Should we define this as array? How will the UI best make sense of it?
description: Paths to the nginx access log file.
type: text
multi: true
default:
- /var/log/nginx/access.log*
# I suggest to use ECS fields for this config options here: https://github.com/elastic/ecs/blob/master/schemas/os.yml
# This would need to be based on a predefined definition on what can be filtered on
os.darwin:
default:
- /usr/local/var/log/nginx/access.log*
os.windows:
default:
- c:/programdata/nginx/logs/*access.log*

requirements:
elasticsearch.processors:
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@

# The selected input has to be passed to the stream config when processed.

{{#if input == log}}
input: log

{{#each paths}}
paths: "{{this}}"
{{/each}}
exclude_files: [".gz$"]

processors:
- add_locale: ~
{{/if}}


# This is an example stream config on how multiple inputs could be supported

{{#if input == syslog}}
input: syslog

# TODO: would need some more config options
{{/if}}
34 changes: 26 additions & 8 deletions dev/package-examples/nginx-1.2.0/dataset/error/manifest.yml
Original file line number Diff line number Diff line change
@@ -1,11 +1,29 @@
title: Nginx Error Logs
type: logs
ingest_pipeline: pipeline
vars:
- name: paths
default:
- /var/log/nginx/error.log*
os.darwin:
- /usr/local/var/log/nginx/error.log*
os.windows:
- c:/programdata/nginx/logs/error.log*


# This is an example that multiple inputs are supported by one dataset
inputs:
- type: log
vars:
- name: paths
required: true
default:
- /var/log/nginx/error.log*

# TODO: The exact definition of os specific paths still needs to be defined
os.darwin:
- /usr/local/var/log/nginx/error.log*
os.windows:
- c:/programdata/nginx/logs/error.log*

- type: syslog
vars:
# Are udp and tcp syslog input two different inputs?
- name: protocol.udp.host
required: true
default:
- "localhost:9000"


Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
type: metric/nginx
# Defines which input to use, should be stripped out when creating the config
input: metrics/nginx
metricsets: ["stubstatus"]
period: {{period}}
enabled: true
Expand Down
21 changes: 7 additions & 14 deletions dev/package-examples/nginx-1.2.0/dataset/stubstatus/manifest.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,17 +10,10 @@ compatibility: linux, freebsd
# Each input can be in its own release status
release: beta

vars:
- name: hosts
description: Nginx hosts
default:
["http://127.0.0.1"]
required: true
- name: period
description: "Collection period. Valid values: 10s, 5m, 2h"
default: "10s"
- name: username
type: text
- name: password
# This is the html input type?
type: password
# List of inputs this dataset supports
inputs:
- type: "metric/nginx"

# Only the variables have to be repeated that are not specified as part of the input
vars:
# All variables are specified in the input already
47 changes: 47 additions & 0 deletions dev/package-examples/nginx-1.2.0/manifest.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,50 @@ requirement:
versions: ">7.1.0 <7.6.0"
elasticsearch:
versions: ">7.0.1"

datasources:
-
# Do we need a name for the data source?
name: nginx

# List of inputs this datasource supports
inputs:
-
# An id can be given, in case the type used here is not unique
# This is for selection in the stream
# id: nginx
type: nginx/metrics
descrition: Collecting metrics for nginx.

# Common configuration options for this input
vars:
- name: hosts
description: Nginx hosts
default:
["http://127.0.0.1"]
# All the config options that are required should be shown in the UI
required: true
multi: true
type: text
- name: period
description: "Collection period. Valid values: 10s, 5m, 2h"
default: "10s"
type: duration
- name: username
type: text
- name: password
# This is the html input type?
type: password

-
type: logs
description: Collect nginx logs.

# Common configuration options for this input
vars:

-
type: syslog

# Common configuration options for this input
vars:

This file was deleted.

Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
input: metrics/system
enabled: false # default true
metricset: cpu
period: 10s
dataset: system.cpu
metrics: ["percentages", "normalized_percentages"]
3 changes: 2 additions & 1 deletion dev/package-examples/system-0.9.0/dataset/cpu/manifest.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,5 @@ release: beta
# Needs to describe the type of this input
type: metrics


inputs:
- type: metrics/system

This file was deleted.

Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
inputy: metric/system
enabled: true
metricsets:
- load
3 changes: 2 additions & 1 deletion dev/package-examples/system-0.9.0/dataset/load/manifest.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,5 @@ release: beta
# Needs to describe the type of this input
type: metrics


inputs:
- type: system/metrics

This file was deleted.

Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
input: metric/system
enabled: true
metricsets:
- memory
2 changes: 2 additions & 0 deletions dev/package-examples/system-0.9.0/dataset/memory/manifest.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,6 @@ release: beta
# Needs to describe the type of this input
type: metrics

inputs:
- type: system/metrics

7 changes: 7 additions & 0 deletions dev/package-examples/system-0.9.0/manifest.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,10 @@ release: ga
requirement:
kibana:
versions: "<8.0.0"


datasources:
- inputs:
type: metrics/system
- inputs:
type: log

0 comments on commit e965931

Please sign in to comment.