Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Elastic Agent] Support variable substitution from providers #20781

Closed
blakerouse opened this issue Aug 25, 2020 · 22 comments
Closed

[Elastic Agent] Support variable substitution from providers #20781

blakerouse opened this issue Aug 25, 2020 · 22 comments
Assignees
Labels
Ingest Management:beta2 Group issues for ingest management beta2

Comments

@blakerouse
Copy link
Contributor

blakerouse commented Aug 25, 2020

Overview

To enable dynamic inputs #19225 Elastic Agent needs to support variable substitution. The key/values for the variable substitution come from sources known as providers. This issue is to add support for both the required variables and providers, the first part of handling dynamic inputs.

Providers

Providers provide the key/values that can be used in the variable substitution. Each providers keys are automatically prefixed with the name of the provider in the context of the Elastic Agent. This removes the requirement of worrying about conflicts/overwriting and any type of undetermined behavior.

Example if a provider named foo provides {"key1": "value1", "key2": "value2"} it would be placed in {"foo" : {"key1": "value1", "key2": "value2"}}. Allowing it to be referenced as {{foo.key1}} and {{foo.key2}}.

After discussions its clear that 2 different types of provides need to be supported.

Context Providers

Context providers provide the current context of the running Elastic Agent. Example is Agent information (id, version, etc.), Host information (hostname, IP addresses), Environment information (environment variables).

They can only provide a single key/value mapping. Think of them as singletons, an update of a key/value mapping will result in a re-evaluation of the entire configuration. These provides are normally very static, but it is not required that they behave in that manner. It is possible for a value to change resulting in re-evaluation.

ECS naming should be used for context providers when possible to ensure that documentation and understanding across projects for Elastic is clear and understandable.

Initial Set

  • agent - provides agent information
    • id - current agent ID
    • version - current agent version information object
      • version - version string
      • commit - version commit
      • build_time - version build time
      • snapshot - version is snapshot build
  • host - provides host information
    • name - host hostname
    • platform - host platform
    • architecture - host architecture
    • ip[] - host IP addresses
    • mac[] - host MAC addresses
  • env - provides environment variables
    • key/values from current environment variables
  • local - provides a configurable static mapping
    • vars - key/values come from the configuration user set in elastic-agent.yml

Dynamic Providers

Dynamic providers are different then context provides in that they actually provide an array of multiple key/value mappings. Each key/value mapping is combined with the previous context providers key/value mapping providing a new unique key/value mapping that is then used to generate a configuration.

Each unique mapping must provide a unique ID for that mapping. This allows the provider to modify the data of an already provided mapping or remove a mapping. This is represented below with the objects of .id and .mapping.

Each unique mapping can also provide processors on the input. This only applied in the case that a substitution was made on that input. Look at processors from the Docker provider example.

Below shows the flow of how this will work synchronously (it will be implemented asynchronously, so changes are handled as they occur):

config := getConfig()   // current unparsed Elastic Agent configuration
current := getContext()  // current contexts from all the Context Provides
for _, provider := range providers {
    for _, mapping := range provider.GetMappings() {
        providerContext := duplicateContext(current)  // duplicate current context
        providerContext.Merge(mapping)  // merge the current key/value
        newConfig := parseConfig(config, providerContext) // parse the config using the new key/value mapping
        //
        // use the new config to create inputs
        //
    }
}

To give a good example of this would be with a Docker dynamic provider. Imagine that the Docker provider provides the following:

[
    {
       "id": "1",
       "mapping:": {"id": "1", "paths": {"log": "/var/log/containers/1.log"}},
       "processors": {"add_fields": {"container.name": "my-container"}},
    },
    {
        "id": "2",
        "mapping": {"id": "2", "paths": {"log": "/var/log/containers/2.log"}},
        "processors": {"add_fields": {"container.name": "other-container"}},
    },
]

Elastic Agent automatically prefixes the result with docker.

[
    {"docker": {"id": "1", "paths": {"log": "/var/log/containers/1.log"}}},
    {"docker": {"id": "2", "paths": {"log": "/var/log/containers/2.log"}},
]

With the following defined configuration in Elastic Agent:

inputs:
  - type: logfile
    path: "${docker.paths.log}"

This would result in the following generated configuration:

inputs:
  - type: logfile
    path: "/var/log/containers/1.log"
    processors:
      - add_fields:
          container.name: my-container
  - type: logfile
    path: "/var/log/containers/2.log"
    processors:
      - add_fields:
          container.name: other-container

Initial Set

  • docker - provides inventory of the docker
    • id - ID of the container
    • cmd - Arg path of container
    • name - Name of the container
    • image - Image of the container
    • labels - Labels of the container
    • ports - Ports of the container
    • paths - Object of paths for the container
      • log - Log path of the container
  • local_dynamic - provides a static configuration of dynamic mappings
    • vars - List of key/value mappings

Variable Substitution

To align with similar syntax as Beats configuration variable substitution using ${ var } will be used.

When an input uses a variable substitution that is not present in the current key/value mapping being evaluated that input is removed in the result.

inputs:
  - type: logfile
    path: "/var/log/foo"
  - type: logfile
    path: "${ unknown.key }"

Result because no unknown.key exists:

inputs:
  - type: logfile
    path: "/var/log/foo"

Variable substitution can also define alternative variables or a constant. A constant must be defined with either ' or ". Once a constant is reached in the substitution evaluation of any remaining or variables will be ignored, so a constant should really be the last entry in the substitution. Defining alternatives is done with | followed by the next variable or constant. The power comes from allowing the input to defined the preference order of the substitution by chaining multiple |..var.. together.

inputs:
  - type: logfile
    path: "/var/log/foo"
  - type: logfile
    path: "${docker.paths.log|kubernetes.container.paths.log|'/var/log/other'}"

NOTE: Because ${ } will collide with go-ucfg library that Elastic Agent uses to parse the configuration file. Variable parsing by go-ucfg will be disabled for all of Elastic Agent.

Configuring

Configuring providers comes from the top-level key of providers in the elastic-agent.yml configuration. By default all registered providers are enabled, if they cannot connect (in docker case) they just produce no mappings.

providers:
  local:
    vars:
      foo: bar
  local_dynamic:
    vars:
      - item: key1
      - item: key2

A provider can be explicitly disabled with enabled: false when defined, and because all providers are prefixed and do not have a collision the name of the provider is the key in the configuration.

providers:
  docker:
    enabled: false

Debugging

Moved to elastic/elastic-agent#123

@elasticmachine
Copy link
Collaborator

Pinging @elastic/ingest-management (Team:Ingest Management)

@ph
Copy link
Contributor

ph commented Aug 25, 2020

I like the distinction between context provider vs inventory provider. Another provider to consider could be a "static provider" or "map provider", where a user could define a list of key/values pair that could be used in the templates and a user could define the "prefix" to use for that provider. This could help in two different use cases:

  • When developing configurations or testing.
  • When we have "reusable" integrations policy, they could make the provider different for a specific agent.

@ph
Copy link
Contributor

ph commented Aug 25, 2020

Let's make sure we add a point for "default" values for variables.

@ph
Copy link
Contributor

ph commented Aug 25, 2020

@blakerouse Seriously the ability to communicate with the running daemon open so many use cases this is a really great addition for debugging, I haven't considered security implication, but could we have an option to "tail" configuration changes as dynamic inputs get created? Maybe this idea need more thinking but I wanted to just push the idea.

@blakerouse
Copy link
Contributor Author

blakerouse commented Aug 25, 2020

@ph Update the description with many changes.

  • Added a new configuration section
  • Added static context provider
  • Added initial docker inventory provider
  • Added initial items inventory provider
  • Added the processors addition to inventory providers
  • Added default variable substitution
  • Added --watch for inspect command

@ruflin
Copy link
Member

ruflin commented Aug 26, 2020

Proposal LGTM. Few random notes from my end:

  • For context providers we control, we should ensure ECS field naming is used
  • Do we need chaining of defaults?
  • You specified processors in the provider metadata. Do you see other processors used there? Alternative idea is that the provider just provides data that needs to be added and does not require knowledge around processors.
  • "Inventory" as a name could lead to a naming conflict. Perhaps we call it "dynamic" providers. But naming ...
  • Is there a use case to have a provider specified twice? I can't see one but wanted to bring it up.
  • Can a user specify variables from two different dynamic providers in a single input? I assume in this case, it would never run as never both variable matches?

To move forward on this quickly I would even skip the Docker provider at first and only have "items". The Docker provider has a dependency on code from the current autodiscovery.

On naming: I was confused at first around static vs items. Potential alternatives: local and local_list, local_context, local_dynamic. My guess is that the items would be mainly for testing. The odd part here would be the prefix. The other part that threw me off at first was mappings as I thought of Elasticsearch mappings. In the packages we call this vars: https://github.com/elastic/package-storage/blob/production/packages/aws/0.2.4/manifest.yml#L135

@blakerouse
Copy link
Contributor Author

Proposal LGTM. Few random notes from my end:

  • For context providers we control, we should ensure ECS field naming is used

Added information that ECS naming should be used if a field already exists that matches that value.

  • Do we need chaining of defaults?

No because each provider is unique in name and always prefixed with the provider name. So no need to worry about order of preference of collisions.

  • You specified processors in the provider metadata. Do you see other processors used there? Alternative idea is that the provider just provides data that needs to be added and does not require knowledge around processors.

I do not at the moment, but I also wanted to future proof it. But if you think it will only every be adding fields to the produced events, I can change to that behavior.

  • "Inventory" as a name could lead to a naming conflict. Perhaps we call it "dynamic" providers. But naming ...

Renamed "Inventory" to "Dynamic".

  • Is there a use case to have a provider specified twice? I can't see one but wanted to bring it up.

No.

  • Can a user specify variables from two different dynamic providers in a single input? I assume in this case, it would never run as never both variable matches?

No, an input can only target a single dynamic provider. Or by using defaults of a variable {{docker.paths.log|kubernetes.container.paths.log}}.

To move forward on this quickly I would even skip the Docker provider at first and only have "items". The Docker provider has a dependency on code from the current autodiscovery.

Okay. Honestly, I think adding a basic Docker will be very simple. We will see.

On naming: I was confused at first around static vs items. Potential alternatives: local and local_list, local_context, local_dynamic. My guess is that the items would be mainly for testing. The odd part here would be the prefix. The other part that threw me off at first was mappings as I thought of Elasticsearch mappings. In the packages we call this vars: https://github.com/elastic/package-storage/blob/production/packages/aws/0.2.4/manifest.yml#L135

Updated to local and local_dynamic. Updated to use vars instead of mapping and mappings.

@ruflin
Copy link
Member

ruflin commented Aug 27, 2020

@blakerouse Thanks for the changes. On the naming side, I'm not very happy with my proposals, it was just a suggestion, so you have other ideas please share them :-) At the same time, this should not block any implementation steps.

@blakerouse
Copy link
Contributor Author

@ruflin I am also not set on the names. So I changed it for now.

I do wonder instead of the name of Dynamic Inputs we should call it Composable Inputs. With ContextProviders and DynamicProviders as the provider types.

@ruflin
Copy link
Member

ruflin commented Aug 31, 2020

@blakerouse WDYT about putting a Gdoc together with all the things we need to name?

@blakerouse
Copy link
Contributor Author

@ruflin I can do that.

@blakerouse
Copy link
Contributor Author

I have updated the description to use the new variable syntax of ${env.host|host.name|'constant'} and the disablement of using go-ucfg variable substitution for all of Elastic Agent.

The reason this decision was made, was because it was clear that using both Handlebar style syntax for the variable was going to require integration package developers to worry about escaping. To remove that and make it clear that ${ } are for Agent and {{ }} are for packages we choose to make this change.

The removal of using go-ucfg variable parsing also make it clear that only one type of syntax is supported by Agent. This will require some extra work on Agent side to parse configurations but I think it is the proper approach moving forward.

@blakerouse
Copy link
Contributor Author

@ph @ruflin After playing around with my branch that implements the features described in this issue I think we should remove the automatic processor appending to the input. I think that it adds to much magic and it doesn't allow a user to really control those fields. It also places the logic in Elastic Agent instead of a Package, which I think is wrong. If we remove the logic from Agent and allow the Package to define the add_fields processors, then any adjustments or changes can be done in the package instead of needed to release a new Agent. Now that does not remove the requirement if Agent needs to add a new variable to a provider.

Without having the automatic processors the Package can easily add them to the input including making them selectable in Kibana. The select-ability in Kibana gives a more feature rich control to Fleet and will allow users to control the size of there datastreams by not including fields they do not care about.

Simple Example:

inputs:
  - type: logfile
    streams:
      - paths: /var/lib/docker/containers/${docker.container.id}/${docker.container.id}-json.log
    processors:
      - add_fields:
          fields:
            id: ${docker.container.id}
            name: ${docker.container.name}
            image: ${docker.container.image}
            labels: ${docker.container.labels}
          to: container

Package Example (ability to disable labels in package):

inputs:
  - type: logfile
    streams:
      - paths: /var/lib/docker/containers/${docker.container.id}/${docker.container.id}-json.log
    processors:
      - add_fields:
          fields:
            id: ${docker.container.id}
            name: ${docker.container.name}
            image: ${docker.container.image}
            {{#if enable_labels}}
            labels: ${docker.container.labels}
            {{/if}}
          to: container

@ph
Copy link
Contributor

ph commented Sep 2, 2020

I am always interested in how we can remove magic, I know I've mentioned that to you and I do see the benefits of having controls, reducing data and what you define is what you get. This indeed gives more values in the integration, this is where we invest the magic user experience.

I've raised the idea with @ruflin and he had really good counter-arguments to this. I will let him describe it more. This indeed moves the effort/burden/testing/development to the integration, maybe simplified with tooling, but the main and without theses fields some things will break, there is an expectation in other apps that the fields exist and also in the dashboard.

@andresrc @roncohen might also have an opinion on this.

@ruflin
Copy link
Member

ruflin commented Sep 3, 2020

For some fields, I don't think we should give the user the option to remove them as they are required to have complete events. Lets take container.id as an example. If the events coming from docker don't contain it, the events ingested cannot be correlated anymore.

There is the other downside that if a package maintainer makes a small mistake or typo, the fields end up in the wrong place.

These fields will also be critical for parts of the UI to work properly so I would even go so far, we should not allow users to remove them (even though they can if they really want).

I still think, each provider should ship the minimum fields that are really required to differentiate events like container.id but more fields like the images or labels can be added by the package.

@exekias
Copy link
Contributor

exekias commented Sep 3, 2020

In general I would prefer to be opinionated on what we ship, I would leave this to the provider, always being thoughtful on what we put there. Users (and integrations) can drop some if they are not interested in them.

This raised a question on my side. What would happen if I add a drop_fields processor to an input definition? Will the add_fields processor that we inject be appended/preppended?

@ph
Copy link
Contributor

ph commented Sep 3, 2020

Not sure of the current behavior but I would expect that add_fields are prepend to the user-defined processors?

@blakerouse
Copy link
Contributor Author

The processors will be prepended to the user/package defined list of processors.

@roncohen
Copy link
Contributor

roncohen commented Sep 9, 2020

Just checking my understanding of the proposal. It says:

Elastic Agent automatically prefixes the result with docker.

[
    {"docker": {"id": "1", "paths": {"log": "/var/log/containers/1.log"}}},
    {"docker": {"id": "2", "paths": {"log": "/var/log/containers/2.log"}},
]

With the following defined configuration in Elastic Agent:

inputs:
  - type: logfile
    path: "{$docker.path$}"

But, this:

path: "{$docker.path$}"

should be
path: "${docker.paths.log}"

right?

@ruflin
Copy link
Member

ruflin commented Sep 9, 2020

@roncohen Correct.

@ph
Copy link
Contributor

ph commented Sep 9, 2020

I have updated the description with Ron comment to remove some confusion.

@blakerouse
Copy link
Contributor Author

Moved debugging part to its own issue elastic/elastic-agent#123. Closing this now that variable substitution is implemented.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Ingest Management:beta2 Group issues for ingest management beta2
Projects
None yet
Development

No branches or pull requests

6 participants