[Fleet] Configure Fleet packages and integrations through config file #88956

ruflin · 2021-01-21T12:22:15Z

Today to install packages and setup integrations, either the UI has to be used or Kibana API calls. In some cases it would be more convenient to specify the state through configuration files, for example in k8s. This issue is to discuss / brainstorm on how this could be achieved and is not meant as a proposal for the implementation.

Basic idea

There are 2 parts which normally happen through API calls to Fleet:

Installation / Upgrade of packages
Creation of policy with integrations, add integrations to policies

Instead of doing this through an API call, a yml file could be used to configure the state this should be in. This config file should be dynamic reloadable so that in case a new package is needed, no restart of Kibana is required but it can be reloaded.

Config

The config could be either inside the kibana.yml file or a separate config file. The important part is that it would be possible to reload the content and act on it. As package installation and integration configuration is different, also different configs would be needed.

Package setup config

The config to setup a package would have to indicate the package name and which version. It could look like the following:

packages:
- nginx:1.7.2
- apache:1.3.2

This would install the nginx and apache package. It could likely also be used to upgrade a package to a certain version. So if nginx:1.7.1 is installed, it would be upgraded to 1.7.2

Policies and integrations config

The second part would define the policies with the integrations inside and variables. As each integration must define which package it belongs to, the package installation could be skipped if integrations are configured.

policies:
- name: nginx-policy
  id: 1234
  integrations:
    - integration: nginx:1.7.2/logs
      name: nginx-logs
      paths: /foo/bar/nginx.log*
    - integration: nginx:1.7.2/metrics
      name: nginx-metrics
      hosts: 127.0.0.1

For the policies that are created through a config file, it should be defined if these can also be modified manually or not.

NOTE: This is only an indication for discussion of the config file and not the final format.

Permission challenge

One challenge around package installation is that kibana_system lacks the permissions to install Elasticsearch Index Templates and other assets. A user with more permissions (currently superuser) is required. Two ideas on how to solve this problem:

API Endpoint to trigger: Have an api endpoint /fleet/refresh that can be called to trigger a reload of the config file and setup the packages / integrations
API Key in config file: An API Key with sufficient permissions could be put in the config file which then can be used to install packages. This is probably not great from a security perspective.

The text was updated successfully, but these errors were encountered:

elasticmachine · 2021-01-21T12:22:17Z

Pinging @elastic/ingest-management (Team:Ingest Management)

elasticmachine · 2021-01-21T12:22:17Z

Pinging @elastic/fleet (Feature:Fleet)

kevinlog · 2021-01-21T13:06:40Z

cc\ @pzl - could help in the future with our Endpoint Package upgrade process

ph · 2021-01-21T18:02:17Z

@ruflin Can you clarify the use case:

Today to install packages and setup integrations, either the UI has to be used or Kibana API calls. In some cases it would be more convenient to specify the state through configuration files, for example in k8s. This issue is to discuss / brainstorm on how this could be achieved and is not meant as a proposal for the implementation.

Today we ship Kibana with a set of integrations that need to be installed by default during setup, do you expect that theses new settings to replace that behavior or complement that behavior?

For Permission Challenge, we have been pushing this forward a long time, we need to prioritize and formalize where we want to go.

ruflin · 2021-01-25T08:00:26Z

@ph Complement. There might be packages which we need to hardcode as disabling could break the setup.

ruflin · 2021-01-26T08:42:53Z

For reference, here is a Kibana issue that discusses about dynamic config reloading: #52756

ph · 2021-03-08T18:26:18Z

Discussed over zoom:

This is required to bootstrap apm-server.
Decide what the yaml document will look like (see what it look like for the rest call, maybe a json document)
Build a service that receives an object that describes the state of the system for integration and agent policy.
Refactor the current setup to use the service.
Expose the service as a rest endpoint to be called by cloud.

ph · 2021-03-08T19:52:37Z

@Zacqary @ruflin pinged me on thhis and suggested we start with the yaml format because once we have it we can't change it. I've updated the above list.

Zacqary · 2021-03-10T19:09:21Z

How should this config file relate to the default policies? I can think of a few options:

The default policy still gets created regardless of what's in the config
The default policy should only be created if the config defines no policies
The config can redefine the default policy by passing something like name: default

Zacqary · 2021-03-10T22:00:20Z

policies:
- name: nginx-policy
  id: 1234

id isn't required when creating an agent policy, it's automatically generated. Is there a reason we want to ask the user to manually specify an ID?

I'm also noticing there's no value for namespace, which is a required value on policy creation. Should we just come up with a namespace like preconfigured or should the namespace be configurable?

ruflin · 2021-03-11T10:37:51Z

For the default config, good question. I would go with the option, that the default policy has a unique id / name (3). If it should be overwritten, this specific id must be overwritten. This leaves us with the issue, what if someone does not want to have it all?

For the id part I was trying to find a way how we identify the same policy for an update? How do we know to update a policy vs create a new one?

Don't take the yaml example I put in as "the way it should be", this was more to show the concept.

simitt · 2021-03-11T12:28:07Z

Isn't a policy name required to be unique? Then we could use the policy name for applying an update, rather than the policy id.

This leaves us with the issue, what if someone does not want to have it all?

@ruflin could you elaborate what you mean by this?

Zacqary · 2021-03-11T21:21:55Z

integrations:
    - integration: nginx:1.7.2/logs

Do we want the semver to be required here, or optional? e.g. if no semver is specified, should it just install the latest available version?

Zacqary · 2021-03-11T21:58:39Z

Need some guidance on how to add a package in the form of nginx:1.7.2/logs to a policy. I know that ensureInstalledPackage requires us to omit the semver, but would it accept nginx/logs in this case?

Zacqary · 2021-03-11T22:52:03Z

Update: I'm realizing that nginx/logs and nginx/metrics are actually stored as two different inputs of the same nginx integration. Does EPM differentiate between them with the / naming convention, or is this just example code? Trying to understand the intention of this example config.

Zacqary · 2021-03-11T23:02:19Z

So far I have preconfigured packages working, and preconfigured policies using the default settings of the specified integrations. Where I'm stuck is on being able to preconfigure anything beyond the default integration settings.

- integration: nginx:1.7.2/logs
  name: nginx-logs
  paths: /foo/bar/nginx.log*
- integration: nginx:1.7.2/metrics
  name: nginx-metrics
  hosts: 127.0.0.1

This is a very simple representation of what seems to be represented in code as:

"inputs": [
        {
            "type": "logfile",
            "enabled": true,
            "streams": [
                {
                    "enabled": true,
                    "data_stream": {
                        "type": "logs",
                        "dataset": "nginx.access"
                    },
                    "vars": {
                        "paths": {
                            "value": [
                                "/var/log/nginx/access.log*"
                            ],
                            "type": "text"
                        }
                    },
                    "id": "logfile-nginx.access-fca53a35-e013-490c-b3f5-03e6c68ec4be"
                },
                {
                    "enabled": true,
                    "data_stream": {
                        "type": "logs",
                        "dataset": "nginx.error"
                    },
                    "vars": {
                        "paths": {
                            "value": [
                                "/var/log/nginx/error.log*"
                            ],
                            "type": "text"
                        }
                    },
                    "id": "logfile-nginx.error-fca53a35-e013-490c-b3f5-03e6c68ec4be"
                }
            ]
        },
        {
            "type": "nginx/metrics",
            "enabled": true,
            "vars": {
                "hosts": {
                    "value": [
                        "http://127.0.0.1:80"
                    ],
                    "type": "text"
                }
            },
            "streams": [
                {
                    "enabled": true,
                    "data_stream": {
                        "type": "metrics",
                        "dataset": "nginx.stubstatus"
                    },
                    "vars": {
                        "period": {
                            "value": "10s",
                            "type": "text"
                        },
                        "server_status_path": {
                            "value": "/nginx_status",
                            "type": "text"
                        }
                    },
                    "id": "nginx/metrics-nginx.stubstatus-fca53a35-e013-490c-b3f5-03e6c68ec4be"
                }
            ]
        }
    ]

There's a lot of specificity to translate, here. Is there a predictable way for us to know which key/value pairs to translate into a var or a stream, what type to apply, etc.?

It's possible this task will be complex enough to warrant a second PR.

ruflin · 2021-03-15T15:32:36Z

@simitt For #88956 (comment), my perspective is that a name can change over time even thought it might be unique and id can't. It is possible that we treat name like an id at the moment, so both would work.

@Zacqary You will need to specify vars for each input / stream in the schema. Lets try to find some time to sync up an talk through it. I think overall you are on the right track. @jen-huang Would be great if you could also chime in here.

For the versions of the packages, I would stick to require a version for now and not support latest. We can still add these features later.

nchaulet · 2021-03-15T15:52:05Z

I think we could use ids for integrations it will make things a lot easier, there is no technical limitation to not do it.

Zacqary · 2021-03-15T20:19:59Z

What does the is_managed flag on agent policies do? Should we enable that for config-defined policies, or is that for a different thing?

nchaulet · 2021-03-15T20:29:09Z

I do not think we should enable the is_managed flag here. This flag kind of lock a config you cannot enroll/reassign/unenroll agents into that config and you cannot remove some integrations from a managed config

Zacqary · 2021-03-15T20:42:08Z

A question was raised on the draft PR about what should happen if the user:

Adds a policy to the config file, and lets Kibana configure it
Removes this same policy from the config file and restarts Kibana

Should Kibana delete that policy? (If no agents are assigned)

Speaking of which, if the user removes something from packages, should Kibana uninstall it on setup?

ruflin · 2021-03-16T08:16:12Z

My suggestion for now to keep things simple is, that removal is not supported. We can add this later.

ruflin · 2021-03-17T12:17:40Z

After a conversation with @Zacqary I realised for the policy part we might need to take a step back and also think a bit on where policy configs might be in the future.

Elastic Agent policy

There are ongoing discussions around the Elastic Agent policy format, especially the part inside inputs. One of the main discussions is: Do we need the streams part? The streams part was introduced to allow to specify certain config options on the input level like hosts, username, password or data_stream.namespace to not have to repeat it all the time. This should make it easier for users who write their Elastic Agent config manually. It does not matter much in the case of Fleet as the policy is generated.

In the following an example that in theory, all 3 should be indentical. 1 and 2 work today, 3 doesn't but this could be changed:

# Example: What we use today
inputs:
  - type: system/metrics
    use_output: default
    data_stream.namespace: foo
    streams:
      - metricset: cpu
        data_stream.dataset: system.cpu
      - metricset: memory
        data_stream.dataset: system.memory
        
# Example: inputs repeated        
inputs:
  - type: system/metrics
    use_output: default
    data_stream.namespace: foo
    streams:
      - metricset: cpu
        data_stream.dataset: system.cpu
  - type: system/metrics
    use_output: default
    data_stream.namespace: foo
    streams:
      - metricset: memory
        data_stream.dataset: system.memory
        
# Example: no streams
inputs:
  - type: system/metrics
    data_stream.namespace: foo
    use_output: default
    metricset: cpu
    data_stream.dataset: system.cpu
    
  - type: system/metrics
    data_stream.namespace: foo
    use_output: default
    metricset: memory
    data_stream.dataset: system.memory

Kibana policy generation

I showed all the above to explain why the stream nesting is there but in the end it is a nesting that might not be required or partially can be ignored in the discussion.

On the Kibana side, we need a format that supports the following things:

Create a Elastic Agent Policy
Contains a list of integration policies
Each integration policy contains a list of inputs (ignoring streams here)
Each input must contain be able to contain a list of variables.

Here an potential example which might work:

policies:
- name: nginx-policy
  integrations:
  - package: nginx:1.7.2
    name: nginx-logs
    inputs:
    - # This is required to tell where in the package to take the defaults form
      data_stream_path: access
      # This overwrites the path variables
      paths: /var/log/access.log
      # For error logs, all the defaults are taken 
    - data_stream_path: error
    # Metrics is not eanbled as it is not specified

The important part in here is that there must be some way to tell Kibana which data_stream_path it should use for filling in each input and use defaults for it.

I don't know if Kibana today has an validation against packages if a policy is submitted. But if for example the above would contain any data stream path twice, non existing variables or anything similar, it should be rejected.

Note: The above is not mean as the final format, it is just meant to provide an example.

jen-huang · 2021-03-17T19:45:50Z

@Zacqary @ruflin Leaving the gist here instead since it's more related to the above ^ comment. We discussed making the shape of the YAML more similar to what Kibana stores internally in its agent policy SOs and package policy SOs, here is a gist with that structure and notes around what Kibana should do for various fields: https://gist.github.com/jen-huang/c39232dfa59d7f97327f5af7909dc5f4

Zacqary · 2021-03-17T21:08:47Z

@jen-huang Should the YAML field be agent_policies or agentPolicies? It seems like we're using camelcase for everything else in the config.

jen-huang · 2021-03-17T21:36:27Z

@Zacqary Good point, let's go with agentPolicies to be consistent.

Zacqary · 2021-03-17T21:45:07Z

What do we think of this error message? For when the user tries to add an agent policy with a package that's not installed:

jen-huang · 2021-03-17T21:57:30Z

It's very verbose :D You could say it like this to tighten up the wording, but we can ping our wonderful tech writers during PR review for real wordsmithing: {agent policy name} could not be added. {package name} is not installed, add {package name} to xpack.fleet.packages or remove it from {package policy name}.

If you don't have it already, I would add a similar message to logging via logger.warn (and maybe logger.trace if there's json or yaml that would be helpful).

Another thing is that it is strange that running into error with setting up preconfigured policies blocks the entire Fleet UI, but this is a larger issue with the /setup endpoint than just this work, for reference you can read more about that in #91864. Just an FYI, I'm fine with the current blocking behavior since that's how /setup currently behaves.

ruflin · 2021-03-18T09:03:15Z

Nit on the naming: In the Agent Policy itself we currently don't use any camelCase but always _. As I think this should be more close to the Agent config, my vote would be there.

On the /setup part: I thought we agree that it is a separate API endpoint so it does not change the current setup. In phase 2, it might be that /setup calls this new API endpoint with some internal requirements and then it would be blocking. As this is currently only an API feature, not sure where we have an error in the UI?

jen-huang · 2021-03-18T16:49:02Z

For the config naming, most kibana.yml settings are camelcase, including most existing ones for Fleet. I thought it a little strange to have one non-camelcase one for this, but agree that everything else in the policy itself is snakecase (monitoring_enabled, package_policies) so it might not be so bad to align it with that instead.

I may have misunderstood on our call but I thought this initialization was being run as part of /setup, actually I thought the order was opposite, that we first do it through setup and later add another dedicated API endpoint. I actually like having it under another endpoint better, so ++ on that. We currently do a lot of things already in setup that take a while to execute, so not having under there would prevent additional overhead & slower UI loading time. @Zacqary Sorry for the confusion, hopefully it is not too bad to move the code to a new route handler.

Zacqary · 2021-03-18T17:50:54Z

Running into a problem:

vars:
  system.hostfs:
    type: text
    value: home/test

YAML interprets this as:

{
  "vars": {
    "system": {
        "hostfs": {
            "value": "home/test"
        }
    }
}

Wrapping it in quotes like 'system.hostfs' doesn't seem to fix this.

jen-huang · 2021-03-18T17:54:42Z

Hmm, is that the JSON output after running it through js-yaml? I did a quick test on the demo and keys with dots seem to parse ok?

Zacqary · 2021-03-18T18:02:23Z

Could be. It's how kibana.yml is getting parsed in my testing. I'll see if I can find out more.

Zacqary · 2021-03-18T20:25:30Z

Looks like https://github.com/elastic/kibana/blob/master/packages/kbn-config/src/raw/ensure_deep_object.ts is the culprit. Not sure how to disable this for a particular YAML blob, but I'm checking with the core team

Zacqary · 2021-03-18T20:36:45Z

Okay so it turns out . is considered a reserved character in kibana.yml. We're not going to be able to adhere directly to the SavedObject schema if we want to do this through the config file.

I know the API is the first priority, but I assume we want to preserve compatibility with kibana.yml, so I suggest we come up with a slightly different schema that allows us to set var names as a value instead of a key.

jen-huang · 2021-03-18T20:46:02Z

I know the API is the first priority, but I assume we want to preserve compatibility with kibana.yml, so I suggest we come up with a slightly different schema that allows us to set var names as a value instead of a key.

👍🏻 Should be pretty easy to change the vars to an array of var objects that include a name field, I updated the gist here: https://gist.github.com/jen-huang/c39232dfa59d7f97327f5af7909dc5f4/revisions#diff-4760b8308c337d4a06364b03d527bf8bc5b8e1c0776fd5bb67b05cb040a6c6fa

Zacqary · 2021-03-18T21:01:03Z

@jen-huang What do you think of key instead of name? I feel like the combination of key and value is clearer.

vars:
  - key: system.hostfs
    value: home/test

jen-huang · 2021-03-18T21:02:51Z

I don't feel strongly one way or another, so that works for me.

ruflin · 2021-03-22T10:10:20Z

For key vs name: Even though I kind of agree that key might be better, we also have to keep in mind there are some existing values already for this: https://github.com/elastic/package-storage/blob/production/packages/apache/0.3.4/manifest.yml#L38 So not sure if we should introduce new "meanings" if they are better.

As we don't use the kibana.yml for the config in the first iteration but the API, is . still an issue?

Zacqary · 2021-03-22T14:52:16Z

I'd say it's still an issue, we shouldn't have one schema for JSON and a different schema for YAML, if we can help it.

Zacqary · 2021-03-23T16:33:15Z

So as of today, my draft PR actually has this working through BOTH an API and plugin setup through kibana.yml. I can disable the kibana.yml parsing if we're not ready to ship that, but it seems to be working in my local testing.

ruflin · 2021-03-24T08:24:32Z

I would prefer to remove kibana.yml support for know as I'm worried it might add complexity in case of conflicts.

ruflin added Feature:Fleet Fleet team's agent central management project Team:Fleet Team label for Observability Data Collection Fleet team labels Jan 21, 2021

kevinlog added the Team:Defend Workflows “EDR Workflows” sub-team of Security Solution label Jan 21, 2021

ruflin mentioned this issue Feb 1, 2021

Allow Kibana to setup the stack #89827

Closed

ruflin mentioned this issue Mar 1, 2021

[Fleet][APM] Discuss: Setup of APM integration #90410

Closed

ph assigned Zacqary Mar 8, 2021

Zacqary mentioned this issue Mar 11, 2021

[Fleet] Configure Fleet packages and integrations through endpoint #94509

Merged

1 task

Zacqary closed this as completed in #94509 Mar 30, 2021

[Fleet] Configure Fleet packages and integrations through config file #88956

[Fleet] Configure Fleet packages and integrations through config file #88956

Comments

ruflin commented Jan 21, 2021

Basic idea

Config

Package setup config

Policies and integrations config

Permission challenge

elasticmachine commented Jan 21, 2021

elasticmachine commented Jan 21, 2021

kevinlog commented Jan 21, 2021

ph commented Jan 21, 2021

ruflin commented Jan 25, 2021

ruflin commented Jan 26, 2021

ph commented Mar 8, 2021 • edited

ph commented Mar 8, 2021

Zacqary commented Mar 10, 2021

Zacqary commented Mar 10, 2021

ruflin commented Mar 11, 2021

simitt commented Mar 11, 2021 • edited

Zacqary commented Mar 11, 2021

Zacqary commented Mar 11, 2021 • edited

Zacqary commented Mar 11, 2021

Zacqary commented Mar 11, 2021

ruflin commented Mar 15, 2021

nchaulet commented Mar 15, 2021

Zacqary commented Mar 15, 2021

nchaulet commented Mar 15, 2021

Zacqary commented Mar 15, 2021 • edited

ruflin commented Mar 16, 2021

ruflin commented Mar 17, 2021

Elastic Agent policy

Kibana policy generation

jen-huang commented Mar 17, 2021

Zacqary commented Mar 17, 2021

jen-huang commented Mar 17, 2021

Zacqary commented Mar 17, 2021 • edited

jen-huang commented Mar 17, 2021

ruflin commented Mar 18, 2021

jen-huang commented Mar 18, 2021

Zacqary commented Mar 18, 2021

jen-huang commented Mar 18, 2021

Zacqary commented Mar 18, 2021

Zacqary commented Mar 18, 2021

Zacqary commented Mar 18, 2021

jen-huang commented Mar 18, 2021

Zacqary commented Mar 18, 2021 • edited

jen-huang commented Mar 18, 2021

ruflin commented Mar 22, 2021

Zacqary commented Mar 22, 2021

Zacqary commented Mar 23, 2021

ruflin commented Mar 24, 2021

ph commented Mar 8, 2021 •

edited

simitt commented Mar 11, 2021 •

edited

Zacqary commented Mar 11, 2021 •

edited

Zacqary commented Mar 15, 2021 •

edited

Zacqary commented Mar 17, 2021 •

edited

Zacqary commented Mar 18, 2021 •

edited