Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fleet] Configure Fleet packages and integrations through config file #88956

Closed
ruflin opened this issue Jan 21, 2021 · 42 comments · Fixed by #94509
Closed

[Fleet] Configure Fleet packages and integrations through config file #88956

ruflin opened this issue Jan 21, 2021 · 42 comments · Fixed by #94509
Assignees
Labels
Feature:Fleet Fleet team's agent central management project Team:Defend Workflows “EDR Workflows” sub-team of Security Solution Team:Fleet Team label for Observability Data Collection Fleet team

Comments

@ruflin
Copy link
Member

ruflin commented Jan 21, 2021

Today to install packages and setup integrations, either the UI has to be used or Kibana API calls. In some cases it would be more convenient to specify the state through configuration files, for example in k8s. This issue is to discuss / brainstorm on how this could be achieved and is not meant as a proposal for the implementation.

Basic idea

There are 2 parts which normally happen through API calls to Fleet:

  • Installation / Upgrade of packages
  • Creation of policy with integrations, add integrations to policies

Instead of doing this through an API call, a yml file could be used to configure the state this should be in. This config file should be dynamic reloadable so that in case a new package is needed, no restart of Kibana is required but it can be reloaded.

Config

The config could be either inside the kibana.yml file or a separate config file. The important part is that it would be possible to reload the content and act on it. As package installation and integration configuration is different, also different configs would be needed.

Package setup config

The config to setup a package would have to indicate the package name and which version. It could look like the following:

packages:
- nginx:1.7.2
- apache:1.3.2

This would install the nginx and apache package. It could likely also be used to upgrade a package to a certain version. So if nginx:1.7.1 is installed, it would be upgraded to 1.7.2

Policies and integrations config

The second part would define the policies with the integrations inside and variables. As each integration must define which package it belongs to, the package installation could be skipped if integrations are configured.

policies:
- name: nginx-policy
  id: 1234
  integrations:
    - integration: nginx:1.7.2/logs
      name: nginx-logs
      paths: /foo/bar/nginx.log*
    - integration: nginx:1.7.2/metrics
      name: nginx-metrics
      hosts: 127.0.0.1

For the policies that are created through a config file, it should be defined if these can also be modified manually or not.

NOTE: This is only an indication for discussion of the config file and not the final format.

Permission challenge

One challenge around package installation is that kibana_system lacks the permissions to install Elasticsearch Index Templates and other assets. A user with more permissions (currently superuser) is required. Two ideas on how to solve this problem:

  • API Endpoint to trigger: Have an api endpoint /fleet/refresh that can be called to trigger a reload of the config file and setup the packages / integrations
  • API Key in config file: An API Key with sufficient permissions could be put in the config file which then can be used to install packages. This is probably not great from a security perspective.
@ruflin ruflin added Feature:Fleet Fleet team's agent central management project Team:Fleet Team label for Observability Data Collection Fleet team labels Jan 21, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/ingest-management (Team:Ingest Management)

@elasticmachine
Copy link
Contributor

Pinging @elastic/fleet (Feature:Fleet)

@kevinlog kevinlog added the Team:Defend Workflows “EDR Workflows” sub-team of Security Solution label Jan 21, 2021
@kevinlog
Copy link
Contributor

cc\ @pzl - could help in the future with our Endpoint Package upgrade process

@ph
Copy link
Contributor

ph commented Jan 21, 2021

@ruflin Can you clarify the use case:

Today to install packages and setup integrations, either the UI has to be used or Kibana API calls. In some cases it would be more convenient to specify the state through configuration files, for example in k8s. This issue is to discuss / brainstorm on how this could be achieved and is not meant as a proposal for the implementation.

Today we ship Kibana with a set of integrations that need to be installed by default during setup, do you expect that theses new settings to replace that behavior or complement that behavior?

For Permission Challenge, we have been pushing this forward a long time, we need to prioritize and formalize where we want to go.

@ruflin
Copy link
Member Author

ruflin commented Jan 25, 2021

@ph Complement. There might be packages which we need to hardcode as disabling could break the setup.

@ruflin
Copy link
Member Author

ruflin commented Jan 26, 2021

For reference, here is a Kibana issue that discusses about dynamic config reloading: #52756

@ph
Copy link
Contributor

ph commented Mar 8, 2021

Discussed over zoom:

  • This is required to bootstrap apm-server.
  • Decide what the yaml document will look like (see what it look like for the rest call, maybe a json document)
  • Build a service that receives an object that describes the state of the system for integration and agent policy.
  • Refactor the current setup to use the service.
  • Expose the service as a rest endpoint to be called by cloud.

@ph
Copy link
Contributor

ph commented Mar 8, 2021

@Zacqary @ruflin pinged me on thhis and suggested we start with the yaml format because once we have it we can't change it. I've updated the above list.

@Zacqary
Copy link
Contributor

Zacqary commented Mar 10, 2021

How should this config file relate to the default policies? I can think of a few options:

  1. The default policy still gets created regardless of what's in the config
  2. The default policy should only be created if the config defines no policies
  3. The config can redefine the default policy by passing something like name: default

@Zacqary
Copy link
Contributor

Zacqary commented Mar 10, 2021

policies:
- name: nginx-policy
  id: 1234

id isn't required when creating an agent policy, it's automatically generated. Is there a reason we want to ask the user to manually specify an ID?

I'm also noticing there's no value for namespace, which is a required value on policy creation. Should we just come up with a namespace like preconfigured or should the namespace be configurable?

@ruflin
Copy link
Member Author

ruflin commented Mar 11, 2021

For the default config, good question. I would go with the option, that the default policy has a unique id / name (3). If it should be overwritten, this specific id must be overwritten. This leaves us with the issue, what if someone does not want to have it all?

For the id part I was trying to find a way how we identify the same policy for an update? How do we know to update a policy vs create a new one?

Don't take the yaml example I put in as "the way it should be", this was more to show the concept.

@simitt
Copy link
Contributor

simitt commented Mar 11, 2021

Isn't a policy name required to be unique? Then we could use the policy name for applying an update, rather than the policy id.

This leaves us with the issue, what if someone does not want to have it all?

@ruflin could you elaborate what you mean by this?

@Zacqary
Copy link
Contributor

Zacqary commented Mar 11, 2021

integrations:
    - integration: nginx:1.7.2/logs

Do we want the semver to be required here, or optional? e.g. if no semver is specified, should it just install the latest available version?

@Zacqary
Copy link
Contributor

Zacqary commented Mar 11, 2021

Need some guidance on how to add a package in the form of nginx:1.7.2/logs to a policy. I know that ensureInstalledPackage requires us to omit the semver, but would it accept nginx/logs in this case?

@Zacqary
Copy link
Contributor

Zacqary commented Mar 11, 2021

Update: I'm realizing that nginx/logs and nginx/metrics are actually stored as two different inputs of the same nginx integration. Does EPM differentiate between them with the / naming convention, or is this just example code? Trying to understand the intention of this example config.

@Zacqary
Copy link
Contributor

Zacqary commented Mar 11, 2021

So far I have preconfigured packages working, and preconfigured policies using the default settings of the specified integrations. Where I'm stuck is on being able to preconfigure anything beyond the default integration settings.

- integration: nginx:1.7.2/logs
  name: nginx-logs
  paths: /foo/bar/nginx.log*
- integration: nginx:1.7.2/metrics
  name: nginx-metrics
  hosts: 127.0.0.1

This is a very simple representation of what seems to be represented in code as:

"inputs": [
        {
            "type": "logfile",
            "enabled": true,
            "streams": [
                {
                    "enabled": true,
                    "data_stream": {
                        "type": "logs",
                        "dataset": "nginx.access"
                    },
                    "vars": {
                        "paths": {
                            "value": [
                                "/var/log/nginx/access.log*"
                            ],
                            "type": "text"
                        }
                    },
                    "id": "logfile-nginx.access-fca53a35-e013-490c-b3f5-03e6c68ec4be"
                },
                {
                    "enabled": true,
                    "data_stream": {
                        "type": "logs",
                        "dataset": "nginx.error"
                    },
                    "vars": {
                        "paths": {
                            "value": [
                                "/var/log/nginx/error.log*"
                            ],
                            "type": "text"
                        }
                    },
                    "id": "logfile-nginx.error-fca53a35-e013-490c-b3f5-03e6c68ec4be"
                }
            ]
        },
        {
            "type": "nginx/metrics",
            "enabled": true,
            "vars": {
                "hosts": {
                    "value": [
                        "http://127.0.0.1:80"
                    ],
                    "type": "text"
                }
            },
            "streams": [
                {
                    "enabled": true,
                    "data_stream": {
                        "type": "metrics",
                        "dataset": "nginx.stubstatus"
                    },
                    "vars": {
                        "period": {
                            "value": "10s",
                            "type": "text"
                        },
                        "server_status_path": {
                            "value": "/nginx_status",
                            "type": "text"
                        }
                    },
                    "id": "nginx/metrics-nginx.stubstatus-fca53a35-e013-490c-b3f5-03e6c68ec4be"
                }
            ]
        }
    ]

There's a lot of specificity to translate, here. Is there a predictable way for us to know which key/value pairs to translate into a var or a stream, what type to apply, etc.?

It's possible this task will be complex enough to warrant a second PR.

@ruflin
Copy link
Member Author

ruflin commented Mar 15, 2021

@simitt For #88956 (comment), my perspective is that a name can change over time even thought it might be unique and id can't. It is possible that we treat name like an id at the moment, so both would work.

@Zacqary You will need to specify vars for each input / stream in the schema. Lets try to find some time to sync up an talk through it. I think overall you are on the right track. @jen-huang Would be great if you could also chime in here.

For the versions of the packages, I would stick to require a version for now and not support latest. We can still add these features later.

@nchaulet
Copy link
Member

I think we could use ids for integrations it will make things a lot easier, there is no technical limitation to not do it.

@Zacqary
Copy link
Contributor

Zacqary commented Mar 15, 2021

What does the is_managed flag on agent policies do? Should we enable that for config-defined policies, or is that for a different thing?

@nchaulet
Copy link
Member

I do not think we should enable the is_managed flag here. This flag kind of lock a config you cannot enroll/reassign/unenroll agents into that config and you cannot remove some integrations from a managed config

@Zacqary
Copy link
Contributor

Zacqary commented Mar 15, 2021

A question was raised on the draft PR about what should happen if the user:

  1. Adds a policy to the config file, and lets Kibana configure it
  2. Removes this same policy from the config file and restarts Kibana

Should Kibana delete that policy? (If no agents are assigned)

Speaking of which, if the user removes something from packages, should Kibana uninstall it on setup?

@ruflin
Copy link
Member Author

ruflin commented Mar 16, 2021

My suggestion for now to keep things simple is, that removal is not supported. We can add this later.

@ruflin
Copy link
Member Author

ruflin commented Mar 17, 2021

After a conversation with @Zacqary I realised for the policy part we might need to take a step back and also think a bit on where policy configs might be in the future.

Elastic Agent policy

There are ongoing discussions around the Elastic Agent policy format, especially the part inside inputs. One of the main discussions is: Do we need the streams part? The streams part was introduced to allow to specify certain config options on the input level like hosts, username, password or data_stream.namespace to not have to repeat it all the time. This should make it easier for users who write their Elastic Agent config manually. It does not matter much in the case of Fleet as the policy is generated.

In the following an example that in theory, all 3 should be indentical. 1 and 2 work today, 3 doesn't but this could be changed:

# Example: What we use today
inputs:
  - type: system/metrics
    use_output: default
    data_stream.namespace: foo
    streams:
      - metricset: cpu
        data_stream.dataset: system.cpu
      - metricset: memory
        data_stream.dataset: system.memory
        
# Example: inputs repeated        
inputs:
  - type: system/metrics
    use_output: default
    data_stream.namespace: foo
    streams:
      - metricset: cpu
        data_stream.dataset: system.cpu
  - type: system/metrics
    use_output: default
    data_stream.namespace: foo
    streams:
      - metricset: memory
        data_stream.dataset: system.memory
        
# Example: no streams
inputs:
  - type: system/metrics
    data_stream.namespace: foo
    use_output: default
    metricset: cpu
    data_stream.dataset: system.cpu
    
  - type: system/metrics
    data_stream.namespace: foo
    use_output: default
    metricset: memory
    data_stream.dataset: system.memory

Kibana policy generation

I showed all the above to explain why the stream nesting is there but in the end it is a nesting that might not be required or partially can be ignored in the discussion.

On the Kibana side, we need a format that supports the following things:

  • Create a Elastic Agent Policy
  • Contains a list of integration policies
  • Each integration policy contains a list of inputs (ignoring streams here)
  • Each input must contain be able to contain a list of variables.

Here an potential example which might work:

policies:
- name: nginx-policy
  integrations:
  - package: nginx:1.7.2
    name: nginx-logs
    inputs:
    - # This is required to tell where in the package to take the defaults form
      data_stream_path: access
      # This overwrites the path variables
      paths: /var/log/access.log
      # For error logs, all the defaults are taken 
    - data_stream_path: error
    # Metrics is not eanbled as it is not specified

The important part in here is that there must be some way to tell Kibana which data_stream_path it should use for filling in each input and use defaults for it.

I don't know if Kibana today has an validation against packages if a policy is submitted. But if for example the above would contain any data stream path twice, non existing variables or anything similar, it should be rejected.

Note: The above is not mean as the final format, it is just meant to provide an example.

@jen-huang
Copy link
Contributor

@Zacqary @ruflin Leaving the gist here instead since it's more related to the above ^ comment. We discussed making the shape of the YAML more similar to what Kibana stores internally in its agent policy SOs and package policy SOs, here is a gist with that structure and notes around what Kibana should do for various fields: https://gist.github.com/jen-huang/c39232dfa59d7f97327f5af7909dc5f4

@Zacqary
Copy link
Contributor

Zacqary commented Mar 17, 2021

@jen-huang Should the YAML field be agent_policies or agentPolicies? It seems like we're using camelcase for everything else in the config.

@jen-huang
Copy link
Contributor

@Zacqary Good point, let's go with agentPolicies to be consistent.

@Zacqary
Copy link
Contributor

Zacqary commented Mar 17, 2021

What do we think of this error message? For when the user tries to add an agent policy with a package that's not installed:
Screen Shot 2021-03-17 at 4 47 58 PM

@jen-huang
Copy link
Contributor

It's very verbose :D You could say it like this to tighten up the wording, but we can ping our wonderful tech writers during PR review for real wordsmithing: {agent policy name} could not be added. {package name} is not installed, add {package name} to xpack.fleet.packages or remove it from {package policy name}.

If you don't have it already, I would add a similar message to logging via logger.warn (and maybe logger.trace if there's json or yaml that would be helpful).

Another thing is that it is strange that running into error with setting up preconfigured policies blocks the entire Fleet UI, but this is a larger issue with the /setup endpoint than just this work, for reference you can read more about that in #91864. Just an FYI, I'm fine with the current blocking behavior since that's how /setup currently behaves.

@ruflin
Copy link
Member Author

ruflin commented Mar 18, 2021

Nit on the naming: In the Agent Policy itself we currently don't use any camelCase but always _. As I think this should be more close to the Agent config, my vote would be there.

On the /setup part: I thought we agree that it is a separate API endpoint so it does not change the current setup. In phase 2, it might be that /setup calls this new API endpoint with some internal requirements and then it would be blocking. As this is currently only an API feature, not sure where we have an error in the UI?

@jen-huang
Copy link
Contributor

For the config naming, most kibana.yml settings are camelcase, including most existing ones for Fleet. I thought it a little strange to have one non-camelcase one for this, but agree that everything else in the policy itself is snakecase (monitoring_enabled, package_policies) so it might not be so bad to align it with that instead.

I may have misunderstood on our call but I thought this initialization was being run as part of /setup, actually I thought the order was opposite, that we first do it through setup and later add another dedicated API endpoint. I actually like having it under another endpoint better, so ++ on that. We currently do a lot of things already in setup that take a while to execute, so not having under there would prevent additional overhead & slower UI loading time. @Zacqary Sorry for the confusion, hopefully it is not too bad to move the code to a new route handler.

@Zacqary
Copy link
Contributor

Zacqary commented Mar 18, 2021

Running into a problem:

vars:
  system.hostfs:
    type: text
    value: home/test

YAML interprets this as:

{
  "vars": {
    "system": {
        "hostfs": {
            "value": "home/test"
        }
    }
}

Wrapping it in quotes like 'system.hostfs' doesn't seem to fix this.

@jen-huang
Copy link
Contributor

Hmm, is that the JSON output after running it through js-yaml? I did a quick test on the demo and keys with dots seem to parse ok?

@Zacqary
Copy link
Contributor

Zacqary commented Mar 18, 2021

Could be. It's how kibana.yml is getting parsed in my testing. I'll see if I can find out more.

@Zacqary
Copy link
Contributor

Zacqary commented Mar 18, 2021

Looks like https://github.com/elastic/kibana/blob/master/packages/kbn-config/src/raw/ensure_deep_object.ts is the culprit. Not sure how to disable this for a particular YAML blob, but I'm checking with the core team

@Zacqary
Copy link
Contributor

Zacqary commented Mar 18, 2021

Okay so it turns out . is considered a reserved character in kibana.yml. We're not going to be able to adhere directly to the SavedObject schema if we want to do this through the config file.

I know the API is the first priority, but I assume we want to preserve compatibility with kibana.yml, so I suggest we come up with a slightly different schema that allows us to set var names as a value instead of a key.

@jen-huang
Copy link
Contributor

I know the API is the first priority, but I assume we want to preserve compatibility with kibana.yml, so I suggest we come up with a slightly different schema that allows us to set var names as a value instead of a key.

👍🏻 Should be pretty easy to change the vars to an array of var objects that include a name field, I updated the gist here: https://gist.github.com/jen-huang/c39232dfa59d7f97327f5af7909dc5f4/revisions#diff-4760b8308c337d4a06364b03d527bf8bc5b8e1c0776fd5bb67b05cb040a6c6fa

@Zacqary
Copy link
Contributor

Zacqary commented Mar 18, 2021

@jen-huang What do you think of key instead of name? I feel like the combination of key and value is clearer.

vars:
  - key: system.hostfs
    value: home/test

@jen-huang
Copy link
Contributor

I don't feel strongly one way or another, so that works for me.

@ruflin
Copy link
Member Author

ruflin commented Mar 22, 2021

For key vs name: Even though I kind of agree that key might be better, we also have to keep in mind there are some existing values already for this: https://github.com/elastic/package-storage/blob/production/packages/apache/0.3.4/manifest.yml#L38 So not sure if we should introduce new "meanings" if they are better.

As we don't use the kibana.yml for the config in the first iteration but the API, is . still an issue?

@Zacqary
Copy link
Contributor

Zacqary commented Mar 22, 2021

I'd say it's still an issue, we shouldn't have one schema for JSON and a different schema for YAML, if we can help it.

@Zacqary
Copy link
Contributor

Zacqary commented Mar 23, 2021

So as of today, my draft PR actually has this working through BOTH an API and plugin setup through kibana.yml. I can disable the kibana.yml parsing if we're not ready to ship that, but it seems to be working in my local testing.

@ruflin
Copy link
Member Author

ruflin commented Mar 24, 2021

I would prefer to remove kibana.yml support for know as I'm worried it might add complexity in case of conflicts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:Fleet Fleet team's agent central management project Team:Defend Workflows “EDR Workflows” sub-team of Security Solution Team:Fleet Team label for Observability Data Collection Fleet team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants