[Ingest Management] Agent supports capabilities definition #23848

michalpristas · 2021-02-04T13:31:06Z

What does this PR do?

This PR introduces capabilities as described in #21000
Consists of first step of implementation - bare input/output/upgrade filter without any additional filtering options such as only subset of metricsets etc.

How this works is that there is a Capability injected down to components which checks objects passed to Apply fn.
This capability has set of input/output/upgrade capabilities based on description in capabilities.yml file. This file sits next to agent config and contains a list of definitions.

Each of these capabilities decides whether input object is interesting and capability can operate on it or not.
If so it checks for whatever it is configured for and returns updated object.
It always returns object of the same type as it was passed in.
E.g input capability is able to operate on maps and on AST. if ast is passed in it can never happen map is returned.

When agent rules out inuput/output log is written with ERROR log level and health of agent is degraded as discussed in elastic/kibana#76841

Feature works for standalone and fleet managed mode.

sample capabilities.yml allowing system metrics and nothing else

version: 0.0.1
capabilities:
- rule: allow
  input: system/metrics
- rule: deny
  input: "*"

Testing

The ideal way how to play with this is to package an agent and then modify elastic-agent.yml, capabilities.yml and invoking ./elastic-agent inspect to see how the resulting config looks like

Why is it important?

#21000

Checklist

My code follows the style guidelines of this project
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
I have made corresponding change to the default configuration files
I have added tests that prove my fix is effective or that my feature works
I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

ruflin · 2021-02-05T10:00:57Z

x-pack/elastic-agent/pkg/capabilities/testdata/allow_metrics-capabilities.yml

@@ -0,0 +1,4 @@
+- rule: allow


Having a top level entry capabilities instead of directly the array would allow use to extend it later with other things if needed. For example the version of the capabilities file.

+1 on adding a version from day 1.

michalpristas · 2021-02-05T16:06:46Z

/package

michalpristas · 2021-02-05T20:45:16Z

/package

urso · 2021-02-08T14:33:09Z

x-pack/elastic-agent/pkg/agent/application/emitter.go

@@ -142,7 +154,7 @@ func (e *emitterController) update() error {
 	return e.router.Dispatch(ast.HashStr(), programsToRun)
 }

-func emitter(ctx context.Context, log *logger.Logger, agentInfo *info.AgentInfo, controller composable.Controller, router programsDispatcher, modifiers *configModifiers, reloadables ...reloadable) (emitterFunc, error) {
+func emitter(ctx context.Context, log *logger.Logger, agentInfo *info.AgentInfo, controller composable.Controller, router programsDispatcher, modifiers *configModifiers, caps capabilities.Capability, reloadables ...reloadable) (emitterFunc, error) {


nit: The number of parameters gets a little out of hand. Do we really need this function, or can we just use the struct and have a separate "init" or "start" function/method?

Do we pass a tuple of ctx, log, agentInfo often? In the future we might even want to pass something for metrics and APM tracing. The 'tuple' could be combined into an appContext struct.

i'm +1 on this here. would be worth a small refactor across all the codebase, we have this pattern i think on multple places. these things were simple at first but got bloated over time

The emitter function is actually become more of a wrapper for the emitterController with the addition of dynamic inputs and conditions.

We should probably refactor that into a single controller.

x-pack/elastic-agent/pkg/capabilities/capabilities.go

x-pack/elastic-agent/pkg/capabilities/expr.go

urso · 2021-02-08T15:22:03Z

x-pack/elastic-agent/pkg/capabilities/capabilities.go

+type Capability interface {
+	// Apply applies capabilities on input and returns true if input should be completely blocked
+	// otherwise, false and updated input is returned
+	Apply(interface{}) (bool, interface{})


The interface looks hard to use correctly long term.

Some capabilities seem to accept map[string]interface{} and in some places an "AST" object is required. The return type is also not so clear.

Is the return type always map[string]interface{} or can it be an AST? Does the return type depend on the input type.

Will a capability modify the input, or will it always create a new object?

For the implementor of a capability: how do I know which input type I must accept? Does it depend on where the capability will be applied at? Do we have capability that need to implement multiple possible input types?

For example inputCapability: blocked is false if the Input gets passed the wrong type. The inputCapability even silently ignores programmer errors if I pass in the wrong type by accident in the future.

Can we settle on a single type as input to a Capability to ease maintenance in the future? In case we really need the two implementations for Apply, let's use the type system to reflect this e.g. ApplyConfig(m map[string]interface{}) ... and ApplyAST(...) (maybe even in a separate interface).
If possible I would prefer concrete types over anything interface{}.

How about returning an error instead of a boolean? e.g. common signals would be defined as constants ErrBlocked, while we still will be able to pass errors up if an instances is used wrongly.

i was aiming for abstract interface, for some capabilities it may be map, it may be struct, it may be both. depends on a place where this is coming from.
e.g for upgrade it is an Action coming form fleetapi or object conforming to interface, for input it can be map or AST, or if we implement it elsewhere it may be also an Action. Separate implementation would lead to more objects there, and also on the top level side we would probably need all of them methods. (at the level of top capability returned by Load whcih is then passed and injected down the stream)

it's up to implementor of capability to be aware of the hooks and places where data are coming from.

returning an error seems reasonable

x-pack/elastic-agent/pkg/capabilities/capabilities_test.go

urso · 2021-02-08T15:57:45Z

x-pack/elastic-agent/pkg/capabilities/input.go

+	Input    string `json:"input" yaml:"input"`
+}
+
+func (c *inputCapability) Apply(in interface{}) (bool, interface{}) {


inputCapability seems to be internal to this package only. Actually seems to be internal to the multiInputsCapability. Why not accept and return concrete types such that multiInputsCapability does not need to do special casts and runtime type checks validating the output of inputCapability?

i delegated this responsibility to the latest caller. in case inputCapability is used outside of multiInput i wanted to have checks in place.
it might be redundant but it's not run that often

x-pack/elastic-agent/pkg/capabilities/input.go

urso · 2021-02-08T16:10:34Z

x-pack/elastic-agent/pkg/capabilities/input.go

+		if inputs := inputsMap(inputsIface, c.log); inputs != nil {
+			renderedInputs, err := c.renderInputs(inputs)
+			if err != nil {
+				c.log.Errorf("marking inputs failed for capability '%s': %v", c.name(), err)


Structure logging. If possible, let's include input id and other meta-data (if present) that allow us to identifiy the input: c.log.With(...).Errorf or c.log.Errorw.

x-pack/elastic-agent/pkg/capabilities/expr.go

michalpristas · 2021-02-09T13:46:28Z

/package

blakerouse

Looks like @urso gave a good review. I didn't catch anything that you have not already fixed. +1 from me.

michalpristas · 2021-02-15T08:09:21Z

/package

mdelapenya · 2021-02-15T09:40:08Z

/package

Hey @michalpristas, I'm seeing the following error while running the standalone tests (using the docker image):

[fleet_elastic-agent_1][docker.elastic.co/observability-ci/elastic-agent:pr-23848][e52698e5e9c49fd623fdffaadae8be1b883b05627aa09938f23a234b93273122][2021-02-15T09:28:28.128Z] standard_init_linux.go:219: exec user process caused: exec format error

Container logs: https://beats-ci.elastic.co/job/e2e-tests/job/e2e-testing-mbp/job/master/351/artifact/docker_logs_fleet_stand_alone_agent%20&&%20~@nightly.log

do you think it's related?

michalpristas · 2021-02-15T09:42:04Z

@mdelapenya exec format i dont think so, maybe arm?

michalpristas · 2021-02-15T10:17:55Z

tried linux locally and it works. @mdelapenya is talking to the team to see what's causing failures. will not merge until i know it's not related

ruflin · 2021-02-15T11:48:47Z

@michalpristas @urso I assume in any case we should not merge when CI is red? I guess this also applied for E2E?

michalpristas · 2021-02-15T13:42:18Z

failures don't look related, something with e2e test in general

…3848) [Ingest Management] Agent supports capabilities definition (elastic#23848)

urso · 2021-02-15T13:29:14Z

x-pack/elastic-agent/pkg/capabilities/rule.go

+)
+
+type ruler interface {
+	Rule() string


What is 'Rule' supposed to return? Maybe we should name the method RuleType()?

Why do we need the return type "string" here? AFAICT the only valid values are allow and deny. If valid values are restricted we should introduce an enum.

I don't see the method Rule() to be used anywhere? Do you plan to use it somewhere in the future?

urso · 2021-02-15T13:50:58Z

x-pack/elastic-agent/pkg/capabilities/capabilities.go

+type Capability interface {
+	// Apply applies capabilities on input and returns true if input should be completely blocked
+	// otherwise, false and updated input is returned
+	Apply(interface{}) (interface{}, error)


I still would say the interface is doing to much, while not really abstracting things.

The input and output types are interface{}, that is the internal representation of the objects are not really abstracted away behind coherent interfaces. Instead we rely on internal knowledge of the potential structures to be passed from different call sides. Due to the need to downcast and analyze the internal structure, we fail to have proper abstractions here. Instead we allow for potential bugs being introduced in the future in case we change the internal representation for e.g. Inputs to a proper go-type without finding and updating the right capabilities.

The Apply method indicates that we are doing a transformation. Yet we only want to filter. Would it make sense to change the interface to act more like a predicate and have the code to move through the configured structures in some other central place?
My understanding is that we have the 'walking' on the structure already in the transpiler, why should we replicate the walking in the capabilities?

E.g.

type Capability interface { // TestFeature returns nil if the kind and typeName are accepted by the capability. // ErrDeny is returned if the capability rejects the current feature. TestFeature(kind ComponentKind, typeName string) (err error) } var ErrDeny = errors.New("deny") type ComponentKind int const ( ComponentInput ComponentKind = iota ComponentOutput )

Alternatively (in case we want transformations), we might want to pass the root configuration object to Apply. In that case only the implementation is abstracted away, but the input type is known. If we change the go type for inputs/outputs to proper structures, the actual implementation of the input capability should fail to compile.

x-pack/elastic-agent/pkg/capabilities/input.go

…24037) [Ingest Management] Agent supports capabilities definition (#23848)

michalpristas added 25 commits January 26, 2021 09:01

base

e364446

test

51ff837

yaml parsing

c78b8fe

interim

5861d56

Merge branch 'master' of github.com:elastic/beats into agent-21000

e5c76d8

rename

d4f938d

upgrade logic done

cd7468f

expr

c21a7c7

input in progress

c1d244b

conditions

33f6329

multi inputs

595c253

ast to map, saving 80perc allocations

716c4d6

output

8667e67

logging

3214f2e

load form file

8ec47a0

Merge branch 'master' into agent-21000

03acaea

prepare tests of load

faacd96

tests load

2472827

tests load

0124574

injecting

8882956

reporter in

29f3413

refinement

b094755

unhealthy on filter

7fa392f

Merge branch 'master' of github.com:elastic/beats into agent-21000

0037255

gitignore revert

d53b4eb

michalpristas added enhancement needs_backport PR is waiting to be backported to other branches. Team:Ingest Management Team:Elastic-Agent Label for the Agent team labels Feb 4, 2021

michalpristas self-assigned this Feb 4, 2021

ruflin reviewed Feb 5, 2021

View reviewed changes

michalpristas added 2 commits February 5, 2021 16:20

subtree

4e23669

Merge branch 'master' of github.com:elastic/beats into agent-21000

5b0bdca

Merge branch 'master' of github.com:elastic/beats into agent-21000

75dc879

Merge branch 'master' of github.com:elastic/beats into agent-21000

9a337b9

urso reviewed Feb 8, 2021

View reviewed changes

ph mentioned this pull request Feb 8, 2021

[Fleet] Show agent as Unhealthy if agent reports an error about incompatible input(s) elastic/kibana#76841

Closed

michalpristas added 3 commits February 9, 2021 10:45

small refactor

58a04ce

Merge branch 'master' of github.com:elastic/beats into agent-21000

41443f9

lint

ea33dff

comment typo

02c88d8

blakerouse approved these changes Feb 10, 2021

View reviewed changes

michalpristas added 2 commits February 12, 2021 09:39

Merge branch 'master' of github.com:elastic/beats into agent-21000

6aa4ca0

Merge branch 'master' of github.com:elastic/beats into agent-21000

c18b5cd

michalpristas merged commit 282a7bc into elastic:master Feb 15, 2021

michalpristas added a commit to michalpristas/beats that referenced this pull request Feb 15, 2021

[Ingest Management] Agent supports capabilities definition (elastic#2…

b62dc78

…3848) [Ingest Management] Agent supports capabilities definition (elastic#23848)

michalpristas mentioned this pull request Feb 15, 2021

Cherry-pick #23848 to 7.x: Agent supports capabilities definition #24037

Merged

6 tasks

urso reviewed Feb 15, 2021

View reviewed changes

michalpristas added a commit that referenced this pull request Feb 15, 2021

[Ingest Management] Agent supports capabilities definition (#23848) (#…

86a0bd8

…24037) [Ingest Management] Agent supports capabilities definition (#23848)

ph mentioned this pull request Apr 4, 2022

Refactor: Use the capabilities to defined if an Agent is upgradable or not elastic/elastic-agent#290

Open

felixbarny mentioned this pull request Jun 18, 2021

add java attacher config elastic/apm-server#5483

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Ingest Management] Agent supports capabilities definition #23848

[Ingest Management] Agent supports capabilities definition #23848

michalpristas commented Feb 4, 2021 •

edited

ruflin Feb 5, 2021

urso Feb 8, 2021

michalpristas commented Feb 5, 2021

michalpristas commented Feb 5, 2021

urso Feb 8, 2021

michalpristas Feb 8, 2021

blakerouse Feb 10, 2021

urso Feb 8, 2021

michalpristas Feb 8, 2021

urso Feb 8, 2021

michalpristas Feb 8, 2021

urso Feb 8, 2021

michalpristas commented Feb 9, 2021

blakerouse left a comment

michalpristas commented Feb 15, 2021

mdelapenya commented Feb 15, 2021 •

edited

michalpristas commented Feb 15, 2021

michalpristas commented Feb 15, 2021

ruflin commented Feb 15, 2021

michalpristas commented Feb 15, 2021 •

edited

urso Feb 15, 2021

urso Feb 15, 2021

[Ingest Management] Agent supports capabilities definition #23848

[Ingest Management] Agent supports capabilities definition #23848

Conversation

michalpristas commented Feb 4, 2021 • edited

What does this PR do?

Testing

Why is it important?

Checklist

Choose a reason for hiding this comment

Choose a reason for hiding this comment

michalpristas commented Feb 5, 2021

michalpristas commented Feb 5, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

michalpristas commented Feb 9, 2021

blakerouse left a comment

Choose a reason for hiding this comment

michalpristas commented Feb 15, 2021

mdelapenya commented Feb 15, 2021 • edited

michalpristas commented Feb 15, 2021

michalpristas commented Feb 15, 2021

ruflin commented Feb 15, 2021

michalpristas commented Feb 15, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

michalpristas commented Feb 4, 2021 •

edited

mdelapenya commented Feb 15, 2021 •

edited

michalpristas commented Feb 15, 2021 •

edited