Elasticsearch output #137

faec · 2022-10-13T21:15:49Z

Port of the Elasticsearch output from Beats to the shipper.

Currently it accepts a Beats-style configuration tree, creates an ES client, reads batches from the queue, and sends them to the fixed elastic-agent-shipper index in the target cluster.

This pass does not include:

Support for workers / load balancing
Detailed batching / timeout / error recovery parameters
Most other configuration options that are tied to the internals of the Beats pipeline
Beats tests that depend on unimplemented features

Because this code has been uprooted from a completely different setting, much of it is nonfunctional. However, the intent is eventually to fully support the same configurations as the original, so rather than delete all these fields and functions I have left most of them either commented out or idle, so that (vast and trunkless) they can remind us of the work that is still ahead and where it needs to go.

mergify · 2022-10-13T21:16:22Z

This pull request does not have a backport label.
If this is a bug or security fix, could you label this PR @faec? 🙏.
For such, you'll need to label your PR with:

The upcoming major version of the Elastic Stack
The upcoming minor version of the Elastic Stack (if you're not pushing a breaking change)

To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

backport-v8./d.0 is the label to automatically backport to the 8./d branch. /d is the digit

elasticmachine · 2022-10-13T21:19:05Z

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS

Expand to view the summary

Build stats

Start Time: 2022-10-27T17:39:19.309+0000
Duration: 12 min 9 sec

❕ Flaky test report

No test was executed to be analysed.

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

/test : Re-trigger the build.

…es-output-staging

fearful-symmetry · 2022-10-24T21:25:01Z

controller/runner.go

+	if config.Console != nil && config.Console.Enabled {
+		return output.NewConsole(queue)
+	}
+	if config.Elasticsearch != nil {


Is there a reason we're just checking for != nil ? I'm imagining a scenario where someone sets output.elasticsearch.enabled: false and it still starts up or something.

The reason was that the Beats config structures didn't have an Enabled, and if we add one the shipper doesn't have a way to default its value to true the way Beats does. But new outputs will be coming soon so I guess we need to handle this anyway -- I added an enabled flag (which must now be explicitly specified in the config), and revised this check.

…es-output-staging

leehinman

Probably not for this PR, but it would be nice to get ES started from Docker and add to the integration test.

output/elasticsearch/client.go

leehinman · 2022-10-25T18:46:33Z

output/elasticsearch/client.go

+	}
+
+	count := len(data)
+	failed := data[:0]


Suggested change

failed := data[:0]

failed := make([]*messages.Event, count)

nit

I added the other one but I think this one makes sense as-is, there's no need to preallocate anything especially a full-length buffer since this array will usually (knock on wood 😜) be empty

I added the other one but I think this one makes sense as-is, there's no need to preallocate anything especially a full-length buffer since this array will usually (knock on wood 😜) be empty

The slice is pre-allocating, that was part of why I proposed the change, the make is a little more declarative.

See https://go.dev/play/p/JXwF_bFUYgE

And of course I screwed up in the suggestion, it should be make([]*messages.Event, 0, len(data)) I forgot the 0 for the length. If we really expect failed to be empty we could do a make with len & cap set to 0.

Sorry, could you elaborate on how an allocation is happening? As I understand it, failed := data[:0] creates a slice but doesn't allocate anything new, it just references the existing array data -- it should cause no heap allocations unless append is called, which creates a new underlying array if needed. I've seen this pattern used as a short way to create an auxiliary list that will often be empty, specifically because it avoids the allocation.

Sorry, could you elaborate on how an allocation is happening?

🤦 your right, nevermind.

leehinman · 2022-10-25T19:01:07Z

output/elasticsearch/json_read.go

+//
+// Due to parser simply stepping through the input buffer, almost no additional
+// allocations are required.
+type jsonReader struct {


do we have any unit tests for jsonReader?

we do not :-/

leehinman · 2022-10-25T19:13:15Z

output/elasticsearch/config.go

+// Config specifies all configurable parameters for the Elasticsearch output.
+// Currently these are identical to the parameters in the Beats Elasticsearch
+// output, however this is subject to change as we approach official release.
+type Config struct {


probably want some unit tests for the config validation.

The config validation is sparse and like in Beats we have no reliable way to evaluate whether a config is valid. For example, other than the baseline validation inherited from the network config, the the ES output validation only checks that at most one of (apikey) and (username+password) are specified. There are many other ways to produce invalid configurations, but we only check them implicitly when the fields are used by the ES client.

All that said, I can still add a test that at least invokes what we have.

One thing we could look at to get an idea of what the "core" Elasticsearch client settings under agent are is the Endpoint security ES client. It is written to support only the configuration that can be provided from an agent policy's output settings. It can be found here (private repository). It doesn't have any of the historical configuration that standalone Beats have accumulated.

This will mostly be the minimum set of parameters specified in the Fleet documentation https://www.elastic.co/guide/en/fleet/current/fleet-settings.html#output-settings. Note that users can put whatever they want in the "Advanced YAML configuration" block, but the UI doesn't validate anything and there is no guarantee that every process running under agent supports them. We likely need to clarify the backwards compatibility guarantees for Advanced YAML configuration, but I am reasonably confident that it is mostly used to specify workers and bulk_max_size for Beats.

cmacknz

Still going through this, but I found at least two features I think we can drop.

+1 to Lee's suggestion to get an integration test using Docker up, but we can do that as a follow up task.

Since automated tests will come later, what are the steps to test this manually? Are there example Beat+Shipper configs you can share?

output/elasticsearch/dead_letter_selector.go

output/elasticsearch/elasticsearch.go

faec · 2022-10-26T17:50:09Z

Since automated tests will come later, what are the steps to test this manually? Are there example Beat+Shipper configs you can share?

The config I've been testing with is:

# filebeat.yml
...
output.shipper:
  server: "localhost:50052"

# elastic-agent-shipper.yml
...
output:
  elasticsearch:
    enabled: true
    hosts: ["https://localhost:9200"]
    username: "elastic"
    password: "[password]"
    allow_older_versions: true
    ssl.verification_mode: none

Co-authored-by: Lee E Hinman <57081003+leehinman@users.noreply.github.com>

mergify · 2022-10-27T13:59:54Z

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b es-output-staging upstream/es-output-staging
git merge upstream/main
git push upstream es-output-staging

output/elasticsearch/client.go

output/elasticsearch/non_indexable_policy.go

cmacknz

We'll need to get issues created for all the follow work we need to do, but this is good as a starting point. Thanks!

…es-output-staging

faec added 8 commits August 11, 2022 12:54

prototyping ES output

cf32709

Merge branch 'main' into es-output

bad4725

add ES code to the repo

cabef2a

working on prototype

99d1523

making more things build

38a1f42

add a bunch more pieces

8e8ec6d

idk

dbe6af1

it builds

28ab8a2

faec self-assigned this Oct 13, 2022

jlind23 linked an issue Oct 17, 2022 that may be closed by this pull request

Implement an MVP of the Elasticsearch output #10

Closed

faec added 4 commits October 17, 2022 17:22

Merge branch 'main' of github.com:elastic/elastic-agent-shipper into …

5190662

…es-output-staging

propagate configs, start ES output from server

7de4aaa

a bunch more fiddling

d89ab82

it ingests data

0ae11df

faec marked this pull request as ready for review October 24, 2022 20:54

faec requested a review from a team as a code owner October 24, 2022 20:54

faec requested review from belimawr and leehinman and removed request for a team October 24, 2022 20:54

cleanup

9c995ed

fearful-symmetry reviewed Oct 24, 2022

View reviewed changes

faec added 7 commits October 24, 2022 17:27

lint

cbf62cf

lint

e672a9a

lint

d0a348a

lint

e28cd05

lint

a47c0e2

lint

f7df3b7

lint

3e08fef

faec added 3 commits October 25, 2022 12:38

add license headers

9fe0929

mage check

c8a27d8

Merge branch 'main' of github.com:elastic/elastic-agent-shipper into …

f9b5ca7

…es-output-staging

faec changed the title ~~Elasticsearch output draft~~ Elasticsearch output Oct 25, 2022

faec added 2 commits October 25, 2022 13:33

working on broken test

97a3bc8

fix unit test

32b131f

fearful-symmetry mentioned this pull request Oct 25, 2022

Implement occupied_retry metrics for the shipper #143

Closed

leehinman approved these changes Oct 25, 2022

View reviewed changes

cmacknz reviewed Oct 25, 2022

View reviewed changes

output/elasticsearch/dead_letter_selector.go Outdated Show resolved Hide resolved

output/elasticsearch/elasticsearch.go Outdated Show resolved Hide resolved

faec added 4 commits October 26, 2022 13:32

remove dead_letter_selector / NonIndexableAction

7310352

remove more idle code

bddfbbb

remove old init code with pipeline selectors and dead letter support

9119f29

remove more code that will not be used

0daab6d

faec and others added 3 commits October 26, 2022 14:17

Update output/elasticsearch/client.go

36c575c

Co-authored-by: Lee E Hinman <57081003+leehinman@users.noreply.github.com>

add config validation test

b0af58b

fix redundant import

73f907d

cmacknz reviewed Oct 27, 2022

View reviewed changes

output/elasticsearch/client.go Outdated Show resolved Hide resolved

cmacknz reviewed Oct 27, 2022

View reviewed changes

output/elasticsearch/non_indexable_policy.go Outdated Show resolved Hide resolved

cmacknz approved these changes Oct 27, 2022

View reviewed changes

faec added 3 commits October 27, 2022 13:30

remove more unneeded code

4fc420e

Comment the placeholder output configuration

98d2a6f

Merge branch 'main' of github.com:elastic/elastic-agent-shipper into …

5874354

…es-output-staging

faec merged commit 0f4c555 into elastic:main Oct 27, 2022

faec deleted the es-output-staging branch October 27, 2022 17:54

cmacknz mentioned this pull request Oct 31, 2022

Use the go-elasticsearch client for the Elasticsearch output #14

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Elasticsearch output #137

Elasticsearch output #137

faec commented Oct 13, 2022 •

edited

Loading

mergify bot commented Oct 13, 2022

elasticmachine commented Oct 13, 2022 •

edited

Loading

Build stats

fearful-symmetry Oct 24, 2022

faec Oct 25, 2022 •

edited

Loading

leehinman left a comment

leehinman Oct 25, 2022

faec Oct 26, 2022

leehinman Oct 26, 2022

faec Oct 27, 2022

leehinman Oct 27, 2022

leehinman Oct 25, 2022

faec Oct 26, 2022

leehinman Oct 25, 2022

faec Oct 26, 2022

cmacknz Oct 27, 2022 •

edited

Loading

cmacknz left a comment

faec commented Oct 26, 2022

mergify bot commented Oct 27, 2022

cmacknz left a comment

Elasticsearch output #137

Elasticsearch output #137

Conversation

faec commented Oct 13, 2022 • edited Loading

mergify bot commented Oct 13, 2022

elasticmachine commented Oct 13, 2022 • edited Loading

💚 Build Succeeded

Build stats

❕ Flaky test report

🤖 GitHub comments

Choose a reason for hiding this comment

faec Oct 25, 2022 • edited Loading

Choose a reason for hiding this comment

leehinman left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cmacknz Oct 27, 2022 • edited Loading

Choose a reason for hiding this comment

cmacknz left a comment

Choose a reason for hiding this comment

faec commented Oct 26, 2022

mergify bot commented Oct 27, 2022

cmacknz left a comment

Choose a reason for hiding this comment

faec commented Oct 13, 2022 •

edited

Loading

elasticmachine commented Oct 13, 2022 •

edited

Loading

faec Oct 25, 2022 •

edited

Loading

cmacknz Oct 27, 2022 •

edited

Loading