Integrate with Elastic Agent #4004

axw · 2020-07-21T01:50:21Z

jalvz · 2020-09-10T13:34:33Z

@axw you mentioned that you had some details you wanted to add here, right? or did I misinterpret?

axw · 2020-09-14T02:10:38Z

@jalvz I had previously added a list, but I think I forgot to save and it was lost. I'll add things as I remember them 😅
Feel free to add things you think we should look at too.

jalvz · 2020-10-21T16:15:18Z

In Fleet / Ingest Manager, the top level entity in an integration is the data stream. Each integration can have 1 or more data streams. Each data stream has a type. Currently there are 2 types defined in the spec: logs and metrics.
Each data stream can have any number of inputs. Every input has its own settings, although there is a way to associate settings to a given data stream and have them applied to all its inputs. There is no way however to go up 1 level and have a setting mapped to the whole integration.
Data streams and inputs are defined in manifest.yml files for each package. Both data streams and inputs can be enabled or disabled. Kibana will parse the manifest file and display the information in 2 levels, with a toggle switch for each.
This is how it looks for the Apache integration:

Note that the hosts setting applies to the whole metrics data stream, that currently only has 1 input (status metrics).

Now we need to define an APM integration in terms of data streams and inputs with the structure described above. I'm going with the assumption that we will have only one APM integration for apm-server and APM agents.

Following the Indexing Strategy proposal, we are going to introduce a new traces type, and use both existing metrics and logs types. The most suitable option then is to treat APM agents as inputs.
Then, to configure eg. the Java agent, one would find some settings under traces, some other under metrics, etc.
We could further decide that the inputs under traces have general and tracing-related settings, and the inputs under the other data streams (metrics, logs) have settings specific to that type only.

From there, 2 options come to mind:

We could adopt the convention of having apm-server configuration under traces, and make sure that the traces data stream can't be disabled in the UI. We would also need the UI to propagate that configuration even if no input is enabled (this is currently not supported in Ingest Manager).

A simplified version of this would look like:

Alternatively we could treat apm-server like a separate input. In that case its settings will be scattered across the data streams like for APM agents, we would still need to pick one type for generic settings (which are almost all), and we would still need the UI to enforce that specific input (ie. don't allow users to disable apm-server).

jalvz · 2020-10-21T16:18:46Z

@elastic/apm-server @felixbarny @ruflin looking for feedback on comment above, if anything of what I wrote makes any sense.

graphaelli · 2020-10-21T20:42:39Z

I would vote for the simplest UI that gets APM Server and the Agents deployed. If possible, just a Monitor applications with a toggle per language, deferring all opting in/out of trace, log, profile, metric collection and any configuration customization to central configuration. Picking on these screenshots, I would not expect to see secret token reporting interval max spans per transaction at this point in the integration setup.

axw · 2020-10-22T05:48:59Z

I agree with @graphaelli's comments above, with the exception of maybe secret_token. That's something that's defined in the server's config, so I think it does belong in that UI... Maybe we shouldn't add it though, and instead only provide an option to enable/disable API Key auth?

I don't think it makes a heap of sense to opt in/out of individual data types for APM agents at this level, since it doesn't give a way of filtering by service. You might have multiple Java services, but want to enable logs for only a handful of them. This is different to existing integrations, where the integration is tied to a specific service (apache, nginx, etc.)

Going further, I think the only agent-related config that should go in the package should be to do with auto-instrumentation: whether to enable/disable per language, and perhaps some parameters like a pattern for matching process names. @felixbarny WDYT?

jalvz · 2020-10-22T07:04:43Z

Thanks all,

I don't intend to add any agent related stuff at this point at all, but since we have to make to room for it I wanted to show now how it could look like. I don't have an opinion on what settings should be there, this was just an example.

There is no way around the toggle for the data types and inputs (that is what I tried to explain in standup yesterday @axw), I agree it doesn't make sense for us.

I guess my biggest question now is whether to treat apm-server as an input (confusing UI IMO), or just have all its settings under eg. the traces data stream (needs additional support in Ingest Manager to have those settings propagated downstream even if there are 0 inputs).

felixbarny · 2020-10-22T07:05:07Z

I don't think it makes a heap of sense to opt in/out of individual data types for APM agents at this level

++
I think we should not constrain ourselves with sticking to an input per data stream. Instead, let's think about what would be an ideal solution from the perspective of the user.

I've created a document with a proposal of how to integrate agents in the workflow that includes some mockups:
https://docs.google.com/document/d/1b_Fy0KcgQBer_jQpgCjVaIbsg8--oGX_XqNy6mSONsU/edit?usp=sharing

I agree with @graphaelli's comments above, with the exception of maybe secret_token.

Do we need a secret token? When APM Server is installed locally, could we make it to only accept connections from localhost? On k8s, can we inject (via a mutating webhook) the secret_token a environment variable from a k8s secret?
See also this section of the proposal doc.

deferring all opting in/out of trace, log, profile, metric collection and any configuration customization to central configuration.

++
An exception to that rule is if we'd want users to set options that are not dynamic, meaning they have to be known upfront, before querying remote config.

jalvz · 2020-10-22T07:16:51Z

I think we should not constrain ourselves with sticking to an input per data stream.

Not sure if I understand, I think we are not doing that?

let's think about what would be an ideal solution from the perspective of the user

SGTM, but note that the proposal in the doc you linked (one data stream per agent, which is what I first thought of) might not be very compatible with how Ingest Manager works now. Eg. what is the type of the agent data stream, etc.
Depending on what are the release expectations that can be more or less ambitious, so for now I just wanted to find the path of least resistance to move forward (progress over perfection and all that).

felixbarny · 2020-10-22T07:20:59Z

In the very first iteration, I think it's probably fine to not have any agent related inputs. The next step would be to include documentation on how to install the agents (similar to the add data dialog). On top of that, we can add controls for auto attachment of agents.

axw · 2020-10-22T08:03:08Z

Do we need a secret token? When APM Server is installed locally, could we make it to only accept connections from localhost?

Good point, we probably don't need it to start with. Perhaps later we could orchestrate secrets (e.g. server generates secret_tokens and/or API Keys, and passes them to instrumented apps). It's probably not high priority since it's localhost-only.

@jalvz reiterating my statement from standup for posterity: I'm on board with squishing the config into an arbitrary (i.e. traces) data stream config as a way to move forward, and working with the Fleet team to improve the UX.

axw · 2020-10-22T10:24:08Z

Copying a little bit of conversation from Slack. I failed to write down some thoughts I had swimming in my mind which led to some confusion.

We talked a bit about needing a way to configure APM Server for hosted cases, e.g. I may want to run an APM Server for RUM, Lambda, etc., where APM Server does not/cannot run on the same host. I'm thinking we might want two separate packages: one that is all about Hosted APM Server, and one that is focused more on APM Agents, where APM Server is configured transparently in order to service the agents.

jalvz · 2020-10-22T12:16:15Z

Ok, so let's elaborate that. The integrations page would offer something like:

Click on the left one:

Again, that is just an example. But I was also thinking that it would make sense (eventually) to include agent settings not supported by central config?
Anyways, move on to the APM Server integration:

Now maybe is more obvious what I mentioned about confusing UI:

We have 2 levels ("Collect application traces" -> "APM Server") that we don't need. We can change the titles, but there is no way around the 2 levels. I wouldn't bet on a "simple" fix for that in Ingest Manager, but maybe I am wrong.
In addition of the problematic toggle switch for the traces, we have now a problematic toggle switch for the APM Server input as well, which doesn't make sense.

Having 2 similar-yet-different ways to install/manage APM Server means that we need 2 ways of handling configuration coming from Elastic agent in the apm-server code, more documentation, user confusion, etc.

All in all, I think the option with best trade-offs is still the initial suggestion.

jalvz · 2020-10-22T13:49:34Z

~~Answering myself here and cross-posting my confusion: we might need the apm-server input after all, as otherwise templates will not be installed if there are no inputs...~~

jalvz · 2020-11-23T07:13:43Z

Update 23 Nov

jalvz · 2020-12-01T17:26:17Z

Update 01 Dec

Work that was WIP at the time of the last status update has been finished. This means: Kibana support for top level variables, a way for us to generate a package out of our fields.yml and some code refactoring to have data streams and ILM not interfer with each other. This is needed eg. to ensure that setup subcommands don't work when data streams are enabled.

The main body of work currently WIP is #4473. The most important thing that it brings to the package are pipelines.
APM Server uses pipelines to drop span metadata (usually not needed), add an ingest timestamp, and enrich events with user agent and geo data.
There was a misunderstanding in that the apm package expected to define pipelines outside of data streams (following the described format) because some of them need to run for different event types. However that feature is not actually implemented. We should be able to move forward copying the pipelines to each relevant data stream, so this is not a blocker.

The templates currently generated by the apm package seem all right and good. Once we wire the result of kibana/83878 down to the apm-server config reloader, and verify that the whole thing (hopefully!) works, then we can finally copy the package to elastic/integrations and take it from there.

jalvz · 2020-12-08T09:30:41Z

Update 08 Dec

Work that was WIP at the time of the last status update has been finished. It took a couple of rounds of fixes to get pipelines working; and with the addition of 4486 we got now all the pieces together.
Installation, ingestion and the APM UI (with index setting changes) seem to work fine together, but more exhaustive testing is needed.

I opened apm-server issues to iterate on docs and testing, and to add icons/screenshots (to be worked on next). Also filed package-registry/659 to improve pipeline support in Fleet.

First package should come to the snapshot registry hopefully soon via package-storage/688, which is now waiting on a pending registry release.

ph · 2020-12-10T14:50:55Z

Thanks @jalvz for the updates!

jalvz · 2020-12-16T11:26:22Z

Update 16 Dec

Work that was WIP at the time of the last status update has been finished. We have the APM package version 0.1.0-dev.1 in the snapshot registry and 0.1.0-dev.2 is on its way bringing more configurations (and an icon).

We identified and fixed a permission issue in Fleet, which now needs an enhancement to smooth out the experience for 7.12.

There is now support in apm-integration-testing to run apm-server under Agent, and we will continue to iterate on that in the upcoming weeks and flip the switch to run apm-server under agent by default (that is, in apm-integration-testing).

This week there has been also good progress adding tests in apm-server for the package; and last but not least, a new proposal for sourcemapping is out for review (this has POC implementation already).

In the server weekly today we decided to not release the APM package in 7.11, and instead use a whole release cycle to improve, test and play with it. The reason is that the package as-is doesn't bring a huge benefit to existing APM users, they would just have some features less and a bit more of friction.
All code is in place and has been backported to 7.x.

Therefore I will close this issue soon and open a new meta issue for tracking 7.12 goals.

jalvz · 2020-12-16T13:18:28Z

Forgot to mention that there is an ongoing discussion about whether or not to bundle apm-server with Agent.

I'll close this now in favor of #4558, thanks for following 👋

axw added this to the 7.10 milestone Jul 21, 2020

axw added the [zube]: Ready label Jul 23, 2020

axw assigned jalvz Aug 6, 2020

dgieselaar mentioned this issue Aug 25, 2020

[APM] List possible index/document types for new indexing strategy elastic/kibana#75881

Closed

jalvz added [zube]: In Progress and removed [zube]: Ready labels Sep 2, 2020

axw mentioned this issue Sep 17, 2020

Investigate Elastic Agent elastic/cloud-on-k8s#3201

Closed

This was referenced Sep 22, 2020

Initial spec file for apm-server elastic/beats#21225

Merged

Add initial support for beats central management #4228

Merged

axw modified the milestones: 7.10, 7.11 Oct 28, 2020

axw mentioned this issue Nov 3, 2020

beater: add support for reloadable config #4376

Closed

jalvz mentioned this issue Nov 3, 2020

Implement the new indexing strategy #4378

Closed

jalvz added the meta label Nov 3, 2020

dgieselaar mentioned this issue Nov 9, 2020

Field to identify different kinds of metrics #4392

Closed

jalvz mentioned this issue Nov 23, 2020

Generate fields for APM package #4432

Merged

10 tasks

axw mentioned this issue Dec 1, 2020

Support ilm.pattern option in APM Server #3482

Closed

jalvz mentioned this issue Dec 1, 2020

Iterate package #4473

Merged

10 tasks

jalvz mentioned this issue Dec 16, 2020

[Fleet] APM Server managed by Elastic Agent with Fleet - 7.12 #4558

Closed

15 tasks

jalvz closed this as completed Dec 16, 2020

zube bot added [zube]: Done and removed [zube]: In Progress labels Dec 16, 2020

axw removed the [zube]: Done label Dec 24, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integrate with Elastic Agent #4004

Integrate with Elastic Agent #4004

axw commented Jul 21, 2020 •

edited by jalvz

Loading

jalvz commented Sep 10, 2020

axw commented Sep 14, 2020

jalvz commented Oct 21, 2020 •

edited

Loading

jalvz commented Oct 21, 2020 •

edited

Loading

graphaelli commented Oct 21, 2020

axw commented Oct 22, 2020

jalvz commented Oct 22, 2020 •

edited

Loading

felixbarny commented Oct 22, 2020

jalvz commented Oct 22, 2020 •

edited

Loading

felixbarny commented Oct 22, 2020

axw commented Oct 22, 2020

axw commented Oct 22, 2020

jalvz commented Oct 22, 2020 •

edited

Loading

jalvz commented Oct 22, 2020 •

edited

Loading

jalvz commented Nov 23, 2020 •

edited

Loading

jalvz commented Dec 1, 2020

jalvz commented Dec 8, 2020 •

edited

Loading

ph commented Dec 10, 2020

jalvz commented Dec 16, 2020

jalvz commented Dec 16, 2020

Integrate with Elastic Agent #4004

Integrate with Elastic Agent #4004

Comments

axw commented Jul 21, 2020 • edited by jalvz Loading

Things to investigate/consider

Main lines of work required

Issues

Required

Closed issues

jalvz commented Sep 10, 2020

axw commented Sep 14, 2020

jalvz commented Oct 21, 2020 • edited Loading

jalvz commented Oct 21, 2020 • edited Loading

graphaelli commented Oct 21, 2020

axw commented Oct 22, 2020

jalvz commented Oct 22, 2020 • edited Loading

felixbarny commented Oct 22, 2020

jalvz commented Oct 22, 2020 • edited Loading

felixbarny commented Oct 22, 2020

axw commented Oct 22, 2020

axw commented Oct 22, 2020

jalvz commented Oct 22, 2020 • edited Loading

jalvz commented Oct 22, 2020 • edited Loading

jalvz commented Nov 23, 2020 • edited Loading

Update 23 Nov

Done

Closed

WIP

jalvz commented Dec 1, 2020

Update 01 Dec

jalvz commented Dec 8, 2020 • edited Loading

Update 08 Dec

ph commented Dec 10, 2020

jalvz commented Dec 16, 2020

Update 16 Dec

jalvz commented Dec 16, 2020

axw commented Jul 21, 2020 •

edited by jalvz

Loading

jalvz commented Oct 21, 2020 •

edited

Loading

jalvz commented Oct 21, 2020 •

edited

Loading

jalvz commented Oct 22, 2020 •

edited

Loading

jalvz commented Oct 22, 2020 •

edited

Loading

jalvz commented Oct 22, 2020 •

edited

Loading

jalvz commented Oct 22, 2020 •

edited

Loading

jalvz commented Nov 23, 2020 •

edited

Loading

jalvz commented Dec 8, 2020 •

edited

Loading