Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve and unify dimensions for Elastic-Agent and Beats metrics #8238

Merged
merged 2 commits into from
Nov 6, 2023

Conversation

belimawr
Copy link
Contributor

@belimawr belimawr commented Oct 18, 2023

Proposed commit message

The metrics from Elastic-Agent package define some time series but
not enough dimensions for the entries to be unique. This commit adds
new mappings and dimensions to make all metrics events unique as well
as unifies the dimensions from Elastic-Agent and Beats.

Checklist

  • I have reviewed tips for building integrations and this pull request is aligned with them.
  • I have verified that all data streams collect metrics or logs.
  • I have added an entry to my package's changelog.yml file.
  • I have verified that Kibana version constraints are current according to guidelines.

Author's Checklist

Why do we need those new dimensions?

The combination of the dimension fields plus the timestamp field must be unique within the timeseries. In the case of the indexes used by the Elastic-Agent integration that will make Elasticsearch return 409 Conflict for the duplicated event.

This PR uses the following fields for metrics from Elastic-Agent and Beats:

  • agent.id: Each deployed Elastic-Agent has got a unique ID, this ID is present in each event generated by its components, hence different Beats process share the same ID.
  • component.id: Each component run by Elastic-Agent has got a unique ID and each component run in a independent process, this is enough to to ensure different process using the same Beats binary will have unique IDs
  • metricset.name: For each component we collect more than one set of metrics, so it is possible some of them are going to be have the same timestamp, which will lead to collisions. Having the metricset name as dimension ensures there will not be any collision. For some Beats we collect at least stats and state metricset (the input configuration can be seen here). We also collect a json metricset.

Why was service.address removed?
service.address is frequently the same given the address can be a named pipe on Windows, e.g:

  "agent": {
    "name": "agent",
    "type": "metricbeat",
    "version": "8.9.1"
  },
  "service": {
    "name": "beat",
    "type": "beat",
    "address": "http://npipe/stats"
  }

#7977 contains more details about some possible collisions and how to avoid them.

How to test this PR locally

  1. Build the Elastic-Agent package
cd packages/elastic_agent 
elastic-package -v build
  1. Deploy the stack
elastic-package stack up --version=8.12.0-SNAPSHOT -v -d
  1. Build a custom Elastic-Agent
    Clone the code from the branch fix-tsdb-metrics or the Elastic-Agent PR, the build
DEV=true EXTERNAL=false PACKAGES=tar.gz PLATFORMS=linux/amd64 mage -v package

Adjust the PLATFORMS according to your system.

  1. Create a new policy with system metrics and monitoring enabled
  2. Deploy your build of the Elastic-Agent. You can either install it or enroll and then execute as a normal user.
  3. Go to Kibana -> Discover, select the metrics-* dataview. The entries from data_stream.dataset: elastic_agent.* sent by the Elastic-Agent you build (eg. you can also filter by host.name) should all contain a component.id field.

Bonus: Testing with Logstash

Testing with Logstash has got the advantage that Logstash will log 409s from Elasticsearch, making it clear the fields are correctly ingested in Elasticsearch. However, this PR alone is not enough to prevent the 409s happening, you will need to also use the Elastic-Agent build mentioned above (see step 3).

First you need to configure Logstash to get receive data from Elastic-Agent and send it to Elasticsearch. For this test the easiest way to get a Stack running is to use elastic-package, it will also enable you to deploy the custom elastic-agent integration from the PR mentioned above.

Here is a configuration for Logstash, it uses some self generated certificates and the default credentials from Elastic-Pakcage.

logstash-sample.conf

input {
  elastic_agent {
    port => 5044
    ssl => true
    ssl_certificate_authorities => ["/home/tiago/sandbox/tls/rootCA.crt"]
    ssl_certificate => "/home/tiago/sandbox/tls/server__server_cert.crt"
    ssl_key => "/home/tiago/sandbox/tls/server_cert.pkcs8.key"
    ssl_verify_mode => "force_peer"    
  }
}

output {
  elasticsearch {
    hosts => ["https://localhost:9200"]
    user => "elastic"
    password => "changeme"
    cacert => "/home/tiago/.elastic-package/profiles/default/certs/ca-cert.pem"
  }
}

  1. Once Logstash is up and running, in Kibana go to Fleet -> Settings and add a Logstash output.
  2. Create a new policy with system metrics and monitoring enabled
  3. Go to Fleet -> Agent policies -> -> Settings and change the output for integrations and monitoring to use Logstash
  4. Deploy the Elastic-Agent
  5. Ensure the fields component.id and component.binary are set for events that match data_stream.dataset: elastic_agent.*
  6. There should be no 409 error from Elasticsearch.

Important:
Make sure your Elastic-Agent can resolve the hostnames the elastic-package stack uses, on Linux add this to your /etc/hosts

 127.0.0.1 localhost elasticsearch fleet-server kibana

Related issues

## Screenshots

@elasticmachine
Copy link

elasticmachine commented Oct 18, 2023

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2023-10-31T15:00:12.150+0000

  • Duration: 15 min 54 sec

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

@elasticmachine
Copy link

elasticmachine commented Oct 18, 2023

🌐 Coverage report

Name Metrics % (covered/total) Diff
Packages 100.0% (0/0) 💚
Files 100.0% (0/0) 💚
Classes 100.0% (0/0) 💚
Methods 33.333% (28/84)
Lines 100.0% (0/0) 💚
Conditionals 100.0% (0/0) 💚

@belimawr belimawr marked this pull request as ready for review October 19, 2023 17:07
@belimawr belimawr requested a review from a team as a code owner October 19, 2023 17:07
@pierrehilbert pierrehilbert added the Team:Elastic-Agent Label for the Agent team label Oct 20, 2023
@elasticmachine
Copy link

Pinging @elastic/elastic-agent (Team:Elastic-Agent)

@belimawr belimawr changed the title Fix dimensions for Elastic-Agent metrics Improve and unify dimensions for Elastic-Agent and Beats metrics Oct 27, 2023
@belimawr belimawr requested a review from rdner October 27, 2023 15:19
@belimawr belimawr requested a review from cmacknz October 31, 2023 14:55
The metrics from Elastic-Agent package define some time series but
not enough dimensions for the entries to be unique. This commit adds
new mappings and dimensions to make all metrics events unique as well
as unifies the dimensions from Elastic-Agent and Beats
Add missing mappings for some other Beats
@belimawr
Copy link
Contributor Author

belimawr commented Nov 6, 2023

The Elastic-Agent PR (elastic/elastic-agent#3626) has been merged and this PR is adding new mappings/settings, I did a quick test and did not find any issues with running an older version of Elastic-Agent with the changes introduced here.

@belimawr belimawr merged commit 494f42e into elastic:main Nov 6, 2023
4 checks passed
@elasticmachine
Copy link

Package elastic_agent - 1.16.0 containing this change is available at https://epr.elastic.co/search?package=elastic_agent

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:Elastic-Agent Label for the Agent team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Elastic Agent] TSDB metrics dimensions do not create unique time series
5 participants