Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QASource] Test collection of system process metrics under docker #39900

Open
fearful-symmetry opened this issue Jun 13, 2024 · 4 comments
Open
Assignees
Labels
QA:Needs Validation Needs validation by the QA Team Team:Fleet-QA

Comments

@fearful-symmetry
Copy link
Contributor

In the past months, we've run into a considerable amount of bugs when it comes to monitoring host metrics while running under docker. I'm making these test steps in the hope that this can be a regular set of tests that are run with every release.

Steps to test

  1. Run metricbeat via docker with the following:
 docker run --label co.elastic.metrics/module=system \                                                                                                                         
--mount type=bind,source=/proc,target=/hostfs/proc,readonly \
--mount type=bind,source=/sys/fs/cgroup,target=/hostfs/sys/fs/cgroup,readonly \
--mount type=bind,source=/,target=/hostfs,readonly \
--mount type=bind,source=/var/run/dbus/system_bus_socket,target=/hostfs/var/run/dbus/system_bus_socket,readonly \
--env DBUS_SYSTEM_BUS_ADDRESS='unix:path=/hostfs/var/run/dbus/system_bus_socket' \
--net=host --cgroupns=host docker.elastic.co/beats/metricbeat:VERSION_TO_TEST metricbeat -e -E output.elasticsearch.hosts='[ES_ENDPOINT]' -d '*'
  1. In elasticsearch, ensure that there are documents with metricset.name matching process
  2. in the debug logs, ensure that there are no log lines that contain the strings:
    • Non fatal error fetching PID some info
    • Error fetching PID info for
    • GetInfoForPid:
  3. Repeat steps 1-3, but omit the --cgroupns=host config line:
 docker run --label co.elastic.metrics/module=system \                                                                                                                         
--mount type=bind,source=/proc,target=/hostfs/proc,readonly \
--mount type=bind,source=/sys/fs/cgroup,target=/hostfs/sys/fs/cgroup,readonly \
--mount type=bind,source=/,target=/hostfs,readonly \
--mount type=bind,source=/var/run/dbus/system_bus_socket,target=/hostfs/var/run/dbus/system_bus_socket,readonly \
--env DBUS_SYSTEM_BUS_ADDRESS='unix:path=/hostfs/var/run/dbus/system_bus_socket' \
--net=host docker.elastic.co/beats/metricbeat:VERSION_TO_TEST metricbeat -e -E output.elasticsearch.hosts='[ES_ENDPOINT]' -d '*'

Test Targets

This should be run under docker on linux, and preferably tested across a range of linux distros from our support matrix, at least:

  • Ubuntu 16.04
  • Ubuntu 20.04
  • Ubuntu 24.04
  • RHEL 7, 8, and 9, if possible.
@fearful-symmetry fearful-symmetry added the QA:Needs Validation Needs validation by the QA Team label Jun 13, 2024
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Jun 13, 2024
@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Jun 13, 2024
@elasticmachine
Copy link
Collaborator

Pinging @elastic/fleet-qasource-external (Team:Fleet-QA)

@amolnater-qasource
Copy link

Hi @fearful-symmetry

We have tested this feature on latest 8.15.0 SNAPSHOT kibana cloud environment and had below observations:

Observation Table:

S.no. HostOS Data under metricbeat-* Data under metricbeat-* without –cgroupns=host Non fatal error fetching PID some info Error fetching PID info for GetInfoForPid:
1 Ubuntu 16.04 Available Available No Errors observed No Errors observed No Errors observed
2 Ubuntu 20.04 Available Available No Errors observed No Errors observed No Errors observed
3 Ubuntu 24.04 Available Available No Errors observed No Errors observed No Errors observed
4 Rhel 7 AWS Template not Working AWS Template not Working NA NA NA
5 Rhel 8 Available Available No Errors observed No Errors observed No Errors observed
6 Rhel 9 Available Available No Errors observed No Errors observed No Errors observed

Artifact used: docker.elastic.co/beats/metricbeat:8.15.0-ee48b214-SNAPSHOT metricbeat

Further we were getting authentication errors so we have added authentication under the install command:

sudo docker run --label co.elastic.metrics/module=system \
--mount type=bind,source=/proc,target=/hostfs/proc,readonly \
--mount type=bind,source=/sys/fs/cgroup,target=/hostfs/sys/fs/cgroup,readonly \
--mount type=bind,source=/,target=/hostfs,readonly \
--mount type=bind,source=/var/run/dbus/system_bus_socket,target=/hostfs/var/run/dbus/system_bus_socket,readonly \
--env DBUS_SYSTEM_BUS_ADDRESS='unix:path=/hostfs/var/run/dbus/system_bus_socket' \
--net=host --cgroupns=host docker.elastic.co/beats/metricbeat:8.15.0-ee48b214-SNAPSHOT metricbeat -e -E output.elasticsearch.hosts='https://host-url:443' \
-E output.elasticsearch.username='elastic' \
-E output.elasticsearch.password='password' \
-d '*'
  1. In elasticsearch, ensure that there are documents with metricset.name matching process

For this we have tested metricbeat-* under Discover tab.

image

  1. in the debug logs, ensure that there are no log lines that contain the strings:

For this we have searched the CLI logs where metricbeat is running

image

Logs with cgroups:
with cg.txt

Logs without cgroups:
without cg.txt

Further, could you please share a working AWS- Rhel 7 template as the AWS-Rhel 7 templates we are using below errors are observed on running any install commands.
image

Please let us know if we are missing anything here.

cc: @pierrehilbert

Thanks!!

@fearful-symmetry
Copy link
Contributor Author

Looked at the logs, nothing seems suspicious.

@amolnater-qasource

Further, could you please share a working AWS- Rhel 7 template as the AWS-Rhel 7 templates we are using below errors are observed on running any install commands.

I've tested this myself entirely in local VMs, so I can't comment on any AWS-specific configs needed.

Based on the screenshot above, it looks like the elasticsearch check may be incorrect. We need to check for the presence of documents with metricset.name = process, but the screenshot above appears to show process.name=metricbeat

@amolnater-qasource
Copy link

Hi @fearful-symmetry

Thank you for the update, we have applied metricset.name : "process" under Discover tab.

  • The data for this dataset is available for all the 5 hosts.

Screen Captures:
image

Discover.-.Elastic.-.Google.Chrome.2024-06-21.10-32-40.mp4

Please let us know if we are still missing anything here.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
QA:Needs Validation Needs validation by the QA Team Team:Fleet-QA
Projects
None yet
Development

No branches or pull requests

4 participants