[Stack Monitoring] update packages dataset #4018

klacabane · 2022-08-17T14:35:04Z

Summary

This change updates the datasets of the Stack Monitoring packages to include a stack_monitoring part (ie elasticsearch.stack_monitoring.node. The identifier has two purposes: 1) clarify the intent of these datastreams 2) free the stack products namespaces for the upcoming PO initiative. Only the metrics are impacted because logs will be collected similarly in both SM and PO.

The change also bumps elasticsearch, kibana and logstash package to their next major version while keeping their release to experimental until elastic/kibana#120415 is completed. This will allow the packages to be built into the registry so that we don't have to manually build them locally when testing. I thought that metrics mappings being aligned was a relevant milestone for the bump.

Testing

Start an elastic-package stack with elasticsearch, kibana and logstash packages installed:
- We can automate the package installation by providing the right fleet configuration to kibana, and we can use elastic-package profiles to do that. Let's download a profile that specifically does that. You can skip this step if you want to install the packages manually
```
curl https://drive.google.com/uc\?export\=download\&id\=1aqdqNb9JaYXL-C3WjiT2t58Q65Cl4NIV -L -o /tmp/stack_monitoring-profile.zip && \
unzip -o /tmp/stack_monitoring-profile.zip -d ~/.elastic-package/profiles && \
rm /tmp/stack_monitoring-profile.zip
```
- Now cd at the root of the integrations repository and let's build the packages, start the stack with the downloaded profile and also start a logstash service with some predefined pipelines. The command may take a moment to complete and will be done once the logstash service is started, this log message should appear: Service is up, please use ctrl+c to take it down
```
(cd packages/elasticsearch && elastic-package build) && \
(cd packages/kibana && elastic-package build) && \
(
  cd packages/logstash && elastic-package build && \
  elastic-package stack up -v -d --profile stack_monitoring --version 8.5.0-SNAPSHOT && \
  elastic-package service up -v
)
```
Start a local Kibana with this change [Stack Monitoring] Add stack_monitoring suffix to metrics-* index pattern kibana#137904. See how to connect a local kibana
Open kibana at http://localhost:5602 and navigate to Stack Monitoring app. Elasticsearch, Kibana and Logstash section should be showing up with all their views populated. Inspect all the views and report anything bizarre
Verify every stack monitoring metrics-* data streams is formatted as metrics-{product}.stack_monitoring.{metricset}

elasticmachine · 2022-08-17T14:46:08Z

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS

Expand to view the summary

Build stats

Start Time: 2022-09-06T22:31:39.774+0000
Duration: 14 min 53 sec

Test stats 🧪

Test	Results
Failed	0
Passed	62
Skipped	0
Total	62

🤖 GitHub comments

To re-run your PR in the CI, just comment with:

/test : Re-trigger the build.

elasticmachine · 2022-08-17T15:01:46Z

🌐 Coverage report

Name	Metrics % (`covered/total`)	Diff
Packages	100.0% (`0/0`)	💚
Files	100.0% (`0/0`)	💚 2.745
Classes	100.0% (`0/0`)	💚 2.745
Methods	49.462% (`46/93`)	👎 -40.185
Lines	100.0% (`0/0`)	💚 9.503
Conditionals	100.0% (`0/0`)	💚

crespocarlos · 2022-09-05T13:53:36Z

For Logstash internal monitoring I had to add the lines below to packages/logstash/_dev/deploy/docker/config/logstash.yml

monitoring.enabled: true
monitoring.elasticsearch.hosts: https://elasticsearch:9200
monitoring.elasticsearch.username: elastic
monitoring.elasticsearch.password: changeme

And for some reason the docker container name is elastic-package-service-logstash-1, without _.

I've tested this branch using local kibana with changes from elastic/kibana#137904 and everything looks ok, except for this standalone cluster. I'm trying to understand why this is there.

I don't see some datastreams like ml_job but that's probably because I don't have any ml jobs running

crespocarlos · 2022-09-05T13:55:15Z

Not related to this change, but enrich consistently fails due to lack of permission. I think the agent user lacks some privileges

klacabane · 2022-09-05T14:30:43Z

For Logstash internal monitoring I had to add the lines below

Are these settings allowing the Agent to collect data or is it publishing data in .monitoring-logstash ?

And for some reason the docker container name is elastic-package-service-logstash-1, without _.

It seems dependent on the docker version you're running which makes the profiles difficult to share and probably not the best approach when it comes to installing packages. @matschaffer has similar issue when testing the elasticsearch package

except for this standalone cluster

The logstash service defined in _dev contains a standalone pipeline so that is expected, unless the Standalone Cluster view breaks

Not related to this change but enrich consistently fails due to lack of permission. I think the agent user lacks some privileges

Interesting, I'm wondering if this happens in standalone metricbeat as well. We can track this as a separate ticket

crespocarlos

LGTM. I just couldn't verify ccr and enrich (because enrich fails).

The standalone cluster was fixed after setting the monitoring.cluster_uuid on logstash.yml

crespocarlos · 2022-09-05T15:08:58Z

Are these settings allowing the Agent to collect data or is it publishing data in .monitoring-logstash ?

I removed those config lines. The problem was actually the wrong logstash host I used. Even with those lines on the config file, .monitoring-logstash remains empty.

matschaffer · 2022-09-06T03:49:43Z

Glad to hear it wasn't the settings from #4018 (comment) - those would be for internal collection which wouldn't exercise the agent at all.

Also the hyphen and dash is a difference between docker-compose 1 (underscores) & 2 (hyphens). I was running 1 awhile ago, then switched to 2 to see if it would help at all with stack up slowness. You can run docker-compose version to check.

This is part of why I'm a little skeptical of the "canned profile" approach and would like to work on better elastic-package level support for additional services used in testing. There's already code in there that deals with docker-compose differences so I think we can probably deal with it at the golang layer more easily.

matschaffer · 2022-09-06T04:27:42Z

Wondering we might have a new issue with the saved profile now. When I try to connect a main kibana it seems to fail at this stage:

[2022-09-06T13:12:35.789+09:00][INFO ][savedobjects-service] [.kibana] OUTDATED_DOCUMENTS_REFRESH -> UPDATE_TARGET_MAPPINGS. took: 34ms.
[2022-09-06T13:12:35.889+09:00][INFO ][savedobjects-service] [.kibana] UPDATE_TARGET_MAPPINGS -> UPDATE_TARGET_MAPPINGS_WAIT_FOR_TASK. took: 100ms.
{"log.level":"error","@timestamp":"2022-09-06T04:12:44.321Z","log":{"logger":"elastic-apm-node"},"ecs":{"version":"1.6.0"},"message":"APM Server transport error: intake response timeout: APM server did not respond within 10s of gzip stream finish"}
{"log.level":"error","@timestamp":"2022-09-06T04:13:34.146Z","log":{"logger":"elastic-apm-node"},"ecs":{"version":"1.6.0"},"message":"APM Server transport error: error fetching APM Server version: timeout (30000ms) fetching APM Server version"}
[2022-09-06T13:13:35.492+09:00][ERROR][savedobjects-service] [.kibana_task_manager] Action failed with '[timeout_exception] Timed out waiting for completion of [Task{id=19169, type='transport', action='indices:data/write/update/byquery', description='update-by-query [.kibana_task_manager_8.5.0_001]', parentTask=unset, startTime=1662437555506, startTimeNanos=1380955459148883}]'. Retrying attempt 1 in 2 seconds.
[2022-09-06T13:13:35.493+09:00][INFO ][savedobjects-service] [.kibana_task_manager] UPDATE_TARGET_MAPPINGS_WAIT_FOR_TASK -> UPDATE_TARGET_MAPPINGS_WAIT_FOR_TASK. took: 59993ms.

update: nm... just yet another potential manifestation of the low-docker-disk issue 🤦🏻

					"explanation": "the node is above the high watermark cluster setting [cluster.routing.allocation.disk.watermark.high=90%], having less than the minimum required [5.8gb] free space, actual free: [5.5gb], actual used: [90.5%]"

matschaffer

This seems good to merge. The data stream names are as-expected. I wish we had a more complete stack to test with (ccr, logstash, beats, monitoring), but not sure how easy/hard that'd be with the current setup.

I'm catching some logstash docs that are showing up as "standalone" but they cause the UI just look "blank"

But I'm fairly certain that's up to elastic/kibana#137904 to fix.

One thing on my mind for this PR is the logs. I guess they don't really need a .stack_monitoring datastream though since the shape won't be changing between now and platform observability.

matschaffer · 2022-09-06T06:26:24Z

Yeah, looks like those are connection errors.

So we can fix that part up in kibana.

matschaffer · 2022-09-06T06:41:36Z

Ah, and the connection failures are due to elastic-package-service_logstash_1 (docker compose v1 style) :)

So many little things to iron out here but I think this PR is fine.

This reverts commit f880aee.

klacabane · 2022-09-06T22:48:52Z

Merging - the ghost standalone cluster will be fixed with elastic/kibana#140102

klacabane added 3 commits August 17, 2022 16:32

update elasticsearch dataset

6e55103

update kibana dataset

f3af650

update logstash dataset

917fe9a

klacabane added v8.5.0 Team:Infra Monitoring UI - DEPRECATED Label for the Infrastructure Monitoring UI team. - DEPRECATED - Use Team:obs-ux-infra_services labels Aug 17, 2022

klacabane self-assigned this Aug 17, 2022

add changelog entries

b8adb6d

klacabane marked this pull request as ready for review August 17, 2022 15:24

klacabane requested a review from a team as a code owner August 17, 2022 15:24

crespocarlos approved these changes Sep 5, 2022

View reviewed changes

crespocarlos mentioned this pull request Sep 5, 2022

[Stack Monitoring] Agent fails to publish docs to metrics-elasticsearch.stack_monitoring.enrich index elastic/kibana#140039

Closed

matschaffer approved these changes Sep 6, 2022

View reviewed changes

matschaffer mentioned this pull request Sep 6, 2022

[Stack Monitoring] Add stack_monitoring suffix to metrics-* index pattern elastic/kibana#137904

Merged

klacabane added 2 commits September 7, 2022 00:10

add error.message mapping

f880aee

Revert "add error.message mapping"

428f711

This reverts commit f880aee.

klacabane merged commit aadbf6e into elastic:main Sep 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Stack Monitoring] update packages dataset #4018

[Stack Monitoring] update packages dataset #4018

klacabane commented Aug 17, 2022 •

edited

Loading

elasticmachine commented Aug 17, 2022 •

edited

Loading

Build stats

Test stats 🧪

elasticmachine commented Aug 17, 2022 •

edited

Loading

crespocarlos commented Sep 5, 2022

crespocarlos commented Sep 5, 2022 •

edited

Loading

klacabane commented Sep 5, 2022 •

edited

Loading

crespocarlos left a comment

crespocarlos commented Sep 5, 2022 •

edited

Loading

matschaffer commented Sep 6, 2022 •

edited

Loading

matschaffer commented Sep 6, 2022 •

edited

Loading

matschaffer left a comment

matschaffer commented Sep 6, 2022

matschaffer commented Sep 6, 2022

klacabane commented Sep 6, 2022

[Stack Monitoring] update packages dataset #4018

[Stack Monitoring] update packages dataset #4018

Conversation

klacabane commented Aug 17, 2022 • edited Loading

Summary

Testing

elasticmachine commented Aug 17, 2022 • edited Loading

💚 Build Succeeded

Build stats

Test stats 🧪

🤖 GitHub comments

elasticmachine commented Aug 17, 2022 • edited Loading

🌐 Coverage report

crespocarlos commented Sep 5, 2022

crespocarlos commented Sep 5, 2022 • edited Loading

klacabane commented Sep 5, 2022 • edited Loading

crespocarlos left a comment

Choose a reason for hiding this comment

crespocarlos commented Sep 5, 2022 • edited Loading

matschaffer commented Sep 6, 2022 • edited Loading

matschaffer commented Sep 6, 2022 • edited Loading

matschaffer left a comment

Choose a reason for hiding this comment

matschaffer commented Sep 6, 2022

matschaffer commented Sep 6, 2022

klacabane commented Sep 6, 2022

klacabane commented Aug 17, 2022 •

edited

Loading

elasticmachine commented Aug 17, 2022 •

edited

Loading

elasticmachine commented Aug 17, 2022 •

edited

Loading

crespocarlos commented Sep 5, 2022 •

edited

Loading

klacabane commented Sep 5, 2022 •

edited

Loading

crespocarlos commented Sep 5, 2022 •

edited

Loading

matschaffer commented Sep 6, 2022 •

edited

Loading

matschaffer commented Sep 6, 2022 •

edited

Loading