Add elasticsearch/ingest_pipeline metricset #34012

joshdover · 2022-12-10T00:31:32Z

What does this PR do?

Depends on:

Add support for optional metricsets in xpack mode #34273

Ports elastic/integrations#4597 to metricbeat. This adds a new ingest_pipeline metricset to the elasticsearch module of Metricbeat. This module does the following:

Fetches the Nodes Stats API's ingest metrics
Ingest two types of documents:
- Top-level pipeline metrics are ingested on every interval, including the counters for total time, documents processed, and failure counts. We also calculate a "self time" metric which will subtract the time spent on processor calls to other pipelines.
- Processor-level metrics are ingested on a sampling rate (25% - every 4th interval by default, configurable). For these metrics, a separate document is created for every processor in every pipeline, on every ES node. For this reason, the sampling strategy is used to minimize the amount of data that is produced.
Supports local and cluster scopes for better performance

The UI for visualizing this data will be included as a dashboard. Right now this is only being shipped in an Agent integration with support for Stack Monitoring and Metricbeat index patterns. In a follow up PR to Kibana, a link will be added from the Stack Monitoring UI to this dashboard, or direct the user to install the package to get the dashboard.

Here's a summary of the new fields that are added:

- name: ingest_pipeline
  type: group
  release: beta
  description: Runtime metrics on ingest pipeline execution
  fields:
    - name: name
      type: wildcard
      description: Name / id of the ingest pipeline
    - name: total
      type: group
      description: Metrics on the total ingest pipeline execution, including all processors.
      fields:
        - name: count
          type: long
          description: Number of documents processed by this pipeline
        - name: failed
          type: long
          description: Number of documented failed to process by this pipeline
        - name: time.total.ms
          type: long
          description: Total time spent processing documents through this pipeline, inclusive of other pipelines called
        - name: time.self.ms
          type: long
          description: Time spent processing documents through this pipeline, exclusive of other pipelines called
    - name: processor
      type: group
      fields:
        - name: type
          type: keyword
          description: The type of ingest processor
        - name: type_tag
          type: keyword
          description: The type and the tag for this processor in the format "<type>:<tag>"
        - name: order_index
          type: long
          description: The order this processor appears in the pipeline definition
        - name: count
          type: long
          description: Number of documents processed by this processor
        - name: failed
          type: long
          description: Number of documented failed to process by this processor
        - name: time.total.ms
          type: long
          description: Total time spent processing documents through this processor

Why is it important?

With the adoption of Agent integrations, ingest pipeline performance is very important to overall ingest performance. Users have very little insight into this today and giving them dashboards and metrics from the existing ES APIs is a great first start in improving the situation.

Checklist

My code follows the style guidelines of this project
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
I have made corresponding change to the default configuration files
I have added tests that prove my fix is effective or that my feature works
I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Author's Checklist

Add unit tests
Add monitoring mappings for es ingest metricset elasticsearch#92950
Determine how we want to handle XPack mode. By default, xpack mode enables all metricset but we want to ship this as beta, so we may need to modify this behavior
- Solved by Add support for optional metricsets in xpack mode #34273

How to test this PR locally

Related issues

Relates to [Stack Monitoring] Ingest pipeline monitoring kibana#41936

Use cases

Screenshots

Logs

mergify · 2022-12-10T00:32:08Z

This pull request does not have a backport label.
If this is a bug or security fix, could you label this PR @joshdover? 🙏.
For such, you'll need to label your PR with:

The upcoming major version of the Elastic Stack
The upcoming minor version of the Elastic Stack (if you're not pushing a breaking change)

To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

backport-v8./d.0 is the label to automatically backport to the 8./d branch. /d is the digit

elasticmachine · 2022-12-10T01:23:53Z

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS

Expand to view the summary

Build stats

Start Time: 2023-01-27T13:38:35.125+0000
Duration: 58 min 33 sec

Test stats 🧪

Test	Results
Failed	0
Passed	4067
Skipped	887
Total	4954

💚 Flaky test report

Tests succeeded.

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

/test : Re-trigger the build.
/package : Generate the packages and run the E2E tests.
/beats-tester : Run the installation tests with beats-tester.
run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

sonarcloud · 2022-12-10T01:52:06Z

Kudos, SonarCloud Quality Gate passed!

0 Bugs
0 Vulnerabilities
0 Security Hotspots
3 Code Smells

No Coverage information
No Duplication information

sonarcloud · 2022-12-14T19:12:46Z

Kudos, SonarCloud Quality Gate passed!

0 Bugs
0 Vulnerabilities
0 Security Hotspots
1 Code Smell

No Coverage information
No Duplication information

elasticmachine · 2023-01-16T13:56:22Z

Pinging @elastic/elastic-agent (Team:Elastic-Agent)

ruflin

Overall LGTM. Left some minor comment. +1 on moving forward with it.

metricbeat/docs/modules/elasticsearch/ingest.asciidoc

metricbeat/helper/elastic/elastic.go

metricbeat/metricbeat.reference.yml

metricbeat/module/elasticsearch/ingest/_meta/data.json

metricbeat/module/elasticsearch/ingest/data_test.go

metricbeat/helper/elastic/elastic.go

mergify · 2023-01-25T00:19:00Z

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b es-ingest upstream/es-ingest
git merge upstream/main
git push upstream es-ingest

klacabane · 2023-01-27T11:58:19Z

metricbeat/module/elasticsearch/ingest_pipeline/ingest_pipeline.go

+// format. It publishes the event which is then forwarded to the output. In case
+// of an error set the Error field of mb.Event or simply call report.Error().
+func (m *IngestMetricSet) Fetch(report mb.ReporterV2) error {
+	shouldSkip, err := m.ShouldSkipFetch()


ShouldSkipFetch will skip when we have a ScopeNode and we're not talking to master. This is implemented in most metricsets because we hit an API that returns the global state from master and we don't need to collect individual nodes data, but this does not seem to apply here.
iiuc we want to fetch individual node data when the ScopeNode, and hit the global API when ScopeCluster so we should remove that call and implement our own logic (similar to the node_stats metricset)

Good catch, thank you. I think I can just remove this entirely as I'm already setting the URI correctly on the lines below this. Do you agree?

Sounds good

metricbeat/module/elasticsearch/ingest_pipeline/_meta/data.json

amitkanfer · 2023-01-27T15:38:55Z

🚢

botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Dec 10, 2022

joshdover mentioned this pull request Dec 10, 2022

Add ingest_pipeline data streams and dashboards to Elasticsearch package elastic/integrations#4597

Merged

20 tasks

mergify bot assigned joshdover Dec 10, 2022

joshdover force-pushed the es-ingest branch from 1e2d82f to c3170df Compare December 10, 2022 01:16

joshdover force-pushed the es-ingest branch from c3170df to 3b25c82 Compare December 10, 2022 01:45

joshdover force-pushed the es-ingest branch from 3b25c82 to 015c6ac Compare December 10, 2022 10:56

joshdover changed the title ~~Add elasticsearch.ingest metricset~~ Add elasticsearch/ingest metricset Dec 12, 2022

joshdover added enhancement Team:Elastic-Agent Label for the Agent team backport-skip Skip notification from the automated backport with mergify Team:Infra Monitoring UI - DEPRECATED Infrastructure Monitoring UI team - DEPRECATED - Use Team:Monitoring v8.7.0 labels Dec 12, 2022

botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Dec 12, 2022

Add support for optional metricsets in xpack mode

c01bbf2

joshdover mentioned this pull request Jan 16, 2023

Add support for optional metricsets in xpack mode #34273

Merged

6 tasks

joshdover force-pushed the es-ingest branch from 28e2dfa to 17b9a06 Compare January 16, 2023 13:06

joshdover mentioned this pull request Jan 16, 2023

Add monitoring mappings for es ingest metricset elastic/elasticsearch#92950

Merged

joshdover marked this pull request as ready for review January 16, 2023 13:56

joshdover requested review from a team as code owners January 16, 2023 13:56

joshdover requested review from belimawr and rdner and removed request for a team January 16, 2023 13:56

joshdover requested a review from klacabane January 16, 2023 16:36

ruflin reviewed Jan 17, 2023

View reviewed changes

joshdover added 11 commits January 24, 2023 10:24

Remame pipeline time fields

00342d3

Make ingest metricset optional in xpack mode

fd056fd

Update x-pack metricbeat reference yml

9783b05

Update changelog and docs

399be78

Update fields

111a3b0

Rename metricset to ingest_pipeline

682f01f

Rename time fields to match node_stats metricset

60fae09

Add example event

e223a20

Fix docs mentioning host module

a13b2b7

Update docs and fields

35931c4

Remove useless unit test

d09de60

joshdover force-pushed the es-ingest branch from ae8c466 to d09de60 Compare January 24, 2023 09:24

joshdover mentioned this pull request Jan 24, 2023

[Stack Monitoring] Add new tab on ES monitoring page to link to Ingest Pipeline dashboard elastic/kibana#149386

Closed

Merge remote-tracking branch 'upstream/main' into es-ingest

2c0b9b2

joshdover added 2 commits January 25, 2023 10:00

Merge branch 'main' into es-ingest

f912521

Merge branch 'main' into es-ingest

605f3e6

klacabane reviewed Jan 27, 2023

View reviewed changes

metricbeat/module/elasticsearch/ingest_pipeline/_meta/data.json Outdated Show resolved Hide resolved

joshdover requested a review from a team as a code owner January 27, 2023 13:36

joshdover requested a review from klacabane January 27, 2023 13:36

joshdover added 2 commits January 27, 2023 14:37

Use realistic cluster and node ids

ed09ca6

Remove ShouldSkipFetch call

3f8dbed

joshdover force-pushed the es-ingest branch from a866415 to 3f8dbed Compare January 27, 2023 13:38

joshdover removed the request for review from a team January 27, 2023 13:38

klacabane approved these changes Jan 27, 2023

View reviewed changes

joshdover merged commit e355974 into elastic:main Jan 27, 2023

joshdover deleted the es-ingest branch January 27, 2023 15:37

chrisberkhout pushed a commit that referenced this pull request Jun 1, 2023

Add elasticsearch/ingest_pipeline metricset (#34012)

691d09f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add elasticsearch/ingest_pipeline metricset #34012

Add elasticsearch/ingest_pipeline metricset #34012

joshdover commented Dec 10, 2022 •

edited

mergify bot commented Dec 10, 2022

elasticmachine commented Dec 10, 2022 •

edited by jenkins-beats-ci bot

Build stats

Test stats 🧪

sonarcloud bot commented Dec 10, 2022

sonarcloud bot commented Dec 14, 2022

elasticmachine commented Jan 16, 2023

ruflin left a comment

mergify bot commented Jan 25, 2023

klacabane Jan 27, 2023

joshdover Jan 27, 2023

klacabane Jan 27, 2023

amitkanfer commented Jan 27, 2023

Add elasticsearch/ingest_pipeline metricset #34012

Add elasticsearch/ingest_pipeline metricset #34012

Conversation

joshdover commented Dec 10, 2022 • edited

What does this PR do?

Why is it important?

Checklist

Author's Checklist

How to test this PR locally

Related issues

Use cases

Screenshots

Logs

mergify bot commented Dec 10, 2022

elasticmachine commented Dec 10, 2022 • edited by jenkins-beats-ci bot

💚 Build Succeeded

Build stats

Test stats 🧪

💚 Flaky test report

🤖 GitHub comments

sonarcloud bot commented Dec 10, 2022

sonarcloud bot commented Dec 14, 2022

elasticmachine commented Jan 16, 2023

ruflin left a comment

Choose a reason for hiding this comment

mergify bot commented Jan 25, 2023

klacabane Jan 27, 2023

Choose a reason for hiding this comment

joshdover Jan 27, 2023

Choose a reason for hiding this comment

klacabane Jan 27, 2023

Choose a reason for hiding this comment

amitkanfer commented Jan 27, 2023

joshdover commented Dec 10, 2022 •

edited

elasticmachine commented Dec 10, 2022 •

edited by jenkins-beats-ci bot