Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collect queue growth events and bytes metrics when PQ is enabled. #14554

Merged
merged 24 commits into from Oct 13, 2022

Conversation

mashhurs
Copy link
Contributor

@mashhurs mashhurs commented Sep 26, 2022

Release notes

  • Logstash introduced instance and pipeline level flow metrics, growth_bytes and growth_events for persisted queue to provide a better visibility about how fast pipeline queue is growing each single seconds.

What does this PR do?

Introducing instance and pipeline level flow metrics, growth_bytes and growth_events for persisted queue.

Why is it important/What is the impact to the user?

Refer to #14463 to get to know about motivation and importance.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • [ ] I have made corresponding change to the default configuration files (and/or docker env variables)
  • I have added tests that prove my fix is effective or that my feature works

Author's Checklist

  • Check Lazy gauge initialization logics with Ry
  • Check integration tests CI before merging.

How to test this PR locally

  • Checkout the change locally and simply run the Logstash after assembling.
  • Set the pipeline with PQ enabled.
  • Call the node stats (or pipeline) API to check the growth flow rates.

Unit tests

$bin/rspec logstash-core/spec/logstash/api/modules/node_stats_spec.rb
$bin/rspec logstash-core/spec/logstash/api/commands/stats_spec.rb:184

Related issues

Closes #14556

Use cases

Screenshots

  • Pipeline level PQ flow metrics (IFF only for persisted)

Screen Shot 2022-10-12 at 2 08 45 PM

Logs

// not sure how to replicate but tested the log readability with mockups.
[2022-09-26T10:14:01,982][WARN ][org.logstash.execution.AbstractPipelineExt] Cannot initialize `growth_events` metric. Retrieving `events` number gauge from `[queue]` namespace(s) failed.
[2022-09-26T10:14:01,982][WARN ][org.logstash.execution.AbstractPipelineExt] Cannot initialize `growth_bytes` metric. Retrieving `queue_size_in_bytes` number gauge from `[queue, capacity]` namespace(s) failed.

@mashhurs mashhurs self-assigned this Sep 27, 2022
@mashhurs mashhurs changed the title Collect growth events and bytes metrics if PQ is enabled. Collect queue growth events and bytes metrics when PQ is enabled. Sep 27, 2022
@mashhurs mashhurs marked this pull request as ready for review September 28, 2022 20:46
@github-actions
Copy link
Contributor

📃 DOCS PREVIEWhttps://logstash_14554.docs-preview.app.elstc.co/diff

Copy link
Member

@yaauie yaauie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, this looks to be on the right track. I'll take an attempt at making it possible for a FlowMetric to take Supplier<Metric<? extends Number>, or some other way to create a flow metric that can tolerate the metrics not yet being populated.

@github-actions
Copy link
Contributor

github-actions bot commented Oct 4, 2022

📃 DOCS PREVIEWhttps://logstash_14554.docs-preview.app.elstc.co/diff

@mashhurs mashhurs mentioned this pull request Oct 5, 2022
4 tasks
@mashhurs mashhurs force-pushed the feature/flow-metrics-integration-pq branch from 0728109 to ab2cd72 Compare October 10, 2022 15:42
@cla-checker-service
Copy link

cla-checker-service bot commented Oct 10, 2022

💚 CLA has been signed

@github-actions
Copy link
Contributor

📃 DOCS PREVIEWhttps://logstash_14554.docs-preview.app.elstc.co/diff

@mashhurs mashhurs linked an issue Oct 10, 2022 that may be closed by this pull request
@mashhurs mashhurs changed the base branch from feature/flow-metrics-integration-pq to main October 10, 2022 16:55
@github-actions
Copy link
Contributor

📃 DOCS PREVIEWhttps://logstash_14554.docs-preview.app.elstc.co/diff

Enables interface `FlowMetrics::create` to take suppliers that _implement_
a `Metric<? extends Number>` instead of requiring them to be pre-cast, and
avoid unnecessary exposure of the metrics value-type into our lazy init.
Avoids routing two enum values through `MetricType#toString()`
and `String#equals()` when they can be compared directly.
@github-actions
Copy link
Contributor

📃 DOCS PREVIEWhttps://logstash_14554.docs-preview.app.elstc.co/diff

- Unit and integration tests are added or fixed.
- Documentation added along with sample response data
@github-actions
Copy link
Contributor

📃 DOCS PREVIEWhttps://logstash_14554.docs-preview.app.elstc.co/diff

@@ -534,6 +534,18 @@ A pipeline with only one single-threaded input may contribute up to 1.00, a pipe

Additionally, some amount of back-pressure is both _normal_ and _expected_ for pipelines that are _pulling_ data, as this back-pressure allows them to slow down and pull data at a rate its downstream pipeline can tolerate.

| `queue_persisted_growth_bytes` |
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yaauie, I need your review these important lines. Want to be sure every single point we are going to deliver to the end-user shouldn't be missed. Should be concise!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sweet. I had been working on the docs and have just rebased and made a PR against your branch -> mashhurs#3

mashhurs and others added 2 commits October 12, 2022 15:40
… section.


Update logstash-core/spec/logstash/api/commands/stats_spec.rb

Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>
Mistake: `flow_status` should be `pipeline_flow_stats`

Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>
@github-actions
Copy link
Contributor

📃 DOCS PREVIEWhttps://logstash_14554.docs-preview.app.elstc.co/diff

Copy link
Member

@yaauie yaauie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤦🏼 s/Number/Numeric/g

qa/integration/specs/reload_config_spec.rb Outdated Show resolved Hide resolved
qa/integration/specs/monitoring_api_spec.rb Outdated Show resolved Hide resolved
Number should be Numeric in the ruby specs.

Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>
@github-actions
Copy link
Contributor

📃 DOCS PREVIEWhttps://logstash_14554.docs-preview.app.elstc.co/diff

@github-actions
Copy link
Contributor

📃 DOCS PREVIEWhttps://logstash_14554.docs-preview.app.elstc.co/diff

@mashhurs
Copy link
Contributor Author

Generated doc preview:
image

@mashhurs mashhurs requested a review from yaauie October 13, 2022 00:39
@@ -56,7 +56,7 @@ def block_until(limit_seconds, &condition)

before :all do
clear_data_dir
settings = mock_settings("config.reload.automatic" => true)
settings = mock_settings("config.reload.automatic" => true, "queue.type" => "persisted")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: don't enable the PQ for all of the shared tests in order to validate behavior of the PQ in just one of those tests. I recently worked with this shared context so I can see if I can find a way to limit the scope in which we have to incur the cost of spooling up PQ's.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR here -> mashhurs#4

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great approach to apply specific setting through the context:

include_context "api setup", {"queue.type" => "persisted"}

@github-actions
Copy link
Contributor

📃 DOCS PREVIEWhttps://logstash_14554.docs-preview.app.elstc.co/diff

Copy link
Member

@yaauie yaauie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉 LGTM

@mashhurs mashhurs merged commit f19e9cb into elastic:main Oct 13, 2022
kares added a commit to kares/logstash that referenced this pull request Oct 17, 2022
* main: (131 commits)
Collect queue growth events and bytes metrics when PQ is enabled.
(elastic#14554)
  DRA: Handle env variables better (elastic#14644)
Follow up PR of elastic#14645, adds version qualifier to the plain version
variable (elastic#14646)
Avoid to pass SNAPSHOT particle to the version passed to
release-manager (elastic#14645)
  Disable -x in dra build scripts (elastic#14643)
  Enable debug for DRA shells scripts (elastic#14642)
  dra_upload.sh: Leave artifacts under build/ (elastic#14639)
  fix PipelineIR.getPostQueue by accounting for vertex copies (elastic#13621)
  Fix/dra use another technique to extract branch name (elastic#14636)
  Re-added execution rights to dra_upload.sh (elastic#14626)
  timestamp: respect locale's decimal-style when parsing (elastic#14628)
Switch branch selector from major.minor to read the current branch
name (elastic#14619)
  api: source pipelines that are fully-loaded (elastic#14595)
  Updates DRA scripts to build snapshot artifacts (elastic#14600)
DRA - Update scripts to use the version qualifier in stack_version
(elastic#14589)
Adds new close method to Java's Filter API to be used to clean
shutdown resources allocated by the filter during registration phase.
(elastic#14485)
  Fix DLQ fails to start due to read 1 byte file (elastic#14605)
Extract the branch name passed to release-manager from version file
(elastic#14592)
  Extended Flow Metrics (elastic#14571)
  update ci release version (elastic#14598)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Pipeline Health API - PQ Metrics
3 participants