Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[reciever/kafkametricsreciever] Expand broker metrics #14167

Closed

Conversation

abeach-nr
Copy link

@abeach-nr abeach-nr commented Sep 16, 2022

Description:
This PR adds additional broker metrics for the kafka metrics receiver.

Link to tracking Issue:
#14166

Testing:
Manual testing, running the collector and looking at the metrics exported to a file

Documentation:
Updates to metadata.yaml, and auto generated docs

@abeach-nr abeach-nr requested review from a team and dmitryax as code owners September 16, 2022 15:23
@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Sep 16, 2022

CLA Signed

The committers listed above are authorized under a signed CLA.

receiver/kafkametricsreceiver/broker_scraper.go Outdated Show resolved Hide resolved
receiver/kafkametricsreceiver/broker_scraper.go Outdated Show resolved Hide resolved
receiver/kafkametricsreceiver/broker_scraper.go Outdated Show resolved Hide resolved
receiver/kafkametricsreceiver/broker_scraper.go Outdated Show resolved Hide resolved
receiver/kafkametricsreceiver/broker_scraper.go Outdated Show resolved Hide resolved
receiver/kafkametricsreceiver/documentation.md Outdated Show resolved Hide resolved
@@ -9,6 +9,15 @@ These are the metrics available for this scraper.
| Name | Description | Unit | Type | Attributes |
| ---- | ----------- | ---- | ---- | ---------- |
| **kafka.brokers** | Number of brokers in the cluster. | {brokers} | Gauge(Int) | <ul> </ul> |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now we have a problem with metric names. According the the OTel spec metric name cannot be used as namespace for any other metrics. So we need to either pick another namespace for the added metrics or rename kafka.brokers through a deprecation process

Copy link
Author

@abeach-nr abeach-nr Sep 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah. I see this rule in the spec here:

https://github.com/open-telemetry/opentelemetry-specification/blob/ded25b221d8405e9dae4bc6aa293e9002c2aedbe/specification/common/attribute-naming.md#attribute-naming.

I would prefer to change the added metrics namespace to avoid deprecation. Let me review the spec and other receivers and I can change the namespace.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's discuss options here. I don't see a big problem in renaming kafka.brokers metric if we don't find a good alternative for the namespace

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe kafka.brokers.count would be the best option but happy to see other suggestions

Copy link
Author

@abeach-nr abeach-nr Sep 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

@abeach-nr abeach-nr Sep 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am unsure of how to handle the rename/deprecation. This would be a breaking change and I don't see how to include that in the unreleased/*yaml file.

I added a new attribute kafka.brokers.count that duplicates the kafka.brokers with a deprecation warning log, but am unsure if this is the correct way to handle this. Any guidance or links to spec docs would be helpful for renaming.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's handle it in a separate PR and break it down into several releases, let's allow 2 versions for every step:

  1. Introduce kafka.brokers.count as a new optional metric, add warning if kafka.brokers is enabled which is by default.
  2. Enable kafka.brokers.count and disable kafka.brokers by default.
  3. Remove kafka.brokers metric

@abeach-nr abeach-nr force-pushed the expand-kafka-broker-metrics branch 2 times, most recently from 05ec6a5 to 835f2a4 Compare October 5, 2022 16:00
@abeach-nr abeach-nr force-pushed the expand-kafka-broker-metrics branch 3 times, most recently from 3fc7f13 to fac8916 Compare October 18, 2022 15:12
@abeach-nr
Copy link
Author

tests keep failing. I have been rebasing main to keep the branch up to date

=== RUN   TestStalenessMarkersEndToEnd
    staleness_end_to_end_test.go:240: 
        	Error Trace:	/Users/abeach/go/src/github.com/open-telemetry/opentelemetry-collector-contrib/receiver/prometheusreceiver/internal/staleness_end_to_end_test.go:240
        	Error:      	Should be true
        	Test:       	TestStalenessMarkersEndToEnd
        	Messages:   	Expected at least 1 sample
--- FAIL: TestStalenessMarkersEndToEnd (6.89s)

@abeach-nr abeach-nr force-pushed the expand-kafka-broker-metrics branch 2 times, most recently from 14f66e6 to 36f2ea0 Compare December 7, 2022 17:54
@runforesight
Copy link

runforesight bot commented Dec 7, 2022

Foresight Summary

    
Major Impacts

build-and-test duration(27 minutes 30 seconds) has decreased 37 minutes 40 seconds compared to main branch avg(1 hour 5 minutes 10 seconds).
View More Details

✅  tracegen workflow has finished in 1 minute 8 seconds (2 minutes 10 seconds less than main branch avg.) and finished at 7th Dec, 2022.


Job Failed Steps Tests
build-dev -     🔗  N/A See Details
publish-latest -     🔗  N/A See Details
publish-stable -     🔗  N/A See Details

⭕  build-and-test-windows workflow has finished in 7 seconds (43 minutes 28 seconds less than main branch avg.) and finished at 10th Mar, 2023.


Job Failed Steps Tests
windows-unittest-matrix -     🔗  N/A See Details
windows-unittest -     🔗  N/A See Details

✅  telemetrygen workflow has finished in 1 minute 3 seconds (1 minute 3 seconds less than main branch avg.) and finished at 10th Mar, 2023.


Job Failed Steps Tests
publish-latest -     🔗  N/A See Details
build-dev -     🔗  N/A See Details
publish-stable -     🔗  N/A See Details

✅  check-links workflow has finished in 1 minute 29 seconds and finished at 10th Mar, 2023.


Job Failed Steps Tests
changed files -     🔗  N/A See Details
check-links -     🔗  N/A See Details

✅  changelog workflow has finished in 2 minutes 17 seconds and finished at 10th Mar, 2023.


Job Failed Steps Tests
changelog -     🔗  N/A See Details

❌  build-and-test workflow has finished in 27 minutes 30 seconds (37 minutes 40 seconds less than main branch avg.) and finished at 10th Mar, 2023. 3 jobs failed. There are 1 test failures.


Job Failed Steps Tests
unittest-matrix (1.20, extension) Run Unit Tests     🔗  ✅ 187  ❌ 1  ⏭ 0    🔗 See Details
unittest-matrix (1.20, internal) -     🔗  ✅ 166  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.19, extension) -     🔗  ✅ 384  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.20, receiver-0) -     🔗  ✅ 101  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.20, other) -     🔗  ✅ 0  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.19, internal) -     🔗  ✅ 483  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.19, other) -     🔗  ✅ 0  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.20, processor) -     🔗  ✅ 762  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.20, receiver-1) -     🔗  ✅ 300  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.19, receiver-0) -     🔗  ✅ 292  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.19, processor) -     🔗  ✅ 762  ❌ 0  ⏭ 0    🔗 See Details
correctness-metrics -     🔗  ✅ 2  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.19, receiver-1) -     🔗  ✅ 300  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.19, exporter) -     🔗  ✅ 593  ❌ 0  ⏭ 0    🔗 See Details
unittest-matrix (1.20, exporter) -     🔗  ✅ 593  ❌ 0  ⏭ 0    🔗 See Details
correctness-traces -     🔗  ✅ 17  ❌ 0  ⏭ 0    🔗 See Details
integration-tests -     🔗  ✅ 55  ❌ 0  ⏭ 0    🔗 See Details
setup-environment -     🔗  N/A See Details
check-collector-module-version -     🔗  N/A See Details
lint-matrix (receiver-0) -     🔗  N/A See Details
lint-matrix (receiver-1) -     🔗  N/A See Details
lint-matrix (processor) -     🔗  N/A See Details
lint-matrix (exporter) -     🔗  N/A See Details
lint-matrix (extension) -     🔗  N/A See Details
lint-matrix (internal) -     🔗  N/A See Details
lint-matrix (other) -     🔗  N/A See Details
build-examples -     🔗  N/A See Details
check-codeowners -     🔗  N/A See Details
checks -     🔗  N/A See Details
unittest (1.20) Interpret result     🔗  N/A See Details
unittest (1.19) Interpret result     🔗  N/A See Details
lint -     🔗  N/A See Details
cross-compile -     🔗  N/A See Details
build-package -     🔗  N/A See Details
windows-msi -     🔗  N/A See Details
publish-stable -     🔗  N/A See Details
publish-check -     🔗  N/A See Details
publish-dev -     🔗  N/A See Details

✅  prometheus-compliance-tests workflow has finished in 10 minutes 51 seconds (⚠️ 3 minutes 6 seconds more than main branch avg.) and finished at 10th Mar, 2023.


Job Failed Steps Tests
prometheus-compliance-tests -     🔗  ✅ 21  ❌ 0  ⏭ 0    🔗 See Details

✅  load-tests workflow has finished in 15 minutes 44 seconds (⚠️ 2 minutes 5 seconds more than main branch avg.) and finished at 10th Mar, 2023.


Job Failed Steps Tests
loadtest (TestTraceAttributesProcessor) -     🔗  ✅ 3  ❌ 0  ⏭ 0    🔗 See Details
loadtest (TestIdleMode) -     🔗  ✅ 1  ❌ 0  ⏭ 0    🔗 See Details
loadtest (TestMetric10kDPS|TestMetricsFromFile) -     🔗  ✅ 6  ❌ 0  ⏭ 0    🔗 See Details
loadtest (TestTraceNoBackend10kSPS|TestTrace1kSPSWithAttrs) -     🔗  ✅ 8  ❌ 0  ⏭ 0    🔗 See Details
loadtest (TestMetricResourceProcessor|TestTrace10kSPS) -     🔗  ✅ 12  ❌ 0  ⏭ 0    🔗 See Details
loadtest (TestTraceBallast1kSPSWithAttrs|TestTraceBallast1kSPSAddAttrs) -     🔗  ✅ 10  ❌ 0  ⏭ 0    🔗 See Details
loadtest (TestBallastMemory|TestLog10kDPS) -     🔗  ✅ 18  ❌ 0  ⏭ 0    🔗 See Details
setup-environment -     🔗  N/A See Details

✅  e2e-tests workflow has finished in 17 minutes 56 seconds (⚠️ 2 minutes 46 seconds more than main branch avg.) and finished at 10th Mar, 2023.


Job Failed Steps Tests
kubernetes-test (v1.26.0) -     🔗  N/A See Details
kubernetes-test (v1.24.7) -     🔗  N/A See Details
kubernetes-test (v1.25.3) -     🔗  N/A See Details
kubernetes-test (v1.23.13) -     🔗  N/A See Details

🔎 See details on Foresight

*You can configure Foresight comments in your organization settings page.

Copy link
Member

@dmitryax dmitryax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay. Please rebase and address a couple of comments. Otherwise, it should be good to go

@abeach-nr abeach-nr force-pushed the expand-kafka-broker-metrics branch 2 times, most recently from f0ce5a6 to 9d16cbd Compare February 23, 2023 16:36
@@ -48,6 +48,7 @@ var newMetricsReceiver = func(
params receiver.CreateSettings,
consumer consumer.Metrics,
) (receiver.Metrics, error) {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

@@ -59,6 +59,7 @@ func createMetricsReceiver(
cfg component.Config,
nextConsumer consumer.Metrics) (receiver.Metrics, error) {
c := cfg.(*Config)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

@@ -29,6 +29,8 @@ import (
"github.com/open-telemetry/opentelemetry-collector-contrib/receiver/kafkametricsreceiver/internal/metadata"
)

type saramaMetrics map[string]map[string]interface{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is sarama a meaningful name here?

@dmitryax
Copy link
Member

@atoulme, thank you for bumping this PR :)

@abeach-nr I lost it, sorry. Could you please rebase?

@github-actions
Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions
Copy link
Contributor

Closed as inactive. Feel free to reopen if this PR is still being worked on.

@github-actions github-actions bot closed this Jun 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants