Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: ensure OTel collector is able to export metrics and traces #298

Merged
merged 5 commits into from
Sep 7, 2023

Conversation

leninmehedy
Copy link
Member

@leninmehedy leninmehedy commented Aug 29, 2023

Description

This pull request changes the following:

  • enables node metrics (from port 9999) through otel-collector's prometheus port 8889
  • Removes otel-collector's hostmetrics to reduce noise (however we may need this later)
  • enables traces to be exported using otlp exporter (to tempo by default)
  • add helper scripts to deploy telemetry stack (grafana + tempo + prometheus) for local testing

Notes

  • A local prometheus instance can be deployed to scrape port 8889 directly
  • Prometheus remote write can be used to export the metrics to Graphana cloud or other prometheus instance.
  • make deploy-telemetry-stack will deploy telemetry stack (grafana + tempo + prometheus) for local testing

Limitations

  • We don't have prometheus alerting available yet (it will require deploying additional pods/svcs that will be done as separate PRs)
  • Grafana dashboards need to be manually imported.
  • I wasn't able to configure OTel collector to post data to Grafana cloud (tempo), however, this is not required so I stopped that effort. Instead of that, a user is able to deploy grafana + tempo in the cluster and get the traces for local testing.

Related Issues

@leninmehedy leninmehedy changed the title feat: enable node metrics port through open-collector feat: enable node metrics and tracing through open-collector Aug 29, 2023
@leninmehedy leninmehedy changed the title feat: enable node metrics and tracing through open-collector feat: enable node metrics port and tracing through otel-collector Aug 29, 2023
@leninmehedy leninmehedy changed the title feat: enable node metrics port and tracing through otel-collector fix: enable node metrics port and tracing through otel-collector Aug 29, 2023
@leninmehedy leninmehedy changed the title fix: enable node metrics port and tracing through otel-collector fix: ensure OTel collector is able to export metrics and traces Aug 29, 2023
@leninmehedy leninmehedy changed the title fix: ensure OTel collector is able to export metrics and traces fix: ensure OTel collector is able to export metrics Aug 29, 2023
@leninmehedy leninmehedy marked this pull request as ready for review August 29, 2023 03:45
@github-actions
Copy link

github-actions bot commented Aug 29, 2023

Unit Test Results

16 files  ±0  16 suites  ±0   31s ⏱️ -1s
85 tests ±0  82 ✔️ ±0  3 💤 ±0  0 ±0 
86 runs  ±0  83 ✔️ ±0  3 💤 ±0  0 ±0 

Results for commit 780ea1c. ± Comparison against base commit 230b291.

♻️ This comment has been updated with latest results.

@leninmehedy leninmehedy changed the title fix: ensure OTel collector is able to export metrics fix: ensure OTel collector is able to export metrics and traces Sep 1, 2023
@leninmehedy leninmehedy changed the title fix: ensure OTel collector is able to export metrics and traces feat: ensure OTel collector is able to export metrics and traces Sep 1, 2023
Signed-off-by: Lenin Mehedy <lenin.mehedy@swirldslabs.com>
Signed-off-by: Lenin Mehedy <lenin.mehedy@swirldslabs.com>
Signed-off-by: Lenin Mehedy <lenin.mehedy@swirldslabs.com>
…debugging

Signed-off-by: Lenin Mehedy <lenin.mehedy@swirldslabs.com>
Signed-off-by: Lenin Mehedy <lenin.mehedy@swirldslabs.com>
@sonarcloud
Copy link

sonarcloud bot commented Sep 7, 2023

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
No Duplication information No Duplication information

@leninmehedy leninmehedy merged commit 8c24733 into main Sep 7, 2023
10 checks passed
@leninmehedy leninmehedy deleted the 174-export-metrics-and-traces branch September 7, 2023 20:43
swirlds-automation added a commit that referenced this pull request Sep 14, 2023
## [0.8.0](v0.7.0...v0.8.0) (2023-09-14)

### Features

* add hedera node explorer as a conditional sub chart ([#275](#275)) ([c4eb8d7](c4eb8d7))
* add helm chart tests to check haproxy and envoy proxy deployments ([#319](#319)) ([08c3f85](08c3f85))
* add helm chart tests to validate state of sidecars ([#314](#314)) ([e04de57](e04de57))
* add optional Gateway API resource definition to expose endpoints ([#280](#280)) ([9b9effd](9b9effd))
* allow pod-monitor-role to have access to secrets ([#309](#309)) ([75e66c3](75e66c3))
* convert helm client test source set to a module ([#300](#300)) ([2fcb3c3](2fcb3c3))
* enable locally building and loading kubectl-bats docker image into the cluster for helm tests ([#310](#310)) ([230b291](230b291))
* ensure OTel collector is able to export metrics and traces ([#298](#298)) ([8c24733](8c24733))
* expose envoy-proxy prometheus metrics endpoint ([#328](#328)) ([f052481](f052481))
* expose prometheus metrics from HAProxy ([#320](#320)) ([8d4f51c](8d4f51c))
* introduces the build-logic project instead of buildSrc ([#327](#327)) ([aeb4c48](aeb4c48))
* parameterize gateway port mapping using values file ([#339](#339)) ([0c16855](0c16855))
* support higher number of nodes with consistent gateway port mapping ([#337](#337)) ([8b44410](8b44410))
* switch from JPMS module conventions to project level plugin ([#312](#312)) ([b3a7429](b3a7429))
* update Open Telemetry configuration to support optional remote write pushing of metrics and log ([#281](#281)) ([f5aadc5](f5aadc5))

### Bug Fixes

* lock the semantic release toolchain to specific known working versions ([#342](#342)) ([5eb8600](5eb8600))
* resolves the accidental mangling of Helm subchart versions on release ([#285](#285)) ([4e6a539](4e6a539))
* use 9090 as prometheus port across services for consistency ([#330](#330)) ([5052945](5052945))
@swirlds-automation
Copy link
Contributor

🎉 This PR is included in version 0.8.0 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

jeromy-cannon pushed a commit that referenced this pull request Sep 14, 2023
Signed-off-by: Lenin Mehedy <lenin.mehedy@swirldslabs.com>
Signed-off-by: Jeromy Cannon <jeromy@swirldslabs.com>
jeromy-cannon pushed a commit that referenced this pull request Sep 14, 2023
## [0.8.0](v0.7.0...v0.8.0) (2023-09-14)

### Features

* add hedera node explorer as a conditional sub chart ([#275](#275)) ([c4eb8d7](c4eb8d7))
* add helm chart tests to check haproxy and envoy proxy deployments ([#319](#319)) ([08c3f85](08c3f85))
* add helm chart tests to validate state of sidecars ([#314](#314)) ([e04de57](e04de57))
* add optional Gateway API resource definition to expose endpoints ([#280](#280)) ([9b9effd](9b9effd))
* allow pod-monitor-role to have access to secrets ([#309](#309)) ([75e66c3](75e66c3))
* convert helm client test source set to a module ([#300](#300)) ([2fcb3c3](2fcb3c3))
* enable locally building and loading kubectl-bats docker image into the cluster for helm tests ([#310](#310)) ([230b291](230b291))
* ensure OTel collector is able to export metrics and traces ([#298](#298)) ([8c24733](8c24733))
* expose envoy-proxy prometheus metrics endpoint ([#328](#328)) ([f052481](f052481))
* expose prometheus metrics from HAProxy ([#320](#320)) ([8d4f51c](8d4f51c))
* introduces the build-logic project instead of buildSrc ([#327](#327)) ([aeb4c48](aeb4c48))
* parameterize gateway port mapping using values file ([#339](#339)) ([0c16855](0c16855))
* support higher number of nodes with consistent gateway port mapping ([#337](#337)) ([8b44410](8b44410))
* switch from JPMS module conventions to project level plugin ([#312](#312)) ([b3a7429](b3a7429))
* update Open Telemetry configuration to support optional remote write pushing of metrics and log ([#281](#281)) ([f5aadc5](f5aadc5))

### Bug Fixes

* lock the semantic release toolchain to specific known working versions ([#342](#342)) ([5eb8600](5eb8600))
* resolves the accidental mangling of Helm subchart versions on release ([#285](#285)) ([4e6a539](4e6a539))
* use 9090 as prometheus port across services for consistency ([#330](#330)) ([5052945](5052945))

Signed-off-by: Jeromy Cannon <jeromy@swirldslabs.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Charts[Network Node]: Ensure OTel collector is able to export metrics and traces
4 participants