Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Summer 2023] 客户端可观测性指标建设 / Add new client metrics observability. #10377

Open
KomachiSion opened this issue Apr 24, 2023 · 2 comments
Labels

Comments

@KomachiSion
Copy link
Collaborator

开源之夏是由中科院软件所“开源软件供应链点亮计划”发起并长期支持的一项暑期开源活动,旨在鼓励在校学生积极参与开源软件的开发维护,培养和发掘更多优秀的开发者,促进优秀开源软件社区的蓬勃发展,助力开源软件供应链建设。

Nacos将会作为指导组织参加本次的“开源之夏2023”

背景

去年的编程之夏中,Nacos社区重新为Nacos服务端演进了可观测体系,提升了Nacos服务端测的可观测性;而客户端测的可观测能力一直相对缺失。

同时随着云原生和分布式微服务应用越来越多,客户端由于更贴近用户,可观测能力显得更加重要;业内也逐渐退出如opentelemetry的观测性标准。

因此,Nacos社区希望能在nacos-client中添加和演进对应的可观察能力,接入新的opentelemetry的观测性标准,提供metrics或trace的能力,帮助用户发现和定位问题。

目标

在nacos-client中通过mircometer先添加部分关键指标和监控内容;同时接入opentelemetry的观测性标准,支持透出关键指标,并尝试探索服务和配置的注册、订阅的追踪能力。

难度

基础

导师

袁赓拓
gengtuo.ygt@alibaba-inc.com

产出要求

  • 通过mircometer先添加部分nacos-client的关键指标和监控内容
  • 支持透出关键指标通过opentelemetry的观测性标准
  • 支持通过opentelemetry的观测性标准追踪服务和配置的注册、订阅链路
  • 添加新增指标文档,及client观测性能力使用文档

能力要求

  • 熟悉Java编程语言
  • 了解 micrometer 使用
  • 了解 opentelemetry 可观测性标准
  • 熟悉markdown

Open Source Promotion Plan is a summer program organized by the Institute of Software Chinese Academy of Sciences and long-term supported by the Open Source Software Supply Chain Promotion Plan. It aims to encourage college students to actively participate in the maintenance and development of open source software, promote the vigorous development of open source software communities, and build the open source software supply chain together.

Nacos will join The Summer 2023 as the mentoring organization.

Background

In last year's Summer OSPP, the Nacos community evolved the observability system for the Nacos server, improving the observability of the Nacos server, while the observability of the Nacos client has been relatively lacking.

At the same time, with the increasing popularity of cloud-native and distributed microservice applications, observability capabilities for the client are becoming increasingly important due to its proximity to the user. The industry has also gradually introduced observability standards such as opentelemetry.

Therefore, the Nacos community hopes to add and evolve the corresponding observability capabilities in nacos-client, integrate with the new opentelemetry observability standard, and provide the ability to discover and locate issues through metrics or trace.

Target

Add some key metrics and monitoring content for nacos-client through mircometer, and integrate with the opentelemetry observability standard to support the transparency of key metrics, and explore the tracing capabilities of service and configuration registration and subscription.

Difficulty

Basic

Mentor

GengTuo Yuan
gengtuo.ygt@alibaba-inc.com

Output Requirements

  • Add some key metrics and monitoring content for nacos-client through mircometer
  • Support transparency of key metrics through the opentelemetry observability standard
  • Support tracing of service and configuration registration and subscription chains through the opentelemetry observability standard
  • Add document for new metric and client observability capabilities usage

Technical Requirements

  • Familiar with Java programming language
  • Understand the usage of micrometer
  • Understand opentelemetry observability standards
  • Familiar with writing markdown documents
@FAWC438
Copy link

FAWC438 commented Apr 25, 2023

你好,我有意愿参加 OSPP 2023。我有一些 OpenTelemetry 和 Apache Skywalking 的可观测性相关经验并且对这个 idea 感兴趣。您提到了 Nacos 服务端已经实现了可观测性的提升,我想知道是否有相关的 文档/PR/issue 对此进行说明?我希望通过一些调研来进一步完善我的项目申请提案。

@pixystone
Copy link
Contributor

Server端相关可以参考 #8461

KomachiSion pushed a commit that referenced this issue Oct 30, 2023
…11166)

* Use Micrometer to monitor the metrics previously detected by Prometheus

* Replace all Prometheus implement to Micrometer

* Add unit test

* Define unit test case

* Remove unnecessary dependencies

* Remove unnecessary dependencies

* Fix magic value

* Optimize code architecture

* Use a new CompositeMeterRegistry instead of the globalRegistry

* Use `NacosClientProperties` to get the env value

* Finish `configNotifyCostDuration` metric

* Finish all config metric

* Finish record naming rpc request duration

* Finish all naming meters

* Fix meter names

* Add unit tests

* Add Config trace spans

* Add Naming trace spans

* Add trace unit test

* Inject trace context with request headers

* Test trace to Jaeger

* Add config trace nested spans

* Add naming trace nested spans

* Add naming trace namespace attr and fix unit tests

* Fix Jaeger test case

* Update author info

* Add config serverNumber metric

* Add server request handle meters

* Fix metric/trace tests

* Clean all trace content, now a pure metric branch

* Clean maven dependencies

* Clean maven dependencies
KomachiSion pushed a commit that referenced this issue Oct 30, 2023
…11138)

* Use Micrometer to monitor the metrics previously detected by Prometheus

* Replace all Prometheus implement to Micrometer

* Add unit test

* Define unit test case

* Remove unnecessary dependencies

* Remove unnecessary dependencies

* Fix magic value

* Optimize code architecture

* Use a new CompositeMeterRegistry instead of the globalRegistry

* Use `NacosClientProperties` to get the env value

* Finish `configNotifyCostDuration` metric

* Finish all config metric

* Finish record naming rpc request duration

* Finish all naming meters

* Fix meter names

* Add unit tests

* Add Config trace spans

* Add Naming trace spans

* Add trace unit test

* Inject trace context with request headers

* Test trace to Jaeger

* Add config trace nested spans

* Add naming trace nested spans

* Add naming trace namespace attr and fix unit tests

* Fix Jaeger test case

* Update author info

* Add config serverNumber metric

* Add server request handle meters

* Fix metric/trace tests

* Set grpc attr to lower case

* Set span name "nacos" to upper case

* Fix trace tests assert

* Add server request handler traces

* Fix config magic values and make sure config trace attributes are set when exceptions occur

* Fix naming magic values and make sure naming trace attributes are set when exceptions occur

* Add trace about EncryptDataKey

* Roll back enhanced subclass

* Roll back enhanced subclass

* Nacos common no longer depends on opentelemetry-api

* Add spanProxy

* Set spanProxy params type to SpanBuilder

* Set the span kind of outgoing spans to `SpanKind.CLIENT`

* Call `getTracer()` every time when acquiring spans to follow OpenTelemetry doc

* Fix a null pointer issue

* Using dynamic proxy to tracing ClientWorker

* Using dynamic proxy to tracing all config module, except some static methods

* Using dynamic proxy to tracing Service level naming spans

* Using dynamic proxy to tracing naming redo service

* Almost all trace spans are refactored by JDK dynamic proxy

* Add unit tests for `TraceDynamicProxy`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants