-
Notifications
You must be signed in to change notification settings - Fork 11.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RIP-46][Task1]: Define the specification of metric. #5366
Comments
@yangwenting-ywt is there some pop metrics on broker or consumer ? |
Will you add metric gauge about consumeOKTPS and consumeFailedTPS in ConsumeStatus.java? These data on consumer client are very helpful to find if there are some error cause by dirty data or virtual machine issues. And will you and similar consumeOKTPS and consumeFailedTPS on producer client? Such like producerOKTPS or producerFailedTPS. Although this is not important as consumer TPS, it's better to help business to find who is the bad guy that send large garbage. |
Hisrograms can track the number of observations , that showing up as a time series with a _count suffix is inherently a counter. |
Thanks for your reply. And I see topic and client_id params in Producer Metrics. Does that means Metrics can show topic send msg count by each producer machine? |
In prometheus metrics spec, histogram rocketmq_send_cost_time will be transferred to rocketmq_send_cost_time_count, rocketmq_send_cost_time_sum, and rocketmq_send_cost_time_bucket. I think rocketmq_send_cost_time_count is what you need.
Each label combination generates a time series, so you will see 1,2,3,4. You can read the Prometheus doc Data Model to get more information. |
Thanks, I got it. Where does Metrics collect rocketmq_send_cost_time's data from? Broker's Stats in memory? Or some new place? And when will Metrics release? Only in 5.x? Or also in 4.9.x? |
The rocketmq_send_cost_time is collected by producer and reported to opentelemetry collector. We will release metrics in 5.x first. The server metrics is easy to backport to 4.x but client metrics probably can't. |
Metrics
RocketMQ exposes the following metrics in Prometheus format. You can monitor your clusters with those metrics.
Details of metrics
Metric types
The standard for defining metrics in RocketMQ complies with that for defining the metrics in open source Prometheus. The metric types that RocketMQ offers include counters, gauges, and histograms. For more information, see METRIC TYPES.
Broker metrics
The following table describes the labels of the metrics that are related to the Message Queue for Apache RocketMQ broker.
Normal:normal messages;
FIFO:ordered messages;
Transaction:Transactional messages;
Delay:scheduled or delayed messages.
le_1_kb: ≤ 1 KB
le_4_kb: ≤ 4 KB
le_512_kb: ≤ 512 KB
le_1_mb: ≤ 1 MB
le_2_mb: ≤ 2 MB
le_4_mb: ≤ 4 MB
le_overflow: > 4 MB
le_1_kb: ≤ 1 KB
le_4_kb: ≤ 4 KB
le_512_kb: ≤ 512 KB
le_1_mb: ≤ 1 MB
le_2_mb: ≤ 2 MB
le_4_mb: ≤ 4 MB
le_overflow: > 4 MB
Producer metrics
The following table describes the labels of the metrics that are related to the producers in Message Queue for Apache RocketMQ.
Normal:normal messages;
FIFO:ordered messages;
Transaction:Transactional messages;
Delay:scheduled or delayed messages.
le_1_ms
le_5_ms
le_10_ms
le_20_ms
le_50_ms
le_200_ms
le_500_ms
le_overflow
Consumer metrics
The following table describes the labels of the metrics that are related to the consumers in Message Queue for Apache RocketMQ.
le_1_ms
le_5_ms
le_10_ms
le_100_ms
le_10000_ms
le_60000_ms
le_overflow
le_1_ms
le_5_ms
le_20_ms
le_100_ms
le_1000_ms
le_5000_ms
le_10000_ms
le_overflow
Background information
RocketMQ defines metrics based on the following business scenarios.
Message accumulation scenarios
The above figure shows the number and duration of messages in different stages. By monitoring these metrics, you can determine whether the business consumption is abnormal. The following table describes the meaning of these metrics and the formulas that are used to calculate these metrics.
Scheduled message:timing end time.
Transactional message: transaction commit time.
This time reflects the timeliness of the consumer to complete message processing.
PushConsumer consumption scenarios
In PushConsumer, real-time message processing capability is implemented based on the typical Reactor thread model inside the SDK.As shown below, the SDK has a built-in long polling thread that asynchronously pulls messages into the SDK's built-in buffer queue and then separately commits them to the consumer thread, triggering the listener to execute the local consumption logic.
![PushConsumer client](https://user-images.githubusercontent.com/3804270/196887265-016af4d5-01c7-4bb8-b2aa-2dcc420247ac.png)
The metrics of local buffer queues in the PushConsumer scenario are as follows:
The text was updated successfully, but these errors were encountered: