Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(metrics): change gauge metrics to be scoped by worker id #9605

Closed

Conversation

thalesmg
Copy link
Contributor

@thalesmg thalesmg commented Dec 22, 2022

https://emqx.atlassian.net/browse/EMQX-8548

Currently, we face several issues trying to keep resource metrics
reasonable. For example, when a resource is re-created and has its
metrics reset, but then its durable queue resumes its previous work
and leads to strange (often negative) metrics.

Instead using counters that are shared by more than one worker to
manage gauges, we introduce an ETS table whose key is not only scoped
by the Resource ID as before, but also by the worker ID. This way,
when a worker starts/terminates, they should set their own gauges to
their values (often 0 or replayq:count when resuming off a queue).
With this scoping and initialization procedure, we'll hopefully avoid
hitting those strange metrics scenarios and have better control over
the gauges.

Related Wolff PR: kafka4beam/wolff#41

@thalesmg thalesmg force-pushed the refactor-gauge-metrics-e50 branch 2 times, most recently from ef3d229 to 50b87fd Compare December 22, 2022 20:25
Currently, we face several issues trying to keep resource metrics
reasonable.  For example, when a resource is re-created and has its
metrics reset, but then its durable queue resumes its previous work
and leads to strange (often negative) metrics.

Instead using `counters` that are shared by more than one worker to
manage gauges, we introduce an ETS table whose key is not only scoped
by the Resource ID as before, but also by the worker ID.  This way,
when a worker starts/terminates, they should set their own gauges to
their values (often 0 or `replayq:count` when resuming off a queue).
With this scoping and initialization procedure, we'll hopefully avoid
hitting those strange metrics scenarios and have better control over
the gauges.
@coveralls
Copy link
Collaborator

coveralls commented Dec 23, 2022

Pull Request Test Coverage Report for Build 3781769126

  • 189 of 220 (85.91%) changed or added relevant lines in 12 files are covered.
  • 44 unchanged lines in 19 files lost coverage.
  • Overall coverage increased (+0.009%) to 79.491%

Changes Missing Coverage Covered Lines Changed/Added Lines %
apps/emqx/src/emqx_metrics_worker.erl 35 36 97.22%
apps/emqx_resource/src/emqx_resource_metrics.erl 14 17 82.35%
lib-ee/emqx_ee_bridge/src/kafka/emqx_ee_bridge_kafka_consumer_sup.erl 12 17 70.59%
apps/emqx_resource/src/emqx_resource_worker.erl 42 51 82.35%
lib-ee/emqx_ee_bridge/src/kafka/emqx_bridge_impl_kafka_consumer.erl 57 70 81.43%
Files with Coverage Reduction New Missed Lines %
apps/emqx_authn/src/emqx_authn.erl 1 76.67%
apps/emqx_authn/src/simple_authn/emqx_authn_jwt.erl 1 85.8%
apps/emqx_dashboard/src/emqx_dashboard_monitor.erl 1 69.44%
apps/emqx_gateway/src/mqttsn/emqx_sn_channel.erl 1 73.04%
apps/emqx_resource/src/emqx_resource.erl 1 84.72%
apps/emqx_resource/src/emqx_resource_manager.erl 1 89.69%
apps/emqx/src/emqx_authentication_config.erl 1 83.62%
apps/emqx/src/emqx_channel.erl 1 87.17%
apps/emqx/src/emqx_logger.erl 1 80.49%
apps/emqx/src/emqx_router_helper.erl 1 86.27%
Totals Coverage Status
Change from base Build 3749779121: 0.009%
Covered Lines: 22623
Relevant Lines: 28460

💛 - Coveralls

}
label {
en: "Kafka Consumer"
zh: "卡夫卡消费者"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
zh: "卡夫卡消费者"
zh: "Kafka 消费者"

}
label {
en: "Kafka Consumer"
zh: "卡夫卡消费者"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
zh: "卡夫卡消费者"
zh: "Kafka 消费者"

consumer_kafka_opts {
desc {
en: "Kafka consumer configs."
zh: "Kafka消费者配置。"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
zh: "Kafka消费者配置。"
zh: "Kafka 消费者配置。"

en: "MQTT to Kafka"
zh: "MQTT 到 Kafka"
en: "Kafka Producer"
zh: "卡夫卡制片人"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
zh: "卡夫卡制片人"
zh: "Kafka 生产者"

}
label {
en: "Kafka topic"
zh: "卡夫卡主题 "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
zh: "卡夫卡主题 "
zh: "Kafka 主题 "

@thalesmg
Copy link
Contributor Author

thalesmg commented Dec 27, 2022

Port of the gauge part of this for master (since e501 is a bit on hold): #9619

@thalesmg
Copy link
Contributor Author

thalesmg commented Jan 5, 2023

Superseded by #9619 and #9642 (both on master).

@thalesmg thalesmg closed this Jan 5, 2023
@thalesmg thalesmg deleted the refactor-gauge-metrics-e50 branch January 5, 2023 19:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants