Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removed AspectJ based metrics for ZooKeeper #10533

Merged
merged 3 commits into from
May 13, 2021
Merged

Conversation

merlimat
Copy link
Contributor

Motivation

So far, we have been relying on a wacky approach to gather metrics for Zookeeper usage. That was done because ZK used not to have a decent set of exported metrics in server or client side.

We were relying on AspectJ to intercept specific points in ZooKeeper client/server execution and collect the data points and expose them through Prometheus.

There are few problem with this approach:

  1. It's hacky because we're jumping inside ZK undocumented code internals, which break between releases and there's overhead in the AspectJ interception.
  2. AspectJ doesn't work anymore with recent versions of Java 16+. I does not look like there is any work in progress to achieve that.

Since we have already upgraded ZooKeeper to 3.6, it already has support for pluggable metrics provider and Prometheus is one of the options. There's no point anymore to continue doing the AspectJ hack.

In separate PRs I'll add the ZK config and submit ZK Grafana dashboards.

@merlimat merlimat added the type/enhancement The enhancements for the existing features or docs. e.g. reduce memory usage of the delayed messages label May 10, 2021
@merlimat merlimat added this to the 2.8.0 milestone May 10, 2021
@merlimat merlimat self-assigned this May 10, 2021
@linlinnn
Copy link
Contributor

/pulsarbot run-failure-checks

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Also, for reference, if we update to ZooKeeper 3.7.0 there is an official API to bootstrap ZooKeeper Server Nodes for production from Java code, instead of using internal ZooKeeper classes, that may change behaviour from one release to the next

https://zookeeper.apache.org/doc/current/apidocs/zookeeper-server/org/apache/zookeeper/server/embedded/ZooKeeperServerEmbedded.html

@merlimat
Copy link
Contributor Author

LGTM

Also, for reference, if we update to ZooKeeper 3.7.0 there is an official API to bootstrap ZooKeeper Server Nodes for production from Java code, instead of using internal ZooKeeper classes, that may change behaviour from one release to the next

https://zookeeper.apache.org/doc/current/apidocs/zookeeper-server/org/apache/zookeeper/server/embedded/ZooKeeperServerEmbedded.html

Sounds good. We could go to 3.7.0 after Pulsar 2.8 is cut.

Copy link
Contributor

@rdhabalia rdhabalia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in next PR, please make sure we update existing metrics placeholder so, existing monitoring doesn't break.

@merlimat
Copy link
Contributor Author

merlimat commented May 11, 2021

@rdhabalia The metrics that ZooKeeper exports are going to be different, in names but also types. It will not be possible to maintain a compatibility layer here.

Even with that, I think that it would be much better to just move to the metrics supported by ZooKeeper instead of the current situation.

@eolivelli eolivelli merged commit 4f88f5a into apache:master May 13, 2021
@merlimat merlimat deleted the aspectj branch May 13, 2021 19:15
codelipenghui pushed a commit that referenced this pull request Jun 3, 2021
…10803)

### Motivation
After upgrade zookeeper version to 3.6.2 in #8590 and removed AspectJ based metrics for ZooKeeper in #10533, the zookeeper's prometheus metric has lost if we start zookeeper by `bin/puldar-daemon start zookeeper`.

Due to zookeeper 3.6.0+ has add internal prometheus metric provider, so we can turn on by default in pulsar.

### Modification
1. turn on zookeeper prometheus metric provider by default in `conf/zookeeper.conf` and use 8000 as default port sync with old zookeeper metric port
2. add grafana panel for new zookeeper metrics
3. remove old prometheus metric provider in `ZooKeeperStarter` and `ConfigurationStoreStarter`.
4. update reference-metric.md doc
yangl pushed a commit to yangl/pulsar that referenced this pull request Jun 23, 2021
…pache#10803)

### Motivation
After upgrade zookeeper version to 3.6.2 in apache#8590 and removed AspectJ based metrics for ZooKeeper in apache#10533, the zookeeper's prometheus metric has lost if we start zookeeper by `bin/puldar-daemon start zookeeper`.

Due to zookeeper 3.6.0+ has add internal prometheus metric provider, so we can turn on by default in pulsar.

### Modification
1. turn on zookeeper prometheus metric provider by default in `conf/zookeeper.conf` and use 8000 as default port sync with old zookeeper metric port
2. add grafana panel for new zookeeper metrics
3. remove old prometheus metric provider in `ZooKeeperStarter` and `ConfigurationStoreStarter`.
4. update reference-metric.md doc
pkumar-singh pushed a commit to pkumar-singh/pulsar that referenced this pull request Aug 10, 2021
…pache#10803)

After upgrade zookeeper version to 3.6.2 in apache#8590 and removed AspectJ based metrics for ZooKeeper in apache#10533, the zookeeper's prometheus metric has lost if we start zookeeper by `bin/puldar-daemon start zookeeper`.

Due to zookeeper 3.6.0+ has add internal prometheus metric provider, so we can turn on by default in pulsar.

1. turn on zookeeper prometheus metric provider by default in `conf/zookeeper.conf` and use 8000 as default port sync with old zookeeper metric port
2. add grafana panel for new zookeeper metrics
3. remove old prometheus metric provider in `ZooKeeperStarter` and `ConfigurationStoreStarter`.
4. update reference-metric.md doc
bharanic-dev pushed a commit to bharanic-dev/pulsar that referenced this pull request Mar 18, 2022
…pache#10803)

### Motivation
After upgrade zookeeper version to 3.6.2 in apache#8590 and removed AspectJ based metrics for ZooKeeper in apache#10533, the zookeeper's prometheus metric has lost if we start zookeeper by `bin/puldar-daemon start zookeeper`.

Due to zookeeper 3.6.0+ has add internal prometheus metric provider, so we can turn on by default in pulsar.

### Modification
1. turn on zookeeper prometheus metric provider by default in `conf/zookeeper.conf` and use 8000 as default port sync with old zookeeper metric port
2. add grafana panel for new zookeeper metrics
3. remove old prometheus metric provider in `ZooKeeperStarter` and `ConfigurationStoreStarter`.
4. update reference-metric.md doc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/enhancement The enhancements for the existing features or docs. e.g. reduce memory usage of the delayed messages
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants