Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[doc] Add a documentation page for metrics reference #4910

Merged
merged 6 commits into from
Aug 21, 2019

Conversation

sijie
Copy link
Member

@sijie sijie commented Aug 7, 2019

Motivation

Add a documentation page for metrics reference

*Motivation*

Add a documentation page for metrics reference
@sijie sijie added the doc Your PR contains doc changes, no matter whether the changes are in markdown or code files. label Aug 7, 2019
@sijie sijie added this to the 2.5.0 milestone Aug 7, 2019
@sijie sijie self-assigned this Aug 7, 2019
@Jennifer88huang-zz
Copy link
Contributor

@Anonymitaet Could you please help review? Thank you.

@Anonymitaet
Copy link
Member

@jennifer88huang glad to help

The broker metrics are exposed under "/metrics" at port 8080. You can change the port by updating `webServicePort` to a different port
in `broker.conf` configuration file.

All the metrics exposed by broker are labelled with `cluster=${pulsar_cluster}`. The value of `${pulsar_cluster}` is the pulsar cluster
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
All the metrics exposed by broker are labelled with `cluster=${pulsar_cluster}`. The value of `${pulsar_cluster}` is the pulsar cluster
All the metrics exposed by a broker are labelled with `cluster=${pulsar_cluster}`. The value of `${pulsar_cluster}` is the pulsar cluster


> Namespace metrics are only exposed when `exposeTopicLevelMetricsInPrometheus` is set to false.

All the namespace metrics will be labelled with following labels:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
All the namespace metrics will be labelled with following labels:
All the namespace metrics will be labelled with the following labels:


## Monitoring

You can [setup a Prometheus instance](https://prometheus.io/) to collect all the metrics exposed at Pulsar components and setup
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
You can [setup a Prometheus instance](https://prometheus.io/) to collect all the metrics exposed at Pulsar components and setup
You can [set up a Prometheus instance](https://prometheus.io/) to collect all the metrics exposed at Pulsar components and set up

setup is a noun
set up is a verb phrase

The metrics exposed by Pulsar are in Prometheus format. The types of metrics are:

- [Counter](https://prometheus.io/docs/concepts/metric_types/#counter): a cumulative metric that represents a single monotonically increasing counter whose value can only increase or be reset to zero on restart.
- [Gauge](https://prometheus.io/docs/concepts/metric_types/#gauge): A *gauge* is a metric that represents a single numerical value that can arbitrarily go up and down.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- [Gauge](https://prometheus.io/docs/concepts/metric_types/#gauge): A *gauge* is a metric that represents a single numerical value that can arbitrarily go up and down.
- [Gauge](https://prometheus.io/docs/concepts/metric_types/#gauge): a *gauge* is a metric that represents a single numerical value that can arbitrarily go up and down.

Same cases for the two items below

| Name | Type | Description |
|---|---|---|
| bookie_SERVER_STATUS | Gauge | The server status for bookie server. <br><ul><li>1: the bookie is running in writable mode.</li><li>0: the bookie is running in readonly mode.</li></ul> |
| bookkeeper_server_ADD_ENTRY_count | Counter | The total number of ADD_ENTRY requests received at the bookie. Label `success` used to distinguish successes and failures |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A period should be placed behind the last word of the sentence. Please check all cases.


> Subscription metrics are only exposed when `exposeTopicLevelMetricsInPrometheus` is set to true.

All the subscription metrics will be labelled with following labels:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
All the subscription metrics will be labelled with following labels:
All the subscription metrics are labelled with the following labels:

> Consumer metrics are only exposed when both `exposeTopicLevelMetricsInPrometheus` and `exposeConsumerLevelMetricsInPrometheus`
> are set to true.

All the consumer metrics will be labelled with following labels:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
All the consumer metrics will be labelled with following labels:
All the consumer metrics are labelled with the following labels:

The zookeeper metrics are exposed under "/metrics" at port 8000. You can change the port by configuring a system
property `stats_server_port` to use a different port.

### Server Metrics
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### Server Metrics
### Server metrics

We use sentence case rather than title case for headings, please check all occurrences.


All the metrics exposed by broker are labelled with `cluster=${pulsar_cluster}`. The value of `${pulsar_cluster}` is the pulsar cluster
name you configured in `broker.conf`.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding the following to show a preview/structure of long contents?
Besides, it is easy for users to locate info for #### (heading 4) contents which are not shown on the right TOC on the webpage.

If agree, please check all cases.

Broker has the following kinds of metrics:

* [Namespace metrics](#namespace-metrics)
    * [Replication metrics](#replication-metrics)
* [Topic metrics](#topic-metircs)
* [Subscription metrics](#subscription-metrics)
* [Consumer metrics](#consumer-metrics)

| pulsar_consumer_msg_throughput_out | Gauge | The total message dispatch throughput for a consumer (bytes/second) |
| pulsar_consumer_available_permits | Gauge | The available permits for for a consumer |

## Monitoring
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
## Monitoring
## Monitor

Use a base verb rather than a noun in headings, which facilitates searching.

@sijie
Copy link
Member Author

sijie commented Aug 10, 2019

@Anonymitaet I have addressed all your review comments. PTAL

@sijie
Copy link
Member Author

sijie commented Aug 12, 2019

run cpp tests
run java8 tests

sidebar_label: Pulsar Metrics
---

<style type="text/css">
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we display this to users?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is to set the css style for the table. keep it consistent with other reference pages.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, it belongs to metadata, make sure it will not be displayed to users.


## ZooKeeper

The zookeeper metrics are exposed under "/metrics" at port 8000. You can change the port by configuring a system
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

zookeeper --> ZooKeeper
If there is similar cases, please check and refine throughout.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use a different port by configuring the stats_server_port system property.


| Name | Type | Description |
|---|---|---|
| zookeeper_server_znode_count | Gauge | Number of z-nodes stored. |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| zookeeper_server_znode_count | Gauge | Number of z-nodes stored. |
| zookeeper_server_znode_count | Gauge | The number of z-nodes stored. |

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can check and refine all similar cases.


## BookKeeper

The bookkeeper metrics are exposed under "/metrics" at port 8000. You can change the port by updating `prometheusStatsHttpPort`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bookkeeper --> BookKeeper
Check and refine all similar cases.


| Name | Type | Description |
|---|---|---|
| bookie_journal_JOURNAL_SYNC_count | Counter | The total number of journal fsync operations happening at the bookie. Label `success` used to distinguish successes and failures. |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| bookie_journal_JOURNAL_SYNC_count | Counter | The total number of journal fsync operations happening at the bookie. Label `success` used to distinguish successes and failures. |
| bookie_journal_JOURNAL_SYNC_count | Counter | The total number of journal fsync operations happening at the bookie. The `success` label is used to distinguish successes and failures. |


### Namespace metrics

> Namespace metrics are only exposed when `exposeTopicLevelMetricsInPrometheus` is set to false.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
> Namespace metrics are only exposed when `exposeTopicLevelMetricsInPrometheus` is set to false.
> Namespace metrics are only exposed when `exposeTopicLevelMetricsInPrometheus` is set to `false`.


> Topic metrics are only exposed when `exposeTopicLevelMetricsInPrometheus` is set to true.

All the topic metrics are labelled with following labels:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
All the topic metrics are labelled with following labels:
All the topic metrics are labelled with the following labels:


> Subscription metrics are only exposed when `exposeTopicLevelMetricsInPrometheus` is set to true.

All the subscription metrics are labelled with following labels:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
All the subscription metrics are labelled with following labels:
All the subscription metrics are labelled with the following labels:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check other similar cases.

|---|---|---|
| pulsar_consumer_msg_rate_redeliver | Gauge | The total message rate for message being redelivered (messages/second). |
| pulsar_consumer_unacked_massages | Gauge | The total number of unacked messages of a consumer (messages). |
| pulsar_consumer_blocked_on_unacked_messages | Gauge | Indicate whether a consumer is blocked on unacked messages or not. <br> <ul><li>1 means the consumer is blocked on waiting unacked messages to be acked.</li><li>0 means the consumer is not blocked on waiting unacked messages to be acked.</li></ul> |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| pulsar_consumer_blocked_on_unacked_messages | Gauge | Indicate whether a consumer is blocked on unacked messages or not. <br> <ul><li>1 means the consumer is blocked on waiting unacked messages to be acked.</li><li>0 means the consumer is not blocked on waiting unacked messages to be acked.</li></ul> |
| pulsar_consumer_blocked_on_unacked_messages | Gauge | Indicate whether a consumer is blocked on unacknowledged messages or not. <br> <ul><li>`1` means the consumer is blocked on waiting unacknowledged messages to be acknowledged.</li><li>`0` means the consumer is not blocked on waiting unacknowledged messages to be acknowledged.</li></ul> |

You can [set up a Prometheus instance](https://prometheus.io/) to collect all the metrics exposed at Pulsar components and set up
[Grafana](https://grafana.com/) dashboards to display the metrics and monitor your Pulsar cluster.

The example Grafana dashboards can be found:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The example Grafana dashboards can be found:
The following are some Grafana dashboards examples:

Copy link
Member

@wolfstudy wolfstudy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM +1

@sijie
Copy link
Member Author

sijie commented Aug 18, 2019

@jennifer88huang I have addressed your review comments.

Copy link
Contributor

@Jennifer88huang-zz Jennifer88huang-zz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only a tiny issue.


| Name | Type | Description |
|---|---|---|
| bookie_journal_JOURNAL_SYNC_count | Counter | The total number of journal fsync operations happening at the bookie. The `success` label used to distinguish successes and failures. |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| bookie_journal_JOURNAL_SYNC_count | Counter | The total number of journal fsync operations happening at the bookie. The `success` label used to distinguish successes and failures. |
| bookie_journal_JOURNAL_SYNC_count | Counter | The total number of journal fsync operations happening at the bookie. The `success` label is used to distinguish successes and failures. |

@sijie
Copy link
Member Author

sijie commented Aug 20, 2019

@jennifer88huang addressed your comments. PTAL

@codelipenghui
Copy link
Contributor

run java8 tests

1 similar comment
@codelipenghui
Copy link
Contributor

run java8 tests

@sijie sijie merged commit 5631067 into apache:master Aug 21, 2019
@sijie sijie deleted the metrics_reference branch August 21, 2019 02:11
@wolfstudy wolfstudy modified the milestones: 2.5.0, 2.4.2 Nov 19, 2019
@wolfstudy
Copy link
Member

Change the Milestone to 2.4.2, because of conflict.

wolfstudy pushed a commit that referenced this pull request Nov 20, 2019
*Motivation*

Add a documentation page for metrics reference

(cherry picked from commit 5631067)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc Your PR contains doc changes, no matter whether the changes are in markdown or code files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants