Switch branches/tags
Find file History
jtuple and hanm ZOOKEEPER-3098: Add additional server metrics
This patch adds several new server-side metrics as well as makes it easier to add new metrics in the future. This patch also includes a handful of other minor metrics-related changes.

Here's a high-level summary of the changes.

1. This patch extends the request latency tracked in `ServerStats` to
   track `read` and `update` latency separately. Updates are any request
   that must be voted on and can change data, reads are all requests that
   can be handled locally and don't change data.

2. This patch adds the `ServerMetrics` logic and the related `AvgMinMaxCounter`
   and `SimpleCounter` classes. This code is designed to make it incredibly easy to
   add new metrics. To add a new metric you just add one line to `ServerMetrics` and
   then directly reference that new metric anywhere in the code base. The `ServerMetrics`
   logic handles creating the metric, properly adding the metric to the JSON output of
   the `/monitor` admin command, and properly resetting the metric when necessary.

   The motivation behind `ServerMetrics` is to make things easy enough that it encourages
   new metrics to be added liberally. Lack of in-depth metrics/visibility is a long-standing
   ZooKeeper weakness. At Facebook, most of our internal changes build on `ServerMetrics` and
   we have nearly 100 internal metrics at this time -- all of which we'll be upstreaming
   in the coming months as we publish more internal patches.

3. This patch adds 20 new metrics, 14 which are handled by `ServerMetrics`.

4. This patch replaces some uses of `synchronized` in `ServerStats` with atomic operations.

Here's a list of new metrics added in this patch:

- `uptime`: time that a peer has been in a stable leading/following/observing state
- `leader_uptime`: uptime for peer in leading state
- `global_sessions`: count of global sessions
- `local_sessions`: count of local sessions
- `quorum_size`: configured ensemble size
- `synced_observers`: similar to existing `synced_followers` but for observers
- `fsynctime`: time to fsync transaction log (avg/min/max)
- `snapshottime`: time to write a snapshot (avg/min/max)
- `dbinittime`: time to reload database -- read snapshot + apply transactions (avg/min/max)
- `readlatency`: read request latency (avg/min/max)
- `updatelatency`: update request latency (avg/min/max)
- `propagation_latency`: end-to-end latency for updates, from proposal on leader to committed-to-datatree on a given host (avg/min/max)
- `follower_sync_time`: time for follower to sync with leader (avg/min/max)
- `election_time`: time between entering and leaving election (avg/min/max)
- `looking_count`: number of transitions into looking state
- `diff_count`: number of diff syncs performed
- `snap_count`: number of snap syncs performed
- `commit_count`: number of commits performed on leader
- `connection_request_count`: number of incoming client connection requests
- `bytes_received_count`: similar to existing `packets_received` but tracks bytes

Author: Joseph Blomstedt <jdb@fb.com>

Reviewers: Allan Lyu <fangmin@apache.org>, Andor Molnár <andor@apache.org>, Enrico Olivelli <eolivelli@gmail.com>, Michael Han <hanm@apache.org>

Closes #580 from jtuple/ZOOKEEPER-3098
Latest commit f4cbb68 Sep 18, 2018
Permalink
..
Failed to load latest commit information.
admin ZOOKEEPER-2829: Interface usability / compatibility improvements thro… Jul 27, 2017
cli ZOOKEEPER-3073: fix couple of typos Jul 10, 2018
client ZOOKEEPER-2251: Add Client side packet response timeout to avoid infi… Jul 27, 2018
common ZOOKEEPER-3057: Fix IPv6 literal usage Jul 27, 2018
jmx ZOOKEEPER-3083: Remove some redundant and noisy log lines Jul 19, 2018
metrics ZOOKEEPER-3123: MetricsProvider Lifecycle in ZooKeeper Server Sep 11, 2018
server ZOOKEEPER-3098: Add additional server metrics Sep 17, 2018
util ZOOKEEPER-2935: [QP MutualAuth]: Port ZOOKEEPER-1045 implementation f… Nov 27, 2017
version/util ZOOKEEPER-3085: define exit codes in enum Aug 6, 2018
AsyncCallback.java ZOOKEEPER-2829: Interface usability / compatibility improvements thro… Jul 27, 2017
ClientCnxn.java ZOOKEEPER-1990: fix Random instances Sep 10, 2018
ClientCnxnSocket.java ZOOKEEPER-2940: Deal with maxbuffer as it relates to large requests f… Jul 11, 2018
ClientCnxnSocketNIO.java ZOOKEEPER-2139: Support multiple ZooKeeper client with different conf… May 2, 2016
ClientCnxnSocketNetty.java ZOOKEEPER-2949: using hostname and port to create SSLEngine Jan 30, 2018
ClientWatchManager.java ZOOKEEPER-1139. jenkins is reporting two warnings, fix these (phunt v… Jul 28, 2011
CreateMode.java ZOOKEEPER-2829: Interface usability / compatibility improvements thro… Jul 27, 2017
Environment.java ZOOKEEPER-2630: Use interface type instead of implementation type whe… Sep 11, 2017
JLineZNodeCompleter.java ZOOKEEPER-1408. CLI: sort output of ls command (Hartmut Lang via michim) Mar 29, 2014
KeeperException.java ZOOKEEPER-2251: Add Client side packet response timeout to avoid infi… Jul 27, 2018
Login.java ZOOKEEPER-2935: [QP MutualAuth]: Port ZOOKEEPER-1045 implementation f… Nov 27, 2017
MultiResponse.java ZOOKEEPER-1297. Add stat information to create() call (Lenni Kuff via… Dec 19, 2012
MultiTransactionRecord.java ZOOKEEPER-2169. Enable creation of nodes with TTLs. (Jordan Zimmerman… Oct 7, 2016
Op.java ZOOKEEPER-2169. Enable creation of nodes with TTLs. (Jordan Zimmerman… Oct 7, 2016
OpResult.java ZOOKEEPER-1297. Add stat information to create() call (Lenni Kuff via… Dec 19, 2012
Quotas.java ZOOKEEPER-231. Quotas in ZooKeeper. (mahadev) Feb 3, 2009
SaslClientCallbackHandler.java ZOOKEEPER-2935: [QP MutualAuth]: Port ZOOKEEPER-1045 implementation f… Nov 27, 2017
ServerAdminClient.java ZOOKEEPER-2829: Interface usability / compatibility improvements thro… Jul 27, 2017
Shell.java ZOOKEEPER-3085: define exit codes in enum Aug 6, 2018
StatsTrack.java ZOOKEEPER-231. Quotas in ZooKeeper. (mahadev) Feb 3, 2009
Testable.java ZOOKEEPER-1730. Make ZooKeeper easier to test - support simulating a … Apr 1, 2014
Transaction.java ZOOKEEPER-2829: Interface usability / compatibility improvements thro… Jul 27, 2017
Version.java ZOOKEEPER-3085: define exit codes in enum Aug 6, 2018
WatchDeregistration.java ZOOKEEPER-442. need a way to remove watches that are no longer of int… Jan 24, 2014
WatchedEvent.java ZOOKEEPER-2829: Interface usability / compatibility improvements thro… Jul 27, 2017
Watcher.java [ZOOKEEPER-2368] Send a watch event is when a client is closed Jun 20, 2018
ZKUtil.java ZOOKEEPER-1962: Add a CLI command to recursively list a znode and chi… Sep 8, 2016
ZooDefs.java ZOOKEEPER-2829: Interface usability / compatibility improvements thro… Jul 27, 2017
ZooKeeper.java ZOOKEEPER-2251: Add Client side packet response timeout to avoid infi… Jul 27, 2018
ZooKeeperMain.java ZOOKEEPER-3085: define exit codes in enum Aug 6, 2018
ZooKeeperTestable.java ZOOKEEPER-2069 Netty Support for ClientCnxnSocket (Hongchao via fpj) Dec 20, 2014