EP Stats

1 Getting Started

For introductory information on stats within membase, start with the membase wiki stats page.

2 Stats Definitions

2.1 Toplevel Stats

Stat	Description
ep_version	Version number of ep_engine.
ep_storage_age	Seconds since most recently
	stored object was initially queued.
ep_storage_age_highwat	ep_storage_age high water mark
ep_min_data_age	Minimum data age setting.
ep_queue_age_cap	Queue age cap setting.
ep_max_txn_size	Max number of updates per transaction.
ep_data_age	Seconds since most recently
	stored object was modified.
ep_data_age_highwat	ep_data_age high water mark
ep_too_young	Number of times an object was
	not stored due to being too young.
ep_too_old	Number of times an object was
	stored after being dirty too long.
ep_total_enqueued	Total number of items queued for
	persistence.
ep_total_new_items	Total number of persisted new items.
ep_total_del_items	Total number of persisted deletions.
ep_total_persisted	Total number of items persisted.
ep_item_flush_failed	Number of times an item failed to flush
	due to storage errors.
ep_item_commit_failed	Number of times a transaction failed to
	commit due to storage errors.
ep_item_begin_failed	Number of times a transaction failed to
	start due to storage errors.
ep_expired	Number of times an item was expired.
ep_item_flush_expired	Number of times an item is not flushed
	due to the expiry of the item
ep_queue_size	Number of items queued for storage.
ep_flusher_todo	Number of items remaining to be written.
ep_flusher_state	Current state of the flusher thread.
ep_commit_num	Total number of write commits.
ep_commit_time	Number of seconds of most recent commit.
ep_commit_time_total	Cumulative seconds spent committing.
ep_vbucket_del	Number of vbucket deletion events.
ep_vbucket_del_fail	Number of failed vbucket deletion events.
ep_vbucket_del_max_walltime	Max wall time (µs) spent by deleting
	a vbucket
ep_vbucket_del_total_walltime	Total wall time (µs) spent by deleting
	vbuckets
ep_vbucket_del_avg_walltime	Avg wall time (µs) spent by deleting
	a vbucket
ep_flush_preempts	Num of flush early exits for read reqs.
ep_flush_duration	Number of seconds of most recent flush.
ep_flush_duration_total	Cumulative seconds spent flushing.
ep_flush_duration_highwat	ep_flush_duration high water mark.
curr_items	Num items in active vbuckets.
curr_items_tot	Num current items including those not
	active (replica, dead and pending states)
ep_kv_size	Memory used to store keys and values.
ep_overhead	Extra memory used by rep queues, etc..
ep_max_data_size	Max amount of data allowed in memory.
ep_mem_low_wat	Low water mark for auto-evictions.
ep_mem_high_wat	High water mark for auto-evictions.
ep_total_cache_size	The total size of all items in the cache
ep_oom_errors	Number of times unrecoverable OOMs
	happened while processing operations
ep_tmp_oom_errors	Number of times temporary OOMs
	happened while processing operations
ep_bg_fetched	Number of items fetched from disk.
ep_tap_bg_fetched	Number of tap disk fetches
ep_tap_bg_fetch_requeued	Number of times a tap bg fetch task is
	requeued.
ep_num_pager_runs	Number of times we ran pager loops
	to seek additional memory.
ep_num_expiry_pager_runs	Number of times we ran expiry pager loops
	to purge expired items from memory/disk
ep_num_value_ejects	Number of times item values got ejected
	from memory to disk
ep_num_eject_replicas	Number of times replica item values got
	ejected from memory to disk
ep_num_eject_failures	Number of items that could not be ejected
ep_num_not_my_vbuckets	Number of times Not My VBucket exception
	happened during runtime
ep_warmup_thread	Warmup thread status.
ep_warmed_up	Number of items warmed up.
ep_warmup_dups	Duplicates encountered during warmup.
ep_warmup_oom	OOMs encountered during warmup.
ep_warmup_time	Time (µs) spent by warming data.
ep_tap_keepalive	Tap keepalive time.
ep_dbname	DB path.
ep_dbinit	Number of seconds to initialize DB.
ep_dbshards	Number of shards for db store
ep_db_strategy	SQLite db strategy
ep_warmup	true if warmup is enabled.
ep_io_num_read	Number of io read operations
ep_io_num_write	Number of io write operations
ep_io_read_bytes	Number of bytes read (key + values)
ep_io_write_bytes	Number of bytes written (key + values)
ep_pending_ops	Number of ops awaiting pending vbuckets
ep_pending_ops_total	Total blocked pending ops since reset
ep_pending_ops_max	Max ops seen awaiting 1 pending vbucket
ep_pending_ops_max_duration	Max time (µs) used waiting on pending
	vbuckets
ep_bg_num_samples	The number of samples included in the avg
ep_bg_min_wait	The shortest time (µs) in the wait queue
ep_bg_max_wait	The longest time (µs) in the wait queue
ep_bg_wait_avg	The average wait time (µs) for an item
	before it is serviced by the dispatcher
ep_bg_min_load	The shortest load time (µs)
ep_bg_max_load	The longest load time (µs)
ep_bg_load_avg	The average time (µs) for an item to be
	loaded from the persistence layer
ep_num_non_resident	The number of non-resident items
ep_num_active_non_resident	Number of non-resident items in active
	vbuckets.
ep_store_max_concurrency	Maximum allowed concurrency at the storage
	layer.
ep_store_max_readers	Maximum number of concurrent read-only.
	storage threads.
ep_store_max_readwrite	Maximum number of concurrent read/write
	storage threads.
ep_db_cleaner_status	Status of database cleaner that cleans up
	invalid items with old vbucket versions

2.2 Tap stats

ep_tap_total_queue	Sum of tap queue sizes on the current
	tap queues
ep_tap_total_fetched	Sum of all tap messages sent
ep_tap_bg_max_pending	The maximum number of bg jobs a tap
	connection may have
ep_tap_bg_fetched	Number of tap disk fetches
ep_tap_bg_fetch_requeued	Number of times a tap bg fetch task is
	requeued.
ep_tap_fg_fetched	Number of tap memory fetches
ep_tap_deletes	Number of tap deletion messages sent
ep_tap_throttled	Number of tap messages refused due to
	throttling.
ep_tap_keepalive	How long to keep tap connection state
	after client disconnect.
ep_tap_count	Number of tap connections.
ep_tap_bg_num_samples	The number of tap bg fetch samples
	included in the avg
ep_tap_bg_min_wait	The shortest time (µs) for a tap item
	before it is serviced by the dispatcher
ep_tap_bg_max_wait	The longest time (µs) for a tap item
	before it is serviced by the dispatcher
ep_tap_bg_wait_avg	The average wait time (µs) for a tap item
	before it is serviced by the dispatcher
ep_tap_bg_min_load	The shortest time (µs) for a tap item to
	be loaded from the persistence layer
ep_tap_bg_max_load	The longest time (µs) for a tap item to
	be loaded from the persistence layer
ep_tap_bg_load_avg	The average time (µs) for a tap item to
	be loaded from the persistence layer
ep_tap_noop_interval	The number of secs between a noop is added
	to an idle connection
ep_tap_backoff_period	The number of seconds the tap connection
	should back off after receiving ETMPFAIL

2.2.1 Per Tap Client Stats

Each stat begins with ep_tapq: followed by a unique client_id and another colon. For example, if your client is named, slave1, the qlen stat would be ep_tapq:slave1:qlen.

qlen	Queue size for the given client_id.
qlen_high_pri	High priority tap queue items.
qlen_low_pri	Low priority tap queue items.
vb_filters	Size of connection vbucket filter set.
vb_filter	The content of the vbucket filter
rec_fetched	Tap messages sent to the client.
rec_skipped	Number of messages skipped due to
	tap reconnect with a different filter
idle	True if this connection is idle.
empty	True if this connection has no items.
complete	True if backfill is complete.
has_item	True when there is a bg fetched item
	ready.
has_queued_item	True when there is a key ready to be
	looked up (may become fg or bg item)
bg_wait_for_result	True if the max number of background
	operations is started
bg_queue_size	Number of bg fetches enqueued for this
	connection.
bg_queued	Number of background fetches enqueued.
bg_result_size	Number of ready background results.
bg_results	Number of background results ready.
bg_jobs_issued	Number of background jobs started.
bg_jobs_completed	Number of background jobs completed.
bg_backlog_size	Number of items pending bg fetch.
flags	Connection flags set by the client.
connected	true if this client is connected
pending_disconnect	true if we’re hanging up on this client
paused	true if this client is blocked
pending_backfill	true if we’re still backfilling keys
	for this connection
pending_disk_backfill	true if we’re still backfilling keys
	from disk for this connection
reconnects	Number of reconnects from this client.
disconnects	Number of disconnects from this client.
backfill_age	The age of the start of the backfill.
ack_seqno	The current tap ACK sequence number.
recv_ack_seqno	Last receive tap ACK sequence number.
ack_log_size	Tap ACK backlog size.
ack_window_full	true if our tap ACK window is full.
expires	When this ACK backlog expires.
num_tap_nack	The number of negative tap acks received
num_tap_tmpfail_survivors	The number of items rescheduled due to
	a temporary nack.

2.3 Timing Stats

Timing stats provide histogram data from high resolution timers over various operations within the system.

2.3.1 General Form

As this data is multi-dimensional, some parsing may be required for machine processing. It’s somewhat human readable, but the stats script mentioned in the Getting Started section above will do fancier formatting for you.

Consider the following sample stats:

STAT disk_insert_8,16 9488
STAT disk_insert_16,32 290
STAT disk_insert_32,64 73
STAT disk_insert_64,128 86
STAT disk_insert_128,256 48
STAT disk_insert_256,512 2
STAT disk_insert_512,1024 12
STAT disk_insert_1024,2048 1

This tells you that disk_insert took 8-16µs 9,488 times, 16-32µs 290 times, and so on.

The same stats displayed through the stats CLI tool would look like this:

disk_insert (10008 total)
   8us - 16us    : ( 94.80%) 9488 ###########################################
   16us - 32us   : ( 97.70%)  290 #
   32us - 64us   : ( 98.43%)   73
   64us - 128us  : ( 99.29%)   86
   128us - 256us : ( 99.77%)   48
   256us - 512us : ( 99.79%)    2
   512us - 1ms   : ( 99.91%)   12
   1ms - 2ms     : ( 99.92%)    1

2.3.2 Available Stats

The following histograms are available from “timings” in the above form to describe when time was spent doing various things:

bg_wait	bg fetches waiting in the dispatcher queue
bg_load	bg fetches waiting for disk
bg_tap_wait	tap bg fetches waiting in the dispatcher queue
bg_tap_laod	tap bg fetches waiting for disk
pending_ops	client connections blocked for operations
	in pending vbuckets.
storage_age	Analogous to ep_storage_age in main stats.
data_age	Analogous to ep_data_age in main stats.
get_cmd	servicing get requests
store_cmd	servicing store requests
arith_cmd	servicing incr/decr requests
get_vb_cmd	servicing vbucket status requests
set_vb_cmd	servicing vbucket set state commands
del_vb_cmd	servicing vbucket deletion commands
tap_vb_set	servicing tap vbucket set state commands
tap_mutation	servicing tap mutations
notify_io	waking blocked connections
disk_insert	waiting for disk to store a new item
disk_update	waiting for disk to modify an existing item
disk_del	waiting for disk to delete an item
disk_vb_del	waiting for disk to delete a vbucket
disk_vb_chunk_del	waiting for disk to delete a vbucket chunk
disk_commit	waiting for a commit after a batch of updates
disk_invalid_item_del	Waiting for disk to delete a chunk of invalid
	items with the old vbucket version

2.4 Hash Stats

Hash stats provide information on your per-vbucket hash tables.

Requesting these stats does affect performance, so don’t do it too regularly, but it’s useful for debugging certain types of performance issues. For example, if your hash table is tuned to have too few buckets for the data load within it, the max_depth will be too large and performance will suffer.

Each stat is prefixed with vb_ followed by a number, a colon, then the individual stat name.

For example, the stat representing the size of the hash table for vbucket 0 is vb_0:size.

state	The current state of this vbucket
size	Number of hash buckets
locks	Number of locks covering hash table operations
min_depth	Minimum number of items found in a bucket
max_depth	Maximum number of items found in a bucket
reported	Number of items this hash table reports having
counted	Number of items found while walking the table
resized	Number of times the hash table resized.
mem_size	Running sum of memory used by each item.
mem_size_counted	Counted sum of current memory used by each item.

3 Details

3.1 Ages

The difference between ep_storage_age and ep_data_age is somewhat subtle, but when you consider that a given record may be updated multiple times before hitting persistence, it starts to be clearer.

ep_data_age is how old the data we actually wrote is.

ep_storage_age is how long the object has been waiting to be persisted.

3.2 Too Young

ep_too_young is incremented every time an object is encountered whose data age is more recent than is allowable for the persistence layer.

For example, if an object that was queued five minutes ago is picked off the todo queue and found to have been updated fifteen seconds ago, it will not be stored, ep_too_young will be incremented, and the key will go back on the input queue.

3.3 Too Old

ep_too_old is incremented every time an object is encountered whose queue age exceeds the ep_queue_age_cap setting.

ep_queue_age_cap generally exists as a safety net to prevent the ep_min_data_age setting from preventing persistence altogether.

3.4 Warming Up

Opening the data store is broken into three distinct phases:

3.4.1 Initializing

During the initialization phase, the server is not accepting connections or otherwise functional. This is often quick, but in a server crash can take some time to perform recovery of the underlying storage.

This time is made available via the ep_dbinit stat.

3.4.2 Warming Up

After initialization, warmup begins. At this point, the server is capable of taking new writes and responding to reads. However, only records that have been pulled out of the storage or have been updated from other clients will be available for request.

(note that records read from persistence will not overwrite new records captured from the network)

During this phase, ep_warmup_thread will report running and ep_warmed_up will be increasing as records are being read.

3.4.3 Complete

Once complete, ep_warmed_up will stop increasing and ep_warmup_thread will report complete.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

stats.org

stats.org

EP Stats

1 Getting Started

2 Stats Definitions

2.1 Toplevel Stats

2.2 Tap stats

2.2.1 Per Tap Client Stats

2.3 Timing Stats

2.3.1 General Form

2.3.2 Available Stats

2.4 Hash Stats

3 Details

3.1 Ages

3.2 Too Young

3.3 Too Old

3.4 Warming Up

3.4.1 Initializing

3.4.2 Warming Up

3.4.3 Complete

Files

stats.org

Latest commit

History

stats.org

File metadata and controls

EP Stats

1 Getting Started

2 Stats Definitions

2.1 Toplevel Stats

2.2 Tap stats

2.2.1 Per Tap Client Stats

2.3 Timing Stats

2.3.1 General Form

2.3.2 Available Stats

2.4 Hash Stats

3 Details

3.1 Ages

3.2 Too Young

3.3 Too Old

3.4 Warming Up

3.4.1 Initializing

3.4.2 Warming Up

3.4.3 Complete