Skip to content

Commit

Permalink
Summary from USENIX/LISA13 Metrics Workshop
Browse files Browse the repository at this point in the history
  • Loading branch information
brendangregg committed May 15, 2014
1 parent f30bbb1 commit e815ab3
Show file tree
Hide file tree
Showing 8 changed files with 132 additions and 0 deletions.
13 changes: 13 additions & 0 deletions appservers/README.md
@@ -0,0 +1,13 @@
# Application Servers

* Total requests served, rate
* Latency:
* time to serve a client
* complete a client transaction
* request queue time
* App error rate
* Error counts on backend H/W
* Bandwidth usage front and backend
* System load on primary application server: CPU, memory, disk, swapping
* Usage patterns:
* which user, client time, session time, active vs idle time
24 changes: 24 additions & 0 deletions config/README.md
@@ -0,0 +1,24 @@
# Configuration

* Apps should export flags, to check for consistency
* a metadata to show the target configuration
* Versioning:
* ldd, libraries linked against
* time a config was applied
* Platform Type:
* server H/W
* Cost of Configuration
* cost of configuration upload/download
* time to deployment: security changes (high priority), vs others
* CPU and RAM usage during configuration
* People
* deployment report
* Hardware
* current hardware
* max expected performance
* Process
* compliance measurement of configuration: percent of systems
* Failure
* failure of configuration deployment
* rollbacks, rollforward: config metric didn't apply
* OS flags
22 changes: 22 additions & 0 deletions databases/README.md
@@ -0,0 +1,22 @@
# Databases

* Queries/sec
* # of connections
* connections/sec
* avg time per query
* cache hit rate
* avg io latency
* aggregate io
* % of query time in io
* # of locks
* # of versions (for read consistency)
* terminated connects
* SQL statements
* cache evictions
* query errors by type
* saturation: plan to execute
* queueing on pool
* change in number of executed plans
* latency of last checkpoint, and on-disk representation of wall log
* (how much of DB to reply)
* checkpoint times
14 changes: 14 additions & 0 deletions distributed/README.md
@@ -0,0 +1,14 @@
# Distributed Systems

* Perceived latency: service time and queueing
* Request rate
* Error rate
* Traffic origins
* Histogram of latencies for each server, for comparisons
* Visualizations:
* heatmaps
* for service
* per server
* per backend
* system 'flame graph'
* visualize traffic as graph, queue time, request flow
12 changes: 12 additions & 0 deletions messaging/README.md
@@ -0,0 +1,12 @@
# Message Queueing

* Distribution of message latency (ns)
* Throughput
* Total number of ns
* Errors, drop, retransmits, discards
* Message fanout distribution (gain: ratio of input to put)
* For distribution message queues: see distributied systems
* Queue lengths
* Saturation: run out of space
* Resource constraints on queueing systems
* Last time of access
26 changes: 26 additions & 0 deletions network/README.md
@@ -0,0 +1,26 @@
# Network Infrastructure

* Physical Infrastructure
* bandwidth, utilization of individual links
* CoS/QoS rate/drops
* L2/L2 protocol health
* churn
* reachabality
* Per port:
* packets/sec
* packet size
* buffer utilization
* perf flow into:
* app injection BW
* app injectiov rate
* app consumption rate
* app consumption BW
* Component:
* links
* errors
* latency
* utilization
* Topology:
* app to app latency
* app to app low
* symmetry
7 changes: 7 additions & 0 deletions resources/README.md
@@ -0,0 +1,7 @@
# Resources/Devices

* Utilization
* per-device: eg, as a heat map for distribution over time
* Saturation
* average queue length, or time waiting on queue
* Errors
14 changes: 14 additions & 0 deletions webservers/README.md
@@ -0,0 +1,14 @@
# Web Servers

* Requests: referrer, origin, UA, resp code, count
* origin
* response code
* Req size: distribution
* Response Size: resp code, distribution
* Responce Count: resp code, counter
* Time To First Bite: resp code, distribution
* Time To Last Bite: resp code, distribution
* Active Workers: guage
* Worker Age: guage
* Connections: counter
* Process Metrics from host

0 comments on commit e815ab3

Please sign in to comment.