Basics

tony2001 edited this page Sep 18, 2014 · 3 revisions

Table of Contents

How it works

Pinba PHP extension creates a data packet on request shutdown and sends it to the configured Pinba server. The packet is sent over UDP, so there is no need to establish connection and it doesn't affect PHP's performance in any way. This also means some packets may be lost, as UDP is not reliable by its nature, but that should not bother you much.

Pinba server listening to the configured port (see pinba.port) reads and decodes arriving data packets, adding them to the temporary pool. The temporary pool is needed to prevent too frequent locking of the main pool, which might slow down the queries. Periodically (see pinba.stats_gathering_period) Pinba locks down the main pool of records and merges records from the temporary pool. It also drops outdated records (see pinba.stats_history) and updates indexes and base reports in the same moment. Tag reports are also updated, if any.

Both main and temporary pools are implemented as cyclic buffers of limited size, created on startup and never re-allocated, so newer records always overwrite older ones.

User interface

Pinba is created in the form of MySQL storage engine, which uses separate thread for collecting data. The collected data is accessible through a set of read-only tables of certain (predefined) structure. This allows us to use SQL and avoid re-inventing the wheel (i.e. our own language).

Pinba tables can be divided into two categories:

  • raw data tables
  • reports

Raw data tables contain raw request data (surprise!). Please bear in mind that access to the raw data is relatively slow (the number of records might reach millions and there are NO indexes except for primary keys, so almost any operation requires full table scan).
Reports were created in order to speed up the most frequent operations by aggregating data on-the-fly, when new request data arrives.

Data

Each PHP script sends the following data at the request shutdown:

  • string hostname - gethostname() result
  • int request_count - number of requests served by this process
  • string server_name - $_SERVER["SERVER_NAME"]
  • string script_name - $_SERVER["SCRIPT_NAME"]
  • int document_size - size of the response body
  • int memory_peak - memory allocation peak
  • float request_time - time spent of processing the request
  • float ru_utime - resource usage (user)
  • float ru_stime - resource usage (system)
  • array timers - array of timers (optional)
  • int status - HTTP status of the current request
  • int memory_footprint - the size of the process processing the request
  • string schema - HTTP or HTTPS (if possible to detect it), NULL by default
  • array tags - request tags, useful for filtering different kind of requests

Timers

Timers are a very important part of Pinba. In fact, they were the major reason to implement it.
A timer can be started and stopped any number of times. All timers are stopped on request shutdown if they were not stopped manually before that. A timer contains the following data:

  • float value - total time between timer start and stop
  • int hit_count - total number of timer starts
  • array of tags

The last element needs a detailed explanation.
Tags are used to mark timers and are in some way similar to object properties. Each tag has a name and a value, both strings. Timer tags are used to group timer in tag reports.
Two timers with similar sets of tags are merged on request shutdown and hit_count of the resulting timer is set to 2.
Short example:

$t = pinba_timer_start(array("group"=>"mysql", "server"=>"dbs2", "operation"=>"select"));
$result = mysql_query("SELECT ...", $connection);
pinba_timer_stop($t);

This code creates a timer, which you can later use in tag reports. With these tags you can get the following data from appropriate tag reports:

  • how often mysql is used and how much time is spent on mysql operations in total
  • how often dbs2 server is queried and how much time it takes
  • how often SELECTs are performed (on dbs2 and in total) and how much time they take

Raw data tables also provide access to raw timers data, so you can check for any abnormalities yourself.

Timers are set by user using API provided by Pinba extension.