Jon Watte edited this page Sep 24, 2012 · 1 revision

Home - Glossary

A compendium of terms and jargon used in istatd.

  • aggregation - The process of collecting multiple samples into a single quantized value. Buckets are one example of aggregation. Counter aggregate stat files are another example, they aggregate collated buckets stat files. Reduction is also aggregation -- of raw stat file data into synthesized buckets.
  • bucket - A collection of samples aggregated together over some interval. It contains a running sum, sum of squares, minimum value, and maximum value, and number of samples in the bucket. This data can then be used to calculate average and standard deviation when requested.
  • coarse resolution - Describes stat files with large intervals and accordingly, a lower level of aggregated detail. Contrast with fine resolution.
  • collation - The process of counting discrete event occurrences and measuring them in events/sec.
  • collated bucket - A bucket that is exactly one collation interval wide. It is contained in the collation window until it is shifted out to a collated stat file. These buckets are special in that they contain exactly 0 or 1 sample, and their sum/min/max are forced to be the running average of events/second, and accordingly, their sum of squares is exactly equal to their sum squared (because there is exactly one sample).
  • collated counter - A term used to refer to a stat counter that measures discrete events using the process of collation, it reports its data as a rate in events/sec. The collation window determines in what range data can be recorded.
  • collation interval - The finest interval in a collated counter, and the period over which collation occurs.
  • collation shift - The process where the sliding window of collated buckets is moved ahead, by aggregating the oldest value into coarser resolution "counter aggregate" statfiles, updating the statfile with all buckets in the window, shifting each element in the window down by one (bumping the oldest element out of the window), and creating a newly initialized collation bucket at the end of the sliding window which is exactly one collation interval ahead of the newest sample in the window.
  • collated stat file - A stat file that contains collated buckets.
  • collation window - A sliding window that contains a fixed buffer of collated buckets. If a sample is recorded before the oldest bucket in the window, then the sample will be rejected. If time is advanced, or a sample is being recorded after the latest bucket in the window, collation shifts must occur until that sample will be in the window.
  • counter - May refer to a general "stat counter", or a "collated counter" which is a specific kind of stat counter.
  • counter aggregate stat file - An aggregate of collated counter counter buckets in a coarser resolution.
  • fine resolution - Describes stat files with small intervals and accordingly, a higher level of precise detail. Contrast with coarse resolution.
  • flush - The process by which pending modification to StatFile pages in memory are written to disk. This is done when pages fall out of LRU cache, periodically through timer triggers, and also through other means.
  • gauge - A stat counter that reports periodic samples of continuous metrics into quantized buckets.
  • interval - A span of time over which each bucket aggregates individual samples. Intervals are specified in the retention policy, which is used when creating stat files and their respective buckets.
  • page - An boundary-aligned block in a stat file, which contains several buckets. Pages are used to improve read/write performance.
  • reduction - The process by which data samples are synthesized from raw data sets into simplified sets of data with a coarser resolution and less samples.
  • resolution - Sometimes an overloaded term used to mean interval (eg. "ten second resolution"). When used with "coarser"/"finer", refers to the level of detail used for bucket aggregation relative to other stat files for the same counter.
  • retention policy - Defines the intervals used for new stat counters, as well as the timespan to keep old data at each interval before it will be overwritten by newer recordings. This affects bucket allocation for each stat file.
  • sample - An individual tuple of statistics recorded to istatd. Samples in the same interval get aggregated into the same bucket.
  • stat counter - A named metric which contains multiple stat files, each at different intervals defined in the retention policy. A stat counter may either be a counter or gauge type.
  • stat file - A stat file is a binary file format that is used by istatd, containing a paged, fixed size, circular file of buckets. StatFile is also a class layer that is responsible for managing the stat file format, reading/updating buckets in a LRU cache of pages, and flushing pages to disk.
  • synthetic bucket - A bucket that is created when a stat counter is selected and data needs to be reduced in complexity by creating temporary buckets of coarser resolution than the raw results. These buckets do not exist in a stat file, and do not affect anything other than presentation of data.
  • trailing stat file - used for StatFiles with seasons/exponential moving average. Currently a work-in-progress feature.