Skip to content

Commit

Permalink
Fixing typos and formatting in the README
Browse files Browse the repository at this point in the history
  • Loading branch information
mattwarren committed Jun 27, 2014
1 parent 94efccb commit 0307890
Showing 1 changed file with 46 additions and 39 deletions.
85 changes: 46 additions & 39 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
HdrHistogram: A High Dynamic Range (HDR) Histogram

This project currently includes Java, C, and C# implementations of
HdrHistogram, all of which share common concenpts and data
representation capabolities.
HdrHistogram, all of which share common concepts and data
representation capabilities.

Note: The below is an excerpt from a Histogram JavaDoc. While it
generally applies to C and C# as well, some details may vary by
Expand All @@ -13,15 +13,15 @@ library you intended to use.
HdrHistogram
----------------------------------------------

HdrHistogram supports the recording and analyzing sampled data value
HdrHistogram supports the recording and analyzing of sampled data value
counts across a configurable integer value range with configurable value
precision within the range. Value precision is expressed as the number of
significant digits in the value recording, and provides control over value
quantization behavior across the value range and the subsequent value
resolution at any given level.

For example, a Histogram could be configured to track the counts of
observed integer values between 0 and 3,600,000,000 while maintaining a
observed integer values between 0 and 3,600,000,000 while maintaining a
value precision of 3 significant digits across that range. Value
quantization within the range will thus be no larger than 1/1,000th
(or 0.1%) of any value. This example Histogram could be used to track and
Expand All @@ -38,7 +38,7 @@ Histogram form. IntHistogram and ShortHistogram, which track value counts in
int and short fields respectively, are provided for use cases where smaller
count ranges are practical and smaller overall storage is beneficial.

HDR Histogram is designed for recoding histograms of value measurements in
HdrHistogram is designed for recoding histograms of value measurements in
latency and performance sensitive applications. Measurements show value
recording times as low as 3-6 nanoseconds on modern (circa 2012) Intel CPUs.
AbstractHistogram maintains a fixed cost in both space and time. A
Expand All @@ -61,7 +61,7 @@ performing the same analysis directly on the potentially infinite series of
sourced data values samples.

Internally, AbstractHistogram data is maintained using a concept somewhat
similar to that of floating point number representation: Using a an
similar to that of floating point number representation: Using an
exponent a (non-normalized) mantissa to support a wide dynamic range at a
high but varying (by exponent value) resolution. AbstractHistogram uses
exponentially increasing bucket value ranges (the parallel of the exponent
Expand All @@ -72,16 +72,16 @@ range and resolution are configurable, with highestTrackableValue
controlling dynamic range, and numberOfSignificantValueDigits controlling
resolution.

An common use example of an HDR Histogram would be to record response times
An common use example of HdrHistogram would be to record response times
in units of microseconds across a dynamic range stretching from 1 usec to
over an hour, with a good enough resolution to support later performing
post-recording analysis on the collected data. Analysis can including
post-recording analysis on the collected data. Analysis can include
computing, examining, and reporting of distribution by percentiles, linear
or logarithmic value buckets, mean and standard deviation, or by any other
means that can can be easily added by using the various iteration techniques
supported by the Histogram.
In order to facilitate the accuracy needed for various post-recording
analysis techniques, this example can maintain where a resolution of ~1 usec
analysis techniques, this example can maintain a resolution of ~1 usec
or better for times ranging to ~2 msec in magnitude, while at the same time
maintaining a resolution of ~1 msec or better for times ranging to ~2 sec,
and a resolution of ~1 second or better for values up to 2,000 seconds.
Expand All @@ -96,19 +96,20 @@ Histogram variants and internal representation
----------------------------------------------

The HdrHistogram package includes multiple implementations of the
AbstractHistogram class:
- Histogram, which is the commonly used Histogram form and tracks
`AbstractHistogram` class:
- `Histogram`, which is the commonly used Histogram form and tracks
value counts in long fields.
- IntHistogram and ShortHistogram, which track value counts in int
- `IntHistogram` and `ShortHistogram`, which track value counts in int
and short fields respectively, are provided for use cases where
smaller count ranges are practical and smaller overall storage
is beneficial (e.g. systems where tens of thousands of in-memory
histogram are being tracked).
- AtomicHistogram and SynchronizedHistogram
- `AtomicHistogram` and `SynchronizedHistogram` (see 'Synchronization
and concurrent access' below)

Internally, data in HdrHistogram variants is maintained using a concept
somewhat similar to that of floating point number representation: Using a
an exponent a (non-normalized) mantissa to support a wide dynamic range at
somewhat similar to that of floating point number representation: Using an
exponent a (non-normalized) mantissa to support a wide dynamic range at
a high but varying (by exponent value) resolution. AbstractHistogram uses
exponentially increasing bucket value ranges (the parallel of the exponent
portion of a floating point number) with each bucket containing a fixed
Expand Down Expand Up @@ -140,72 +141,77 @@ Histograms supports multiple convenient forms of iterating through the
histogram data set, including linear, logarithmic, and percentile iteration
mechanisms, as well as means for iterating through each recorded value or
each possible value level. The iteration mechanisms are accessible through
the HistogramData available through getHistogramData().
the HistogramData available through `getHistogramData()`.
Iteration mechanisms all provide HistogramIterationValue data points along
the histogram's iterated data set, and are available for the default
(corrected) histogram data set via the following HistogramData methods:

percentiles: An Iterable<HistogramIterationValue> through the histogram
- `percentiles`: An `Iterable<HistogramIterationValue>` through the histogram
using a PercentileIterator
linearBucketValues: An Iterable<HistogramIterationValue> through the
- `linearBucketValues`: An `Iterable<HistogramIterationValue>` through the
histogram using a LinearIterator
logarithmicBucketValues: An Iterable<HistogramIterationValue> through
- `logarithmicBucketValues`: An `Iterable<HistogramIterationValue>` through
the histogram using a LogarithmicIterator
recordedValues: An Iterable<HistogramIterationValue> through the
- `recordedValues`: An `Iterable<HistogramIterationValue>` through the
histogram using a RecordedValuesIterator
allValues: An Iterable<HistogramIterationValue> through the histogram
- `allValues`: An `Iterable<HistogramIterationValue>` through the histogram
using a AllValuesIterator

Iteration is typically done with a for-each loop statement. E.g.:

``` java
for (HistogramIterationValue v :
histogram.getHistogramData().percentiles(ticksPerHalfDistance)) {
...
}
```

or

``` java
for (HistogramIterationValue v :
histogram.getRawHistogramData().linearBucketValues(unitsPerBucket)) {
...
}
```

The iterators associated with each iteration method are resettable, such
that a caller that would like to avoid allocating a new iterator object for
each iteration loop can re-use an iterator to repeatedly iterate through
the histogram. This iterator re-use usually takes the form of a traditional
for loop using the Iterator's hasNext() and next() methods:
for loop using the Iterator's `hasNext()` and `next()` methods.

to avoid allocating a new iterator object for each iteration loop:
So to avoid allocating a new iterator object for each iteration loop:

``` java
PercentileIterator iter =
histogram.getHistogramData().percentiles().iterator(ticksPerHalfDistance<);
histogram.getHistogramData().percentiles().iterator(ticksPerHalfDistance);
...
iter.reset(percentileTicksPerHalfDistance);
for (iter.hasNext() {
HistogramIterationValue v = iter.next();
...
}

```

Equivalent Values and value ranges
----------------------------------------------

Due to the finite (and configurable) resolution of the histogram, multiple
adjacent integer data values can be "equivalent". Two values are considered
"equivalent" if samples recorded for both are always counted in a common
total count due to the histogram's resolution level. Histogram provides
total count due to the histogram's resolution level. HdrHistogram provides
methods for determining the lowest and highest equivalent values for any
given value, as we as determining whether two values are equivalent, and
given value, as well as determining whether two values are equivalent, and
for finding the next non-equivalent value for a given value (useful when
looping through values, in order to avoid double-counting count).
looping through values, in order to avoid a double-counting count).
Corrected vs. Raw value recording calls
----------------------------------------------
In order to support a common use case needed when histogram values are used
to track response time distribution, Histogram provides for the recording
of corrected histogram value by supporting a recordValueWithExpectedInterval()
of corrected histogram value by supporting a `recordValueWithExpectedInterval()`
variant is provided. This value rexording form is useful in [common latency
measurement] scenarios where response times may exceed the expected interval
between issuing requests, leading to "dropped" response time measurements
Expand Down Expand Up @@ -260,16 +266,17 @@ fields and stats (which can be estimated as "fixed at well less than 1KB"),
the bulk of a Histogram's storage is taken up by it's data value recording
counts array. The total footprint can be conservatively estimated by:
largestValueWithSingleUnitResolution =
2 * (10 ^ numberOfSignificantValueDigits);
subBucketSize =
roundedUpToNearestPowerOf2(largestValueWithSingleUnitResolution);

expectedHistogramFootprintInBytes = 512 +
({primitive type size} / 2) *
(log2RoundedUp((highestTrackableValue) / subBucketSize) + 2) *
subBucketSize
``` java
largestValueWithSingleUnitResolution =
2 * (10 ^ numberOfSignificantValueDigits);
subBucketSize =
roundedUpToNearestPowerOf2(largestValueWithSingleUnitResolution);
expectedHistogramFootprintInBytes = 512 +
({primitive type size} / 2) *
(log2RoundedUp((highestTrackableValue) / subBucketSize) + 2) *
subBucketSize
```
A conservative (high) estimate of a Histogram's footprint in bytes is
available via the getEstimatedFootprintInBytes() method.
available via the `getEstimatedFootprintInBytes()` method.

0 comments on commit 0307890

Please sign in to comment.