Skip to content
An n-dimensional histogramming library
C Perl XS Perl6
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.


    Math::Histogram - N-dimensional histogramming library

      use Math::Histogram;
      my @dimensions = (
        Math::Histogram::Axis->new(10, 0., 1.), # x: 10 bins between 0 and 1
        Math::Histogram::Axis->new([1, 2, 4, 8, 16]), # y: 5 bins of variable size
        Math::Histogram::Axis->new(2, -1., 1.), # z: 2 bins: [-1, 0) and [0, 1)
      my $hist = Math::Histogram->new(\@dimensions);
      # FIXME cover make_histogram here, too
      # Fill some primitive data
      while (<>) {
        my @cols = split /\s+/, $_;
        die "Invalid number of columns: " . scalar(@cols)
          if @cols != 3;
        # Insert new datum into histogram
      # Dump histogram content to screen (excluding overflow)
      for my $iz (1 .. $hist->get_axis(2)->nbins) {
        for my $iy (1 .. $hist->get_axis(1)->nbins) {
          for my $ix (1 .. $hist->get_axis(0)->nbins) {
            print $hist->get_bin_content([$ix, $iy, $iz]), " ";
          print "\n";
        print "\n";

    This Perl module wraps an n-dimensional histogramming library written in

    Beware, this is an early release. While the basic functionality is
    rather well tested, the library has not been used in production. If you
    intend to adopt it for production, please test your application well and
    get in touch with the author.

  On N-Dimensional Histogramming
    If all you are looking for is a regular one dimensional histogram, then
    consider other libraries such as Math::SimpleHisto::XS first for
    simplicity and performance. Some care has been taken to optimize the
    library for performance given a variable number of dimensions, but not
    knowing the number of dimensions statically makes for both somewhat
    inefficient algorithmic implementation as well as occasionally awkward
    APIs. For example, simply iterating through all bins of a 2D histogram
    -- a matrix -- is as simple as

      # Pseudo-code
      foreach my $ix (0..$nx-1) {
        foreach my $iy (0..$ny-1) {
          my $z = $matrix->get_bin_content([$ix, $iy]);

    If you don't know the number of dimensions statically, you need to do
    something like this (there are other ways to do it, too):

      # Pseudo-code
      my $coords = [(0) x $ndims];
      foreach my $i (0..$unrolled_total_nbins-1) {
        my $z = $ndimhisto->get_bin_content($coords);

        my $i = 0;
        while ($i < $ndims
               && $coords->[$i] >= $ndimhisto->get_axis($i)->nbins)
          $coords->[$i] = 0;

    Not pretty, eh? Not fast either. So keep that in mind: Your application
    knows the number of dimensions that you care about, this histogramming
    library does not.

    Generally speaking, a histogram object in the context of this library
    contains N axis objects (axises 0 to N-1) that define the binning of
    each dimension. Below and above its coordinate range, each axis has an
    under- and an overflow bin. When you fill a histogram with data using
    the "fill()" method, and the provided coordinates are outside the range
    of the histogram, then the data will be filled into the correct under-
    or overflow bin. For example, if you create a 2D histogram with the
    following axises:

      my $h = Math::Histogram->new([
        Math::Histogram::Axis->new(2, 0., 1.),
        Math::Histogram::Axis->new(3, 0., 3.),

      # Worst ASCII drawing ever:
      # +-+-+-+-+
      # |:|.|.|:|
      # +-+-+-+-+
      # |.| | |.|
      # +-+-+-+-+
      # |.| | |.|  ^
      # +-+-+-+-+  |
      # |.| | |.|  |
      # +-+-+-+-+  dimension 1
      # |:|.|.|:|
      # +-+-+-+-+
      #   ---> dimension 0
      # Bins marked with . are under- or overflow in one dimension.
      # Bins marked with : are under- or overflow in BOTH dimensions.

    Then you created a histogram with six regular bins: two bins in the X
    direction and three bins in the Y direction for a total of "2 * 3 = 6".
    On top of that, you get a ring of over- and underflow bins around your
    ordinary bins. In this case, there are a grand total of 14 such over-
    and underflow bins. As you increase the number of bins in your actual
    histogram, the relative number of over- and underflow bins goes down.

    You can access histogram content both by the N-dimensional bin numbers
    (so, in the 2D example, an array reference containing two integers) or
    by user coordinates (eg. an array reference of two floating point
    numbers). The module provides facilities to determine the bin in which a
    particular set of coordinates falls. The lower boundary of a bin is
    always considered part of the bin, whereas the upper boundary is not.
    Internally, the histogram data is stored in a flat array since the
    dimensionality is unknown at compile time. The linear index into this
    array is what may be referred to as the "flat" or "linear" bin number.
    In a 1D histogram, it corresponds to the bin numbers of the only axis in
    the histogram.

    Class method, constructor. Takes an array reference as first parameter.
    The array reference must contain one or more Math::Histogram::Axis
    objects that define the binning in one dimension each. The number of
    axises determines the dimensionality of the histogram.

    Returns an exact clone of the histogram.

    Returns a clone of the histogram, but without its content.

    Given a dimension number (starting at 0), returns the axis object of
    that dimension.

    Returns the number of dimensions in the histogram.

    Returns the total content of the histogram. (The sum over all bins,
    except this is cached.)

    Returns the number of fill operations that have been performed on the
    histogram so far. This is not the same as total unless all fills have a
    weight of 1.

    Given a reference to an array of coordinates, adds 1 to the content of
    the bin that the coordinates belong to.

    Same as "fill()", except that the second argument needs to be a weight,
    the number to add to the bin content (instead of incrementing by 1).

    Same as "fill()", except that the first parameter needs to be a
    reference to a nested array, each of the inner arrays containing a set
    of coordinates. In other words, this method works the same as calling
    "fill()" repeatedly for each element in the outer array:

      my @coords = (
        [0.1, 0.2],
        [3.8, -1.2],
      # Is the same as:
      $h->fill($_) for @coords;
      # Except a teeny bit faster.

    This is to "fill_w(\@coord, $weight)" what "fill_n(\@coords)" is to
    "fill(\@coord)". In other words, the first argument is the same as for
    "fill_n()", the second is an array reference containing as many weights
    as the first had coordinate sets.

    Same as "fill()", but takes an array reference containing bin numbers as
    argument (instead of a reference to an array of coordinates).

    This is to "fill_w" what "fill_bin" is to "fill".

    This is to "fill_n" what "fill_bin" is to "fill".

    This is to "fill_nw" what "fill_bin" is to "fill".

    Given a reference to an array of bin numbers, returns the content of the
    specified bin. Throws an exception when out of bounds.

    Given a reference to an array of coordinates, returns a reference to an
    array of (the same number of) bin numbers that correspond to the bin
    that the coordinates fall into.

    Given a dimension number (starting at 0), creates an N-1 dimensional
    histogram that contains the original data, but with the specified
    dimension contracted. The original histogram is untouched. Throws an
    exception if the dimension is out of bounds.

    Given a dimension number (starting at 0), cumulates along that
    dimension, modifying the input histogram. Throws an exception if the
    dimension is out of bounds. Example:

      X ->
      1 2 3  ^
      4 5 6  |
      7 8 9  Y

    Cumulated along X, the result is:

      1 3  6
      4 9  15
      7 15 24

    Cumulated along Y instead, the result is (note direction of Y axis in

      12 15 18
      11 13 15
      7  8  9

    Given another histogram, returns true if the data content is equal to
    the invocant's data. Uses your machine "DBL_EPSILON" for floating point

    Given a set of bin numbers, returns true if the bin is an under- or
    overflow bin, false otherwise. This is O(n) in the number of dimensions,
    but O(1) in the number of bins in the histogram.

    Given a linear bin number, returns true if the bin is an under- or
    overflow bin, false otherwise. This is O(1) in the number of dimensions
    and the number of bins in the histogram.

    Returns a JSON string that represents this histogram object.

    Class method. Given a JSON string as generated by "serialize()",
    recreates the histogram object that it represents. Also accepts a scalar
    reference to a JSON string.

    Math::Histogram::Axis, which is part of this distribution, implements
    the binning for a histogram in a single dimension.

    Math::SimpleHisto::XS is a fast 1D histogramming module.

    SOOT is a dynamic wrapper around the ROOT C++ library which does
    histogramming and much more. Beware, it is experimental software.

    Steffen Mueller, <>

    Copyright (C) 2012 by Steffen Mueller

    This library is free software; you can redistribute it and/or modify it
    under the same terms as Perl itself, either Perl version 5.8.1 or, at
    your option, any later version of Perl 5 you may have available.

Something went wrong with that request. Please try again.