Skip to content

tsee/algorithm-spatialindex-strategy-medianquadtree

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NAME
    Algorithm::SpatialIndex::Strategy::MedianQuadTree - QuadTree splitting
    on bucket medians

SYNOPSIS
      use Algorithm::SpatialIndex;
      my $idx = Algorithm::SpatialIndex->new(
        strategy => 'MedianQuadTree',
      );

DESCRIPTION
    A modified quad tree implementation that I'll call Median Quad Tree
    (MQT) in this document. (Not sure if this data structure has a different
    name elsewhere.) See "ALGORITHM" below.

    For a description of the public interface, see Algorithm::SpatialIndex.
    This spatial index strategy uses the default
    "Algorithm::SpatialIndex::Strategy::QuadTree" as a base class and gets
    away with containing very little code because of that. This also means
    that most of the interface and behaviour is inherited.

ALGORITHM
    For a basic discussion of quad trees, take a look at the documentation
    of the Algorithm::QuadTree module or look it up on Wikipedia. The
    following describes how the MQT differs from a normal quad tree and some
    implementation details.

    Any x/y coordinate pair can be used to divide a rectangular area into
    four sub-rectangles. When splitting up a node of an ordinary quad tree,
    the center of the quad tree is chosen to split the node into four
    sub-nodes. This can be done either when the tree is created (before it
    is populated) with a static depth of the tree, or dynamically whenever
    the number of items associated with a node becomes too large.

    For the MQT, the point which splits a given node into four is chosen to
    be the median of all contained item coordinates in each dimension.

    This has several consequences:

    * Due to the dynamic nature of the coordinate choice when splitting a
      node in four, the MQT cannot be of a fixed depth and preallocated. It
      needs to grow dynamically as it is filled.

    * When the data in the tree is a reasonably general sample of the
      underlying distribution of data, then this algorithm will create a
      tree of evenly filled buckets, but not necessarily a well balanced
      tree. To obtain a reasonable sample of the underlying data
      distribution, it is prudent to insert items in random order.

    * Due to this behaviour, the tree will create very small nodes/bins
      where the most data is concentrated and bins of gradually increasing
      size as one moves away from the highest concentrations. A normal quad
      tree will have a similar behaviour, but due to the fixed size of the
      bins, will "converge" much more slowly.

    * If the data is inserted in order of the item coordinates, an ordinary
      quad tree is a more efficient. The MQT will make bad choices as the
      median of a contiguous subset of the ordered data will not reflect the
      overall distribution and the property of having fairly evenly filled
      buckets vanishes.

    * It is likely that polling a spatial index using an MQT is faster than
      an ordinary quad tree if the distribution of data is very different
      from uniformity. If in doubt, benchmark.

    * If the data is uniform but inserted in random order, the MQT will at
      best be equal in performance to a quad tree.

    * Filling a dynamically growing MQT has slightly more overhead than
      filling a dynamically growing quad tree due to the median calculation,
      but it is neither algorithmically slower nor necessarily slower in
      practice. Algorithmically, splitting a quad tree node will be O(n) in
      the bucket size. Splitting an MQT node will be O(n*log(n)) if a naive
      median calculation is used or also O(n) with a linear-time median
      calculation (like this implementation).

      If the MQT is filled in random order and the data is not uniformly
      distributed, it will, on average, have more evenly filled buckets and
      thus less nodes than a quad tree, thus reducing the amount of tree
      walking required while filling and later polling the index.

SEE ALSO
    Algorithm::SpatialIndex

    Algorithm::SpatialIndex::Strategy::QuadTree

    Algorithm::SpatialIndex::Strategy

    Algorithm::QuadTree

AUTHOR
    Steffen Mueller, <smueller@cpan.org>

COPYRIGHT AND LICENSE
    Copyright (C) 2010 by Steffen Mueller

    This library is free software; you can redistribute it and/or modify it
    under the same terms as Perl itself, either Perl version 5.10.1 or, at
    your option, any later version of Perl 5 you may have available.

About

Quad tree implementation using the item coordinate median for splitting nodes

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages