Home

Jim Covert edited this page Jan 27, 2016 · 10 revisions
Clone this wiki locally

Conversant RTree

RTree is a spatial indexing strategy that involves building a tree of bounding regions that support arbitrary range searches. RTrees are efficient for geospatial data but can be extended to support any data that is amenable to range search queries.

Conversant RTree is a hyperdimensional implementation of RTree that supports data with arbitrarily large numbers of orthogonal relations.

Implementation Details

Split Type - When a node overflows

A strategy must be chosen during tree creation to determine how to handle the addition of an entry to a completely full node. Generally, two new child nodes are created, and the entries are distributed between them. The original leaf node becomes their parent, a branch node. The addition of the new entry and distribution of the existing entries into the new child nodes is determined by the "split type", of which three implementations are described below.

Linear

Choose the two entries with bounding boxes that are farthest apart and add every other entry, one at a time, to the box requiring the least area increase to include it. Ties are resolved by choosing the box with the smallest area. Ties that remain are resolved by choosing the box with the fewest entries. Any additional tie is resolved by choosing a box at random.

Quadratic

Examine all pairs of bounding boxes choose the two that would be have the greatest bounding box rectangle if put together. Add every other entry, one at a time, to either of these seed bounding boxes, minimizing area increase. Ties are resolved in the same manner as in the Linear split strategy.

Axial

Choose the dimension with the greatest range extent. In this dimension, find an axis to split on that minimizes the perimeter of the two resulting bounding boxes. Add every other entry, one at a time, to either of these seed bounding boxes, minimizing area increase. Ties are resolved in the same manner as in the Linear split strategy. Additional details can be found in R-Tree Lecture Notes, by Yufei Tao.

m, M - Minimum and maximum entries per node

"m" and "M" represent the minimum and maximum entries allowed per node. When adding an entry to a node that already has "M" entries, a split occurs in accordance with one of the splitting strategies above. After the split, each resulting child node is guaranteed to have at least "m" entries.

Default Values for m, M, and Split Type

The values of 2, 8, and AXIAL for m, M, and split type proved to yield the best tree structure with an optimal leaf-fill percentage, resulting ultimately in the fastest search time. This agrees with much of the available RTree documentation and white papers.

Choosing an appropriate m, M, and split type

For average bounding box overlap, the value 8 for M results in the fastest searches.

Getting Started

Each entry stored in the tree needs to be an instance of a type that implements the HyperRect interface. Each HyperRect instance will contain a minimum and maximum point, representing the minimum and maximum values across each dimension for that entry. These points must implement the HyperPoint interface and are used for such critical operations as "contains" and "intersects". HyperRects are instantiated through methods on an instance of the RectBuilder interface.

Implementing HyperRect

The HyperRect interface requires the following be implemented:

    HyperRect getMbr(HyperRect r);
    int getNDim();
    HyperPoint getMin();
    HyperPoint getMax();
    HyperPoint getCentroid();
    double getRange(final int d);
    boolean contains(HyperRect r);
    boolean intersects(HyperRect r);
    double cost();
    double perimeter(); 

Implementing HyperPoint

The HyperPoint interface requires the following be implemented:

    int getNDim();
    <D extends Comparable<D>> D getCoord(int d);
    double distance(HyperPoint p);
    double distance(HyperPoint p, int d);

Implementing RectBuilder

The RectBuilder interface requires the following be implemented:

    HyperRect getBBox(T entry);
    HyperRect getMbr(HyperPoint p1, HyperPoint p2);

Creating an R-Tree

    // T implements HyperRect
    RTree<T> tree = new RTree<>(new T.Builder());

Examples

Rect2D (implementing HyperRect)

/**
 * Created by jcovert on 6/15/15.
 */
public class Rect2D implements HyperRect {
    final Point min, max;

    Rect2D(final Point p) {
        min = new Point(p.x, p.y);
        max = new Point(p.x, p.y);
    }

    Rect2D(final double x1, final double y1, final double x2, final double y2) {
        min = new Point(x1, y1);
        max = new Point(x2, y2);
    }

    Rect2D(final Point p1, final Point p2) {
        final double minX, minY, maxX, maxY;
        if(p1.x < p2.x) {
            minX = p1.x;
            maxX = p2.x;
        } else {
            minX = p2.x;
            maxX = p2.x;
        }
        if(p1.y < p2.y) {
            minY = p1.y;
            maxY = p2.y;
        } else {
            minY = p2.y;
            maxY = p2.y;
        }
        min = new Point(minX, minY);
        max = new Point(maxX, maxY);
    }

    @Override
    public HyperRect getMbr(final HyperRect r) {
        final Rect2D r2 = (Rect2D)r;
        final double minX = Math.min(min.x, r2.min.x);
        final double minY = Math.min(min.y, r2.min.y);
        final double maxX = Math.max(max.x, r2.max.x);
        final double maxY = Math.max(max.y, r2.max.y);
        return new Rect2D(minX, minY, maxX, maxY);
    }

    @Override
    public int getNDim() { return 2; }

    @Override
    public HyperPoint getCentroid() {
        final double dx = min.x + (max.x - min.x)/2.0;
        final double dy = min.y + (max.y - min.y)/2.0;
        return new Point(dx, dy);
    }

    @Override
    public HyperPoint getMin() { return min; }

    @Override
    public HyperPoint getMax() { return max; }

    @Override
    public double getRange(final int d) {
        if(d == 0) { return max.x - min.x; }
        else if(d == 1) { return max.y - min.y; }
        else { throw new RuntimeException(("Invalid dimension")); }
    }

    @Override
    public boolean contains(final HyperRect r) {
        final Rect2D r2 = (Rect2D)r;
        return min.x <= r2.min.x &&
                max.x >= r2.max.x &&
                min.y <= r2.min.y &&
                max.y >= r2.max.y;
    }

    @Override
    public boolean intersects(final HyperRect r) {
        final Rect2D r2 = (Rect2D)r;
        if(min.x > r2.max.x ||
                r2.min.x > max.x ||
                min.y > r2.max.y ||
                r2.min.y > max.y) {
            return false;
        }
        return true;
    }

    @Override
    public double cost() {
        final double dx = max.x - min.x;
        final double dy = max.y - min.y;
        return Math.abs(dx) * Math.abs(dy);
    }

    public final static class Builder implements RectBuilder<Rect2D> {
        @Override
        public HyperRect getBBox(final Rect2D rect2D) { return rect2D; }

        @Override
        public HyperRect getMbr(final HyperPoint p1, final HyperPoint p2) {
            return new Rect2D(p1.getCoord(0), p1.getCoord(1), p2.getCoord(0), p2.getCoord(1));
        }
    }
}

Point (implementing HyperPoint)

/**
 * Created by jcovert on 6/15/15.
 */
public class Point implements HyperPoint {
    final double x, y;

    Point(final double x, final double y) {
        this.x = x;
        this.y = y;
    }

    @Override
    public int getNDim() { return 2; }

    @Override
    public Double getCoord(final int d) {
        if(d==0) { return x; } 
        else if(d==1) { return y; } 
        else { throw new RuntimeException("Invalid dimension"); }
    }

    @Override
    public double distance(final HyperPoint p) {
        final Point p2 = (Point)p;
        final double dx = p2.x - x;
        final double dy = p2.y - y;
        return Math.sqrt(dx*dx + dy*dy);
    }

    @Override
    public double distance(final HyperPoint p, final int d) {
        final Point p2 = (Point)p;
        if (d == 0) { return Math.abs(p2.x - x); } 
        else if (d == 1) { return Math.abs(p2.y - y); } 
        else { throw new RuntimeException("Invalid dimension"); }
    }

    public final static class Builder implements RectBuilder<Point> {
        @Override
        public HyperRect getBBox(final Point point) { return new Rect2D(point); }

        @Override
        public HyperRect getMbr(final HyperPoint p1, final HyperPoint p2) {
            final Point point1 = (Point)p1;
            final Point point2 = (Point)p2;
            return new Rect2D(point1, point2);
        }
    }
}

SpatialSearch Interface & LockingRTree

The RTree class implements the SpatialSearch interface, which requires methods "search", "add", "remove", and "update" be implemented. Another implementation can be found in LockingRTree, which wraps all the operations in a locking mechanism. The whole tree is locked when writing to avoid inconsistent or incomplete reads and writes. The LockingRTree implementation provides both blocking and non-blocking methods.

External Links

R-Trees: A Dynamic Index Structure for Spatial Searching (Guttman, 1984)

The R*-tree: An Efficient and Robust Access Method for Points and Rectangles (Beckmann, 1990)

On Optimal Node Splitting for R-trees

R-Tree Lecture Notes (Yufei Tao, axial split)

Wikipedia R-Tree