A GeoHash implementation geared toward indexing data for performing proximity searches at different radii.
The algorithm underlying Proxihash is the GeoHash algorithm, however the hashes are returned offer the bit string as a numeric value, rather than a base-32 representation. This allows bit-precision, rather than 5-bit precision, which comes in handy for variable-distance proximity searches.
Proxihash computes for you the tileset required for a given radius search from a given point, within the specified bit precision range. This range lets you limit your index to certain precisions, which reflects the search distances you need to support.
Create a proxihash with 31 bits of precision:
proxihash = Proxihash.encode(42.6, -5.6, 31)
Get the numeric value, for storage:
proxihash.value # 939008154 proxihash.num_bits # 31
Compute the tile (bounding box) represented by the proxihash:
proxihash.tile # [42.5994873046875, 42.60498046875, -5.60302734375, -5.5975341796875]
Find neighbors at the same precision:
proxihash.neighbor(-1, 1) # tile to the south-east proxihash.neighbor( 1, -1) # tile to the north-west # Alternatively: proxihash.north # same as proxihash.neighbor( 1, 0) proxihash.east # same as proxihash.neighbor( 0, 1) proxihash.west # same as proxihash.neighbor( 0, -1) proxihash.south # same as proxihash.neighbor(-1, 0)
Find the midpoint of the tile:
proxihash.decode # [42.60223388671875, -5.60028076171875]
Note that roundtripping through
decode may not give you the same
initial point, as decoding gives you the center of the tile. It will give you
a nearby point, however. Roundtripping through
encode will give
you back the same tile.
The idea is to index your data with proxihashes at the precisions you care about. These precisions depend on the search distances you need to support.
Consider that the radius of the earth is about 40,036 km, (24,873 miles). This means 16 bits of longitude (16 + 15 = 31 bit proxihash) will give you sub-kilometer precision. Sub-mile precision requires 29 bits. These fit nicely into a 32-bit integer.
Once you've indexed all your data by their proxihash at the precisions you care
about, you probably want to answer a query like "find all users within 25 miles
(lat,lng)". To do this, you have to find the tiles to search for:
Proxihash.radius = :earth_in_miles Proxihash.search_tiles(lat, lng, 25, min_bits: 11, max_bits: 31)
The first line sets the units to use (default is kilometers).
The second returns up to 4 tiles (proxihashes): the tile that
on, plus the 3 neighbors that need to be searched. The 3 neighbors depend on
which quadrant of the center tile
(lat,lng) falls on. The tile size is the
smallest that will cover the 25 mile search circle.
The precision of the tiles (proxihashes) computed depends on two things: the distance, and the latitude.
Larger tiles (shorter proxihashes) are required for larger distances and latitudes nearer the poles; smaller tiles (longer proxihashes) for shorter distances and latitudes nearer the equator. The latitude makes a difference because tiles nearer the poles are smaller, and so a circle of a given radius can cover more tiles of the same angular size near the poles than near the equator.
There is one caveat that arises from this: Finding neighboring tiles degenerates
toward the poles. If the search circle overlaps a pole, or finding a neighboring
tile requires crossing a pole, a
PoleWrapException is raised.
:max_bits option prevents you from computing an arbitrarily precise
proxihash if you take the distance from user input. It will simply cap the
precision of the proxihashes returned. The
:min_bits option makes
search_tiles return nil if a larger tile is required.
does not return more than 4 tiles to accomodate a smaller tileset.
- Bug reports
- Patches: Fork on Github, send pull request.
- Include tests where practical.
- Leave the version alone, or bump it in a separate commit.
Copyright (c) George Ogata. See LICENSE for details.