# Hexagonal indexing

https://eng.uber.com/h3/

## Uber's H3

* Uber works with a lot of geographical data. They open sourced one of their geo indexing libraries which devides a 2d space into hexagons possible in multiple granularities:
  * Python wrapper is available with and intuitive API. Javascript binding as well
  * Each lat, lng can be mapped in O(1) time to a hexagon id (hash) with a certain resolution
  * Hexagon traversal is fast O(1) (operations like parent hexagon, child hexagons or adjacent hexagons)
  * Only 2 Dimentional indexing over a sphere (planet earth). Nothing against using the same method over other 2 dimentional spaces
  * In memory indexing (index not stored on disk) Unlike trees-based indexes which can be found in DBMS like postgis

<img src="h3.png"></img>
<img src="h3splitting.png"></img>

## Why Hexagons?

* Ability to naturaly devide a sphere's surface
* Adjacent hexagons are equaly far
* Traversal using only bitwise operations
* They are not perfect though:
  * A hexagon is not perfectly devidable to smaller hexagons
  * Hexagons cannot perfectly cover planet earth (squares can)

<img src="hexa.png"></img>

## Side note: Google's S2

https://s2geometry.io/

* Can be used for 2 Dimentioanl indexing on a sphere
* In memory

## installation

In [1]:
!pip install h3

Collecting h3
  Downloading h3-3.7.3-cp38-cp38-macosx_10_9_x86_64.whl (675 kB)
[K     |████████████████████████████████| 675 kB 2.6 MB/s eta 0:00:01
[?25hInstalling collected packages: h3
Successfully installed h3-3.7.3


In [14]:
import h3

## Basic use

### Indexing

In [30]:
# Get the id of the hexagon from lng/lat
h3.geo_to_h3(lat=50, lng=8, resolution=9)

'891faec4d37ffff'

In [31]:
# Get the lat/lng of the center of a hexagon from its id
h3.h3_to_geo('891faec4d37ffff')

(49.99965438361455, 8.002389024210904)

In [32]:
# Get the polygon boundary of the hexagon from its id
h3.h3_to_geo_boundary('891faec4d37ffff', geo_json=False)

((50.00023096421835, 7.999905575883487),
 (49.998562822637005, 8.000148699309813),
 (49.99798621187932, 8.002632090638492),
 (49.999077737473364, 8.004872466948562),
 (50.000745899300895, 8.004629446418397),
 (50.001322515289395, 8.0021459466788))

### Traversal

In [33]:
# Get id of parent hexagon (hexagon in one resolution lower)
h3.h3_to_parent('891faec4d37ffff')

'881faec4d3fffff'

In [34]:
# Get id of child hexagons (hexagon in one resolution higher)
h3.h3_to_children('891faec4d37ffff')

{'8a1faec4d347fff',
 '8a1faec4d34ffff',
 '8a1faec4d357fff',
 '8a1faec4d35ffff',
 '8a1faec4d367fff',
 '8a1faec4d36ffff',
 '8a1faec4d377fff'}

## Examples online

https://towardsdatascience.com/fast-geospatial-indexing-with-h3-90e862482585

https://observablehq.com/@nrabinowitz/h3-radius-lookup

How to query using h3 as an index

In [419]:
import pandas as pd
import numpy as np

res = 6

## Example: Search for nearest points

* Given 10000000 latitudes and longitudes (points on earth)
* Find the closest points to one randomly chosen point

In [420]:
# Our data points randomly generated
data = pd.DataFrame({
    "lat": np.random.uniform(-90, 90, 10000000),
    "lng": np.random.uniform(-180, 180, 10000000)
})
data["hexa"] = data.apply(lambda x:h3.geo_to_h3(resolution=res, **x), axis=1)
data = data.set_index("hexa")

# Target point. To search for points near it
target = data.sample(1)

In [468]:
def nearest_point(dat, target):
    """Return points in dat nearest to target"""
    near = [target]
    hexas = [target]
    radius = 1
    while len(near) <= 1:
        near = dat.loc[dat.index.isin(hexas)]
        hexas = h3.k_ring(target, radius)
        radius *= 2
    return near

In [481]:
%%time

nearest_point(data, target.index[0])

CPU times: user 199 ms, sys: 2.93 ms, total: 202 ms
Wall time: 202 ms


Unnamed: 0_level_0,lat,lng,dist
hexa,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
8620c97a7ffffff,41.186616,86.245675,0.0
8620c97a7ffffff,41.188211,86.239817,521.486056
