-
Notifications
You must be signed in to change notification settings - Fork 278
Spatial Index
Ardb add 2d spatial data index support in v0.7.0. This document explain the design details.
Generally, this design can be simply described as 'GeoHash + Sorted Set'. It's easy to port this solution to redis.
2d spatial point can be described as a tuple like (latitude, longitude, value). The point would be stored as an element in a sorted set.
Geohash is a latitude/longitude geocode system which could encode latitude/longitude with a precision into several bits. More precision would make geohash result more bits.
Since sorted set only accept number as the score value. We need a way to convert the geohash result to a number value.
I wrote another C99 library geohash-int to encode latitude/longitude to a 64bit integer. (Almost all geohash libray listed in Geohash wiki only give a base32 string result. That's why i write that library).
Only two steps:
-
Use geohash-int to encode latitude/longitude with 26 steps which would generate a 52bit integer value. This result would have a distance precision about 0.6 meters.
-
Use the 52bit geohash integer as the score value, and use 'ZADD' command to store the spatial data.
ZADD key
The search condition is a given coordinate and radius. For example, seach all points within a 1000m radius of longitude/latitude coordinate (120.0, 25.0).
First of all, We should estimate the geohash encoding bits by raius value first. Since geohash value represent a box, considering the edge case referenced in Geohash Wiki, to search fast, we need find the smallest geohash box with surrounding 8 geohash box that could cover all points in radius in the worst case. This is a simple table about the estimate geohash encoding bits with radius.
HashBits, Radius Meters
52, 0.5971
50, 1.1943
48, 2.3889
46, 4.7774
44, 9.5547
42, 19.1095
40, 38.2189
38, 76.4378
36, 152.8757
34, 305.751
32, 611.5028
30, 1223.0056
28, 2446.0112
26, 4892.0224
24, 9784.0449
22, 19568.0898
20, 39136.1797
18, 78272.35938
16, 156544.7188
14, 313089.4375
12, 626178.875
10, 1252357.75
8, 2504715.5
6, 5009431
4, 10018863
- Encode longitude/latitude coordinate with estimate bits by geohash-int.
- Find surrounding 8 neighbors' geohash integer value by geohash-int.
- For each geohash integer value, we generate a pair (GeoHashIneger, GeoHashIneger + 1). Then we got 9 pairs.
- For each pair, we convert it to a score range. Any integer value in the pair should be left shift to 52 bits. The shifted value is the smallest 52 bits geohash value in the geohash box represent by unshifted GeoHashInteger.
For example, if we need search points in radius 3000m, then we should encode the longitude/latitude coordinate to a integer with 26 bits, then left shift it 26 bits, then we got a 52 bits integer. - For each score range, use 'ZRANGEBYSCORE key min max WITHSCORES' to retrieve all point's value and it's score.
- For each point value and it's score, we can decode the score to a GeoHash area by geohash-int and compute the distance with given longitude/latitude , then compare the distance with given radius value to exclude the point not in radius.
v0.9 use redis's command(georadius) instead.
Syntax: GEOADD key longitude latitude value
Syntax: GEOSEARCH key LOCATION lat lon RADIUS r [ASC|DESC] [WITHCOORDINATES] [WITHDISTANCES] [GET pattern [GET pattern ...]] [LIMIT offset count]
27,000,000 spatial points with mercator coordinates saved in a sorted set.
Four-core Intel(R) Xeon(R) CPU E5520 @2.27GHz, with 16 GB RAM.
Ardb: 4 threads
All data cached in memory, about 11000 qps to search with radius 1000m, 150 points matching, 100 points matched.
All data in LevelDB without cache, about 900 qps to search with radius 1000m, 150 points matching, 100 points matched.