Skip to content

Go library for local, small, fast reverse geocoding

License

Notifications You must be signed in to change notification settings

SmilyOrg/tinygpkg

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation


Logo

tinygpkg

Go library for local, small, fast reverse geocoding with TWKB & GeoPackage.

📥 Get Datasets · 🐛 Report Bug · 💡 Request Feature

Table of Contents
  1. About
  2. Benchmarks
  3. Usage
  4. Contributing
  5. License
  6. Acknowledgements

About

tinygpkg is a Go library for fast, local, and small-scale geospatial processing. Currently the main use-case is local reverse geocoding by using GeoPackage files that have been simplified and compressed into Tiny Well-known Binary (TWKB) format.

The library has been heavily inspired by sams96/rgeo, a Go library for local reverse geocoding. The main difference is that rgeo uses embedded compressed GeoJSON, which it uses to build a s2.ShapeIndex at initialization time, while tinygpkg uses the GeoPackage format (based on SQLite), which it queries and deserializes at query time.

This means that for comparable datasets tinygpkg has almost no startup cost (12ms vs 14s) and drastically lower runtime memory usage (27MB vs 1.5GB) at the expense of slower reverse geocoding queries (63µs vs 500ns) compared to rgeo. tinygpkg can also work with much larger datasets (like geoBoundaries CGAZ), as it doesn't need to index the entire dataset in memory.

Features

  • Local - no network requests needed
  • Small - supports Tiny Well-known Binary (TWKB) in GeoPackage for smaller dataset sizes
  • Fast - fast startup time (12ms) and reverse geocoding queries (<1ms)
  • Low memory usage - GeoPackage files are queried on-the-fly at runtime
  • Large datasets - can work with datasets that don't fit in memory
  • GeoPackage - uses the GeoPackage format reading geospatial data
  • TWKB - supports Tiny Well-known Binary (TWKB) in GeoPackage for compressed datasets

Limitations

  • Slower queries - each query needs to do a database lookup, geometry deserialization, and point-in-polygon check - it's still plenty fast (microseconds), but not as fast as sams96/rgeo that uses s2.ShapeIndex
  • No GeoJSON - only supports GeoPackage files for now

Built With

Benchmarks

See a more detailed comparison below using two Natural Earth datasets.

Benchmark - 110m countries dataset rgeo tinygpkg % of rgeo
Compiled code size 3.2 MB 7.8 MB 243%
Bundle size (code + data) 32 MB 8.2 MB 26%
Startup time 93 ms 14 ms 15%
Startup allocated bytes 22 MB 12 KB 0.05%
Runtime memory usage 25 MB 27 MB 108%
Reverse geocode time 1.2 µs 87 µs 7250%
Benchmark - 10m cities dataset rgeo tinygpkg % of rgeo
Compiled code size 3.2 MB 7.8 MB 243%
Bundle size (code + data) 32 MB 11.5 MB 35%
Bundle size (7z compressed) 30 MB 5 MB 17%
Startup time 14000 ms 12 ms 0.08%
Startup allocated bytes 6 GB 13 KB 0.0002%
Runtime memory usage 1.5 GB 27 MB 1.8%
Reverse geocode time 0.5 µs 63 µs 12600%

See also detailed benchmark results.

Usage

go get github.com/smilyorg/tinygpkg

See example, shortened below.

// Open GeoPackage with dataset and column for reverse geocoding
g, _ := gpkg.Open(
  "../testdata/ne_110m_admin_0_countries_s4_twkb_p3.gpkg",
  "ne_110m_admin_0_countries",
  "NAME",
)
defer g.Close()

// Reverse geocode a point
p := s2.LatLngFromDegrees(48.8566, 2.3522)
name, _ := g.ReverseGeocode(context.Background(), p)
println(name)

// Output: France

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

License

Distributed under the MIT License. See LICENSE for more information.

Acknowledgements