Skip to content

Recipes for geospatial data preprocessing in Python.

Notifications You must be signed in to change notification settings

carlosg-m/goose-lab

Repository files navigation

goose-lab

Recipes for geospatial data processing in Python.

Efficiently generate the bounding coordinates (xmin, ymin, xmax, ymax) of a 2d regular grid in numpy. This array can be easily converted into shapely or pygeos geometries.

Intersect points with rectangles from a regular grid without a spatial index (using only NumPy). Useful for processing rasters larger than memory through reading windows or creating database indexes.

  • The dataset is the European Digital Elevation Model (EU-DEM), version 1.1 from European Union's Copernicus Programme.

  • The format is Raster data (geotiff), divided into 1000x1000km tiles, 25m resolution, accuracy of ~7m RMSE.

  • The only dependencies of this script are Numpy, Pandas, Pyproj and Rasterio.

  • The process was designed to minimize memory consumption. Tiles or chunks of the rasters are read sequentially from storage.

  • Input coordinates are assumed to be in WGS84 (latitude and longitude).

  • A distributed version of this script can be easily created with Dask for Big Data use cases.

Basic usage example:

  1. Download the raster tiles from Copernicus website.

  1. Instantiate the object CopernicusDEM with the geotiff file paths and call the get_elevation method over a Pandas dataframe with latitude and longitude columns.
airports = pd.DataFrame([['LPPT', 38.775600, -9.135400],
                         ['LPPR', 41.242100, -8.678600],
                         ['LPFR', 37.017600, -7.969700],
                         ['LPBJ', 38.063700, -7.939200]], columns=['ICAO', 'Latitude', 'Longitude'])
                         
copernicus = CopernicusDEM(raster_paths=['eu_dem_v11_E20N10.TIF', 'eu_dem_v11_E20N20.TIF'])

airports = copernicus.get_elevation(airports, lat_col='Latitude', lon_col='Longitude')

print(airports)

image