tobler is a python package for areal interpolation, dasymetric mapping, change of support, and small area estimation. It provides a suite of tools with a simple interface for transferring data from one polygonal representation to another. Common examples include standardizing census data from different time periods to a single representation (i.e. to overcome boundary changes in successive years), or the conversion of data collected at different spatial scales into shared units of analysis (e.g. converting zip code and neighborhood data into a regular grid).
tobler is part of the PySAL family of packages for spatial data science and provides highly performant implementations of basic and advanced interpolation methods, leveraging
shapely to optimize for multicore architecture. The package name is an homage to the legendary quantitative geographer Waldo Tobler, a pioneer in geographic interpolation methods, spatial analysis, and computational social science.
tobler provides functionality for three families of spatial interpolation methods. The utility of each technique depends on the context of the problem and varies according to e.g. data availability, the properties of the interpolated variable, and the resolution of source and target geometries. For a further explanation of different interpolation techniques, please explore some of the field's background literature
Areal interpolation uses the area of overlapping geometries to apportion variables. This is the simplest method with no additional data requirements, aside from input and output geometries, however this method is also most susceptible to the modifiable areal unit problem.
Dasymetric interpolation uses auxiliary data to improve estimation, for example
by constraining the areal interpolation to areas that are known to be inhabited. Formally,
tobler adopts a binary dasymetric approach, using auxiliary data to define which land is available or unavailable for interpolation. The package can incorporate additional sources such as
- raster data such as satellite imagery that define land types
- vector features such as roads or water bodies that define habitable or inhabitable land
either (or both) of which may be used to help ensure that variables from the source geometries are not allocated to inappropriate areas of the target geometries. Naturally, dasymetric approaches are sensitive to the quality of ancillary data and underlying assumptions used to guide the estimation.
Model-based interpolation uses [spatial] statistical models to estimate a relationship between the target variable and a set of covariates such as physical features, administrative designations, or demographic and architectural characteristics. Model-based approaches offer the ability to incorporate the richest set of additional data, but they can also difficult to wield in practice because the true relationship between variables is never known. By definition, some formal assumptions of regression models are violated because the target variable is always predicted using data from a different spatial scale than it was estimated.
tobler is under active development and will continue to incorporate emerging interpolation methods as they are introduced to the field. We welcome any and all contributions and if you would like to propose an additional method for adoption please raise an issue for discussion or open a new pull request!
$ conda env create -f environment.yml $ conda activate tobler $ pip install -e . --no-deps
PySAL-tobler is under active development and contributors are welcome.
If you have any suggestion, feature request, or bug report, please open a new issue on GitHub. To submit patches, please follow the PySAL development guidelines and open a pull request. Once your changes get merged, you’ll automatically be added to the Contributors List.
The project is licensed under the BSD license.
Award #1733705 Neighborhoods in Space-Time Contexts
Award #1831615 Scalable Geospatial Analytics for Social Science Research