[WIP] start kriging module #140

knaaptime · 2021-04-27T21:01:44Z

this is a first draft at adding a kriging module based on pykrige. Initial explorations were pretty positive, though the quality of the interpolation obviously depends a great deal on the variogram fit

codecov-commenter · 2021-04-27T21:14:42Z

Codecov Report

Merging #140 (e3a07b6) into master (32c8525) will decrease coverage by 3.45%.
The diff coverage is 0.00%.

@@            Coverage Diff             @@
##           master     #140      +/-   ##
==========================================
- Coverage   81.25%   77.79%   -3.46%     
==========================================
  Files          17       19       +2     
  Lines         832      869      +37     
==========================================
  Hits          676      676              
- Misses        156      193      +37

Impacted Files	Coverage Δ
tobler/kriging/__init__.py	`0.00% <0.00%> (ø)`
tobler/kriging/kriging.py	`0.00% <0.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 32c8525...e3a07b6. Read the comment docs.

knaaptime · 2021-04-29T17:11:17Z

currently this is just to get started exploring the mechanics of the external libraries. The first fraft takes a really naive approach assigning the predicted value for the target_df centroid to the whole polygon. Instead, we should probably generate a geocube raster of the prediction surface then allow both (a) averaging of pixel values inside the polygon and (b) proper block kriging

sjsrey

I think we need to consider different treatments for the extensive and intensive variables. At first glance, kriging seems more appropriate for the latter.

knaaptime · 2021-06-27T17:06:45Z

agreed on both.

i've also played around a bit further and there are a few different ways we could go about this (and maybe provide options for more than one). The question is how we want to shoehorn the very discrete process of human geography into a continuous spatial model (though as you said, it should work reasonably for percentages).

(in the current PR) is the simplest (probably overly so). It estimates the model using polygon centroids from source_df as observations, then uses that model to predict values at the centroids of target_df. The issue here is that, especially for extensive variables like counts, we end up wayy overestimating the volume of the total surface (so that implementation also includes a rescale). We dont have "control" observations in places with 0 population, so the estimated surface doesn't have the variation we need it to have
Estimate the model using source_df centroids, then predict a continuous raster, then take the average of pixel values that fall inside target_df polygons. I think this is closer to the spirit of block kriging, though still looking for the best reference
Rasterize input_df and estimate the model using that raster, then predict a continuous raster and take the average within target_df polys. This might help capture some of the "harder" edges between polygons that get overly smoothed in approach (1), but also kind of inflates the data (estimating raster resolution x polygon area "observations" instead of one per polygon) so might end up with some oddities for places with lots of heterogeneously-sized polygons. This is also really computationally intensive because the training data becomes so large, so a hybrid option of sorts might be to use something like pointpats to drop random points inside each polygon and use those as observations

knaaptime · 2021-06-27T21:52:22Z

actually, a 4th option riffing on 3, would be to include auxiliary data to mask out uninhabited regions of source_df, then randomly drop points in the inhabited areas and assign them them values from source_df, then in uninhabited areas drop random points and assign them all 0 and estimate on that "surface"

knaaptime · 2021-06-28T17:48:59Z

https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1538-4632.2004.tb01135.x

knaaptime added 2 commits April 27, 2021 12:16

start kriging module

7896495

update unittest yaml

e3a07b6

knaaptime requested a review from sjsrey April 27, 2021 21:01

sjsrey reviewed Jun 27, 2021

View reviewed changes

knaaptime deleted the branch pysal:master May 10, 2023 16:52

knaaptime closed this May 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] start kriging module #140

[WIP] start kriging module #140

knaaptime commented Apr 27, 2021

codecov-commenter commented Apr 27, 2021 •

edited

knaaptime commented Apr 29, 2021

sjsrey left a comment

knaaptime commented Jun 27, 2021 •

edited

knaaptime commented Jun 27, 2021

knaaptime commented Jun 28, 2021

[WIP] start kriging module #140

[WIP] start kriging module #140

Conversation

knaaptime commented Apr 27, 2021

codecov-commenter commented Apr 27, 2021 • edited

Codecov Report

knaaptime commented Apr 29, 2021

sjsrey left a comment

Choose a reason for hiding this comment

knaaptime commented Jun 27, 2021 • edited

knaaptime commented Jun 27, 2021

knaaptime commented Jun 28, 2021

codecov-commenter commented Apr 27, 2021 •

edited

knaaptime commented Jun 27, 2021 •

edited