As part of the DeepDEM framework for refining DSMs, we need to calculate a global scale factor by which we will scale the input DSMs. This notebook is used to calculate this scale factor

In [1]:
# torchgeo imports
from torchgeo.datasets import RasterDataset
from torchgeo.samplers import RandomBatchGeoSampler, Units

# GIS imports
import rasterio

# misc imports
import numpy as np
from pathlib import Path

import torch  # for reproducibility
torch.manual_seed(0)

<torch._C.Generator at 0x7f1a45ba1650>

In [2]:
# user specifies path of processed rasters generated in the previous notebook
output_data_path = Path('/mnt/working/karthikv/DeepDEM/data/mt_baker/WV01_20150911_1020010042D39D00_1020010043455300/processed_rasters')
# output_data_path = Path('/mnt/working/karthikv/DeepDEM/data/mt_baker/WV03_20150930_10400100110E9600_1040010011B0B900/processed_rasters')
# output_data_path = Path('/mnt/working/karthikv/DeepDEM/data/mt_baker/WV02_20130911_1030010026900000_1030010027BE9000/processed_rasters')

dsm_file = output_data_path / 'final_asp_dsm.tif'

assert dsm_file.exists(), "DSM file not found!"

The scale factor we want to calculate is the standard deviation seen in heights calculated for our training chips, filtered for outliers (values within the 5th-95th percentile). To do this, we randomly sample chips across the training area of our input DSM.

In [3]:
CHIP_SIZES=[64, 128, 256, 512, 1024]

for CHIP_SIZE in CHIP_SIZES:
    mtbaker_asp_dem = RasterDataset(str(dsm_file))
    sampler = RandomBatchGeoSampler(mtbaker_asp_dem, size=CHIP_SIZE, units=Units.PIXELS, batch_size=32, length=5000)

    def return_sample_std(batch):
        std_values = []
        for b in batch:
            minx, maxx, miny, maxy, _, _ = b
            with rasterio.open(dsm_file) as ds:
                img = ds.read(1, window=rasterio.windows.from_bounds(minx, miny, maxx, maxy, transform=ds.transform)).flatten()
                img = np.ma.masked_where(img == ds.nodata, img)
                std_values.append(np.std(img))

        return std_values

    std_values = sum(map(return_sample_std, sampler), [])
            
    lower_percentile, upper_percentile = np.percentile(std_values,  5), np.percentile(std_values,  95)
    std_values = np.ma.masked_where((std_values < lower_percentile) & (std_values > upper_percentile), std_values)

    gsf = np.nanmean(std_values)
    print(f"Global scale factor for dataset DSM@patch size = ({CHIP_SIZE}x{CHIP_SIZE}): ", gsf)

  a = np.asanyarray(a)
  std_values = np.ma.masked_where((std_values < lower_percentile) & (std_values > upper_percentile), std_values)
  a = np.array(a, copy=copy, subok=True)


Global scale factor for dataset DSM@patch size = (64x64):  9.862463005436561
Global scale factor for dataset DSM@patch size = (128x128):  18.590830177030675
Global scale factor for dataset DSM@patch size = (256x256):  34.90491386259698
Global scale factor for dataset DSM@patch size = (512x512):  63.51072066949995
Global scale factor for dataset DSM@patch size = (1024x1024):  112.10995766644817


For the WV01 Mt Baker dataset (20150911), the DSM scale factor is about 34.81 for a chip size of (256x256) pixels, and 63.67 at (512x512)