In [4]:
from sandmining.load_observations import load_observations
from sandmining.process_annotations import process_annotations
from sandmining.dataset import PatchDataset
from sandmining.train_model import train_model
from sandmining.inference import evaluate_model
from sandmining.visualizations import *

## Load the Data

Download .tif files and annotations + rivers geojson files from google cloud bucket. Save them in relevant data directories. 

In [2]:
load_observations()

## Preprocess Label Annotations and River Polygons

Convert geojson FeatureSets into geopandas GeoDataFrame's and set the appropriate coordinate system for labels and rivers. Convert these polygon sets into boolean mask rasters matching the dimensions of the original .tif source image. Use rasterio's rasterize function to "fill in" polygons so that the boolean raster masks have solid shapes instead of just boundaries. 

In [3]:
process_annotations()

2 409 1224
1 409 1224
0 1231 459


## Train the Model

I picked a pytorch implementation of Unet for this semantic segmentation task for the following reasons. It's architectural features allow it to handle multi-class segmentation with ease, be robust to variations in image size and content, be effective for limited training data scenarios, and versatile for a variety of segmentation tasks. 

The encoder and decoder structures (downsampling and upsampling paths) allow the model to extract high level information about image content and to synthesize low level features with decoder features for better localization. This is paired with skip connections which help preserve boundary information and enhance localization accuracy. It also helps with training by mitigating the vanishing gradient problem. Lastly, fully convolutional layers in conjunction with the encoder decoder architecture allow for pixel-wise segmentation and context aggregation. 

Additionally, I implemented a custom pytorch Dataset with uniform random sampling of the source image using a special sliding window algorithm. This dataset implements logic to sample sliding window patches only within river bounds based on whether at least one pixel in the patch is within the river polygon.

I used a pre-trained Unet model with Imagenet weights for the encoder, as well as a sigmoid activation function for the last layer. 

In [None]:
train_model('unet')

## Evaluate the Model, Visualize Predictions

In [None]:
evaluate_model()

## Final Thoughts