# Introduction #

This notebook shows how to produce predictions from Sentinel-2 L1C imagery using any of a number of models found in the [geotrellis/deeplab-nlcd](https://github.com/geotrellis/deeplab-nlcd) repository.

Any of the "binary" (one class, yes/no) networks in the [architectures](https://github.com/geotrellis/deeplab-nlcd/tree/master/python/architectures) directory should work.  Those include architectures whose names contain the substring "-binary", as well as networks whose names contain the substring "sentinel2".  The former group of architectures also require a "weights.pth" file (trained using the [training script](https://github.com/geotrellis/deeplab-nlcd/blob/master/python/train.py) either by you or someone else), the latter group of architectures require no weights file because they are just spectral indices with no parameters.

The "Manual Preparation" section shows you how to produce predictions one-at-a-time, the "Batch Preparation" section shows you how to produce them in bulk.

# Manual Preparation #

## Prerequisites ##

In [0]:
!pip install rasterio

In [0]:
import copy

import numpy as np
import requests

import rasterio as rio
import torch
import torchvision

import PIL.Image

from urllib.parse import urlparse

## Mount (colab only) ##

In [0]:
from google.colab import drive
drive.mount('/content/gdrive')

## Load Sentinel-2 L1C Imagery ##

In [0]:
imagery_path = '/content/gdrive/My Drive/data/imagery.tif'

with rio.open(imagery_path) as ds:
  profile = copy.copy(ds.profile)
  imagery_data = ds.read()

In [0]:
red = imagery_data[3,:,:] / 4500.0
green = imagery_data[2,:,:] / 4500.0
blue = imagery_data[1,:,:] / 4500.0

In [0]:
red = np.clip(red, 0.0, 1.0)
green = np.clip(green, 0.0, 1.0)
blue = np.clip(blue, 0.0, 1.0)

In [0]:
red = (red * 255).astype(np.uint8)
green = (green * 255).astype(np.uint8)
blue = (blue * 255).astype(np.uint8)

In [0]:
PIL.Image.fromarray(np.stack([red, green, blue], axis=2))

## Load Architecture ##

In [0]:
def read_text(uri: str):
    parsed = urlparse(uri)
    if parsed.scheme.startswith('http'):
        return requests.get(uri).text
    else:
        with codecs.open(uri, encoding='utf-8', mode='r') as f:
            return f.read()

def load_architecture(uri: str):
    arch_str = read_text(uri)
    arch_code = compile(arch_str, uri, 'exec')
    exec(arch_code, globals())

In [0]:
# This can be replaced with other architectures
architecture = 'https://raw.githubusercontent.com/geotrellis/deeplab-nlcd/055b6f42f9042d5d443a0f93fbb7f7ae952b3706/python/architectures/cheaplab-regression-binary.py'

In [0]:
load_architecture(architecture)

## Make Model, Load Weights ##

In [0]:
backend = 'cpu' # 'cuda' can also be used
device = torch.device(backend)
band_count = 13 # XXX
input_stride = 1
class_count = 1
divisor = 1

model = make_model(
    band_count,
    input_stride=input_stride,
    class_count=class_count,
    divisor=divisor,
    pretrained=False,
).to(device)

In [0]:
weights = '/content/gdrive/My Drive/data/weights.pth'

if not hasattr(model, 'no_weights'):
    model.load_state_dict(torch.load(
        weights, map_location=device))

## Inference ##

In [0]:
window_size = 64
width = profile.get('width')
height = profile.get('height')
predictions = np.zeros((1, height, width), dtype=np.float32)

model.eval()

with torch.no_grad():
    for x_offset in range(0, width, window_size):
        if x_offset + window_size > width:
            x_offset = width - window_size - 1
        for y_offset in range(0, height, window_size):
            if y_offset + window_size > height:
                y_offset = height - window_size - 1
            window = imagery_data[0:band_count, y_offset:y_offset+window_size, x_offset:x_offset+window_size].astype(np.float32)
            tensor = torch.from_numpy(np.stack([window], axis=0)).to(device)
            out = model(tensor)
            if isinstance(out, dict):
                out = out['2seg']
                out = out.cpu().numpy()
            else:
                out = out.cpu().numpy()
            predictions[:, y_offset:y_offset+window_size, x_offset:x_offset+window_size] = out[0]

In [0]:
pred_min = np.min(predictions)
pred_max = np.max(predictions)
pred_uint8 = (np.clip((predictions - pred_min) / (pred_max - pred_min), 0.0, 1.0) * 255).astype(np.uint8)[0]

In [0]:
PIL.Image.fromarray(pred_uint8)

## Save ##

In [0]:
profile.update(dtype = np.float32, count=1, compress='lzw', predictor=2)

predictions_path = '/content/gdrive/My Drive/data/prediction.tif'

with rio.open(predictions_path, 'w', **profile) as ds:
  ds.write(predictions)

# Batch Preparation #

### Step 1 ###

Clone the [`geotrellis/deeplab-nlcd`](https://github.com/geotrellis/deeplab-nlcd) repository to a local directory.

Type
```
git clone git@github.com:geotrellis/deeplab-nlcd.git
```
or similar.

### Step 2 ###

Enter the root of the repository directory.

Type
```
cd deeplab-nlcd
```
or similar.

### Step 3 ###

Start a docker container with the needed dependencies.

Type
```
docker run -it --rm -w /workdir -v $HOME/.aws:/root/.aws:ro -v $(pwd):/workdir -v $HOME/Desktop:/desktop --runtime=nvidia jamesmcclain/aws-batch-ml:9 bash
```
or similar.  This sample command line will mount the local directory `~/Desktop/` which is assumed to contain the imagery on which we wish to work.  We will see later that it is also possible to use imagery on S3.

You are now within the docker container.

### Step 4 ###

Build the native library needed by the Python code, if that library does not already exist:

Type
```
make -C /workdir/src/libchips
```
or similar.

### Step 5 ###

Now perform inference on imagery.

Type
```
python3 /workdir/python/inference.py --architecture https://raw.githubusercontent.com/geotrellis/deeplab-nlcd/055b6f42f9042d5d443a0f93fbb7f7ae952b3706/python/architectures/cheaplab-regression-binary.py --libchips /workdir/src/libchips/libchips.so --bands 1 2 3 4 5 6 7 8 9 10 11 12 13 --inference-img /desktop/imagery/image*.tif --weights /desktop/weights.pth --raw-prediction-img '/desktop/predictions/cheaplab/*' --classes 1 --window-size 64
```
or similar.

Note that `~/Desktop/imagery/` is assumed to contain the imagery (files with names matching the pattern `image*.tif`) and the directory `~/Desktop/predictions/cheaplab/` is assumed to exist.

Note that the single quote around the argument to `--raw-prediction-img` are required to prevent the shell from trying to interpret the `*`.


You are done.  Your predictions can now be found in `~/Desktop/predictions/cheaplab/`.  You can also do predictions on remote assets on S3 by typing
```
python3 /workdir/python/inference.py --architecture https://raw.githubusercontent.com/geotrellis/deeplab-nlcd/055b6f42f9042d5d443a0f93fbb7f7ae952b3706/python/architectures/cheaplab-regression-binary.py --libchips /workdir/src/libchips/libchips.so --bands 1 2 3 4 5 6 7 8 9 10 11 12 13 --inference-img s3://my-bucket/imagery/image*.tif --weights /desktop/weights.pth --raw-prediction-img 's3://my-bucket/predictions/cheaplab/*' --classes 1 --window-size 64
```
or similar.

You can also mix-and-match locations: you can have remote imagery and save the predictions locally, you can have local imagery and save the predictions to S3.
