# Project Shackleton - Data Curation and Deep Learning Inference

In the cells below we prepare data for the Shackleton Dashboard, then run our vehicle (YOLTv5) and road (CRESI) detection algorithms on this data.  We prepare the data and execute deep learning inference in the freely available Amazon SageMaker Studio Lab.


-----
## 1. Create the SageMaker StudioLab environment

    # install yolov5
    # https://github.com/ultralytics/yolov5
    cd /home/studio-lab-user
    conda activate default
    git clone https://github.com/ultralytics/yolov5
    cd yolov5
    pip install -r requirements.txt  # install

    # update with geo packages
    conda install -c conda-forge gdal
    conda install -c conda-forge osmnx
    conda install -c conda-forge osmnx=0.12 
    conda install -c conda-forge scikit-image
    conda install -c conda-forge statsmodels
    conda install -c conda-forge matplotlib
    conda install -c conda-forge ipykernel 
    pip install torchsummary
    pip install utm
    pip install numba
    pip install jinja2==2.10
    pip install geopandas==0.8
    
    # clone shackleton codebase
    git clone https://github.com/avanetten/shackleton.git

    # clone YOLTv5 and CRESI
    cd /home/studio-lab-user/shackleton/src/
    git clone https://github.com/avanetten/cresi.git
    git clone https://github.com/avanetten/yoltv5.git

-----
## 2. Download Data

Since the pre-trained model weights are available, we need not download the SpaceNet training data.  Instead, we will just download the testing data.  For this exercise, we'll explore SpaceNet Area of Interest (AOI) \#10: Dar Es Salaam.  This city was withheld for testing purposes in SpaceNet 5, meaning that the pre-trained model has not been trained on this city whatsoever.  To download the data (25 GB):

In [None]:
test_im_raw_dir = '...'
!aws s3 cp --recursive s3://spacenet-dataset/AOIs/AOI_10_Dar_Es_Salaam/PS-MS/ {test_im_raw_dir}


####  Prepare Test Data

While CRESI is designed to handle images of arbitrary size and extent, for this exercise we will clip the image somewhat to speed processing time and ease visualization. We will also convert the 8-band multispectral 16-bit image to an easier to visualize 8-bit RGB image. 

In [None]:
# Clip the image extent
ulx, uly, lrx, lry = 39.25252, -6.7580, 39.28430, -6.7880  # v0
outname = 'test1_cog_clip.tif'
im_name = [z for z in os.listdir(test_im_raw_dir) if z.endswith('.tif')][0]
print("im_name:", im_name)
test_im_raw = os.path.join(test_im_raw_dir, im_name)
test_im_clip = os.path.join(test_im_clip_dir, outname)
print("output_file:", test_im_clip)

!gdal_translate -projwin {ulx} {uly} {lrx} {lry} {test_im_raw} {test_im_clip}

In [None]:
# Convert 16-bit multispectral test data to 8-bit RGB
%cd {os.path.join(cresi_dir, 'cresi/data_prep/')}
import create_8bit_images

create_8bit_images.dir_to_8bit(test_im_clip_dir, test_final_dir,
                              command_file_loc='',
                              rescale_type="perc",
                              percentiles=[2,98],
                              band_order=[5,3,2])

# display our test image
fig_width, fig_height = 16, 16
im_test_name = [z for z in os.listdir(test_final_dir) if z.endswith('.tif')][0]
im_test_path = os.path.join(test_final_dir, im_test_name)
im_test = skimage.io.imread(im_test_path)

fig, ax = plt.subplots(figsize=(fig_width, fig_height))
_ = ax.imshow(im_test)
_ = ax.set_title(im_test_name)

Image stats for test1_cog_clip.tif: 
  - im.shape: (11770, 11111, 3)
  - n pixels: 130,776,470
  - Area = 11.7 km2

In [None]:
# Now ensure the image is a valid COG:

%cd $test_im_clip_dir
!gdal_translate test1_cog_clip.tif test1_realcog_clip.tif -of COG -co COMPRESS=LZW	

----

## 3. Execute Inference

Open a terminal in StudioLab, and run the following


### yoltv5

    cd /home/studio-lab-user/shackleton/src/yoltv5/yoltv5
    time ./test.sh /home/studio-lab-user/shackleton/cfg/yoltv5_8class_test_studio_lab.yaml
    # Total time is < 1 min on a GPU


### cresi

    cd /home/studio-lab-user/shackleton/src/cresi/cresi
    JSON=/home/studio-lab-user/shackleton/cfg/cresi_8class_test_studio_lab.json
    time ./test.sh $JSON
    # Total time is 2.5 minutes on a GPU

-----
## 4. Copy Results Locally

Since StudioLab cannot run a Bokeh server, we need to run the dashboard locally.  The relevant results folder in StudioLab is: /home/studio-lab-user/shackleton/results/test1.  Copy these results locally, then run the dashboard according to the README.