# SpaceNet8 

SSH into a VM of your choice (I use paperspace Core machine that has ML-out-of-the-box, it has `nvidia-docker` installed). **Make sure to immediately open a folder**, opening it later might destroy the existing terminals and reset docker container.  

## Set up the folders (data, repo, and runs)


### Download SpaceNet8 repo and build docker image

My home folder: **~/Work/SpaceNet8** (`/home/paperspace/` is a home directory `~`).

On paperspace machine get our SpaceNet github repo [instructions](https://github.com/nesaboz/SpaceNet8.git):
```
git clone https://github.com/nesaboz/SpaceNet8.git
```


Build docker image (will take a few minutes):
```
sudo nvidia-docker build -t sn8/baseline:1.0 ~/Work/SpaceNet8/docker 
```
There is a way to avoid constant `sudo` but requires messing with some json config files. For now just use `sudo`.


### Download SpaceNet8 data

Let this be folder (create via mkdir): **~/Work/data**

Install `awscli`:
```
pip install awscli
```

Download the dataset (links also [here](https://spacenet.ai/sn8-challenge/)). Try to download data first and if needed set up aws credentials:
Log into [AWS management console](https://aws.amazon.com/console/), under "Account/Security credentials/
Create access key" get ACCESS_KEY and SECRET_KEY:
```
aws configure set aws_access_key_id ACCESS_KEY  
aws configure set aws_secret_access_key SECRET_KEY
```

Download training data:
```
aws s3 cp  s3://spacenet-dataset/spacenet/SN8_floods/tarballs/Germany_Training_Public.tar.gz . 
aws s3 cp s3://spacenet-dataset/spacenet/SN8_floods/tarballs/Louisiana-East_Training_Public.tar.gz . 
```
testing data:
```
aws s3 cp  s3://spacenet-dataset/spacenet/SN8_floods/tarballs/Louisiana-West_Test_Public.tar.gz . 
```

Unzip the data, **make sure they all have their own directory**:
```
tar -xf Germany_Training_Public.tar.gz
tar -xf Louisiana-East_Training_Public.tar.gz
tar -xf Louisiana-West_Test_Public.tar.gz
```


### Create a runs folder

```
mkdir runs
```


## Run docker container

Make sure remote machine in VSCode has open folder, doing this later might destroy the existing terminals and reset docker container.

Let's run the container and mount the three folders that we created in previous steps:
```
sudo nvidia-docker run -v ~/Work/SpaceNet8:/tmp/SpaceNet8 -v ~/Work/data:/tmp/data -v ~/Work/runs:/tmp/runs --ipc=host -it --rm sn8/baseline:1.0 bash
```

I added `--ipc=host` to the command to avoid shared memory [issue](https://github.com/pytorch/pytorch#docker-image). 

in general, option `-v /host/path:/container/path` mounts a folder, more options [here](https://docs.docker.com/engine/reference/commandline/run/).

the prompt should now look like this `root@<container_id>:/#` and the folders will be mounted in the `/tmp` folder. Rename this terminal window to **Do not delete** and don't delete it since this shuts down the container.

To attach to container from VSCode, install "Remote Development" extension [pack](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.vscode-remote-extensionpack), then one can attach the container in VSCode by going to a command pallette (Cmd+Shift+P) and typing "Attach to running container".

To see running containers' info from paperspace machine use:
```
docker ps
```
To stop all the docker containers:
```
docker container stop ID_or_NAME
docker container stop $(docker container ls -aq)
```


## Data Preparation

First we need to create intermediary data:

```
python baseline/data_prep/geojson_prep.py --root_dir /tmp/data --aoi_dirs Germany_Training_Public Louisiana-East_Training_Public
```

and then create masks (this might 5 min):

```
python baseline/data_prep/create_masks.py --root_dir /tmp/data --aoi_dirs Germany_Training_Public Louisiana-East_Training_Public
```

Let's create a split:

```
python baseline/data_prep/generate_train_val_test_csvs.py --root_dir /tmp/data --aoi_dirs Germany_Training_Public Louisiana-East_Training_Public --out_csv_basename sn8_data --val_percent 0.15 --out_dir /tmp/runs
```

## Train/validate Foundation Feature Network




Now we can train the Foundation network:

```
python baseline/train_foundation_features.py --train_csv /tmp/runs/sn8_data_train.csv --val_csv /tmp/runs/sn8_data_val.csv --save_dir /tmp/runs --model_name resnet34 --lr 0.0001 --batch_size 1 --n_epochs 1 --gpu 0
```

I've been running into memory issues and had to reduce the batch size to 1. TODO try larger GPU.

### Inference with Foundation Features Network

Write prediction tiffs to be used for postprocessing and generating the submission.csv
```
python baseline/foundation_eval.py --model_path /tmp/runs/resnet34_lr1.00e-04_bs1_03-05-2023-22-43/best_model.pth --in_csv /tmp/runs/split/sn8_data_val.csv --save_preds_dir /tmp/runs/foundation --gpu 0 --model_name resnet34
```

Write prediction .pngs for visual inspection of predictions:
```
python baseline/foundation_eval.py --model_path /tmp/runs/resnet34_lr1.00e-04_bs1_03-05-2023-22-43/best_model.pth --in_csv /tmp/runs/split/sn8_data_val.csv --save_fig_dir /path/to/output/foundation/pngs --gpu 0 --model_name resnet34
```