<a href="https://colab.research.google.com/github/magrimm/DA_live_training/blob/master/ACRE_Cascade_Starter_Notebook.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Example of using the Starter Kit
The acre-cascade-starter-kit is available on [the PAL GitHub](https://github.com/predictive-analytics-lab/acre-cascade-starter).

You're free to fork this repository and do whatever you'd like. This notebook is just showing an example of how you might use it.

In [3]:
%%capture

# First, we download the starter kit from the repository and install it.
# We can use Python's pip package manager to do this for us.
# N.b. the %%capture command just hides all the rubbish that's logged during installation.

!pip install -U git+https://github.com/predictive-analytics-lab/acre-cascade-starter.git --no-cache-dir

# And just in case you need it....
# !pip uninstall -y acre-cascade-starter 

In [4]:
# We can check that this is indeed a GPU instance with the nvidia-smi command.
# If you're not on a GPOU instance, use the Runtime tab.
!nvidia-smi

Sat Dec 19 17:19:30 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 455.45.01    Driver Version: 418.67       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   42C    P8     9W /  70W |      0MiB / 15079MiB |      0%      Default |
|                               |                      |                 ERR! |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

In [5]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


One of the tricky parts of Machine Learning in general is making sure you have access to the data. We've done some of the work for you and added the AcreCascadeDataModule which will download the data for you and produce smaller patches to work on. (Feel free to edit this if you'd like a different data pre-processing strategy!)

This is made a little more tricky by Colab not having persistent storage.

However, you can mount a Google Drive. This is fine for storing the data, but it's quite slow during training as you have to download each image patch.

Below are some potentially useful commands for copying some of the data from Google Drive to the Colab instance. This will be quicker
Below are some potentially useful commands for copying some of the data from Google Drive to the Colab instance. This will make training quicker.

In [6]:
# Potentially useful command for moving the test directory to the colab instance.
!mkdir -p /content/crops/Development_Dataset/Test_Dev/
!rsync -r --info=progress2 /content/drive/MyDrive/crops/Development_Dataset/Test_Dev/ /content/crops/Development_Dataset/Test_Dev/

rsync: change_dir "/content/drive/MyDrive/crops/Development_Dataset/Test_Dev" failed: No such file or directory (2)
              0 100%    0.00kB/s    0:00:00 (xfr#0, to-chk=0/0)
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1196) [sender=3.1.2]


In [7]:
# Potentiall useful command for moving a subset of images to the colab instance.
!mkdir -p /content/crops/Development_Dataset/Training/Patches/Images/Roseau/
!mkdir -p /content/crops/Development_Dataset/Training/Patches/Masks/Roseau/

!rsync -r --info=progress2 /content/drive/MyDrive/crops/Development_Dataset/Training/Patches/Images/Roseau/ /content/crops/Development_Dataset/Training/Patches/Images/Roseau/
!rsync -r --info=progress2 /content/drive/MyDrive/crops/Development_Dataset/Training/Patches/Masks/Roseau/ /content/crops/Development_Dataset/Training/Patches/Masks/Roseau/

rsync: change_dir "/content/drive/MyDrive/crops/Development_Dataset/Training/Patches/Images/Roseau" failed: No such file or directory (2)
              0 100%    0.00kB/s    0:00:00 (xfr#0, to-chk=0/0)
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1196) [sender=3.1.2]
rsync: change_dir "/content/drive/MyDrive/crops/Development_Dataset/Training/Patches/Masks/Roseau" failed: No such file or directory (2)
              0 100%    0.00kB/s    0:00:00 (xfr#0, to-chk=0/0)
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1196) [sender=3.1.2]


In [8]:
# Potentially useful command if you need the data lookup `data.csv`
!mkdir -p /content/crops/Development_Dataset/Training/
!rsync -r --info=progress2 /content/drive/MyDrive/crops/Development_Dataset/Training/data.csv /content/crops/Development_Dataset/Training/

rsync: change_dir "/content/drive/MyDrive/crops/Development_Dataset/Training" failed: No such file or directory (2)
              0 100%    0.00kB/s    0:00:00 (xfr#0, to-chk=0/0)
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1196) [sender=3.1.2]


In [9]:
import json
from pathlib import Path
import pytorch_lightning as pl
from src.data import AcreCascadeDataModule
from src.model import UNetSegModel
import torch
import gc
from src.utils import generate_timestamp


seed = None
data_dir = Path("/content/drive/MyDrive/") # Make the file path /content/drive/MyDrive/ to download to mounted Google Drive
output_dir = Path("/content/drive/MyDrive/")
train_batch_size=8
val_batch_size=8
val_pcnt=0.9 
num_workers=4
num_layers=1
features_start=2
lr=1e-3
bilinear=True
log_to_wandb=False
gpus=1
epochs=1
use_amp=True

In [10]:
"""Main script."""

# Create a submdir within the output dir named with a timestamp
run_dir = output_dir / generate_timestamp()
run_dir.mkdir(parents=True)



In [11]:


# Set all seeds for reproducibility
if seed is not None:
    pl.seed_everything(seed=seed)
# ------------------------
# 1 INIT DATAMODULE
# ------------------------
dm = AcreCascadeDataModule(
    data_dir=data_dir,
    train_batch_size=train_batch_size,
    val_batch_size=val_batch_size,
    val_pcnt=val_pcnt,
    num_workers=num_workers,
    download=True, # Should you download and save the data?
    teams=["Roseau"], # Just working on a subset of the data for now. To use all teams, pass None
    test_teams=None, # Produce the test set for all teams
)



In [12]:
# ------------------------
# 2 INIT LIGHTNING MODEL
# ------------------------
model = UNetSegModel(
    num_classes=dm.num_classes,
    num_layers=num_layers,
    features_start=features_start,
    lr=lr,
    bilinear=bilinear,
)


In [13]:

# ------------------------
# 3 SET LOGGER
# ------------------------
logger = None
if log_to_wandb:
    logger = WandbLogger()
    # optional: log model topology
    logger.watch(model.net)


In [14]:

# ------------------------
# 4 INIT TRAINER
# ------------------------
trainer = pl.Trainer(
    gpus=gpus,
    logger=logger,
    max_epochs=epochs,
    precision=16 if use_amp else 32,
    progress_bar_refresh_rate=5
)



GPU available: True, used: True
TPU available: None, using: 0 TPU cores
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
Using native 16bit precision.


In [15]:
# ------------------------
# 5 START TRAINING
# ------------------------
trainer.fit(model=model, datamodule=dm)




29a85805-2d8d-4701-a9ab-295180c89eb3: 3.50GB [03:26, 17.0MB/s]                            
Training/Bipbip/Haricot:   0%|          | 0/90 [00:00<?, ?it/s]
Training/Bipbip/Haricot:   1%|          | 1/90 [00:05<08:06,  5.46s/it]
Training/Bipbip/Haricot:   2%|▏         | 2/90 [00:14<09:36,  6.55s/it]
Training/Bipbip/Haricot:   3%|▎         | 3/90 [00:22<10:13,  7.06s/it]
Training/Bipbip/Haricot:   4%|▍         | 4/90 [00:27<09:00,  6.28s/it]
Training/Bipbip/Haricot:   6%|▌         | 5/90 [00:31<08:08,  5.74s/it]
Training/Bipbip/Haricot:   7%|▋         | 6/90 [00:36<07:35,  5.42s/it]
Training/Bipbip/Haricot:   8%|▊         | 7/90 [00:40<07:00,  5.06s/it]
Training/Bipbip/Haricot:   9%|▉         | 8/90 [00:45<06:45,  4.94s/it]
Training/Bipbip/Haricot:  10%|█         | 9/90 [00:49<06:30,  4.82s/it]
Training/Bipbip/Haricot:  11%|█         | 10/90 [00:53<06:07,  4.59s/it]
Training/Bipbip/Haricot:  12%|█▏        | 11/90 [00:58<05:52,  4.46s/it]
Training/Bipbip/Haricot:  13%|█▎        | 12/90 [01

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validation sanity check', layout=Layout…



HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Training', layout=Layout(flex='2'), max…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…




1

In [16]:
# ------------------------
# 6 START TESTING
# ------------------------
trainer.test(model=model, datamodule=dm)



HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Testing', layout=Layout(flex='2'), max=…


--------------------------------------------------------------------------------
DATALOADER:0 TEST RESULTS
{'val_loss': tensor(1.5278, device='cuda:0')}
--------------------------------------------------------------------------------


[{'val_loss': 1.5277734994888306}]

In [17]:

# ------------------------
# 8 SAVE THE SUBMISSION
# ------------------------
submission_fp = run_dir / "submission.json"
with open(submission_fp, "w") as f:
    json.dump(model.submission, f)
print(f"Submission saved to {submission_fp.resolve()}")

Submission saved to /content/drive/MyDrive/7022248/submission.json
