In [1]:
import os
import fileinput

## Notebook 1: Model Training

This notebook contrains code to train individual models on each of the nine areas located in the data. You will need to download said data at 10.5281/zenodo.10646992 if you wish to run it. Code to convert GeoTIFFs & Shapefiles into COCO format is not included in this repository as it will be released at a later date in a more polished package.

Cell 1: Merge COCO files for each of the individual areas into COCO files describing annotations in 8 of the 9 areas. The remaining area is set aside as the test set.

The original COCO files for each area should be located at this_repo/almorox/annotations/area_i_1024_COCO.json etc. for area i. The data in the zenodo repository is already structured this way.

In [1]:
%%capture

#Merge areas into single COCO datasets for cross-validation
#=================================

def merge_areas(areas:list, tile_size):
    """
    Merge areas specified in areas to one coco file
    e.g merge areas = [1,2,3,4,5,6,7,8] etc.
    """
    if len(areas) <= 1:
        raise ValueError("Specify more than one area")

    merged_areas = str(areas[0]) + str(areas[1])

    #Merge first two areas without any removal
    os.system(f"python merge.py \
        almorox/tiled_{tile_size}/annotations/area_{areas[0]}_{tile_size}_COCO.json \
        almorox/tiled_{tile_size}/annotations/area_{areas[1]}_{tile_size}_COCO.json \
        almorox/tiled_{tile_size}/annotations/area_{merged_areas}_{tile_size}_COCO.json")


    for i in range(2, len(areas)): #Loop over r.h area
        print(areas[i])
        os.system(f"python merge.py \
        almorox/tiled_{tile_size}/annotations/area_{merged_areas}_{tile_size}_COCO.json \
        almorox/tiled_{tile_size}/annotations/area_{areas[i]}_{tile_size}_COCO.json \
        almorox/tiled_{tile_size}/annotations/area_{merged_areas + str(areas[i])}_{tile_size}_COCO.json")

        os.system(f"rm almorox/tiled_{tile_size}/annotations/area_{merged_areas}_{tile_size}_COCO.json")
        merged_areas += str(areas[i])

test_areas = list(range(1,10))
area_lists = [[x for x in range(1,10) if x != test_area] for test_area in test_areas]

tile_size = 1024

for areas in area_lists:
    merge_areas(areas, tile_size)

NameError: name 'os' is not defined

Cell 2: Generate mmdetection config files to train a Mask R-CNN model for each of the COCO files generated in cell 1. A special template version of the config file with blanks in place of the relevant area numbers should be placed at this_repo/almorox/configs/BASE_almo_mask_rcnn_r101_fpn_mstrain-poly_3x_coco.py. 

Again, it's already in the right place in the zenodo repository.

In [24]:
#Generate cfg files for each area===========================================================
train_areas_str =  [''.join(map(str, area_list)) for area_list in area_lists]
test_areas_str = [str(test_area) for test_area in test_areas]
tile_size_str = str(tile_size)

base_cfg = "almorox/configs/BASE_almo_mask_rcnn_r101_fpn_mstrain-poly_3x_coco.py" 
out_dir = "almorox/configs/areas"

os.makedirs(out_dir, exist_ok=True)

with open(base_cfg, 'r') as base_file:
    base_data = base_file.read()

for train_areas, test_area in zip(train_areas_str, test_areas_str):
    data = base_data.replace("TSIZE", tile_size_str)
    data = data.replace("TRAINAREAS", train_areas)
    data = data.replace("TESTAREA", test_area)

    out_file = f"{out_dir}/TESTAREA{test_area}_almo_mask_rcnn_r101_fpn_mstrain-poly_3x_coco.py"
    with open(out_file, 'w') as out_file:
        out_file.write(data)


Cell 3: Finally, train models according to each of the configs. The outputs, including checkpoints, will be placed under this_repo/work_dirs

The models used in the publication are included in the zenodo repository, as is the required COCO pre-trained initialisation.

In [2]:
#Train all the models

config_dir = "almorox/configs/areas"

for test_area in ['1']:#test_areas_str:
    print("==================")
    print(f"TRAINING FOR TEST AREA {test_area}")
    print("==================")
    print("==================")
    print("==================")
    print("==================")
    print("==================")
    config_file = f"{config_dir}/TESTAREA{test_area}_almo_mask_rcnn_r101_fpn_mstrain-poly_3x_coco.py"
    os.system(f"python train.py {config_file}")

#!python ../train.py ../configs/mask_rcnn/almo_mask_rcnn_r101_fpn_mstrain-poly_3x_coco.py

TRAINING FOR TEST AREA 1


NameError: name 'os' is not defined