Code Overview:
This script provides a user-friendly approach for researchers without GPU resources to perform urban mapping globally.
Before executing this script, ensure you have met the following prerequisites:
- Acquired multimodal imagery for your target study area according to the instructions provided at: https://github.com/LauraChow77/GlobalUrbanMapper/tree/main/gee_code.
- Obtained the Global Urban Mapper (GUM) model checkpoint and its associated configuration file as per the guidelines detailed at: https://github.com/LauraChow77/GlobalUrbanMapper/tree/main/model.
- Configured your Google Colab notebook to use a T4 GPU by going to 'Runtime' -> 'Change runtime type' -> 'Hardware accelerator' and selecting 'T4 GPU'.

Assistance:
If you encounter any issues or have questions, please do not hesitate to contact me at 22042458r@connect.polyu.hk

# Set Up

Setup Section:
This section prepares the environment for the global urban mapping task by:
- Installing necessary dependencies.
- Importing required libraries.
- Authenticating Google Drive access for file retrieval (e.g., multimodal imagery, model files) and for saving model predictions.

## Install dependencies

Please restart the runtime after executing the cell BELOW. Navigate to 'Runtime' in the menu and select 'Restart Runtime'.

In [None]:
!pip install rasterio
!pip install ftfy
!pip3 install openmim
!mim install mmengine
!mim install "mmcv>=2.0.0"
!pip install mmsegmentation

Please restart the runtime after executing the cell ABOVE. Navigate to 'Runtime' in the menu and select 'Restart Runtime'.

## Import libraries

In [3]:
import ee
from google.colab import auth

import os
import rasterio
import numpy as np
from tqdm import tqdm

import torch
import torch.optim as optim
import torch.nn as nn
from torchvision import transforms
from torch.utils.data import Dataset
from torch.utils.data import DataLoader
from torch.utils.data import TensorDataset

from mmengine.registry import init_default_scope
from mmengine import Config
from mmseg.apis import inference_model, init_model
from mmseg.models import build_segmentor

## Mount Google Drive

In [1]:
from google.colab import drive
drive.mount('/content/drive', force_remount=True)

Mounted at /content/drive


# Global Variables

In [4]:
# Model Inference Configuration:
# These variables are set for performing inference with the model.

BATCH_SIZE = 1
NUM_WORKERS = 16
INPUT_CHANNELS = 10
DEVICE = 'cuda' if torch.cuda.is_available() else 'cpu'

MODEL_CONFIG =  'INSERT_PATH_TO_GUM_CHECKPOINT_CONFIG_HERE'
CKPT_PATH = 'INSERT_PATH_TO_GUM_CHECKPOINT_HERE'
INFERENCE_DATASET_DIR = 'INSERT_PATH_TO_INFERENCE_DATASET_HERE'
PREDICTED_DATASET_DIR = 'INSERT_PATH_TO_PREDICTED_RESULTS_HERE'

data_transforms = transforms.Compose([
    transforms.ToTensor(),
])

# Dataset

In [5]:
class GUM(Dataset):
    def __init__(self, root_dir, transform=None, inference_mode=False):
        """
        Args:
            root_dir (string): Directory with all the images.
            transform (callable, optional): Optional transform to be applied on a sample.
            inference_mode (bool, optional): Flag to indicate whether the dataset is used for inference.
        """
        self.root_dir = root_dir
        self.transform = transform
        self.inference_mode = inference_mode

        self.data = []

        if not self.inference_mode:  # Training/Validation mode
            for cls_name in self.class_names:
                cls_folder = os.path.join(root_dir, cls_name)
                for img_filename in os.listdir(cls_folder):
                    if img_filename.endswith('.tif'):
                        img_path = os.path.join(cls_folder, img_filename)
                        label_path = img_path.replace('img_stack_with_product', 'label')
                        self.data.append((img_path, label_path))
        else:  # Inference mode
            if os.path.isdir(root_dir):
                for img_filename in os.listdir(root_dir):
                    if img_filename.endswith('.tif'):
                        img_path = os.path.join(root_dir, img_filename)
                        self.data.append((img_path, None))
            else:
                raise FileNotFoundError(f"The directory {root_dir} does not exist.")

    def __len__(self):
        return len(self.data)

    def __getitem__(self, idx):
        img_path, label = self.data[idx]

        with rasterio.open(img_path) as data:
            multimodal_data = data.read()

        multimodal_data = np.nan_to_num(multimodal_data)

        # Handle s1 data
        s1 = multimodal_data[4:8, :, :]
        clip_s1_min, clip_s1_max = -25, 0
        s1 = np.clip(s1, clip_s1_min, clip_s1_max)
        s1 = (s1 - clip_s1_min)/25
        s1 = s1.astype(np.float32)
        multimodal_data[4:8, :, :] = s1

        # Handle dem data
        slope, aspect = multimodal_data[9:10, :, :], multimodal_data[10:11, :, :]
        slope = np.clip(slope, 0, 90)
        slope = slope / 90
        aspect = np.clip(aspect, 0, 360)
        aspect = aspect / 360
        slope_aspect = np.concatenate((slope, aspect), axis=0)
        # Replace the corresponding slices in multimodal_data with the modified slope and aspect
        multimodal_data[8:10, :, :] = slope_aspect

        img = multimodal_data[:10, :, :]

        # Manually transpose the array from (C, H, W) to (H, W, C)
        img = np.transpose(multimodal_data[:10, :, :], (1, 2, 0))

        # Apply the transformations
        if self.transform:
            img = self.transform(img)  # img is a numpy array and will be converted to a tensor here

        file_name = os.path.basename(img_path)
        results = {}
        results['filename'] = file_name
        results['ori_filename'] = file_name
        results['img'] = img
        results['img_shape'] = img.shape
        results['ori_shape'] = img.shape
        # Set initial values for default meta_keys
        results['pad_shape'] = img.shape
        results['scale_factor'] = 1.0
        num_channels = 1 if len(img.shape) < 3 else img.shape[2]
        results['img_norm_cfg'] = dict(
            mean=np.zeros(num_channels, dtype=np.float32),
            std=np.ones(num_channels, dtype=np.float32),
            to_rgb=False)
        results['flip'] = False
        # For inference, return the image and the file name
        if self.inference_mode:

            return img, results  # Return the image path instead of the label
        else:
            return img, label


        return img, label

# Main

Main Section:
This section carries out the global urban mapping task. It involves the following steps:
- Loading the necessary multimodal dataset and the pretrained model.
- Using the loaded model to predict urban areas within the provided multimodal data.

## Load data and model

In [None]:
# Prepare the multimodal dataset for prediction.
init_default_scope('mmseg')
inference_dataset = GUM(INFERENCE_DATASET_DIR, transform=data_transforms, inference_mode=True)
inference_loader = DataLoader(inference_dataset, batch_size=BATCH_SIZE, num_workers=NUM_WORKERS, shuffle=False)

# Initialize the Global Urban Mapper model for inference.
model = init_model(MODEL_CONFIG, CKPT_PATH, device=DEVICE)

## Prediction

In [None]:
model.eval()
with torch.no_grad():
    for inputs, img_metas in tqdm(inference_loader, desc='Inference Progress', unit='batch'):

        inputs = inputs.to(DEVICE)
        outputs = model.test_step([inputs])

        file_name = img_metas['filename'][0] # Assuming img_meta is a list of lists of dicts
        input_path = os.path.join(INFERENCE_DATASET_DIR, file_name)
        output_path = os.path.join(PREDICTED_DATASET_DIR, file_name)
        with rasterio.open(input_path) as src:
            profile = src.profile

        # Adjust the profile to match the output dimensions and data type
        output = outputs[0].pred_sem_seg.data.cpu().numpy()
        profile.update(
            dtype=np.uint8,
            count=1  # Update the number of bands to the output's number of bands
        )

        # Save the prediction
        if output.ndim == 2:
            output = output[np.newaxis, :, :]

        # Write the output with the same profile as the input image
        with rasterio.open(output_path, 'w', **profile) as dst:
            dst.write(output)