# Model-Assisted Labeling (MAL) Workflow with DeepLab & CVAT (Later ML Paint for comparison)

# Installing CVAT on Windows

## Step 1: Install Git for Windows

1. Download Git for Windows from [https://gitforwindows.org/](https://gitforwindows.org/).
2. Install Git, keeping all options as default.
3. Open the command prompt (`cmd`) and type the following command to check the Git version:
    ```bash
    git --version
    ```

## Step 2: Install Docker Desktop for Windows

1. Download [Docker Desktop for Windows](https://desktop.docker.com/win/main/amd64/Docker%20Desktop%20Installer.exe).
2. Double-click the Docker for Windows Installer to run the installer.
3. Follow the instructions for installation, and reboot the system after installation is complete.
4. Open the command prompt and check the Docker version:
    ```bash
    docker --version
    ```
5. Check the Docker Compose version:
    ```bash
    docker compose version
    ```

## Step 3: Install Google Chrome

1. Download and install [Google Chrome](https://www.google.com/chrome/), as it is the only browser supported by CVAT.

## Step 4: Clone CVAT Source Code

1. Clone CVAT source code from the [GitHub repository](https://github.com/opencv/cvat):
    ```bash
    git clone https://github.com/opencv/cvat
    cd cvat
    ```
2. Alternatively, check [alternatives](https://opencv.github.io/cvat/docs/administration/basics/installation/#how-to-get-cvat-source-code) for downloading specific release versions.

## Step 5: Run Docker Containers for CVAT

1. Run the following command to start Docker containers. This will download the latest CVAT release and other required images:
    ```bash
    docker compose up -d
    ```
2. Optionally, specify the CVAT version using the CVAT_VERSION environment variable:
    ```bash
    CVAT_VERSION=dev docker compose up -d
    ```
3. Check the status of the containers:
    ```bash
    docker ps
    ```
4. Wait until the CVAT server is up and running:
    ```bash
    docker logs cvat_server -f
    ```
5. Run the CVAT server:
    ```bash
    docker exec -it cvat_server bash
    ```
6. For the first-time setup, create a superuser account:
    ```bash
    python3 manage.py createsuperuser
    ```
    Choose a username and password for the admin account.

## Step 6: Access CVAT in Google Chrome

1. Open Google Chrome and go to `localhost:8080`.
2. Log in with the superuser credentials created earlier.
3. You should now be able to create a new annotation task.

## Workflow
To stop and remove the container, simply type, 
```bash
docker compose down
```
And to start cvat again, simply type 
```bash
docker compose up -d
```
Make sure you're in the correct directory

In [None]:
import sys
import os
import numpy as np
import torch
import torch.nn as nn
from pathlib import Path
from PIL import Image
import matplotlib.pyplot as plt
from tqdm import tqdm
import json
import shutil
from typing import Dict, List

# Add project to path
sys.path.insert(0, '../../')
from models import DeepLab

# Check GPU
print(f"PyTorch: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"Device: {torch.cuda.get_device_name(0)}")
    print(f"Memory: {torch.cuda.get_device_properties(0).total_memory / 1024**3:.1f} GB")

device = 'cuda' if torch.cuda.is_available() else 'cpu'
print(f"Using device: {device}\n")

# Configuration
CLASS_NAMES = ['background', 'building', 'woodland', 'water', 'road']
NUM_CLASSES = len(CLASS_NAMES)
TILE_SIZE = 512

# Define workspace directories
WORKSPACE_DIR = Path("mal_workspace")
TILES_DIR = WORKSPACE_DIR / "01_tiles"
PREDICTIONS_DIR = WORKSPACE_DIR / "02_predictions"
CORRECTED_MASKS_DIR = WORKSPACE_DIR / "03_corrected_masks"
RECONSTRUCTED_DIR = WORKSPACE_DIR / "04_reconstructed"

print(f"‚úì Setup complete | Device: {device}")

In [None]:
# ===== UPDATE THESE PATHS =====
INPUT_GEOTIFF = "cvat_test/images/M-33-7-A-d-2-3.tif"  # Change this path as needed
CHECKPOINT_PATH = "experiments/Deeplab_Landcover_Edited/best_model.pth"  # Change this to your model
# ==============================

# Verify files exist
if not Path(INPUT_GEOTIFF).exists():
    print(f"‚ö†Ô∏è  WARNING: {INPUT_GEOTIFF} not found!")
else:
    print(f"‚úì Input GeoTIFF: {INPUT_GEOTIFF}")

if not Path(CHECKPOINT_PATH).exists():
    print(f"‚ö†Ô∏è  WARNING: {CHECKPOINT_PATH} not found!")
else:
    print(f"‚úì Checkpoint: {CHECKPOINT_PATH}")

print(f"\nWorkspace: {WORKSPACE_DIR}")
print(f"Classes: {CLASS_NAMES}")

In [None]:
def load_model(checkpoint_path: str) -> DeepLab:
    """Load trained DeepLab model."""
    model = DeepLab(
        num_classes=NUM_CLASSES,
        input_image_size=TILE_SIZE,
        backbone='resnet50',
        output_stride=4
    ).to(device)
    
    checkpoint = torch.load(checkpoint_path, map_location=device)
    
    if isinstance(checkpoint, dict) and 'model' in checkpoint:
        model.load_state_dict(checkpoint['model'])
    else:
        model.load_state_dict(checkpoint)
    
    model.eval()
    
    # Ensure all dropout and batch norm layers are in eval mode
    for module in model.modules():
        if isinstance(module, (nn.Dropout, nn.BatchNorm2d)):
            module.eval()
    
    print(f"‚úì Model loaded from {Path(checkpoint_path).name}\n")
    return model


def normalize_image(image: np.ndarray) -> torch.Tensor:
    """Normalize image to [-2, 2] range using ImageNet stats."""
    image = image.astype(np.float32) / 255.0
    image = torch.from_numpy(image).permute(2, 0, 1)
    
    mean = torch.tensor([0.485, 0.456, 0.406]).view(3, 1, 1)
    std = torch.tensor([0.229, 0.224, 0.225]).view(3, 1, 1)
    image = (image - mean) / std
    
    return image


def predict_single_tile(model: DeepLab, image_path: str, debug: bool = False) -> np.ndarray:
    """Generate prediction mask for one tile."""
    image = np.array(Image.open(image_path).convert('RGB'))
    image_tensor = normalize_image(image).unsqueeze(0).to(device)
    
    with torch.no_grad():
        output = model(image_tensor)
        
        if debug:
            print(f"Output shape: {output.shape}")
            print(f"Output min/max: {output.min():.4f} / {output.max():.4f}")
            print(f"Class logits sample (center pixel): {output[0, :, 256, 256]}")
        
        mask = torch.argmax(output, dim=1).squeeze(0).cpu().numpy()
        
        if debug:
            print(f"Mask unique classes: {np.unique(mask)}\n")
    
    return mask.astype(np.uint8)


def generate_all_predictions(model: DeepLab, tiles_dir: str, output_dir: str) -> List[str]:
    """Generate predictions for all tiles in a directory."""
    os.makedirs(output_dir, exist_ok=True)
    
    tile_files = sorted([f for f in os.listdir(tiles_dir) if f.endswith('.png')])
    print(f"Generating predictions for {len(tile_files)} tiles...")
    
    # Test first tile with debug info
    if tile_files:
        first_tile = Path(tiles_dir) / tile_files[0]
        print(f"\nüîç Testing first tile: {tile_files[0]}")
        test_mask = predict_single_tile(model, str(first_tile), debug=True)
    
    prediction_paths = []
    for tile_file in tqdm(tile_files, desc="Predicting"):
        tile_path = Path(tiles_dir) / tile_file
        pred_mask = predict_single_tile(model, str(tile_path))
        
        pred_name = tile_file.replace('.png', '_pred.png')
        pred_path = Path(output_dir) / pred_name
        
        Image.fromarray(pred_mask).save(pred_path)
        prediction_paths.append(pred_path)
    
    print(f"‚úì Generated {len(prediction_paths)} predictions\n")
    return prediction_paths

print("‚úì Inference functions loaded")

In [None]:
def visualize_predictions(tiles_dir: str, predictions_dir: str, num_samples: int = 4):
    """
    Display sample tiles with predictions.
    
    Shows: Original Image | Predicted Mask | Overlay
    """
    tile_files = sorted([f for f in os.listdir(tiles_dir) if f.endswith('.png')])
    indices = np.random.choice(len(tile_files), min(num_samples, len(tile_files)), replace=False)
    
    class_colors = {
        0: [0.0, 0.0, 0.0],       # background - black
        1: [1.0, 0.0, 0.0],       # building - red
        2: [0.0, 0.5, 0.0],       # woodland - green
        3: [0.0, 0.0, 1.0],       # water - blue
        4: [1.0, 1.0, 0.0],       # road - yellow
    }
    
    fig, axes = plt.subplots(len(indices), 3, figsize=(14, 4*len(indices)))
    fig.suptitle('Sample Predictions (Image | Mask | Overlay)', fontsize=14, fontweight='bold')
    
    for row, idx in enumerate(indices):
        tile_file = tile_files[idx]
        tile_path = Path(tiles_dir) / tile_file
        pred_file = tile_file.replace('.png', '_pred.png')
        pred_path = Path(predictions_dir) / pred_file
        
        image = np.array(Image.open(tile_path))
        pred_mask = np.array(Image.open(pred_path))
        
        # Column 1: Image
        axes[row, 0].imshow(image)
        axes[row, 0].set_title(f'Image', fontsize=10)
        axes[row, 0].axis('off')
        
        # Column 2: Mask
        axes[row, 1].imshow(pred_mask, cmap='tab10', vmin=0, vmax=4)
        axes[row, 1].set_title(f'Mask (classes: {np.unique(pred_mask)})', fontsize=10)
        axes[row, 1].axis('off')
        
        # Column 3: Overlay
        mask_rgb = np.zeros((*pred_mask.shape, 3))
        for class_id, color in class_colors.items():
            mask_rgb[pred_mask == class_id] = color
        overlay = 0.65 * (image / 255.0) + 0.35 * mask_rgb
        axes[row, 2].imshow(overlay)
        axes[row, 2].set_title('Overlay', fontsize=10)
        axes[row, 2].axis('off')
    
    plt.tight_layout()
    return fig


In [None]:
def extract_cvat_labelmap(labelmap_txt_path: str) -> Dict[int, tuple]:
    """
    Extract color mapping from labelmap.txt file.
    
    Args:
        labelmap_txt_path: Path to labelmap.txt
    
    Returns:
        Dict mapping class_id to RGB tuple: {0: (0, 0, 0), 1: (250, 50, 83), ...}
    """
    from pathlib import Path
    
    labelmap_path = Path(labelmap_txt_path)
    
    if not labelmap_path.exists():
        print(f"‚ö†Ô∏è  Warning: {labelmap_path} not found")
        return None
    
    try:
        with open(labelmap_path, 'r') as f:
            labelmap_content = f.read()
        
        color_map = {}
        class_names = ['background', 'building', 'road', 'water', 'woodland']
        
        for line in labelmap_content.strip().split('\n'):
            if not line or line.startswith('#'):
                continue
            
            # Parse: "label:color_rgb:parts:actions"
            parts = line.split(':')
            if len(parts) < 2:
                continue
            
            label = parts[0].strip()
            color_str = parts[1].strip()
            
            # Parse color: "R,G,B"
            try:
                r, g, b = map(int, color_str.split(','))
                if label in class_names:
                    class_id = class_names.index(label)
                    color_map[class_id] = (r, g, b)
                    print(f"  Loaded: class {class_id} ({label}) ‚Üí RGB({r}, {g}, {b})")
            except ValueError:
                continue
        
        if not color_map:
            print(f"‚ö†Ô∏è  Could not parse any colors from {labelmap_path}")
            return None
        
        return color_map
    except Exception as e:
        print(f"‚ö†Ô∏è  Error reading labelmap: {e}")
        return None

print("‚úì Helper functions loaded")

In [None]:
def create_cvat_segmentation_mask_export(
    predictions_dir: str,
    output_zip: str,
    class_names: List[str],
    class_colors: Dict[int, tuple] = None
):
    """
    Create a CVAT-compatible Segmentation Mask ZIP file for importing predictions.

    Args:
        predictions_dir: Directory with prediction masks (*_pred.png files)
        output_zip: Path to save the ZIP file (e.g., "mal_workspace/cvat_export.zip")
        class_names: List of class names (e.g., ['background', 'building', 'woodland', 'water', 'road'])
        class_colors: Dict mapping class_id to RGB tuple for labelmap.txt
                     If None, uses default Pascal VOC colors

    Creates this ZIP structure (Required by CVAT. More details check documentation https://docs.cvat.ai/docs/dataset_management/formats/):
        archive.zip/
        ‚îú‚îÄ‚îÄ labelmap.txt
        ‚îú‚îÄ‚îÄ ImageSets/Segmentation/default.txt
        ‚îú‚îÄ‚îÄ SegmentationClass/
        ‚îÇ   ‚îú‚îÄ‚îÄ tile_00000_00000.png
        ‚îÇ   ‚îú‚îÄ‚îÄ tile_00000_00512.png
        ‚îÇ   ‚îî‚îÄ‚îÄ ...
        ‚îî‚îÄ‚îÄ SegmentationObject/
            ‚îú‚îÄ‚îÄ tile_00000_00000.png
            ‚îú‚îÄ‚îÄ tile_00000_00512.png
            ‚îî‚îÄ‚îÄ ...
    """
    import zipfile
    import tempfile

    # Default Pascal VOC colors if not provided
    if class_colors is None:
        class_colors = {
            0: (0, 0, 0),           # background - black
            1: (128, 0, 0),         # building - maroon
            2: (0, 128, 0),         # woodland - dark green
            3: (0, 0, 128),         # water - navy blue
            4: (128, 128, 0),       # road - olive
        }

    # Create temporary directory for ZIP contents
    with tempfile.TemporaryDirectory() as temp_dir:
        temp_path = Path(temp_dir)

        # Create directory structure
        segmentation_class_dir = temp_path / "SegmentationClass"
        segmentation_object_dir = temp_path / "SegmentationObject"
        imageset_dir = temp_path / "ImageSets" / "Segmentation"

        segmentation_class_dir.mkdir(parents=True, exist_ok=True)
        segmentation_object_dir.mkdir(parents=True, exist_ok=True)
        imageset_dir.mkdir(parents=True, exist_ok=True)

        # ===== CREATE labelmap.txt =====
        labelmap_path = temp_path / "labelmap.txt"
        with open(labelmap_path, 'w') as f:
            for class_id, class_name in enumerate(class_names):
                color = class_colors.get(class_id, (0, 0, 0))
                f.write(f"{class_name}:{color[0]},{color[1]},{color[2]}::\n")

        print(f"Created labelmap.txt with {len(class_names)} classes")

        # ===== GET PREDICTION FILES =====
        pred_files = sorted([f for f in os.listdir(predictions_dir) if f.endswith('.png')])
        image_names = []

        print(f"Processing {len(pred_files)} prediction masks...")

        # ===== COPY MASKS TO BOTH DIRECTORIES =====
        for pred_file in tqdm(pred_files, desc="Copying masks"):
            pred_path = Path(predictions_dir) / pred_file

            # Extract base name (remove '_pred.png' or '.png')
            if pred_file.endswith('_pred.png'):
                base_name = pred_file.replace('_pred.png', '')
            else:
                base_name = pred_file.replace('.png', '')
            
            image_names.append(base_name)

            # Load mask
            mask = np.array(Image.open(pred_path))
            
            # For semantic segmentation, both directories get the same mask
            # (SegmentationClass = class per pixel, SegmentationObject = same for semantic)
            Image.fromarray(mask).save(segmentation_class_dir / f"{base_name}.png")
            Image.fromarray(mask).save(segmentation_object_dir / f"{base_name}.png")

        # ===== CREATE ImageSets/Segmentation/default.txt =====
        default_txt = imageset_dir / "default.txt"
        with open(default_txt, 'w') as f:
            for name in image_names:
                f.write(name + '\n')

        print(f"Created ImageSets/Segmentation/default.txt with {len(image_names)} entries")

        # ===== CREATE ZIP ARCHIVE =====
        os.makedirs(os.path.dirname(output_zip), exist_ok=True)
        
        with zipfile.ZipFile(output_zip, 'w', zipfile.ZIP_DEFLATED) as zipf:
            # Add all files from temp directory to ZIP
            for root, dirs, files in os.walk(temp_path):
                for file in files:
                    file_path = Path(root) / file
                    # Get relative path from temp_path for ZIP archive
                    arcname = file_path.relative_to(temp_path)
                    zipf.write(file_path, arcname)

        # Print results
        zip_size_mb = Path(output_zip).stat().st_size / (1024**2)
        print(f"\n‚úì CVAT Segmentation Mask export created successfully!")
        print(f"  Output ZIP: {output_zip}")
        print(f"  Size: {zip_size_mb:.1f} MB")
        print(f"  Classes: {len(class_names)}")
        print(f"  Masks: {len(image_names)}")

print("‚úì CVAT export function loaded")

In [None]:
print("=" * 60)
print("PHASE 1: GENERATE PREDICTIONS")
print("=" * 60)

# Load model
model = load_model(CHECKPOINT_PATH)

# Generate predictions
prediction_paths = generate_all_predictions(
    model=model,
    tiles_dir=str(TILES_DIR),
    output_dir=str(PREDICTIONS_DIR)
)

In [None]:
# Create CVAT-compatible export
create_cvat_segmentation_mask_export(
    predictions_dir=str(PREDICTIONS_DIR),
    output_zip="mal_workspace/cvat_predictions_export.zip",
    class_names=CLASS_NAMES,
    class_colors={
        0: (0, 0, 0),         # background - black
        1: (128, 0, 0),       # building - maroon
        2: (0, 128, 0),       # woodland - dark green
        3: (0, 0, 128),       # water - navy blue
        4: (128, 128, 0),     # road - olive
    }
)

In [None]:
print("=" * 60)
print("PHASE 2: VISUALIZE PREDICTIONS")
print("=" * 60)

fig = visualize_predictions(
    tiles_dir=str(TILES_DIR),
    predictions_dir=str(PREDICTIONS_DIR),
    num_samples=6  # Change to see more/fewer samples
)

plt.savefig(WORKSPACE_DIR / "predictions_preview.png", dpi=100, bbox_inches='tight')
plt.show()

print(f"Preview saved: {WORKSPACE_DIR / 'predictions_preview.png'}")

## Manual Annotation in CVAT

**Important**: This step happens in the CVAT web UI (`http://localhost:8080`), not in this notebook.

### Step 1: Prepare Files
- Tiles are ready in: `mal_workspace/01_tiles/`
- Predictions are ready in: `mal_workspace/02_predictions/`

### Step 2: Create CVAT Project
1. Open http://localhost:8080 in your browser
2. Click "Create a new project"
3. Name it however you want
4. Add labels at constructor (Important for CVAT to recognize the annotations):
   - `background` (class 0)
   - `building` (class 1)
   - `woodland` (class 2)
   - `water` (class 3)
   - `road` (class 4)
5. Submit and open

### Step 3: Create CVAT Task
1. Create a task
2. Name however you want
3. Upload the images you want to annotate. It should be `mal_workspace/01_tiles/`
4. Submit and open

### Step 4: Import Initial Predictions
1. Click Menu ‚Üí Upload annotations
2. Choose Segmentation mask 1.1
3. Upload the ZIP file `mal_workspace\cvat_predictions_export.zip`

### Step 5: Refine Annotations
1. For each tile:
   - Use **Brush** tool to add pixels
   - Use **Eraser** to remove pixels
2. Save changes frequently
(There should be a better workflow. Investigating)

### Step 6: Export Corrected Masks
1. Click Menu ‚Üí Export task dataset
2. Select "Segmentation mask 1.1" format
3. Download ZIP file
4. Extract SegmentationClass folder to: `mal_workspace/03_corrected_masks/`