The notebook mounts Google Drive, reads train.json from “Ursa Space 1A,” and converts each sample’s band_1 and band_2 lists into 75×75 float images, stacked as a two-channel tensor [N, 2, 75, 75]. Labels (is_iceberg) and incidence angles are loaded; angles are imputed with the train-set median for missing values and optionally z-normalized. A stratified 80/20 split creates train/validation indices. Per-channel statistics (mean, std) are computed on the train tensor and used to z-normalize both splits; all arrays plus angles are saved to a compact artifact sar_fullres_splits.npz for reproducibility. The model IcebergVesselCNN is a compact convolutional trunk (Conv-ReLU-MaxPool blocks → AdaptiveAvgPool to a 128-d feature) with an angle-fusion head that concatenates the (optionally normalized) angle to the pooled feature before a small MLP classifier. Training uses AdamW with cross-entropy loss, optional mixed precision (AMP), light SAR-appropriate augmentations (random horizontal/vertical flips and 90° rotations), and a ReduceLROnPlateau scheduler keyed to validation AUC, with early stopping based on AUC to prevent overfitting. After each epoch, accuracy and AUC are computed on the validation set and the best checkpoint is stored as best_iceberg_vessel_cnn.pt. Final reporting includes accuracy, ROC-AUC, and a confusion matrix (e.g., TN/FP/FN/TP), confirming strong performance on the iceberg-vs-vessel classification task.

In [13]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [1]:
from google.colab import userdata
import os

%cd /content/Ursa-Space-1A

os.environ["GITHUB_TOKEN"] = userdata.get("GITHUB_TOKEN")

# set the push URL to use your token (do this once per session)
!git remote set-url --push origin https://x-access-token:${GITHUB_TOKEN}@github.com/zainakhalil/Ursa-Space-1A.git


/content/Ursa-Space-1A


In [18]:
%cd /content/Ursa-Space-1A
!git add -A
!git commit -m "update"
!git push origin HEAD:main

/content/Ursa-Space-1A
On branch main
Your branch is up to date with 'origin/main'.

nothing to commit, working tree clean
Everything up-to-date


In [None]:
# 0) Be in the repo
%cd /content/Ursa-Space-1A
!git rev-parse --is-inside-work-tree

# 1) See what Git sees
!git status

# 2) List files here (confirm your notebook is actually in this folder)
!ls -lah

# 3) If your notebook is elsewhere, copy it in (pick ONE that matches your setup):

# From /content (typical Colab default):
# !cp "/content/YourNotebookName.ipynb" .

# From Drive (example path):
# !cp "/content/drive/MyDrive/Colab Notebooks/YourNotebookName.ipynb" .

# 4) Check if .gitignore is hiding notebooks
!grep -n "\.ipynb" .gitignore || echo "No .ipynb rules in .gitignore"

# If it prints a line with *.ipynb, remove/comment that line or rename the file to be tracked.

# 5) Stage, commit, push (minimal)
!git add -A
!git status
!git commit -m "Update notebook and outputs" || echo "No changes to commit"
!git push origin HEAD:main 2>/dev/null || git push origin HEAD:master


In [2]:
import pandas as pd

!ls "/content/drive/MyDrive/Ursa Space 1A/train.json"
df = pd.read_json('/content/drive/MyDrive/Ursa Space 1A/train.json')


'/content/drive/MyDrive/Ursa Space 1A/train.json'


In [3]:
import math, numpy as np, pandas as pd
from sklearn.model_selection import train_test_split
from pathlib import Path

DATA_PATH = Path("/content/drive/MyDrive/Ursa Space 1A/train.json")
ART_DIR   = Path("/content/artifacts"); ART_DIR.mkdir(parents=True, exist_ok=True)

df = pd.read_json(DATA_PATH)

def to_img(seq):
    a = np.array(seq, dtype=np.float32)
    side = int(round(math.sqrt(a.size)))
    assert side*side == a.size, f"Non-square band length {a.size}"
    return a.reshape(side, side)

# [N, 2, H, W]  (two channels: band_1, band_2)
X = np.stack([np.stack([to_img(b1), to_img(b2)], axis=0)
              for b1, b2 in zip(df["band_1"], df["band_2"])], axis=0)
y = df["is_iceberg"].astype(np.int64).to_numpy()
inc_angle = pd.to_numeric(df["inc_angle"], errors="coerce").to_numpy()

# stratified 80/20 split
idx = np.arange(len(y))
idx_tr, idx_va = train_test_split(idx, test_size=0.2, random_state=1234, stratify=y)
X_tr, X_va = X[idx_tr], X[idx_va]
y_tr, y_va = y[idx_tr], y[idx_va]
ang_tr, ang_va = inc_angle[idx_tr], inc_angle[idx_va]

# per-channel z-normalization using train stats
m = X_tr.mean(axis=(0,2,3), keepdims=True)
s = X_tr.std(axis=(0,2,3), keepdims=True) + 1e-6
X_tr = (X_tr - m)/s
X_va = (X_va - m)/s

np.savez_compressed(ART_DIR / "sar_fullres_splits.npz",
    X_train=X_tr, y_train=y_tr, X_val=X_va, y_val=y_va,
    means=m.squeeze(), stds=s.squeeze(), side=int(X.shape[-1]),
    inc_angle_train=ang_tr, inc_angle_val=ang_va
)
X_tr.shape, X_va.shape


((1283, 2, 75, 75), (321, 2, 75, 75))

In [4]:
import numpy as np, torch, torch.nn as nn, torch.optim as optim
from torch.utils.data import TensorDataset, DataLoader
from pathlib import Path

data = np.load("/content/artifacts/sar_fullres_splits.npz")
Xtr, Xva = data["X_train"], data["X_val"]
ytr, yva = data["y_train"], data["y_val"]

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
Xtr_t = torch.tensor(Xtr, dtype=torch.float32)
Xva_t = torch.tensor(Xva, dtype=torch.float32)
ytr_t = torch.tensor(ytr, dtype=torch.long)
yva_t = torch.tensor(yva, dtype=torch.long)

train_loader = DataLoader(TensorDataset(Xtr_t, ytr_t), batch_size=64, shuffle=True)
val_loader   = DataLoader(TensorDataset(Xva_t, yva_t), batch_size=128, shuffle=False)

class SmallCNN(nn.Module):
    def __init__(self, in_ch=2, num_classes=2):
        super().__init__()
        self.net = nn.Sequential(
            nn.Conv2d(in_ch, 16, 3, padding=1), nn.ReLU(),
            nn.MaxPool2d(2),
            nn.Conv2d(16, 32, 3, padding=1), nn.ReLU(),
            nn.MaxPool2d(2),
            nn.Conv2d(32, 64, 3, padding=1), nn.ReLU(),
            nn.AdaptiveAvgPool2d(1),
        )
        self.fc = nn.Linear(64, num_classes)
    def forward(self, x):
        x = self.net(x).view(x.size(0), -1)
        return self.fc(x)

model = SmallCNN().to(device)
opt = optim.Adam(model.parameters(), lr=1e-3)
crit = nn.CrossEntropyLoss()

def run_epoch(dl, train=True):
    model.train() if train else model.eval()
    tot=correct=loss_sum=0
    for xb,yb in dl:
        xb,yb = xb.to(device), yb.to(device)
        if train: opt.zero_grad()
        with torch.set_grad_enabled(train):
            logits = model(xb); loss = crit(logits, yb)
            if train: loss.backward(); opt.step()
        pred = logits.argmax(1)
        correct += (pred==yb).sum().item()
        tot += yb.numel(); loss_sum += loss.item()*yb.size(0)
    return loss_sum/tot, correct/tot

for ep in range(8):
    tr_loss,tr_acc = run_epoch(train_loader, True)
    va_loss,va_acc = run_epoch(val_loader, False)
    print(f"Epoch {ep+1}: tr {tr_loss:.4f}/{tr_acc:.3f} | va {va_loss:.4f}/{va_acc:.3f}")


Epoch 1: tr 0.6639/0.567 | va 0.6381/0.623
Epoch 2: tr 0.6531/0.610 | va 0.6387/0.611
Epoch 3: tr 0.6456/0.627 | va 0.6285/0.660
Epoch 4: tr 0.6366/0.641 | va 0.6360/0.639
Epoch 5: tr 0.6320/0.659 | va 0.6081/0.670
Epoch 6: tr 0.6229/0.652 | va 0.5872/0.695
Epoch 7: tr 0.6014/0.672 | va 0.5672/0.688
Epoch 8: tr 0.5921/0.661 | va 0.5733/0.713


In [5]:
# ==================================
# Config Small CNN + Small Utilities
# ==================================
from dataclasses import dataclass
from pathlib import Path
import numpy as np
import torch, torch.nn as nn, torch.optim as optim
from torch.utils.data import TensorDataset, DataLoader
from sklearn.metrics import accuracy_score, roc_auc_score, confusion_matrix

@dataclass
class Config:
    # Paths
    npz_path: str = "/content/artifacts/sar_fullres_splits.npz"
    save_dir: str = "/content/artifacts"
    # Training
    batch_size: int = 64
    epochs: int = 20
    lr: float = 1e-3
    weight_decay: float = 1e-4
    # Early stopping / LR schedule
    patience: int = 5          # epochs with no AUC improvement to stop
    lr_patience: int = 2       # plateaus before LR reduce
    lr_gamma: float = 0.5
    # Augmentations
    hflip_p: float = 0.5
    vflip_p: float = 0.5
    rot90_p: float = 0.5       # random k*90° rotation
    # Incidence angle fusion
    use_angle: bool = True     # set False to ignore angle
    norm_angle: bool = True    # z-norm angle using train stats

cfg = Config()
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
Path(cfg.save_dir).mkdir(parents=True, exist_ok=True)
print("Using device:", device)


Using device: cpu


In [6]:
# ===========================
# Load NPZ + Prepare Tensors
# ===========================
D = np.load(cfg.npz_path)

Xtr = D["X_train"]          # [N, 2, 75, 75]
Xva = D["X_val"]
ytr = D["y_train"].astype(np.int64)
yva = D["y_val"].astype(np.int64)

# Angle arrays can have NaN; we'll impute with train median, then optionally z-norm
if cfg.use_angle:
    ang_tr = D["inc_angle_train"].astype(np.float32)
    ang_va = D["inc_angle_val"].astype(np.float32)

    # Impute missing angles with train median (robust/simple)
    train_med = np.nanmedian(ang_tr)
    ang_tr = np.where(np.isfinite(ang_tr), ang_tr, train_med)
    ang_va = np.where(np.isfinite(ang_va), ang_va, train_med)

    if cfg.norm_angle:
        mu, sd = ang_tr.mean(), ang_tr.std() + 1e-6
        ang_tr = (ang_tr - mu) / sd
        ang_va = (ang_va - mu) / sd
else:
    # If angle is disabled, create zeros so the loaders still output (xb, ab, yb) triples
    ang_tr = np.zeros(len(ytr), dtype=np.float32)
    ang_va = np.zeros(len(yva), dtype=np.float32)

# ---------------------------
# PyTorch Tensor Conversion
# ---------------------------
Xtr_t = torch.tensor(Xtr, dtype=torch.float32)
Xva_t = torch.tensor(Xva, dtype=torch.float32)
ytr_t = torch.tensor(ytr, dtype=torch.long)
yva_t = torch.tensor(yva, dtype=torch.long)
ang_tr_t = torch.tensor(ang_tr, dtype=torch.float32).view(-1, 1)
ang_va_t = torch.tensor(ang_va, dtype=torch.float32).view(-1, 1)

# ---------------------------
# Lightweight Augmentations
# ---------------------------
# We implement custom tensor-based flips/rotations (works for 2-channel SAR)
import random

def aug_batch(xb):
    """Apply in-place random flips/rot90 to a batch of images [B,2,H,W]."""
    B = xb.size(0)
    for i in range(B):
        if random.random() < cfg.hflip_p:
            xb[i] = torch.flip(xb[i], dims=[2])  # horizontal flip (W axis)
        if random.random() < cfg.vflip_p:
            xb[i] = torch.flip(xb[i], dims=[1])  # vertical flip (H axis)
        if random.random() < cfg.rot90_p:
            k = random.randint(1,3)
            xb[i] = torch.rot90(xb[i], k, dims=[1,2])  # rotate k*90°
    return xb

In [10]:
# ---------------------------------------
# DataLoaders (always yield (xb, ab, yb))
# ---------------------------------------
from torch.utils.data import TensorDataset, DataLoader

train_ds = TensorDataset(Xtr_t, ang_tr_t, ytr_t)
val_ds   = TensorDataset(Xva_t, ang_va_t, yva_t)

train_loader = DataLoader(train_ds, batch_size=cfg.batch_size, shuffle=True,  num_workers=2)
val_loader   = DataLoader(val_ds,   batch_size=cfg.batch_size*2, shuffle=False, num_workers=2)


In [11]:
# ===========================
# CNN with Angle Fusion
# ===========================
class IcebergVesselCNN(nn.Module):
    """
    A compact CNN trunk for 2-channel SAR (band_1, band_2) + optional angle fusion.
    Trunk: Conv -> ReLU -> Pool x3 + GlobalAvgPool -> feature vector (64 dims)
    Head:  If use_angle, concat [feat, angle] -> Linear -> logits(2)
    """
    def __init__(self, in_ch=2, use_angle=True):
        super().__init__()
        self.use_angle = use_angle
        self.trunk = nn.Sequential(
            nn.Conv2d(in_ch, 32, 3, padding=1), nn.ReLU(),
            nn.MaxPool2d(2),  # 75->37
            nn.Conv2d(32, 64, 3, padding=1), nn.ReLU(),
            nn.MaxPool2d(2),  # 37->18
            nn.Conv2d(64, 128, 3, padding=1), nn.ReLU(),
            nn.MaxPool2d(2),  # 18->9
            nn.Conv2d(128, 128, 3, padding=1), nn.ReLU(),
            nn.AdaptiveAvgPool2d(1),          # -> [B,128,1,1]
        )
        feat_dim = 128
        head_in = feat_dim + (1 if use_angle else 0)
        self.head = nn.Sequential(
            nn.Linear(head_in, 64), nn.ReLU(),
            nn.Dropout(0.2),
            nn.Linear(64, 2)
        )

    def forward(self, x, angle=None):
        f = self.trunk(x).view(x.size(0), -1)  # [B,128]
        if self.use_angle:
            assert angle is not None, "Angle tensor is required when use_angle=True"
            f = torch.cat([f, angle], dim=1)   # [B,129]
        return self.head(f)

model = IcebergVesselCNN(in_ch=2, use_angle=cfg.use_angle).to(device)


In [12]:
# =============================
# Training Setup (model/opt/lr)
# =============================
from pathlib import Path
import numpy as np
import torch, torch.nn as nn, torch.optim as optim
from torch.optim.lr_scheduler import ReduceLROnPlateau

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Model (2 SAR bands; angle fusion controlled by cfg.use_angle)
model = IcebergVesselCNN(in_ch=2, use_angle=cfg.use_angle).to(device)

# Loss / Optimizer / Scheduler
criterion = nn.CrossEntropyLoss()
optimizer = optim.AdamW(model.parameters(), lr=cfg.lr, weight_decay=cfg.weight_decay)
scheduler = ReduceLROnPlateau(optimizer, mode="max", factor=0.5, patience=cfg.lr_patience)

# ===========================
# AMP (Mixed Precision)
# ===========================
try:
    from torch.amp import autocast, GradScaler   # modern API
    use_amp = torch.cuda.is_available()
    scaler  = GradScaler('cuda' if use_amp else 'cpu')
except Exception:
    # Fallback for older torch
    from torch.cuda.amp import autocast, GradScaler
    use_amp = torch.cuda.is_available()
    scaler  = GradScaler() if use_amp else None

# ===========================
# One Epoch (train/eval)
# ===========================
def run_epoch(loader, train=True):
    """One epoch. Returns (avg_loss, accuracy, auc)."""
    model.train() if train else model.eval()
    losses, probs_all, ys_all = [], [], []

    for xb, ab, yb in loader:
        xb, ab, yb = xb.to(device), ab.to(device), yb.to(device)

        # Apply augmentation only during training
        if train:
            xb = aug_batch(xb)

        if train:
            optimizer.zero_grad(set_to_none=True)
            with autocast(device_type='cuda', enabled=use_amp):
                logits = model(xb, ab) if cfg.use_angle else model(xb)
                loss   = criterion(logits, yb)

            if use_amp:
                scaler.scale(loss).backward()
                scaler.step(optimizer)
                scaler.update()
            else:
                loss.backward()
                optimizer.step()
        else:
            with torch.no_grad():
                with autocast(device_type='cuda', enabled=use_amp):
                    logits = model(xb, ab) if cfg.use_angle else model(xb)
                    loss   = criterion(logits, yb)

        # ---- bookkeeping
        losses.append(loss.item())
        prob = torch.softmax(logits, dim=1)[:, 1].detach().cpu().numpy()
        y_np = yb.detach().cpu().numpy()
        probs_all.append(prob); ys_all.append(y_np)

    from sklearn.metrics import accuracy_score, roc_auc_score
    y_true = np.concatenate(ys_all)
    y_prob = np.concatenate(probs_all)
    y_pred = (y_prob >= 0.5).astype(np.int64)

    acc = accuracy_score(y_true, y_pred)
    try:
        auc = roc_auc_score(y_true, y_prob)
    except Exception:
        auc = float("nan")

    return float(np.mean(losses)), acc, auc

# ===========================
# Training Loop
# ===========================
best_auc = 0.0
epochs_no_improve = 0
best_path = str(Path(cfg.save_dir) / "best_iceberg_vessel_cnn.pt")

for ep in range(1, cfg.epochs + 1):
    tr_loss, tr_acc, tr_auc = run_epoch(train_loader, train=True)
    va_loss, va_acc, va_auc = run_epoch(val_loader,   train=False)

    if np.isfinite(va_auc):
        scheduler.step(va_auc)

    print(f"Epoch {ep:02d} | "
          f"tr loss {tr_loss:.4f} acc {tr_acc:.3f} auc {tr_auc:.3f} || "
          f"va loss {va_loss:.4f} acc {va_acc:.3f} auc {va_auc:.3f}")

    if va_auc > best_auc + 1e-4:
        best_auc = va_auc
        epochs_no_improve = 0
        torch.save({"model_state": model.state_dict(), "cfg": dict(cfg.__dict__)}, best_path)
    else:
        epochs_no_improve += 1
        if epochs_no_improve >= cfg.patience:
            print(f"Early stopping at epoch {ep} (best val AUC={best_auc:.3f}).")
            break

print("Best model saved to:", best_path)


Epoch 01 | tr loss 0.6542 acc 0.576 auc 0.616 || va loss 0.6571 acc 0.620 auc 0.663
Epoch 02 | tr loss 0.6752 acc 0.595 auc 0.622 || va loss 0.6384 acc 0.614 auc 0.711
Epoch 03 | tr loss 0.6542 acc 0.588 auc 0.638 || va loss 0.6406 acc 0.688 auc 0.712
Epoch 04 | tr loss 0.6354 acc 0.631 auc 0.676 || va loss 0.6138 acc 0.676 auc 0.726
Epoch 05 | tr loss 0.6245 acc 0.652 auc 0.697 || va loss 0.5982 acc 0.692 auc 0.722
Epoch 06 | tr loss 0.6054 acc 0.680 auc 0.723 || va loss 0.5856 acc 0.664 auc 0.771
Epoch 07 | tr loss 0.5917 acc 0.669 auc 0.735 || va loss 0.5618 acc 0.710 auc 0.798
Epoch 08 | tr loss 0.5437 acc 0.701 auc 0.771 || va loss 0.4941 acc 0.741 auc 0.818
Epoch 09 | tr loss 0.4847 acc 0.759 auc 0.819 || va loss 0.4656 acc 0.788 auc 0.865
Epoch 10 | tr loss 0.4471 acc 0.791 auc 0.857 || va loss 0.4245 acc 0.785 auc 0.878
Epoch 11 | tr loss 0.4373 acc 0.804 auc 0.871 || va loss 0.4778 acc 0.769 auc 0.872
Epoch 12 | tr loss 0.4207 acc 0.805 auc 0.877 || va loss 0.4117 acc 0.819 au

In [13]:
# ===========================
# Load Best & Full Evaluation
# ===========================
ckpt = torch.load(best_path, map_location=device)
model.load_state_dict(ckpt["model_state"])
model.eval()

# Collect validation predictions for metrics & confusion matrix
all_probs, all_preds, all_true = [], [], []
with torch.no_grad():
    for xb, ab, yb in val_loader:
        xb, yb = xb.to(device), yb.to(device)
        ab = ab.to(device) if cfg.use_angle else None
        logits = model(xb, ab) if cfg.use_angle else model(xb)
        prob = torch.softmax(logits, dim=1)[:,1].cpu().numpy()
        pred = (prob >= 0.5).astype(int)
        all_probs.append(prob)
        all_preds.append(pred)
        all_true.append(yb.cpu().numpy())

y_true = np.concatenate(all_true)
y_prob = np.concatenate(all_probs)
y_pred = np.concatenate(all_preds)

acc = accuracy_score(y_true, y_pred)
auc = roc_auc_score(y_true, y_prob)
cm  = confusion_matrix(y_true, y_pred)

print(f"Validation ACC: {acc:.3f} | AUC: {auc:.3f}")
print("Confusion matrix [[TN, FP],[FN, TP]]:\n", cm)


Validation ACC: 0.829 | AUC: 0.919
Confusion matrix [[TN, FP],[FN, TP]]:
 [[131  39]
 [ 16 135]]
