The notebook mounts Google Drive, reads train.json from “Ursa Space 1A,” and converts each sample’s band_1 and band_2 lists into 75×75 float images, stacked as a two-channel tensor [N, 2, 75, 75]. Labels (is_iceberg) and incidence angles are loaded; angles are imputed with the train-set median for missing values and optionally z-normalized. A stratified 80/20 split creates train/validation indices. Per-channel statistics (mean, std) are computed on the train tensor and used to z-normalize both splits; all arrays plus angles are saved to a compact artifact sar_fullres_splits.npz for reproducibility. The model IcebergVesselCNN is a compact convolutional trunk (Conv-ReLU-MaxPool blocks → AdaptiveAvgPool to a 128-d feature) with an angle-fusion head that concatenates the (optionally normalized) angle to the pooled feature before a small MLP classifier. Training uses AdamW with cross-entropy loss, optional mixed precision (AMP), light SAR-appropriate augmentations (random horizontal/vertical flips and 90° rotations), and a ReduceLROnPlateau scheduler keyed to validation AUC, with early stopping based on AUC to prevent overfitting. After each epoch, accuracy and AUC are computed on the validation set and the best checkpoint is stored as best_iceberg_vessel_cnn.pt. Final reporting includes accuracy, ROC-AUC, and a confusion matrix (e.g., TN/FP/FN/TP), confirming strong performance on the iceberg-vs-vessel classification task.

In [12]:
# ==== GitHub setup (Colab) — safe, idempotent, and data-friendly =====================
# - Uses Colab secret: GITHUB_TOKEN (Settings → Secrets → Add secret)
# - Clones your repo without exposing the token
# - Sets a push-URL that authenticates via the secret
# - Writes a .gitignore to keep large data/artifacts out of Git
# - Leaves optional "copy & push" lines commented — un-comment when ready

from google.colab import userdata
import os, pathlib, textwrap, subprocess, sys

# ---- EDIT THESE THREE VALUES --------------------------------------------------------
GITHUB_USER  = "zainakhalil"
GITHUB_REPO  = "Ursa-Space-1A"
GITHUB_EMAIL = "zkhal4@uic.edu"     # <-- set to your GitHub email
# ------------------------------------------------------------------------------------

# 0) Load token from Colab secrets (won't print)
token = userdata.get("GITHUB_TOKEN")
if not token:
    raise RuntimeError(
        "No GITHUB_TOKEN found in Colab secrets. "
        "Add one in Colab: Settings → Secrets → New secret → key: GITHUB_TOKEN"
    )
os.environ["GITHUB_TOKEN"] = token  # used only for push URL; won't be printed

repo_https = f"https://github.com/{GITHUB_USER}/{GITHUB_REPO}.git"
push_https = f"https://{GITHUB_USER}:{token}@github.com/{GITHUB_USER}/{GITHUB_REPO}.git"
repo_dir   = pathlib.Path(GITHUB_REPO)

# 1) Clone if needed (fetch URL stays clean; push URL uses token)
if not repo_dir.exists():
    print(f"Cloning {repo_https} …")
    subprocess.run(["git", "clone", repo_https, GITHUB_REPO], check=True)
else:
    print(f"Repo already exists at: {repo_dir.resolve()}")

# 2) Configure git identity and push URL
os.chdir(repo_dir)
subprocess.run(["git", "config", "user.name", GITHUB_USER], check=True)
subprocess.run(["git", "config", "user.email", GITHUB_EMAIL], check=True)
subprocess.run(["git", "remote", "set-url", "--push", "origin", push_https], check=True)

# 3) Write a .gitignore (keeps data/artifacts/checkpoints out of Git)
gitignore = textwrap.dedent("""
# ---- data & artifacts (keep out of Git) ----
data/
artifacts/
*.npz
*.pt
*.7z
*.json

# ---- notebooks noise ----
*.ipynb_checkpoints/

# ---- OS/editor ----
.DS_Store
*.swp
""").lstrip()

gi_path = pathlib.Path(".gitignore")
if gi_path.exists():
    print(".gitignore already exists; leaving as-is.")
else:
    gi_path.write_text(gitignore)
    subprocess.run(["git", "add", ".gitignore"], check=True)
    subprocess.run(["git", "commit", "-m", "Add .gitignore for data/artifacts"], check=False)

print("\nGit setup complete. Current remotes:")
subprocess.run(["git", "remote", "-v"], check=True)


Cloning https://github.com/zainakhalil/Ursa-Space-1A.git …
.gitignore already exists; leaving as-is.

Git setup complete. Current remotes:


CompletedProcess(args=['git', 'remote', '-v'], returncode=0)

In [None]:
from google.colab import userdata
import os

# Get the GitHub token from Colab Secrets
github_token = userdata.get('GITHUB_TOKEN')

# Replace with your GitHub username and repository name
github_username = 'zainakhalil'
repository_name = 'Ursa-Space-1A'

# Construct the clone URL with the token
clone_url = f'https://{github_token}@github.com/{github_username}/{repository_name}.git'

# Clone the repository
!git clone {clone_url}

In [25]:
# Add All Changes
!git add "TrainCNNV2.ipynb"

In [26]:
# Replace 'Your_Notebook_Name.ipynb' with the actual name of your notebook file
!cp "/content/drive/MyDrive/Colab Notebooks/TrainCNNV2.ipynb" "/content/Ursa-Space-1A/"

In [27]:
# Commit Changes
!git commit -m "Train CNN Attempt 2"

[main c4d20a7] Train CNN Attempt 2
 1 file changed, 1 insertion(+), 1 deletion(-)
 rewrite TrainCNNV2.ipynb (82%)


In [28]:
!git config --global user.email "zkhal4@uic.edu"
!git config --global user.name "zainakhalil"

In [29]:
# Push to the remote repository
# 'origin' is the default remote name, and 'main' or 'master' is typically the branch name
!git push origin main
# If your branch is named 'master', use:
# !git push origin master

remote: Invalid username or token. Password authentication is not supported for Git operations.
fatal: Authentication failed for 'https://github.com/zainakhalil/Ursa-Space-1A.git/'


In [1]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [3]:
import pandas as pd

!ls "/content/drive/MyDrive/Ursa Space 1A/train.json"
df = pd.read_json('/content/drive/MyDrive/Ursa Space 1A/train.json')


'/content/drive/MyDrive/Ursa Space 1A/train.json'


In [4]:
import math, numpy as np, pandas as pd
from sklearn.model_selection import train_test_split
from pathlib import Path

DATA_PATH = Path("/content/drive/MyDrive/Ursa Space 1A/train.json")
ART_DIR   = Path("/content/artifacts"); ART_DIR.mkdir(parents=True, exist_ok=True)

df = pd.read_json(DATA_PATH)

def to_img(seq):
    a = np.array(seq, dtype=np.float32)
    side = int(round(math.sqrt(a.size)))
    assert side*side == a.size, f"Non-square band length {a.size}"
    return a.reshape(side, side)

# [N, 2, H, W]  (two channels: band_1, band_2)
X = np.stack([np.stack([to_img(b1), to_img(b2)], axis=0)
              for b1, b2 in zip(df["band_1"], df["band_2"])], axis=0)
y = df["is_iceberg"].astype(np.int64).to_numpy()
inc_angle = pd.to_numeric(df["inc_angle"], errors="coerce").to_numpy()

# stratified 80/20 split
idx = np.arange(len(y))
idx_tr, idx_va = train_test_split(idx, test_size=0.2, random_state=1234, stratify=y)
X_tr, X_va = X[idx_tr], X[idx_va]
y_tr, y_va = y[idx_tr], y[idx_va]
ang_tr, ang_va = inc_angle[idx_tr], inc_angle[idx_va]

# per-channel z-normalization using train stats
m = X_tr.mean(axis=(0,2,3), keepdims=True)
s = X_tr.std(axis=(0,2,3), keepdims=True) + 1e-6
X_tr = (X_tr - m)/s
X_va = (X_va - m)/s

np.savez_compressed(ART_DIR / "sar_fullres_splits.npz",
    X_train=X_tr, y_train=y_tr, X_val=X_va, y_val=y_va,
    means=m.squeeze(), stds=s.squeeze(), side=int(X.shape[-1]),
    inc_angle_train=ang_tr, inc_angle_val=ang_va
)
X_tr.shape, X_va.shape


((1283, 2, 75, 75), (321, 2, 75, 75))

In [5]:
import numpy as np, torch, torch.nn as nn, torch.optim as optim
from torch.utils.data import TensorDataset, DataLoader
from pathlib import Path

data = np.load("/content/artifacts/sar_fullres_splits.npz")
Xtr, Xva = data["X_train"], data["X_val"]
ytr, yva = data["y_train"], data["y_val"]

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
Xtr_t = torch.tensor(Xtr, dtype=torch.float32)
Xva_t = torch.tensor(Xva, dtype=torch.float32)
ytr_t = torch.tensor(ytr, dtype=torch.long)
yva_t = torch.tensor(yva, dtype=torch.long)

train_loader = DataLoader(TensorDataset(Xtr_t, ytr_t), batch_size=64, shuffle=True)
val_loader   = DataLoader(TensorDataset(Xva_t, yva_t), batch_size=128, shuffle=False)

class SmallCNN(nn.Module):
    def __init__(self, in_ch=2, num_classes=2):
        super().__init__()
        self.net = nn.Sequential(
            nn.Conv2d(in_ch, 16, 3, padding=1), nn.ReLU(),
            nn.MaxPool2d(2),
            nn.Conv2d(16, 32, 3, padding=1), nn.ReLU(),
            nn.MaxPool2d(2),
            nn.Conv2d(32, 64, 3, padding=1), nn.ReLU(),
            nn.AdaptiveAvgPool2d(1),
        )
        self.fc = nn.Linear(64, num_classes)
    def forward(self, x):
        x = self.net(x).view(x.size(0), -1)
        return self.fc(x)

model = SmallCNN().to(device)
opt = optim.Adam(model.parameters(), lr=1e-3)
crit = nn.CrossEntropyLoss()

def run_epoch(dl, train=True):
    model.train() if train else model.eval()
    tot=correct=loss_sum=0
    for xb,yb in dl:
        xb,yb = xb.to(device), yb.to(device)
        if train: opt.zero_grad()
        with torch.set_grad_enabled(train):
            logits = model(xb); loss = crit(logits, yb)
            if train: loss.backward(); opt.step()
        pred = logits.argmax(1)
        correct += (pred==yb).sum().item()
        tot += yb.numel(); loss_sum += loss.item()*yb.size(0)
    return loss_sum/tot, correct/tot

for ep in range(8):
    tr_loss,tr_acc = run_epoch(train_loader, True)
    va_loss,va_acc = run_epoch(val_loader, False)
    print(f"Epoch {ep+1}: tr {tr_loss:.4f}/{tr_acc:.3f} | va {va_loss:.4f}/{va_acc:.3f}")


Epoch 1: tr 0.6810/0.535 | va 0.6508/0.639
Epoch 2: tr 0.6617/0.599 | va 0.6393/0.664
Epoch 3: tr 0.6517/0.627 | va 0.6335/0.651
Epoch 4: tr 0.6467/0.627 | va 0.6277/0.642
Epoch 5: tr 0.6396/0.649 | va 0.6323/0.626
Epoch 6: tr 0.6363/0.646 | va 0.7098/0.576
Epoch 7: tr 0.6781/0.612 | va 0.6370/0.651
Epoch 8: tr 0.6403/0.641 | va 0.6154/0.664


In [6]:
# ==================================
# Config Small CNN + Small Utilities
# ==================================
from dataclasses import dataclass
from pathlib import Path
import numpy as np
import torch, torch.nn as nn, torch.optim as optim
from torch.utils.data import TensorDataset, DataLoader
from sklearn.metrics import accuracy_score, roc_auc_score, confusion_matrix

@dataclass
class Config:
    # Paths
    npz_path: str = "/content/artifacts/sar_fullres_splits.npz"
    save_dir: str = "/content/artifacts"
    # Training
    batch_size: int = 64
    epochs: int = 20
    lr: float = 1e-3
    weight_decay: float = 1e-4
    # Early stopping / LR schedule
    patience: int = 5          # epochs with no AUC improvement to stop
    lr_patience: int = 2       # plateaus before LR reduce
    lr_gamma: float = 0.5
    # Augmentations
    hflip_p: float = 0.5
    vflip_p: float = 0.5
    rot90_p: float = 0.5       # random k*90° rotation
    # Incidence angle fusion
    use_angle: bool = True     # set False to ignore angle
    norm_angle: bool = True    # z-norm angle using train stats

cfg = Config()
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
Path(cfg.save_dir).mkdir(parents=True, exist_ok=True)
print("Using device:", device)


Using device: cpu


In [7]:
# ===========================
# Load NPZ + Prepare Tensors
# ===========================
D = np.load(cfg.npz_path)

Xtr = D["X_train"]          # [N, 2, 75, 75]
Xva = D["X_val"]
ytr = D["y_train"].astype(np.int64)
yva = D["y_val"].astype(np.int64)

# Angle arrays can have NaN; we'll impute with train median, then optionally z-norm
if cfg.use_angle:
    ang_tr = D["inc_angle_train"].astype(np.float32)
    ang_va = D["inc_angle_val"].astype(np.float32)

    # Impute missing angles with train median (robust/simple)
    train_med = np.nanmedian(ang_tr)
    ang_tr = np.where(np.isfinite(ang_tr), ang_tr, train_med)
    ang_va = np.where(np.isfinite(ang_va), ang_va, train_med)

    if cfg.norm_angle:
        mu, sd = ang_tr.mean(), ang_tr.std() + 1e-6
        ang_tr = (ang_tr - mu) / sd
        ang_va = (ang_va - mu) / sd
else:
    ang_tr = np.zeros(len(ytr), dtype=np.float32)
    ang_va = np.zeros(len(yva), dtype=np.float32)

# ---------------------------
# PyTorch Tensor Conversion
# ---------------------------
Xtr_t = torch.tensor(Xtr, dtype=torch.float32)
Xva_t = torch.tensor(Xva, dtype=torch.float32)
ytr_t = torch.tensor(ytr, dtype=torch.long)
yva_t = torch.tensor(yva, dtype=torch.long)
ang_tr_t = torch.tensor(ang_tr, dtype=torch.float32).view(-1, 1)
ang_va_t = torch.tensor(ang_va, dtype=torch.float32).view(-1, 1)

# ---------------------------
# Lightweight Augmentations
# ---------------------------
# We implement custom tensor-based flips/rotations (works for 2-channel SAR)
import random

def aug_batch(xb):
    """Apply in-place random flips/rot90 to a batch of images [B,2,H,W]."""
    B = xb.size(0)
    for i in range(B):
        if random.random() < cfg.hflip_p:
            xb[i] = torch.flip(xb[i], dims=[2])  # horizontal flip (W axis)
        if random.random() < cfg.vflip_p:
            xb[i] = torch.flip(xb[i], dims=[1])  # vertical flip (H axis)
        if random.random() < cfg.rot90_p:
            k = random.randint(1,3)
            xb[i] = torch.rot90(xb[i], k, dims=[1,2])  # rotate k*90°
    return xb


In [8]:
# ===========================
# CNN with Angle Fusion
# ===========================
class IcebergVesselCNN(nn.Module):
    """
    A compact CNN trunk for 2-channel SAR (band_1, band_2) + optional angle fusion.
    Trunk: Conv -> ReLU -> Pool x3 + GlobalAvgPool -> feature vector (64 dims)
    Head:  If use_angle, concat [feat, angle] -> Linear -> logits(2)
    """
    def __init__(self, in_ch=2, use_angle=True):
        super().__init__()
        self.use_angle = use_angle
        self.trunk = nn.Sequential(
            nn.Conv2d(in_ch, 32, 3, padding=1), nn.ReLU(),
            nn.MaxPool2d(2),  # 75->37
            nn.Conv2d(32, 64, 3, padding=1), nn.ReLU(),
            nn.MaxPool2d(2),  # 37->18
            nn.Conv2d(64, 128, 3, padding=1), nn.ReLU(),
            nn.MaxPool2d(2),  # 18->9
            nn.Conv2d(128, 128, 3, padding=1), nn.ReLU(),
            nn.AdaptiveAvgPool2d(1),          # -> [B,128,1,1]
        )
        feat_dim = 128
        head_in = feat_dim + (1 if use_angle else 0)
        self.head = nn.Sequential(
            nn.Linear(head_in, 64), nn.ReLU(),
            nn.Dropout(0.2),
            nn.Linear(64, 2)
        )

    def forward(self, x, angle=None):
        f = self.trunk(x).view(x.size(0), -1)  # [B,128]
        if self.use_angle:
            assert angle is not None, "Angle tensor is required when use_angle=True"
            f = torch.cat([f, angle], dim=1)   # [B,129]
        return self.head(f)

model = IcebergVesselCNN(in_ch=2, use_angle=cfg.use_angle).to(device)


In [10]:
import numpy as np, torch
from torch.amp import autocast, GradScaler  # new API (works on Colab)
# If your torch is older, fallback:
# from torch.cuda.amp import autocast, GradScaler

use_amp = torch.cuda.is_available()
scaler  = GradScaler('cuda' if use_amp else 'cpu')

def run_epoch(loader, train=True):
    """One epoch. Returns (avg_loss, accuracy, auc)."""
    model.train() if train else model.eval()
    losses, preds_all, probs_all, ys_all = [], [], [], []

    for xb, ab, yb in loader:
        xb, yb = xb.to(device), yb.to(device)
        ab = ab.to(device) if cfg.use_angle else None

        optimizer.zero_grad(set_to_none=True)

        if train:
            # ---- forward with (optional) autocast
            with autocast(device_type='cuda', enabled=use_amp):
                logits = model(xb, ab) if cfg.use_angle else model(xb)
                loss   = criterion(logits, yb)

            # ---- backward + step (AMP or not)
            if use_amp:
                scaler.scale(loss).backward()
                scaler.step(optimizer)
                scaler.update()
            else:
                loss.backward()
                optimizer.step()
        else:
            with torch.no_grad():
                logits = model(xb, ab) if cfg.use_angle else model(xb)
                loss   = criterion(logits, yb)

        # ---- bookkeeping
        losses.append(loss.item())
        prob = torch.softmax(logits, dim=1)[:, 1].detach().cpu().numpy()
        pred = (prob >= 0.5).astype(np.int64)
        y_np = yb.detach().cpu().numpy()
        probs_all.append(prob); preds_all.append(pred); ys_all.append(y_np)

    from sklearn.metrics import accuracy_score, roc_auc_score
    y_true = np.concatenate(ys_all)
    y_prob = np.concatenate(probs_all)
    y_pred = np.concatenate(preds_all)
    acc = accuracy_score(y_true, y_pred)
    try:
        auc = roc_auc_score(y_true, y_prob)
    except Exception:
        auc = float("nan")
    return float(np.mean(losses)), acc, auc

# -------- training loop (unchanged except for using the fixed run_epoch) --------
best_auc = 0.0
epochs_no_improve = 0
best_path = str(Path(cfg.save_dir) / "best_iceberg_vessel_cnn.pt")

for ep in range(1, cfg.epochs+1):
    tr_loss, tr_acc, tr_auc = run_epoch(train_loader, train=True)
    va_loss, va_acc, va_auc = run_epoch(val_loader,   train=False)

    if np.isfinite(va_auc):
        scheduler.step(va_auc)

    print(f"Epoch {ep:02d} | "
          f"tr loss {tr_loss:.4f} acc {tr_acc:.3f} auc {tr_auc:.3f} || "
          f"va loss {va_loss:.4f} acc {va_acc:.3f} auc {va_auc:.3f}")

    if va_auc > best_auc + 1e-4:
        best_auc = va_auc
        epochs_no_improve = 0
        torch.save({"model_state": model.state_dict(), "cfg": cfg.__dict__}, best_path)
    else:
        epochs_no_improve += 1
        if epochs_no_improve >= cfg.patience:
            print(f"Early stopping at epoch {ep} (best val AUC={best_auc:.3f}).")
            break

print("Best model saved to:", best_path)


Epoch 01 | tr loss 0.6630 acc 0.562 auc 0.614 || va loss 0.6458 acc 0.636 auc 0.681
Epoch 02 | tr loss 0.6473 acc 0.622 auc 0.645 || va loss 0.6273 acc 0.660 auc 0.688
Epoch 03 | tr loss 0.6220 acc 0.629 auc 0.671 || va loss 0.6159 acc 0.623 auc 0.727
Epoch 04 | tr loss 0.6261 acc 0.640 auc 0.688 || va loss 0.5832 acc 0.701 auc 0.749
Epoch 05 | tr loss 0.5788 acc 0.675 auc 0.747 || va loss 0.5449 acc 0.701 auc 0.757
Epoch 06 | tr loss 0.5368 acc 0.738 auc 0.777 || va loss 0.4808 acc 0.763 auc 0.819
Epoch 07 | tr loss 0.4963 acc 0.761 auc 0.822 || va loss 0.4691 acc 0.757 auc 0.818
Epoch 08 | tr loss 0.4333 acc 0.792 auc 0.847 || va loss 0.4113 acc 0.791 auc 0.877
Epoch 09 | tr loss 0.5107 acc 0.794 auc 0.877 || va loss 0.4542 acc 0.779 auc 0.883
Epoch 10 | tr loss 0.5172 acc 0.735 auc 0.792 || va loss 0.4569 acc 0.785 auc 0.847
Epoch 11 | tr loss 0.4566 acc 0.800 auc 0.864 || va loss 0.4034 acc 0.807 auc 0.885
Epoch 12 | tr loss 0.4465 acc 0.794 auc 0.844 || va loss 0.4257 acc 0.810 au

In [11]:
# ===========================
# Load Best & Full Evaluation
# ===========================
ckpt = torch.load(best_path, map_location=device)
model.load_state_dict(ckpt["model_state"])
model.eval()

# Collect validation predictions for metrics & confusion matrix
all_probs, all_preds, all_true = [], [], []
with torch.no_grad():
    for xb, ab, yb in val_loader:
        xb, yb = xb.to(device), yb.to(device)
        ab = ab.to(device) if cfg.use_angle else None
        logits = model(xb, ab) if cfg.use_angle else model(xb)
        prob = torch.softmax(logits, dim=1)[:,1].cpu().numpy()
        pred = (prob >= 0.5).astype(int)
        all_probs.append(prob)
        all_preds.append(pred)
        all_true.append(yb.cpu().numpy())

y_true = np.concatenate(all_true)
y_prob = np.concatenate(all_probs)
y_pred = np.concatenate(all_preds)

acc = accuracy_score(y_true, y_pred)
auc = roc_auc_score(y_true, y_prob)
cm  = confusion_matrix(y_true, y_pred)

print(f"Validation ACC: {acc:.3f} | AUC: {auc:.3f}")
print("Confusion matrix [[TN, FP],[FN, TP]]:\n", cm)


Validation ACC: 0.875 | AUC: 0.962
Confusion matrix [[TN, FP],[FN, TP]]:
 [[142  28]
 [ 12 139]]
