<a href="https://colab.research.google.com/github/trie0000/external/blob/main/saikyo_code_20260104_1938.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
# -*- coding: utf-8 -*-
"""DL_Basic_2025_Competition_NYUv2_baseline.ipynb

Automatically generated by Colab.

Original file is located at
    https://colab.research.google.com/drive/17t7uAU0aST5aUt6sIJCqFneGlQrxFrJY

# Deep Learning 基礎講座　最終課題: NYUv2 セマンティックセグメンテーション

## 概要
RGB画像から、画像内の各ピクセルがどのクラスに属するかを予測するセマンティックセグメンテーションタスク.

### データセット
- データセット: NYUv2 dataset
- 訓練データ: 795枚
- テストデータ: 654枚
- 入力: RGB画像 + 深度マップ（元画像サイズは可変）
- 出力: 13クラスのセグメンテーションマップ
- 評価指標: Mean IoU (Intersection over Union)

### データセットの詳細（[NYU Depth Dataset V2](https://cs.nyu.edu/~fergus/datasets/nyu_depth_v2.html)）
- 画像は屋内シーンを撮影したもので、家具や壁、床などの物体が含まれています.
- 各画像に対して13クラスのセグメンテーションラベルが提供されます.
- データは以下のディレクトリ構造で提供:
```
data/NYUv2/
├─train/
│  ├─image/      # RGB画像
│  │    000000.png
│  │    ...
│  │
│  ├─depth/      # 深度マップ
│  │    000000.png
│  │    ...
│  │
│  └─label/      # 13クラスセグメンテーション（教師ラベル）
│       000000.png
│       ...
└─test/
   ├─image/      # RGB画像
   │    000000.png
   │    ...
   │  ├─depth/   # 深度マップ
   │    000000.png
   │    ...
```

### タスクの詳細
- 入力のRGB画像と深度マップから、各ピクセルが13クラスのどれに属するかを予測するタスクです.
- 評価はMean IoUを使用します．
  - 各クラスごとにIoUを計算し、その平均を取ります.
  - IoUは以下の式で計算:
  $$IoU = \frac{TP}{TP + FP + FN}$$
    - TP: True Positive（正しく予測されたピクセル数）
    - FP: False Positive（誤って予測されたピクセル数）
    - FN: False Negative（見逃したピクセル数）

### 前処理
- 入力画像は512×512にリサイズされます.
- ピクセル値は0-1に正規化されます.
- セグメンテーションラベルは0-12の整数値（13クラス）です．
  - 255はignore index（評価から除外）

### 提出形式
- テスト画像（RGB + Depth）の各ピクセルに対してクラス（0~12）を予測したものをnumpy配列として保存されます.
- ファイル名: `submission.npy`
- 配列の形状: [テストデータ数, 高さ, 幅]
- 各ピクセルの値: 0-12の整数（予測クラス）

## 考えられる工夫の例
- 事前学習モデルの fine-tuning
    - ImageNetなどで事前学習されたモデルを本データセットでfine-tuningすることで性能向上が見込めます.
- 損失関数の再設計
    - クラスごとの出現頻度に応じて損失を補正するように損失関数を設計すると、クラス分布の不均衡に対してロバストな学習ができます.
- 画像の前処理
    - RandomResizedCrop / Flip / ColorJitter 等のデータ拡張を追加することで，汎化性能の向上が見込めます．

## 修了要件を満たす条件
- ベースラインでは，omnicampus 上での性能評価において， 38.2% となります．したがって，ベースラインである 38.2% を超えた提出のみ，修了要件として認めます．
- ベースラインから改善を加えることで， 50%以上に性能向上することを運営で確認しています．こちらを 1つの指標として取り組んでみてください．

## 注意点
- 最終的な予測モデルは，**配布している訓練データを用いて学習**（ファインチューニング含む）したものとしてください．
- 学習を行わず，**事前学習済みモデルの知識のみを利用した推論は禁止**します．
（例: ChatGPT 等の LLM に入力して推論を得るのみ）

### 事前学習モデルの利用
許可される事項
- **構成要素としての事前学習モデルの利用**: 自身で実装したアーキテクチャの一部（特徴抽出，埋め込みなど）として事前学習モデル（BERT，ViT など）を利用することは可能です．
- **ファインチューニング**: 上記の用途で利用している事前学習モデルのファインチューニングは可能です．

禁止される事項
- **タスク解決用の事前学習モデルの利用**: transformers などで提供されている，対象タスクを直接解くための事前学習モデルでそのまま推論のみ，またはファインチューニングのみで利用することは禁止とします．
  - 禁止事項の例: VQA タスクを直接解くための事前学習モデルを VQA タスクで利用する．

### データの準備
データをダウンロードした際に，google drive したため，利用するために google drive をマウントする必要があります．また， drive 上で展開することができないため，/content ディレクトリ下にコピーし "data.zip" を展開します．
google drive 上に "data.zip" が配置されていない場合は実行できません．google drive 上に "data.zip" (**831MB**) を配置することが可能であれば，"data_download.ipynb" を先に実行してください．難しい場合は，omnicampus 演習環境を利用してください．．
"""

# omnicampus 上では 4 セル目まで実行不要
# ドライブのマウント
from google.colab import drive
drive.mount('/content/drive')

# データダウンロード用の notebook にてgoogle drive への保存後，
# 反映に時間がかかる可能性がありますので，google drive のマウント後，
# data.zip がディレクトリ内にあることを確認してから実行してください．
# data.zip を /content 下にコピーする
!cp "/content/drive/MyDrive/data.zip" "/content"

# Commented out IPython magic to ensure Python compatibility.
# カレントディレクトリ下のファイル群を確認
# data.zip が表示されれば問題ないです
# %ls

# データを解凍する
!unzip data.zip
!mkdir data
!mv train test data/

"""omnicampus 演習環境では，data_download.ipynb のマウント，zip 化，drive へのコピーを実行しないことで，"data.zip" を解凍した形で配置されます．したがって，data ディレクトリが存在するディレクトリをカレントディレクトリとするだけで良いです．


"""

# Commented out IPython magic to ensure Python compatibility.
# omnicampus 実行用
# 以下の例では/workspace/Segmentation/split_data_scripts/omnicampus に data ディレクトリがあると想定
# %cd /workspace/Segmentation/split_data_scripts_omnicampus

# omnicampus 実行用
!pip install h5py scikit-image

"""# import library"""

Mounted at /content/drive
Archive:  data.zip
  inflating: data/train/image/000600.png  
  inflating: data/train/image/000320.png  
  inflating: data/train/image/000491.png  
  inflating: data/train/image/000502.png  
  inflating: data/train/image/000129.png  
  inflating: data/train/image/000044.png  
  inflating: data/train/image/000652.png  
  inflating: data/train/image/000919.png  
  inflating: data/train/image/000528.png  
  inflating: data/train/image/000853.png  
  inflating: data/train/image/000177.png  
  inflating: data/train/image/000584.png  
  inflating: data/train/image/001319.png  
  inflating: data/train/image/000597.png  
  inflating: data/train/image/000223.png  
  inflating: data/train/image/001350.png  
  inflating: data/train/image/000404.png  
  inflating: data/train/image/000488.png  
  inflating: data/train/image/000268.png  
  inflating: data/train/image/000481.png  
  inflating: data/train/image/000341.png  
  inflating: data/train/image/000159.png  
  inflati

'# import library'

In [15]:
%ls -l /content/data
%pip install transformers albumentations timm
%pip install -U albumentations

total 8
drwxr-xr-x 4 root root 4096 Jan  4 03:11 [0m[01;34mtest[0m/
drwxr-xr-x 5 root root 4096 Jan  4 03:11 [01;34mtrain[0m/


In [30]:
# -*- coding: utf-8 -*-
import os
os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "expandable_segments:True"

import time
import random
import numpy as np
from PIL import Image
from tqdm import tqdm
from zipfile import ZipFile, ZIP_DEFLATED

import torch
import torch.nn as nn
import torch.nn.functional as F
from torch import optim
from torch.utils.data import DataLoader, Subset, Dataset
from torchvision import models
from torch.amp import autocast, GradScaler

import albumentations as A

# =========================
# Settings & Seed
# =========================
CLASS_NAMES = ["Bed", "Book", "Ceiling", "Chair", "Floor", "Cabinet", "Object", "Picture", "Sofa", "Desk", "TV", "Wall", "Window"]

def set_seed(seed=42):
    random.seed(seed)
    np.random.seed(seed)
    torch.manual_seed(seed)
    torch.cuda.manual_seed_all(seed)
    torch.backends.cudnn.deterministic = True
    torch.backends.cudnn.benchmark = True

# =========================
# Helper Functions
# =========================
def estimate_height_from_depth(depth_np):
    H, W = depth_np.shape
    y_grid = np.linspace(0, 1, H).reshape(H, 1).repeat(W, axis=1).astype(np.float32)
    height_map = y_grid * depth_np
    max_val = height_map.max()
    if max_val > 0:
        height_map /= max_val
    return height_map.astype(np.float32)

# =========================
# Dataset Class (Same as original)
# =========================
class NYUv2Dataset(Dataset):
    def __init__(self, root_dir, split='train', transform=None, return_label=True):
        self.split = split
        self.transform = transform
        self.return_label = return_label
        src_split = 'train' if split in ['train', 'val'] else 'test'
        self.images_dir = os.path.join(root_dir, src_split, 'image')
        self.depths_dir = os.path.join(root_dir, src_split, 'depth')
        self.labels_dir = os.path.join(root_dir, src_split, 'label') if src_split == 'train' else None
        self.filenames = sorted([f for f in os.listdir(self.images_dir) if f.endswith('.png')])

        self.mean = np.array([0.485, 0.456, 0.406], dtype=np.float32)
        self.std = np.array([0.229, 0.224, 0.225], dtype=np.float32)
        self.d_mean, self.d_std = 0.5, 0.25

    def __len__(self): return len(self.filenames)

    def __getitem__(self, idx):
        fname = self.filenames[idx]
        rgb = np.array(Image.open(os.path.join(self.images_dir, fname)).convert('RGB'))
        depth = np.array(Image.open(os.path.join(self.depths_dir, fname)))
        if len(depth.shape) == 3: depth = depth[:, :, 0]

        depth = depth.astype(np.float32) / (65535.0 if depth.max() > 255 else 255.0)
        h_map = estimate_height_from_depth(depth)

        label = np.array(Image.open(os.path.join(self.labels_dir, fname))) if self.labels_dir and self.return_label else np.zeros(depth.shape, dtype=np.int32)

        if self.transform:
            augmented = self.transform(image=rgb, depth=depth, height=h_map, mask=label)
            rgb, depth, h_map, label = augmented['image'], augmented['depth'], augmented['height'], augmented['mask']

        rgb = (rgb.astype(np.float32) / 255.0 - self.mean) / self.std
        depth = (depth - self.d_mean) / self.d_std
        h_map = (h_map - self.d_mean) / self.d_std

        rgb_t = torch.from_numpy(rgb.transpose(2, 0, 1)).float()
        depth_t = torch.from_numpy(depth).unsqueeze(0).float()
        h_t = torch.from_numpy(h_map).unsqueeze(0).float()

        input_tensor = torch.cat([rgb_t, depth_t, h_t], dim=0)
        return (input_tensor, fname) if self.split == 'test' else (input_tensor, torch.from_numpy(label).long())

# =========================
# Losses (Same as original)
# =========================
class DiceLoss(nn.Module):
    def __init__(self, n_classes=13, smooth=1e-5, ignore_index=255):
        super().__init__()
        self.n_classes = n_classes
        self.smooth = smooth
        self.ignore_index = ignore_index

    def forward(self, pred, target):
        pred = F.softmax(pred, dim=1)
        mask = (target != self.ignore_index)
        target_valid = target.clone()
        target_valid[~mask] = 0
        target_one_hot = F.one_hot(target_valid, self.n_classes).permute(0, 3, 1, 2).float()

        m_expand = mask.unsqueeze(1).expand_as(pred)
        pred, target_one_hot = pred * m_expand, target_one_hot * m_expand

        intersection = (pred * target_one_hot).sum(dim=(2, 3))
        union = pred.sum(dim=(2, 3)) + target_one_hot.sum(dim=(2, 3))
        dice = (2. * intersection + self.smooth) / (union + self.smooth)
        return 1 - dice.mean()

# =========================
# Model Definition (Same as original)
# =========================
class ConvBNReLU(nn.Module):
    def __init__(self, in_ch, out_ch, k=3, p=1, d=1):
        super().__init__()
        self.block = nn.Sequential(nn.Conv2d(in_ch, out_ch, k, padding=p, dilation=d, bias=False), nn.BatchNorm2d(out_ch), nn.ReLU(inplace=True))
    def forward(self, x): return self.block(x)

class ASPP(nn.Module):
    def __init__(self, in_ch, out_ch=256):
        super().__init__()
        self.b1 = ConvBNReLU(in_ch, out_ch, 1, 0)
        self.b2 = ConvBNReLU(in_ch, out_ch, 3, 6, 6)
        self.b3 = ConvBNReLU(in_ch, out_ch, 3, 12, 12)
        self.b4 = ConvBNReLU(in_ch, out_ch, 3, 18, 18)
        self.b5 = nn.Sequential(nn.AdaptiveAvgPool2d(1), ConvBNReLU(in_ch, out_ch, 1, 0))
        self.proj = nn.Sequential(ConvBNReLU(out_ch * 5, out_ch, 1, 0), nn.Dropout(0.1))
    def forward(self, x):
        h, w = x.shape[2:]
        feat1, feat2, feat3, feat4 = self.b1(x), self.b2(x), self.b3(x), self.b4(x)
        feat5 = F.interpolate(self.b5(x), size=(h, w), mode='bilinear', align_corners=False)
        return self.proj(torch.cat([feat1, feat2, feat3, feat4, feat5], dim=1))

class ResNeXtDeepLabV3Plus(nn.Module):
    def __init__(self, num_classes=13, in_channels=5):
        super().__init__()
        backbone = models.resnext101_32x8d(weights=models.ResNeXt101_32X8D_Weights.IMAGENET1K_V1)

        old_conv = backbone.conv1
        new_conv = nn.Conv2d(in_channels, old_conv.out_channels, 7, stride=2, padding=3, bias=False)
        with torch.no_grad():
            new_conv.weight[:, :3] = old_conv.weight
            new_conv.weight[:, 3:] = old_conv.weight.mean(dim=1, keepdim=True)

        self.enc0 = nn.Sequential(new_conv, backbone.bn1, backbone.relu)
        self.pool = backbone.maxpool
        self.enc1, self.enc2, self.enc3, self.enc4 = backbone.layer1, backbone.layer2, backbone.layer3, backbone.layer4

        self.aspp = ASPP(2048, 256)
        self.low_proj = nn.Sequential(nn.Conv2d(256, 48, 1, bias=False), nn.BatchNorm2d(48), nn.ReLU(inplace=True))
        self.decoder = nn.Sequential(ConvBNReLU(304, 256, 3, 1), ConvBNReLU(256, 256, 3, 1), nn.Conv2d(256, num_classes, 1))

    def forward(self, x):
        h, w = x.shape[2:]
        x = self.enc0(x); x = self.pool(x)
        low_level = self.enc1(x)
        x = self.enc4(self.enc3(self.enc2(low_level)))
        x = F.interpolate(self.aspp(x), size=low_level.shape[2:], mode='bilinear', align_corners=False)
        x = self.decoder(torch.cat([x, self.low_proj(low_level)], dim=1))
        return F.interpolate(x, size=(h, w), mode='bilinear', align_corners=False)

# =========================
# Main Training Loop
# =========================
def main():
    dataset_root = "/content/data"
    img_size = 768
    batch_size = 16
    epochs = 50
    lr = 1e-4
    set_seed(42)
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

    timestamp = time.strftime("%Y%m%d%H%M%S")
    best_model_path = f"best_model_v2_{timestamp}.pt"

    # Transforms (Same as original)
    train_aug = A.Compose([
        A.HorizontalFlip(p=0.5),
        A.ShiftScaleRotate(shift_limit=0.1, scale_limit=0.8, rotate_limit=15, p=0.7, border_mode=0),
        A.ColorJitter(p=0.5),
        A.Resize(img_size, img_size)
    ], additional_targets={'depth': 'image', 'height': 'image'})

    val_aug = A.Compose([A.Resize(img_size, img_size)], additional_targets={'depth': 'image', 'height': 'image'})

    # =====================
    # DataLoaders (★ BUG FIX: Proper train/val split)
    # =====================
    full_dataset = NYUv2Dataset(dataset_root, split='train', transform=None)
    n_total = len(full_dataset)
    n_val = int(n_total * 0.1)
    n_train = n_total - n_val

    # Reproducible split
    indices = list(range(n_total))
    random.shuffle(indices)
    train_indices = indices[:n_train]
    val_indices = indices[n_train:]

    # Create separate dataset instances with different transforms
    train_dataset = NYUv2Dataset(dataset_root, split='train', transform=train_aug)
    val_dataset = NYUv2Dataset(dataset_root, split='train', transform=val_aug)

    train_loader = DataLoader(
        Subset(train_dataset, train_indices),
        batch_size=batch_size, shuffle=True, num_workers=8, pin_memory=True, drop_last=True
    )
    # ★ FIX: Use val_indices instead of entire dataset
    val_loader = DataLoader(
        Subset(val_dataset, val_indices),
        batch_size=8, shuffle=False, num_workers=4, pin_memory=True
    )
    test_loader = DataLoader(
        NYUv2Dataset(dataset_root, split='test', transform=val_aug, return_label=False),
        batch_size=1
    )

    print(f"Train: {len(train_indices)}, Val: {len(val_indices)}, Test: {len(test_loader.dataset)}")

    # Weights (Same as original)
    weights = torch.tensor([1.0, 5.5, 0.6, 2.2, 0.6, 1.0, 1.8, 1.5, 1.5, 3.0, 3.0, 0.5, 1.0]).to(device)

    model = ResNeXtDeepLabV3Plus().to(device)
    criterion_ce = nn.CrossEntropyLoss(weight=weights, ignore_index=255)
    criterion_dice = DiceLoss()
    optimizer = optim.AdamW(model.parameters(), lr=lr, weight_decay=1e-3)
    scheduler = optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=epochs)
    scaler = GradScaler("cuda")

    best_miou = 0.0
    for epoch in range(epochs):
        model.train()
        for x, label in tqdm(train_loader, desc=f"Epoch {epoch+1}"):
            x, label = x.to(device), label.to(device)
            optimizer.zero_grad(set_to_none=True)
            with autocast("cuda"):
                out = model(x)
                loss = 0.4 * criterion_ce(out, label) + 0.6 * criterion_dice(out, label)
            scaler.scale(loss).backward(); scaler.step(optimizer); scaler.update()
        scheduler.step()

        # Validation
        model.eval(); cm = torch.zeros((13, 13), dtype=torch.long)
        with torch.no_grad():
            for x, label in val_loader:
                pred = model(x.to(device)).argmax(1).cpu().view(-1)
                target = label.view(-1)
                mask = (target != 255)
                cm += torch.bincount(target[mask] * 13 + pred[mask], minlength=169).view(13, 13)

        tp = torch.diag(cm).float(); iou = tp / (cm.sum(0) + cm.sum(1) - tp + 1e-8)
        miou = iou.mean().item()
        print(f"Epoch {epoch+1}: mIoU={miou:.4f}")
        print(f"  Per-class: {', '.join([f'{CLASS_NAMES[i]}:{iou[i]:.3f}' for i in range(13)])}")

        if miou > best_miou:
            best_miou = miou
            torch.save(model.state_dict(), best_model_path)
            print(f"  -> New best: {best_miou:.4f}")

    # Submission
    print("Generating submission...")
    model.load_state_dict(torch.load(best_model_path))
    model.eval(); preds = []
    with torch.no_grad():
        for x, _ in tqdm(test_loader):
            acc = torch.zeros((1, 13, 512, 512), device=device)
            for s in [1.0, 1.25]:
                h_s, w_s = int(img_size*s), int(img_size*s)
                x_s = F.interpolate(x.to(device), size=(h_s, w_s), mode='bilinear')
                out = model(x_s) + torch.flip(model(torch.flip(x_s, [3])), [3])
                acc += F.interpolate(out, size=(512, 512), mode='bilinear')
            preds.append(acc.argmax(1).cpu().numpy().astype(np.uint8))

    submission = np.concatenate(preds, axis=0)
    np.save("submission.npy", submission)
    with ZipFile("submission.zip", "w", ZIP_DEFLATED) as zf: zf.write("submission.npy")
    print(f"Final shape: {submission.shape}, Best mIoU: {best_miou:.4f}")

if __name__ == "__main__": main()

Train: 716, Val: 79, Test: 654


Epoch 1: 100%|██████████| 44/44 [00:21<00:00,  2.03it/s]


Epoch 1: mIoU=0.3188
  Per-class: Bed:0.511, Book:0.094, Ceiling:0.000, Chair:0.355, Floor:0.767, Cabinet:0.291, Object:0.363, Picture:0.376, Sofa:0.072, Desk:0.239, TV:0.080, Wall:0.635, Window:0.361
  -> New best: 0.3188


Epoch 2: 100%|██████████| 44/44 [00:20<00:00,  2.11it/s]


Epoch 2: mIoU=0.3753
  Per-class: Bed:0.520, Book:0.117, Ceiling:0.000, Chair:0.373, Floor:0.820, Cabinet:0.315, Object:0.394, Picture:0.402, Sofa:0.244, Desk:0.284, TV:0.275, Wall:0.674, Window:0.462
  -> New best: 0.3753


Epoch 3: 100%|██████████| 44/44 [00:21<00:00,  2.04it/s]


Epoch 3: mIoU=0.4302
  Per-class: Bed:0.411, Book:0.120, Ceiling:0.206, Chair:0.498, Floor:0.856, Cabinet:0.394, Object:0.364, Picture:0.445, Sofa:0.483, Desk:0.316, TV:0.270, Wall:0.714, Window:0.514
  -> New best: 0.4302


Epoch 4: 100%|██████████| 44/44 [00:21<00:00,  2.03it/s]


Epoch 4: mIoU=0.4839
  Per-class: Bed:0.737, Book:0.122, Ceiling:0.458, Chair:0.476, Floor:0.875, Cabinet:0.357, Object:0.393, Picture:0.492, Sofa:0.385, Desk:0.351, TV:0.517, Wall:0.664, Window:0.466
  -> New best: 0.4839


Epoch 5: 100%|██████████| 44/44 [00:21<00:00,  2.03it/s]


Epoch 5: mIoU=0.5636
  Per-class: Bed:0.730, Book:0.172, Ceiling:0.584, Chair:0.535, Floor:0.879, Cabinet:0.509, Object:0.507, Picture:0.502, Sofa:0.554, Desk:0.364, TV:0.633, Wall:0.773, Window:0.585
  -> New best: 0.5636


Epoch 6: 100%|██████████| 44/44 [00:21<00:00,  2.04it/s]


Epoch 6: mIoU=0.5105
  Per-class: Bed:0.615, Book:0.128, Ceiling:0.566, Chair:0.567, Floor:0.863, Cabinet:0.394, Object:0.465, Picture:0.552, Sofa:0.402, Desk:0.326, TV:0.562, Wall:0.682, Window:0.513


Epoch 7: 100%|██████████| 44/44 [00:20<00:00,  2.10it/s]


Epoch 7: mIoU=0.5270
  Per-class: Bed:0.718, Book:0.175, Ceiling:0.555, Chair:0.568, Floor:0.891, Cabinet:0.475, Object:0.476, Picture:0.575, Sofa:0.431, Desk:0.362, TV:0.344, Wall:0.740, Window:0.541


Epoch 8: 100%|██████████| 44/44 [00:21<00:00,  2.08it/s]


Epoch 8: mIoU=0.5492
  Per-class: Bed:0.661, Book:0.153, Ceiling:0.512, Chair:0.538, Floor:0.900, Cabinet:0.499, Object:0.497, Picture:0.576, Sofa:0.560, Desk:0.339, TV:0.534, Wall:0.792, Window:0.578


Epoch 9: 100%|██████████| 44/44 [00:20<00:00,  2.10it/s]


Epoch 9: mIoU=0.5484
  Per-class: Bed:0.723, Book:0.167, Ceiling:0.475, Chair:0.555, Floor:0.890, Cabinet:0.482, Object:0.465, Picture:0.473, Sofa:0.541, Desk:0.368, TV:0.649, Wall:0.770, Window:0.571


Epoch 10: 100%|██████████| 44/44 [00:21<00:00,  2.05it/s]


Epoch 10: mIoU=0.5394
  Per-class: Bed:0.748, Book:0.144, Ceiling:0.532, Chair:0.495, Floor:0.858, Cabinet:0.451, Object:0.495, Picture:0.455, Sofa:0.466, Desk:0.387, TV:0.542, Wall:0.803, Window:0.639


Epoch 11: 100%|██████████| 44/44 [00:21<00:00,  2.02it/s]


Epoch 11: mIoU=0.5795
  Per-class: Bed:0.717, Book:0.183, Ceiling:0.478, Chair:0.513, Floor:0.890, Cabinet:0.531, Object:0.464, Picture:0.563, Sofa:0.556, Desk:0.373, TV:0.741, Wall:0.815, Window:0.708
  -> New best: 0.5795


Epoch 12: 100%|██████████| 44/44 [00:21<00:00,  2.02it/s]


Epoch 12: mIoU=0.5880
  Per-class: Bed:0.765, Book:0.186, Ceiling:0.467, Chair:0.621, Floor:0.907, Cabinet:0.536, Object:0.539, Picture:0.606, Sofa:0.626, Desk:0.354, TV:0.586, Wall:0.802, Window:0.649
  -> New best: 0.5880


Epoch 13: 100%|██████████| 44/44 [00:21<00:00,  2.08it/s]


Epoch 13: mIoU=0.5725
  Per-class: Bed:0.740, Book:0.173, Ceiling:0.584, Chair:0.601, Floor:0.906, Cabinet:0.523, Object:0.505, Picture:0.522, Sofa:0.581, Desk:0.390, TV:0.520, Wall:0.800, Window:0.597


Epoch 14: 100%|██████████| 44/44 [00:21<00:00,  2.06it/s]


Epoch 14: mIoU=0.6013
  Per-class: Bed:0.773, Book:0.200, Ceiling:0.577, Chair:0.535, Floor:0.902, Cabinet:0.562, Object:0.514, Picture:0.630, Sofa:0.581, Desk:0.405, TV:0.694, Wall:0.789, Window:0.655
  -> New best: 0.6013


Epoch 15: 100%|██████████| 44/44 [00:21<00:00,  2.07it/s]


Epoch 15: mIoU=0.5824
  Per-class: Bed:0.792, Book:0.167, Ceiling:0.604, Chair:0.590, Floor:0.874, Cabinet:0.470, Object:0.534, Picture:0.501, Sofa:0.610, Desk:0.353, TV:0.587, Wall:0.801, Window:0.688


Epoch 16: 100%|██████████| 44/44 [00:21<00:00,  2.03it/s]


Epoch 16: mIoU=0.5812
  Per-class: Bed:0.712, Book:0.136, Ceiling:0.604, Chair:0.549, Floor:0.883, Cabinet:0.464, Object:0.511, Picture:0.655, Sofa:0.609, Desk:0.352, TV:0.618, Wall:0.804, Window:0.661


Epoch 17: 100%|██████████| 44/44 [00:21<00:00,  2.07it/s]


Epoch 17: mIoU=0.6382
  Per-class: Bed:0.801, Book:0.197, Ceiling:0.592, Chair:0.620, Floor:0.905, Cabinet:0.572, Object:0.565, Picture:0.638, Sofa:0.635, Desk:0.402, TV:0.762, Wall:0.842, Window:0.766
  -> New best: 0.6382


Epoch 18: 100%|██████████| 44/44 [00:20<00:00,  2.11it/s]


Epoch 18: mIoU=0.6106
  Per-class: Bed:0.728, Book:0.204, Ceiling:0.633, Chair:0.646, Floor:0.909, Cabinet:0.573, Object:0.497, Picture:0.566, Sofa:0.665, Desk:0.375, TV:0.625, Wall:0.828, Window:0.691


Epoch 19: 100%|██████████| 44/44 [00:21<00:00,  2.06it/s]


Epoch 19: mIoU=0.6492
  Per-class: Bed:0.769, Book:0.208, Ceiling:0.623, Chair:0.661, Floor:0.911, Cabinet:0.603, Object:0.556, Picture:0.656, Sofa:0.674, Desk:0.417, TV:0.764, Wall:0.858, Window:0.737
  -> New best: 0.6492


Epoch 20: 100%|██████████| 44/44 [00:21<00:00,  2.09it/s]


Epoch 20: mIoU=0.6205
  Per-class: Bed:0.788, Book:0.178, Ceiling:0.649, Chair:0.575, Floor:0.915, Cabinet:0.581, Object:0.546, Picture:0.631, Sofa:0.661, Desk:0.403, TV:0.562, Wall:0.851, Window:0.726


Epoch 21: 100%|██████████| 44/44 [00:21<00:00,  2.09it/s]


Epoch 21: mIoU=0.6298
  Per-class: Bed:0.761, Book:0.160, Ceiling:0.652, Chair:0.615, Floor:0.907, Cabinet:0.565, Object:0.551, Picture:0.663, Sofa:0.662, Desk:0.376, TV:0.680, Wall:0.855, Window:0.741


Epoch 22: 100%|██████████| 44/44 [00:21<00:00,  2.09it/s]


Epoch 22: mIoU=0.6361
  Per-class: Bed:0.783, Book:0.184, Ceiling:0.642, Chair:0.637, Floor:0.912, Cabinet:0.612, Object:0.562, Picture:0.612, Sofa:0.674, Desk:0.415, TV:0.636, Wall:0.856, Window:0.745


Epoch 23: 100%|██████████| 44/44 [00:20<00:00,  2.10it/s]


Epoch 23: mIoU=0.6368
  Per-class: Bed:0.777, Book:0.165, Ceiling:0.642, Chair:0.629, Floor:0.916, Cabinet:0.590, Object:0.576, Picture:0.653, Sofa:0.685, Desk:0.404, TV:0.641, Wall:0.851, Window:0.749


Epoch 24: 100%|██████████| 44/44 [00:20<00:00,  2.11it/s]


Epoch 24: mIoU=0.6333
  Per-class: Bed:0.780, Book:0.163, Ceiling:0.701, Chair:0.632, Floor:0.917, Cabinet:0.588, Object:0.578, Picture:0.668, Sofa:0.622, Desk:0.416, TV:0.606, Wall:0.843, Window:0.718


Epoch 25: 100%|██████████| 44/44 [00:21<00:00,  2.09it/s]


Epoch 25: mIoU=0.6500
  Per-class: Bed:0.752, Book:0.209, Ceiling:0.696, Chair:0.646, Floor:0.921, Cabinet:0.606, Object:0.542, Picture:0.680, Sofa:0.649, Desk:0.428, TV:0.681, Wall:0.867, Window:0.772
  -> New best: 0.6500


Epoch 26: 100%|██████████| 44/44 [00:20<00:00,  2.11it/s]


Epoch 26: mIoU=0.6375
  Per-class: Bed:0.773, Book:0.169, Ceiling:0.658, Chair:0.662, Floor:0.922, Cabinet:0.597, Object:0.578, Picture:0.667, Sofa:0.648, Desk:0.399, TV:0.611, Wall:0.856, Window:0.746


Epoch 27: 100%|██████████| 44/44 [00:21<00:00,  2.06it/s]


Epoch 27: mIoU=0.6440
  Per-class: Bed:0.777, Book:0.188, Ceiling:0.686, Chair:0.639, Floor:0.924, Cabinet:0.591, Object:0.576, Picture:0.643, Sofa:0.654, Desk:0.432, TV:0.655, Wall:0.858, Window:0.751


Epoch 28: 100%|██████████| 44/44 [00:21<00:00,  2.02it/s]


Epoch 28: mIoU=0.6484
  Per-class: Bed:0.760, Book:0.188, Ceiling:0.673, Chair:0.651, Floor:0.919, Cabinet:0.603, Object:0.581, Picture:0.657, Sofa:0.641, Desk:0.432, TV:0.703, Wall:0.866, Window:0.754


Epoch 29: 100%|██████████| 44/44 [00:21<00:00,  2.04it/s]


Epoch 29: mIoU=0.6522
  Per-class: Bed:0.783, Book:0.196, Ceiling:0.669, Chair:0.634, Floor:0.923, Cabinet:0.613, Object:0.584, Picture:0.660, Sofa:0.629, Desk:0.413, TV:0.749, Wall:0.863, Window:0.763
  -> New best: 0.6522


Epoch 30: 100%|██████████| 44/44 [00:21<00:00,  2.09it/s]


Epoch 30: mIoU=0.6495
  Per-class: Bed:0.790, Book:0.169, Ceiling:0.690, Chair:0.666, Floor:0.923, Cabinet:0.590, Object:0.566, Picture:0.679, Sofa:0.681, Desk:0.425, TV:0.642, Wall:0.857, Window:0.768


Epoch 31: 100%|██████████| 44/44 [00:20<00:00,  2.10it/s]


Epoch 31: mIoU=0.6498
  Per-class: Bed:0.777, Book:0.174, Ceiling:0.699, Chair:0.655, Floor:0.925, Cabinet:0.602, Object:0.571, Picture:0.676, Sofa:0.580, Desk:0.435, TV:0.717, Wall:0.867, Window:0.770


Epoch 32: 100%|██████████| 44/44 [00:21<00:00,  2.02it/s]


Epoch 32: mIoU=0.6401
  Per-class: Bed:0.774, Book:0.192, Ceiling:0.659, Chair:0.628, Floor:0.926, Cabinet:0.608, Object:0.573, Picture:0.671, Sofa:0.566, Desk:0.416, TV:0.691, Wall:0.861, Window:0.756


Epoch 33: 100%|██████████| 44/44 [00:21<00:00,  2.06it/s]


Epoch 33: mIoU=0.6592
  Per-class: Bed:0.788, Book:0.183, Ceiling:0.707, Chair:0.677, Floor:0.928, Cabinet:0.618, Object:0.589, Picture:0.683, Sofa:0.676, Desk:0.440, TV:0.676, Wall:0.860, Window:0.745
  -> New best: 0.6592


Epoch 34: 100%|██████████| 44/44 [00:21<00:00,  2.05it/s]


Epoch 34: mIoU=0.6477
  Per-class: Bed:0.757, Book:0.168, Ceiling:0.696, Chair:0.667, Floor:0.927, Cabinet:0.602, Object:0.585, Picture:0.685, Sofa:0.657, Desk:0.427, TV:0.644, Wall:0.862, Window:0.745


Epoch 35: 100%|██████████| 44/44 [00:21<00:00,  2.04it/s]


Epoch 35: mIoU=0.6594
  Per-class: Bed:0.788, Book:0.192, Ceiling:0.678, Chair:0.664, Floor:0.930, Cabinet:0.617, Object:0.581, Picture:0.689, Sofa:0.663, Desk:0.427, TV:0.719, Wall:0.867, Window:0.757
  -> New best: 0.6594


Epoch 36: 100%|██████████| 44/44 [00:20<00:00,  2.12it/s]


Epoch 36: mIoU=0.6648
  Per-class: Bed:0.812, Book:0.185, Ceiling:0.681, Chair:0.678, Floor:0.929, Cabinet:0.631, Object:0.580, Picture:0.699, Sofa:0.674, Desk:0.437, TV:0.717, Wall:0.868, Window:0.750
  -> New best: 0.6648


Epoch 37: 100%|██████████| 44/44 [00:21<00:00,  2.08it/s]


Epoch 37: mIoU=0.6578
  Per-class: Bed:0.791, Book:0.180, Ceiling:0.671, Chair:0.657, Floor:0.929, Cabinet:0.622, Object:0.583, Picture:0.706, Sofa:0.614, Desk:0.442, TV:0.725, Wall:0.869, Window:0.762


Epoch 38: 100%|██████████| 44/44 [00:21<00:00,  2.06it/s]


Epoch 38: mIoU=0.6655
  Per-class: Bed:0.806, Book:0.200, Ceiling:0.667, Chair:0.662, Floor:0.931, Cabinet:0.616, Object:0.585, Picture:0.699, Sofa:0.673, Desk:0.437, TV:0.744, Wall:0.873, Window:0.758
  -> New best: 0.6655


Epoch 39: 100%|██████████| 44/44 [00:21<00:00,  2.07it/s]


Epoch 39: mIoU=0.6645
  Per-class: Bed:0.801, Book:0.184, Ceiling:0.686, Chair:0.670, Floor:0.929, Cabinet:0.622, Object:0.586, Picture:0.697, Sofa:0.661, Desk:0.426, TV:0.751, Wall:0.869, Window:0.756


Epoch 40: 100%|██████████| 44/44 [00:21<00:00,  2.01it/s]


Epoch 40: mIoU=0.6689
  Per-class: Bed:0.810, Book:0.188, Ceiling:0.687, Chair:0.679, Floor:0.930, Cabinet:0.629, Object:0.590, Picture:0.698, Sofa:0.664, Desk:0.431, TV:0.767, Wall:0.868, Window:0.754
  -> New best: 0.6689


Epoch 41: 100%|██████████| 44/44 [00:21<00:00,  2.09it/s]


Epoch 41: mIoU=0.6652
  Per-class: Bed:0.801, Book:0.186, Ceiling:0.679, Chair:0.683, Floor:0.930, Cabinet:0.619, Object:0.580, Picture:0.698, Sofa:0.674, Desk:0.435, TV:0.733, Wall:0.870, Window:0.760


Epoch 42: 100%|██████████| 44/44 [00:21<00:00,  2.09it/s]


Epoch 42: mIoU=0.6635
  Per-class: Bed:0.793, Book:0.181, Ceiling:0.684, Chair:0.675, Floor:0.930, Cabinet:0.616, Object:0.588, Picture:0.703, Sofa:0.661, Desk:0.426, TV:0.751, Wall:0.867, Window:0.751


Epoch 43: 100%|██████████| 44/44 [00:20<00:00,  2.10it/s]


Epoch 43: mIoU=0.6632
  Per-class: Bed:0.805, Book:0.179, Ceiling:0.684, Chair:0.675, Floor:0.930, Cabinet:0.617, Object:0.587, Picture:0.702, Sofa:0.652, Desk:0.429, TV:0.750, Wall:0.865, Window:0.747


Epoch 44: 100%|██████████| 44/44 [00:21<00:00,  2.02it/s]


Epoch 44: mIoU=0.6683
  Per-class: Bed:0.807, Book:0.191, Ceiling:0.681, Chair:0.685, Floor:0.930, Cabinet:0.626, Object:0.591, Picture:0.705, Sofa:0.666, Desk:0.430, TV:0.752, Wall:0.867, Window:0.757


Epoch 45: 100%|██████████| 44/44 [00:21<00:00,  2.09it/s]


Epoch 45: mIoU=0.6657
  Per-class: Bed:0.812, Book:0.179, Ceiling:0.677, Chair:0.683, Floor:0.930, Cabinet:0.629, Object:0.591, Picture:0.701, Sofa:0.669, Desk:0.427, TV:0.733, Wall:0.868, Window:0.755


Epoch 46: 100%|██████████| 44/44 [00:21<00:00,  2.02it/s]


Epoch 46: mIoU=0.6639
  Per-class: Bed:0.807, Book:0.180, Ceiling:0.678, Chair:0.679, Floor:0.930, Cabinet:0.627, Object:0.586, Picture:0.705, Sofa:0.657, Desk:0.426, TV:0.737, Wall:0.868, Window:0.751


Epoch 47: 100%|██████████| 44/44 [00:21<00:00,  2.06it/s]


Epoch 47: mIoU=0.6658
  Per-class: Bed:0.815, Book:0.179, Ceiling:0.679, Chair:0.679, Floor:0.929, Cabinet:0.626, Object:0.589, Picture:0.710, Sofa:0.655, Desk:0.427, TV:0.742, Wall:0.868, Window:0.758


Epoch 48: 100%|██████████| 44/44 [00:21<00:00,  2.09it/s]


Epoch 48: mIoU=0.6667
  Per-class: Bed:0.814, Book:0.174, Ceiling:0.678, Chair:0.682, Floor:0.929, Cabinet:0.628, Object:0.589, Picture:0.709, Sofa:0.662, Desk:0.426, TV:0.749, Wall:0.868, Window:0.759


Epoch 49: 100%|██████████| 44/44 [00:21<00:00,  2.07it/s]


Epoch 49: mIoU=0.6677
  Per-class: Bed:0.808, Book:0.184, Ceiling:0.679, Chair:0.681, Floor:0.929, Cabinet:0.625, Object:0.591, Picture:0.708, Sofa:0.670, Desk:0.426, TV:0.746, Wall:0.868, Window:0.765


Epoch 50: 100%|██████████| 44/44 [00:21<00:00,  2.08it/s]


Epoch 50: mIoU=0.6669
  Per-class: Bed:0.817, Book:0.181, Ceiling:0.682, Chair:0.684, Floor:0.930, Cabinet:0.626, Object:0.591, Picture:0.711, Sofa:0.659, Desk:0.424, TV:0.736, Wall:0.870, Window:0.761
Generating submission...


100%|██████████| 654/654 [01:32<00:00,  7.09it/s]


Final shape: (654, 512, 512), Best mIoU: 0.6689


In [None]:
# ------------------
#    Evaluation
# ------------------

model.load_state_dict(torch.load(model_path, map_location=device))
model.eval()

# 予測結果の生成
predictions = []

with torch.no_grad():
    print("Generating predictions...")
    for image, depth in tqdm(test_data):
        image, depth = image.to(device), depth.to(device)
        x = torch.cat((image, depth), dim=1)
        output = model(x)            # [Batch, num_classes, H, W]
        pred = output.argmax(dim=1)  # [Batch, H, W]
        predictions.append(pred.cpu())
predictions = torch.cat(predictions, dim=0)

predictions = predictions.cpu().numpy()
np.save('submission.npy', predictions)
print("Predictions saved to submission.npy")

"""## 提出方法

以下の3点をzip化し，Omnicampusの「最終課題 (セグメンテーション)」から提出してください．

- `submission.npy`
- `model.pt`や`model_best.pt`など，テストに使用した重み（拡張子は`.pt`のみ）
- 本Colab Notebook
"""

from zipfile import ZipFile, ZIP_DEFLATED

notebook_path = "/content/drive/MyDrive/Colab Notebooks/DL_Basic_2025_Competition_NYUv2_baseline.ipynb"

with ZipFile("submission.zip",
             mode="w",
             compression=ZIP_DEFLATED,
             compresslevel=9) as zf:
    zf.write("submission.npy")
    zf.write(model_path)
    zf.write(notebook_path,
             arcname="DL_Basic_2025_Competition_NYUv2_baseline.ipynb")