- S–phases = SegFormer
- A–phases = UNet-Albedo (
UNetAlbedo
in code) - M–phases = UNet-Maps (
UNetMaps
)
cond_ch = 512
⇒ FiLM conditioning enabled from the first epoch for both U-Nets.
Class weights = 1 / √freq(class)
; WeightedRandomSampler active in every phase except the final hi-res stages.
Phase | Dataset mix | Trainables | Crop / Feed | Augment† | Epochs | Opt & LR | Scheduler | Loss |
---|---|---|---|---|---|---|---|---|
S1 | 100 % MatSynth | enc + dec | 256 | none (first 10 ep) flips · rot · colour · composite (30 %/15 %) | 55 | AdamW 1e‑4→1e‑5 | OneCycle | CE (+√freq) |
S2 | 75 % Mat / 25 % Sky | heads + LoRA | 512 | S1 + SkyPhotometric 0.6 + composite (25 %/10 %) | 10 | AdamW 1e‑5 | cosine‑10, η=2e‑6 | CE + masked‑CE (Sky p>0.8) |
S3 | 50 % / 50 % | top‑½ enc + heads | 768 | SkyPhotometric 0.6 | 10 | AdamW 5e‑6 | cosine‑12 | same |
S4 | 50 % / 50 % | BN/LN only | 1024 | none | 2 | AdamW 3e‑6 | cosine‑restart | CE |
S5 | 50 % / 50 % | dec‑head + LoRA | 1024 | SkyPhotometric 0.5 | 8 | AdamW 1e‑6 | cosine‑8 | CE |
Phase | Dataset mix | Trainables | Crop / Feed (px) | Augment† | Epochs | Optimiser & LR (per‑group) | Scheduler | Loss |
---|---|---|---|---|---|---|---|---|
A1 | 50 % Mat / 50 % Sky | full UNet + FiLM | 256 | flips · rot, SkyPhoto 0.6 | 45 | AdamW — enc 2e‑4 · dec 2e‑4 · FiLM 3e‑4 · head 2.5e‑4 | OneCycle (pct 0.2, cos, final 1e‑5) | L1 + 0.1 SSIM + 0.08 LPIPS |
A2 | 25 % Mat / 75 % Sky | decoder + FiLM | 512 | A1 aug + SkyPhoto 0.6 | 14 | AdamW 1e‑5 | cosine‑14 | same |
A3 | 100 % Sky | 1 × 1 head only | 1 024 | none | 5 | Adam 5e‑7 | Exp 0.9 | same |
Save the best A2 checkpoint → encoder donor for Maps.
Phase | Dataset | Encoder init | Trainables | Crop / Feed (px) | Epochs | Optimiser & LR | Scheduler | Core losses |
---|---|---|---|---|---|---|---|---|
M‑pre | 100 % Sky | best A2 (strict False) | enc + dec + heads | 768 | 6 | AdamW: enc 2e‑5 (LLRD 0.8^d) · dec/heads 1e‑4 | cosine‑6 | Rough L1 + .05 SSIM · Metal BCE · AO L1 · Height L1 + .01 TV |
M0 | 100 % Sky | from M‑pre | enc + dec + heads | 1 024 | 8 | AdamW: enc 1e‑5 · dec/heads 5e‑5 | cosine‑8 | same |
M1 | 100 % Sky | best M0 | one head at a time | 1 024 | 5–7 | Adam 1e‑6 | Exp 0.9 | same (detach Albedo) |
Feed ≤ (px) | Mosaic active? | Grid | Share (Mat / Sky) |
---|---|---|---|
256 | yes | 2 × 2 (128) | 30 % / 15 % |
512 | yes | 2 × 2 (256) | 25 % / 10 % |
≥ 768 | no | — | 0 % |
# before M-pre
p = 0.06 # prior metal pixel ratio
b0 = math.log(p / (1 - p))
model.head_metal[0].bias.data.fill_(b0) # 1×1 conv bias
bce = torch.nn.BCEWithLogitsLoss(pos_weight = neg/pos)
- Global safe – h/v flip, 90° rot (all phases)
- Colour-jitter – ±5 % hue/sat (MatSynth only)
- Composite mosaics – MatSynth only, % per table
- SkyPhotometric(p=0.6) – light tint, γ, grain; p = 0.5 in hi-res stages
# ❶ Weight-transfer Maps ⇐ Albedo
maps.unet.load_state_dict(albedo.unet.state_dict(), strict=False)
# ❷ LLRD parameter groups (encoder only)
for i, blk in enumerate(maps.unet.encoder):
lr = base_enc_lr * (0.8 ** (len(maps.unet.encoder) - i - 1))
param_groups.append({"params": blk.parameters(), "lr": lr, "weight_decay": 1e-2})
# ❸ Detach albedo when feeding Maps
alb = albedo(diffuse_normal, segfeat).detach()
maps_in = torch.cat([alb, normal], 1)
out = maps(maps_in, segfeat)
Category | Action & Reason |
---|---|
plastic | Drop – anachronistic. |
concrete | Merge → stone – roughness/height similar; makes SegFormer’s job easier. |
marble | If Skyrim mod pack has no marble, merge into stone; else keep (rare indoor pillars). |
plaster | Drop |
terracotta | Very rare → drop. |
misc | Contains heterogeneous, often modern designs → drop. |
ceramic, fabric, ground, leather, metal, wood, stone | Keep. Add fur if you have ≥ 100 samples. |
Category | Safe for ALL domains (apply blindly) |
Category-Selective (only if label is known or confidence > 0.8) |
Exclude / Never |
---|---|---|---|
wood | flips, 90° rot | ±10 % brightness, ±5 % hue, small grain-noise mask | heavy tint (green, purple) |
stone | flips, 90° rot | ±8 % brightness, Perlin dirt overlay | hue shift (changes mineral colour) |
metal | flips, 90° rot | subtle specular highlight sprite (white blotch α=0.15) | hue shift (turns iron blue) |
fabric / fur | flips, 90° rot | ±12 % hue/ sat, small warp (elastic-grid) | specular sprite |
leather | flips, 90° rot | ±8 % hue, ±12 % brightness | specular sprite |
ground / ceramic / plaster | flips, 90° rot | ±10 % brightness, Perlin dirt | hue shift > 5 % |
misc (dropped) | — | — | — |