## Results Comparison
We run the run_all_inpanting for both the cifar and celebahq datasets to identify and compare the best approaches till now:

In [1]:
import subprocess
import json
import os
import matplotlib.pyplot as plt
import numpy as np
import os
os.environ["MKL_THREADING_LAYER"] = "GNU"

Defining the runs:

In [2]:
runs = {
    "cifar_32": {
        "cmd": [
            "python3", "../drivers/run_all_inpainting.py", "--include_nftm",
            "--train_dataset", "cifar",
            "--benchmark", "cifar",
            "--img_size", "32",
            "--out", "out_cifar32"
        ],
        "summary_path": "out_cifar32/summary.json",
    },

    "celebahq_64": {
        "cmd": [
            "python3", "../drivers/run_all_inpainting.py", "--include_nftm",
            "--train_dataset", "celebahq",
            "--benchmark", "celebahq",
            "--img_size", "64",
            "--out", "out_celebahq64"
        ],
        "summary_path": "out_celebahq64/summary.json",
    }
}
results = {} 

In [4]:
for name, info in runs.items():
    print(f"\n=== Running experiment: {name} ===")
    try:
        subprocess.run(info["cmd"], check=True)
    except subprocess.CalledProcessError as e:
        print(f" Experiment {name} failed with exit code {e.returncode}")
        continue

    if os.path.exists(info["summary_path"]):
        with open(info["summary_path"], "r") as f:
            results[name] = json.load(f)
        print(f" Loaded metrics: {info['summary_path']}")
    else:
        print(f" No summary.json found at {info['summary_path']}")


=== Running experiment: cifar_32 ===
Using device: cuda
Model parameters: 46479
Epoch 001: loss=0.1608, val_psnr=21.88 dB, time=11.3s
Epoch 002: loss=0.1012, val_psnr=23.49 dB, time=7.5s
Epoch 003: loss=0.0863, val_psnr=24.32 dB, time=7.4s
Epoch 004: loss=0.0780, val_psnr=24.79 dB, time=7.4s
Epoch 005: loss=0.0727, val_psnr=25.11 dB, time=7.4s
Epoch 006: loss=0.0689, val_psnr=25.48 dB, time=7.4s
Epoch 007: loss=0.0661, val_psnr=25.38 dB, time=7.4s
Epoch 008: loss=0.0638, val_psnr=25.90 dB, time=7.4s
Epoch 009: loss=0.0620, val_psnr=26.00 dB, time=7.3s
Epoch 010: loss=0.0603, val_psnr=26.16 dB, time=7.4s
Epoch 011: loss=0.0590, val_psnr=26.04 dB, time=7.4s
Epoch 012: loss=0.0578, val_psnr=26.34 dB, time=7.5s
Epoch 013: loss=0.0568, val_psnr=26.39 dB, time=7.5s
Epoch 014: loss=0.0560, val_psnr=26.43 dB, time=7.4s
Epoch 015: loss=0.0551, val_psnr=26.58 dB, time=7.4s
Epoch 016: loss=0.0545, val_psnr=26.61 dB, time=7.5s
Epoch 017: loss=0.0538, val_psnr=26.66 dB, time=7.5s
Epoch 018: loss=0



Setting up [LPIPS] perceptual loss: trunk [alex], v[0.1], spatial [off]
Loading model from: /gpfs/users/balazsk/.conda/envs/nftm/lib/python3.11/site-packages/lpips/weights/v0.1/alex.pth
Training and evaluation complete.




Setting up [LPIPS] perceptual loss: trunk [alex], v[0.1], spatial [off]
Loading model from: /gpfs/users/balazsk/.conda/envs/nftm/lib/python3.11/site-packages/lpips/weights/v0.1/alex.pth
Checkpoint: out_cifar32/unet/ckpt.pt
Device: cuda
Parameters: 46,479
Seed: 0
Metrics:
  psnr_all: 26.7861
  psnr_miss: 23.2666
  ssim_all: 0.8807
  ssim_miss: 2.1903
  lpips_all: 0.0113
  lpips_miss: 0.0326
  fid: 17.7756
  kid: 0.0104




Running TV-L1 baseline on cifar test set with 250 iterations (device=cuda)




Setting up [LPIPS] perceptual loss: trunk [alex], v[0.1], spatial [off]
Loading model from: /gpfs/users/balazsk/.conda/envs/nftm/lib/python3.11/site-packages/lpips/weights/v0.1/alex.pth
[TVL1] mean PSNR_all=25.58 dB, PSNR_miss=21.65 dB
Saved metrics to out_cifar32/tvl1/metrics.json
[device] cuda | criterion=MSE
[controller] dense | params=46375




Setting up [LPIPS] perceptual loss: trunk [alex], v[0.1], spatial [off]
Loading model from: /gpfs/users/balazsk/.conda/envs/nftm/lib/python3.11/site-packages/lpips/weights/v0.1/alex.pth
[ep 01] β_train=0.280 K_train=5 | loss 0.1267 | train PSNR 15.22 dB | eval PSNR 1..12: 14.96, 15.32, 15.66, 15.97, 16.26 ... 17.78 | ctrl=dense | final SSIM 0.4865 | final LPIPS 0.0852
[ep 02] β_train=0.310 K_train=6 | loss 0.1207 | train PSNR 15.44 dB | eval PSNR 1..12: 15.00, 15.40, 15.77, 16.12, 16.44 ... 18.14 | ctrl=dense | final SSIM 0.5174 | final LPIPS 0.0762
[ep 03] β_train=0.340 K_train=7 | loss 0.1197 | train PSNR 15.49 dB | eval PSNR 1..12: 15.04, 15.48, 15.88, 16.27, 16.62 ... 18.49 | ctrl=dense | final SSIM 0.5419 | final LPIPS 0.0687
[ep 04] β_train=0.370 K_train=8 | loss 0.1142 | train PSNR 15.71 dB | eval PSNR 1..12: 15.04, 15.51, 15.95, 16.36, 16.75 ... 18.78 | ctrl=dense | final SSIM 0.5598 | final LPIPS 0.0625
[ep 05] β_train=0.400 K_train=8 | loss 0.1111 | train PSNR 15.85 dB | eval



[viz] saved per-epoch progression → out_cifar32/nftm/final/progress_epoch_final.png
[gif] saved reconstruction GIF (GT top row, recon bottom) → out_cifar32/nftm/final/progress_epoch_final.gif
[done] checkpoints and plots saved under: out_cifar32/nftm | controller=dense
[metrics] saved metrics.json & psnr_curve.npy in out_cifar32/nftm | controller=dense
[device] cuda | criterion=MSE
[controller] dense | params=46375




Setting up [LPIPS] perceptual loss: trunk [alex], v[0.1], spatial [off]
Loading model from: /gpfs/users/balazsk/.conda/envs/nftm/lib/python3.11/site-packages/lpips/weights/v0.1/alex.pth
[ep 01] β_train=0.280 K_train=5 | loss 0.1268 | train PSNR 15.21 dB | eval PSNR 1..30: 14.96, 15.32, 15.66, 15.97, 16.27 ... 19.29 | ctrl=dense | final SSIM 0.5718 | final LPIPS 0.0551
[ep 02] β_train=0.310 K_train=6 | loss 0.1207 | train PSNR 15.44 dB | eval PSNR 1..30: 15.00, 15.40, 15.77, 16.12, 16.44 ... 19.84 | ctrl=dense | final SSIM 0.6125 | final LPIPS 0.0458
[ep 03] β_train=0.340 K_train=7 | loss 0.1197 | train PSNR 15.49 dB | eval PSNR 1..30: 15.04, 15.47, 15.88, 16.26, 16.61 ... 20.29 | ctrl=dense | final SSIM 0.6264 | final LPIPS 0.0410
[ep 04] β_train=0.370 K_train=8 | loss 0.1142 | train PSNR 15.71 dB | eval PSNR 1..30: 15.04, 15.51, 15.95, 16.36, 16.75 ... 20.81 | ctrl=dense | final SSIM 0.6662 | final LPIPS 0.0340
[ep 05] β_train=0.400 K_train=9 | loss 0.1107 | train PSNR 15.87 dB | eval



[viz] saved per-epoch progression → runs/inpainting/nftm_dense/final/progress_epoch_final.png
[gif] saved reconstruction GIF (GT top row, recon bottom) → runs/inpainting/nftm_dense/final/progress_epoch_final.gif
[done] checkpoints and plots saved under: runs/inpainting/nftm_dense | controller=dense
[metrics] saved metrics.json & psnr_curve.npy in runs/inpainting/nftm_dense | controller=dense
[device] cuda | criterion=MSE
[controller] unet | base=10 | params=46843




Setting up [LPIPS] perceptual loss: trunk [alex], v[0.1], spatial [off]
Loading model from: /gpfs/users/balazsk/.conda/envs/nftm/lib/python3.11/site-packages/lpips/weights/v0.1/alex.pth
[ep 01] β_train=0.280 K_train=5 | loss 0.1295 | train PSNR 15.12 dB | eval PSNR 1..30: 14.93, 15.26, 15.57, 15.85, 16.12 ... 18.51 | ctrl=unet | final SSIM 0.4911 | final LPIPS 0.0658
[ep 02] β_train=0.310 K_train=6 | loss 0.1228 | train PSNR 15.36 dB | eval PSNR 1..30: 14.99, 15.37, 15.73, 16.06, 16.37 ... 19.47 | ctrl=unet | final SSIM 0.5683 | final LPIPS 0.0520
[ep 03] β_train=0.340 K_train=7 | loss 0.1193 | train PSNR 15.51 dB | eval PSNR 1..30: 15.03, 15.45, 15.84, 16.21, 16.55 ... 19.96 | ctrl=unet | final SSIM 0.5936 | final LPIPS 0.0446
[ep 04] β_train=0.370 K_train=8 | loss 0.1151 | train PSNR 15.68 dB | eval PSNR 1..30: 15.02, 15.47, 15.89, 16.28, 16.64 ... 20.12 | ctrl=unet | final SSIM 0.5933 | final LPIPS 0.0426
[ep 05] β_train=0.400 K_train=9 | loss 0.1145 | train PSNR 15.71 dB | eval PSN



[viz] saved per-epoch progression → runs/inpainting/nftm_unet/final/progress_epoch_final.png
[gif] saved reconstruction GIF (GT top row, recon bottom) → runs/inpainting/nftm_unet/final/progress_epoch_final.gif
[done] checkpoints and plots saved under: runs/inpainting/nftm_unet | controller=unet
[metrics] saved metrics.json & psnr_curve.npy in runs/inpainting/nftm_unet | controller=unet

[stage] train_unet
>>> /gpfs/users/balazsk/.conda/envs/nftm/bin/python3 /gpfs/workdir/balazsk/NFTM/nftm_inpaint/train_unet.py --epochs 20 --batch_size 256 --lr 0.002 --weight_decay 0.0001 --tv_weight 0.01 --seed 0 --save_dir out_cifar32/unet --base 10 --target_params 46375 --num_workers 2 --device cuda --benchmark cifar --img_size 32
[stage] train_unet completed in 269.3s

[stage] eval_unet
>>> /gpfs/users/balazsk/.conda/envs/nftm/bin/python3 /gpfs/workdir/balazsk/NFTM/nftm_inpaint/eval_unet.py --ckpt out_cifar32/unet/ckpt.pt --batch_size 256 --num_workers 2 --seed 0 --save_dir out_cifar32/unet_eval --d

NameError: name 'results' is not defined

Try to run both again:

In [None]:
for name, info in runs.items():
    print(f"\n=== Running experiment: {name} ===")
    try:
        subprocess.run(info["cmd"], check=True)
    except subprocess.CalledProcessError as e:
        print(f" Experiment {name} failed with exit code {e.returncode}")
        continue

    if os.path.exists(info["summary_path"]):
        with open(info["summary_path"], "r") as f:
            results[name] = json.load(f)
        print(f" Loaded metrics: {info['summary_path']}")
    else:
        print(f" No summary.json found at {info['summary_path']}")


=== Running experiment: cifar_32 ===
Using device: cuda
[Data] train_dataset=cifar, benchmark=cifar, img_size=32
[Data] train_set size=50000
Model parameters: 46479
Epoch 001: loss=0.1608, val_psnr=21.92 dB, time=10.6s
Epoch 002: loss=0.1014, val_psnr=23.37 dB, time=5.3s
Epoch 003: loss=0.0866, val_psnr=24.28 dB, time=5.2s
Epoch 004: loss=0.0782, val_psnr=24.77 dB, time=5.1s
Epoch 005: loss=0.0730, val_psnr=25.15 dB, time=5.1s
Epoch 006: loss=0.0690, val_psnr=25.49 dB, time=5.1s
Epoch 007: loss=0.0662, val_psnr=25.49 dB, time=5.2s
Epoch 008: loss=0.0639, val_psnr=25.89 dB, time=5.1s
Epoch 009: loss=0.0622, val_psnr=26.03 dB, time=5.0s
Epoch 010: loss=0.0605, val_psnr=26.09 dB, time=5.1s
Epoch 011: loss=0.0592, val_psnr=26.12 dB, time=5.1s
Epoch 012: loss=0.0579, val_psnr=26.31 dB, time=5.2s
Epoch 013: loss=0.0570, val_psnr=26.43 dB, time=5.5s
Epoch 014: loss=0.0560, val_psnr=26.42 dB, time=5.1s
Epoch 015: loss=0.0552, val_psnr=26.64 dB, time=5.5s
Epoch 016: loss=0.0545, val_psnr=26.68



Setting up [LPIPS] perceptual loss: trunk [alex], v[0.1], spatial [off]
Loading model from: /gpfs/users/balazsk/.conda/envs/nftm/lib/python3.11/site-packages/lpips/weights/v0.1/alex.pth
Training and evaluation complete.




[Eval Data] benchmark=cifar, img_size=32
[Eval Data] dataset size=10000




Setting up [LPIPS] perceptual loss: trunk [alex], v[0.1], spatial [off]
Loading model from: /gpfs/users/balazsk/.conda/envs/nftm/lib/python3.11/site-packages/lpips/weights/v0.1/alex.pth
Checkpoint: out_cifar32/unet/ckpt.pt
Device: cuda
Parameters: 46,479
Seed: 0
Metrics:
  psnr_all: 26.6098
  psnr_miss: 23.1794
  ssim_all: 0.8714
  ssim_miss: 2.1714
  lpips_all: 0.0119
  lpips_miss: 0.0339
  fid: 19.9014
  kid: 0.0125




Running TV-L1 baseline on cifar test set with 250 iterations (device=cuda)
[TVL1 Eval Data] benchmark=cifar, img_size=32
[TVL1 Eval Data] dataset size=10000




Setting up [LPIPS] perceptual loss: trunk [alex], v[0.1], spatial [off]
Loading model from: /gpfs/users/balazsk/.conda/envs/nftm/lib/python3.11/site-packages/lpips/weights/v0.1/alex.pth
[TVL1] mean PSNR_all=25.55 dB, PSNR_miss=21.64 dB
Saved metrics to out_cifar32/tvl1/metrics.json
[device] cuda | criterion=MSE


Traceback (most recent call last):
  File "/gpfs/workdir/balazsk/NFTM/image_inpainting.py", line 325, in <module>
    main()
  File "/gpfs/workdir/balazsk/NFTM/image_inpainting.py", line 96, in main
    pyr_steps_eval = split_steps_eval(args.K_eval, pyr_sizes, args.pyr_steps if args.pyr_steps else None)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/gpfs/workdir/balazsk/NFTM/nftm_inpaint/rollout.py", line 47, in split_steps_eval
    assert sum(steps) == K_total and len(steps) == len(sizes), "pyr_steps must match pyramid and sum to K_eval"
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: pyr_steps must match pyramid and sum to K_eval



[stage] train_unet
>>> /gpfs/users/balazsk/.conda/envs/nftm/bin/python3 /gpfs/workdir/balazsk/NFTM/nftm_inpaint/train_unet.py --epochs 20 --batch_size 256 --lr 0.002 --weight_decay 0.0001 --tv_weight 0.01 --seed 0 --save_dir out_cifar32/unet --base 10 --target_params 46375 --num_workers 2 --device cuda --benchmark cifar --img_size 32 --train_dataset cifar
[stage] train_unet completed in 221.6s

[stage] eval_unet
>>> /gpfs/users/balazsk/.conda/envs/nftm/bin/python3 /gpfs/workdir/balazsk/NFTM/nftm_inpaint/eval_unet.py --ckpt out_cifar32/unet/ckpt.pt --batch_size 256 --num_workers 2 --seed 0 --save_dir out_cifar32/unet_eval --device cuda --benchmark cifar --img_size 32
[stage] eval_unet completed in 195.6s

[stage] tvl1_baseline
>>> /gpfs/users/balazsk/.conda/envs/nftm/bin/python3 -m baselines.inpainting.tvl1_baseline --iters 250 --lam 80.0 --tvw 0.1 --tau_p 0.25 --tau_d 0.25 --batch_size 256 --num_workers 2 --seed 0 --save_dir out_cifar32/tvl1 --device cuda --benchmark cifar --img_size 



Setting up [LPIPS] perceptual loss: trunk [alex], v[0.1], spatial [off]
Loading model from: /gpfs/users/balazsk/.conda/envs/nftm/lib/python3.11/site-packages/lpips/weights/v0.1/alex.pth
Training and evaluation complete.




[Eval Data] benchmark=celebahq, img_size=64
[Eval Data] dataset size=6000




Setting up [LPIPS] perceptual loss: trunk [alex], v[0.1], spatial [off]
Loading model from: /gpfs/users/balazsk/.conda/envs/nftm/lib/python3.11/site-packages/lpips/weights/v0.1/alex.pth
Checkpoint: out_celebahq64/unet/ckpt.pt
Device: cuda
Parameters: 46,479
Seed: 0
Metrics:
  psnr_all: 27.2737
  psnr_miss: 24.3688
  ssim_all: 0.8697
  ssim_miss: 2.2628
  lpips_all: 0.0428
  lpips_miss: 0.1019
  fid: 50.1084
  kid: 0.0540




Running TV-L1 baseline on celebahq test set with 250 iterations (device=cuda)
[TVL1 Eval Data] benchmark=celebahq, img_size=64
[TVL1 Eval Data] dataset size=6000




Setting up [LPIPS] perceptual loss: trunk [alex], v[0.1], spatial [off]
Loading model from: /gpfs/users/balazsk/.conda/envs/nftm/lib/python3.11/site-packages/lpips/weights/v0.1/alex.pth
[TVL1] mean PSNR_all=26.69 dB, PSNR_miss=22.79 dB
Saved metrics to out_celebahq64/tvl1/metrics.json
[device] cuda | criterion=MSE
[Data] train_dataset=celebahq, benchmark=celebahq, img_size=64
[Data] train_set size=24000, test_set size=6000
[controller] dense | params=46375




Setting up [LPIPS] perceptual loss: trunk [alex], v[0.1], spatial [off]
Loading model from: /gpfs/users/balazsk/.conda/envs/nftm/lib/python3.11/site-packages/lpips/weights/v0.1/alex.pth
[ep 01] β_train=0.280 K_train=5 | loss 0.0306 | train PSNR 21.55 dB | eval PSNR 1..30: 18.67, 17.63, 15.86, 16.52, 16.18 ... 17.96 | ctrl=dense | final SSIM 0.4942 | final LPIPS 0.3079
[ep 02] β_train=0.310 K_train=6 | loss 0.0275 | train PSNR 22.03 dB | eval PSNR 1..30: 18.85, 18.39, 16.72, 17.37, 16.95 ... 18.08 | ctrl=dense | final SSIM 0.5007 | final LPIPS 0.3118
[ep 03] β_train=0.340 K_train=7 | loss 0.0230 | train PSNR 22.86 dB | eval PSNR 1..30: 19.06, 16.11, 13.71, 14.34, 13.91 ... 14.91 | ctrl=dense | final SSIM 0.3624 | final LPIPS 0.4081
[ep 04] β_train=0.370 K_train=8 | loss 0.0200 | train PSNR 23.54 dB | eval PSNR 1..30: 18.89, 15.49, 13.20, 14.07, 13.81 ... 15.15 | ctrl=dense | final SSIM 0.3947 | final LPIPS 0.3772
[ep 05] β_train=0.400 K_train=9 | loss 0.0193 | train PSNR 23.73 dB | eval

Celebahq run:

In [3]:
runs = {
    "celebahq_64": {
        "cmd": [
            "python3", "../drivers/run_all_inpainting.py", "--include_nftm",
            "--train_dataset", "celebahq",
            "--benchmark", "celebahq",
            "--img_size", "64",
            "--out", "out_celebahq64"
        ],
        "summary_path": "out_celebahq64/summary.json",
    }}

for name, info in runs.items():
    print(f"\n=== Running experiment: {name} ===")
    try:
        subprocess.run(info["cmd"], check=True)
    except subprocess.CalledProcessError as e:
        print(f" Experiment {name} failed with exit code {e.returncode}")
        continue


=== Running experiment: celebahq_64 ===
[device] cuda | criterion=MSE
[Data] train_dataset=celebahq, benchmark=celebahq, img_size=64
[Data] train_set size=24000, test_set size=6000
[controller] dense | params=46375




Setting up [LPIPS] perceptual loss: trunk [alex], v[0.1], spatial [off]
Loading model from: /gpfs/users/balazsk/.conda/envs/nftm/lib/python3.11/site-packages/lpips/weights/v0.1/alex.pth
[ep 01] β_train=0.280 K_train=5 | loss 0.0306 | train PSNR 21.55 dB | eval PSNR 1..30: 18.67, 17.63, 15.86, 16.52, 16.18 ... 17.96 | ctrl=dense | final SSIM 0.4943 | final LPIPS 0.3077
[ep 02] β_train=0.310 K_train=6 | loss 0.0275 | train PSNR 22.03 dB | eval PSNR 1..30: 18.85, 18.39, 16.72, 17.37, 16.96 ... 18.09 | ctrl=dense | final SSIM 0.5011 | final LPIPS 0.3113
[ep 03] β_train=0.340 K_train=7 | loss 0.0230 | train PSNR 22.86 dB | eval PSNR 1..30: 19.06, 16.05, 13.63, 14.26, 13.82 ... 14.81 | ctrl=dense | final SSIM 0.3602 | final LPIPS 0.4090
[ep 04] β_train=0.370 K_train=8 | loss 0.0201 | train PSNR 23.53 dB | eval PSNR 1..30: 18.90, 15.49, 13.26, 14.09, 13.82 ... 15.21 | ctrl=dense | final SSIM 0.3814 | final LPIPS 0.3938
[ep 05] β_train=0.400 K_train=9 | loss 0.0193 | train PSNR 23.74 dB | eval



[viz] saved per-epoch progression → out_celebahq64/nftm_dense_pyramid/final/progress_epoch_final.png
[gif] saved reconstruction GIF (GT top row, recon bottom) → out_celebahq64/nftm_dense_pyramid/final/progress_epoch_final.gif
[done] checkpoints and plots saved under: out_celebahq64/nftm_dense_pyramid | controller=dense
[metrics] saved metrics.json & psnr_curve.npy in out_celebahq64/nftm_dense_pyramid | controller=dense
Using device: cuda
[Data] train_dataset=celebahq, benchmark=celebahq, img_size=64
[Data] train_set size=24000
Model parameters: 46479
Epoch 001: loss=0.1840, val_psnr=21.55 dB, time=264.4s
Epoch 002: loss=0.1017, val_psnr=23.26 dB, time=270.9s
Epoch 003: loss=0.0875, val_psnr=23.99 dB, time=288.2s
Epoch 004: loss=0.0793, val_psnr=24.79 dB, time=221.8s
Epoch 005: loss=0.0738, val_psnr=25.14 dB, time=259.8s
Epoch 006: loss=0.0701, val_psnr=25.50 dB, time=215.1s
Epoch 007: loss=0.0668, val_psnr=25.78 dB, time=161.6s
Epoch 008: loss=0.0644, val_psnr=26.02 dB, time=101.0s
Epo



Setting up [LPIPS] perceptual loss: trunk [alex], v[0.1], spatial [off]
Loading model from: /gpfs/users/balazsk/.conda/envs/nftm/lib/python3.11/site-packages/lpips/weights/v0.1/alex.pth
Training and evaluation complete.




[Eval Data] benchmark=celebahq, img_size=64
[Eval Data] dataset size=6000




Setting up [LPIPS] perceptual loss: trunk [alex], v[0.1], spatial [off]
Loading model from: /gpfs/users/balazsk/.conda/envs/nftm/lib/python3.11/site-packages/lpips/weights/v0.1/alex.pth
Checkpoint: out_celebahq64/unet/ckpt.pt
Device: cuda
Parameters: 46,479
Seed: 0
Metrics:
  psnr_all: 27.2716
  psnr_miss: 24.3384
  ssim_all: 0.8685
  ssim_miss: 2.2602
  lpips_all: 0.0434
  lpips_miss: 0.1049
  fid: 51.1562
  kid: 0.0550




Running TV-L1 baseline on celebahq test set with 250 iterations (device=cuda)
[TVL1 Eval Data] benchmark=celebahq, img_size=64
[TVL1 Eval Data] dataset size=6000




Setting up [LPIPS] perceptual loss: trunk [alex], v[0.1], spatial [off]
Loading model from: /gpfs/users/balazsk/.conda/envs/nftm/lib/python3.11/site-packages/lpips/weights/v0.1/alex.pth
[TVL1] mean PSNR_all=26.69 dB, PSNR_miss=22.79 dB
Saved metrics to out_celebahq64/tvl1/metrics.json
[device] cuda | criterion=MSE
[Data] train_dataset=celebahq, benchmark=celebahq, img_size=64
[Data] train_set size=24000, test_set size=6000
[controller] dense | params=46375




Setting up [LPIPS] perceptual loss: trunk [alex], v[0.1], spatial [off]
Loading model from: /gpfs/users/balazsk/.conda/envs/nftm/lib/python3.11/site-packages/lpips/weights/v0.1/alex.pth
[ep 01] β_train=0.280 K_train=5 | loss 0.1584 | train PSNR 14.21 dB | eval PSNR 1..12: 13.98, 14.30, 14.60, 14.87, 15.13 ... 16.42 | ctrl=dense | final SSIM 0.3568 | final LPIPS 0.5252
[ep 02] β_train=0.310 K_train=6 | loss 0.1498 | train PSNR 14.47 dB | eval PSNR 1..12: 14.05, 14.41, 14.76, 15.08, 15.38 ... 16.99 | ctrl=dense | final SSIM 0.4131 | final LPIPS 0.4645
[ep 03] β_train=0.340 K_train=7 | loss 0.1460 | train PSNR 14.59 dB | eval PSNR 1..12: 14.08, 14.48, 14.85, 15.21, 15.54 ... 17.31 | ctrl=dense | final SSIM 0.4360 | final LPIPS 0.4290
[ep 04] β_train=0.370 K_train=8 | loss 0.1420 | train PSNR 14.72 dB | eval PSNR 1..12: 14.08, 14.52, 14.92, 15.31, 15.67 ... 17.59 | ctrl=dense | final SSIM 0.4545 | final LPIPS 0.4013
[ep 05] β_train=0.400 K_train=8 | loss 0.1436 | train PSNR 14.67 dB | eval



[viz] saved per-epoch progression → out_celebahq64/nftm/final/progress_epoch_final.png
[gif] saved reconstruction GIF (GT top row, recon bottom) → out_celebahq64/nftm/final/progress_epoch_final.gif
[done] checkpoints and plots saved under: out_celebahq64/nftm | controller=dense
[metrics] saved metrics.json & psnr_curve.npy in out_celebahq64/nftm | controller=dense
[device] cuda | criterion=MSE
[Data] train_dataset=celebahq, benchmark=celebahq, img_size=64
[Data] train_set size=24000, test_set size=6000
[controller] dense | params=46375




Setting up [LPIPS] perceptual loss: trunk [alex], v[0.1], spatial [off]
Loading model from: /gpfs/users/balazsk/.conda/envs/nftm/lib/python3.11/site-packages/lpips/weights/v0.1/alex.pth
[ep 01] β_train=0.280 K_train=5 | loss 0.1584 | train PSNR 14.21 dB | eval PSNR 1..30: 13.98, 14.30, 14.60, 14.87, 15.13 ... 17.58 | ctrl=dense | final SSIM 0.3972 | final LPIPS 0.4302
[ep 02] β_train=0.310 K_train=6 | loss 0.1498 | train PSNR 14.47 dB | eval PSNR 1..30: 14.05, 14.41, 14.76, 15.08, 15.38 ... 18.59 | ctrl=dense | final SSIM 0.4891 | final LPIPS 0.3380
[ep 03] β_train=0.340 K_train=7 | loss 0.1460 | train PSNR 14.59 dB | eval PSNR 1..30: 14.08, 14.48, 14.85, 15.21, 15.54 ... 19.09 | ctrl=dense | final SSIM 0.5227 | final LPIPS 0.2916
[ep 04] β_train=0.370 K_train=8 | loss 0.1420 | train PSNR 14.72 dB | eval PSNR 1..30: 14.08, 14.52, 14.92, 15.31, 15.67 ... 19.50 | ctrl=dense | final SSIM 0.5478 | final LPIPS 0.2587
[ep 05] β_train=0.400 K_train=9 | loss 0.1425 | train PSNR 14.72 dB | eval



[viz] saved per-epoch progression → runs/inpainting/nftm_dense/final/progress_epoch_final.png
[gif] saved reconstruction GIF (GT top row, recon bottom) → runs/inpainting/nftm_dense/final/progress_epoch_final.gif
[done] checkpoints and plots saved under: runs/inpainting/nftm_dense | controller=dense
[metrics] saved metrics.json & psnr_curve.npy in runs/inpainting/nftm_dense | controller=dense
[device] cuda | criterion=MSE
[Data] train_dataset=celebahq, benchmark=celebahq, img_size=64
[Data] train_set size=24000, test_set size=6000
[controller] unet | base=10 | params=46843




Setting up [LPIPS] perceptual loss: trunk [alex], v[0.1], spatial [off]
Loading model from: /gpfs/users/balazsk/.conda/envs/nftm/lib/python3.11/site-packages/lpips/weights/v0.1/alex.pth
[ep 01] β_train=0.280 K_train=5 | loss 0.1578 | train PSNR 14.23 dB | eval PSNR 1..30: 13.96, 14.25, 14.52, 14.76, 14.98 ... 16.77 | ctrl=unet | final SSIM 0.3278 | final LPIPS 0.4898
[ep 02] β_train=0.310 K_train=6 | loss 0.1533 | train PSNR 14.36 dB | eval PSNR 1..30: 14.03, 14.38, 14.71, 15.01, 15.30 ... 18.01 | ctrl=unet | final SSIM 0.4168 | final LPIPS 0.3845
[ep 03] β_train=0.340 K_train=7 | loss 0.1458 | train PSNR 14.59 dB | eval PSNR 1..30: 14.07, 14.47, 14.84, 15.19, 15.51 ... 18.86 | ctrl=unet | final SSIM 0.4829 | final LPIPS 0.3219
[ep 04] β_train=0.370 K_train=8 | loss 0.1426 | train PSNR 14.71 dB | eval PSNR 1..30: 14.08, 14.51, 14.91, 15.29, 15.65 ... 19.43 | ctrl=unet | final SSIM 0.5188 | final LPIPS 0.2812
[ep 05] β_train=0.400 K_train=9 | loss 0.1399 | train PSNR 14.81 dB | eval PSN



[viz] saved per-epoch progression → runs/inpainting/nftm_unet/final/progress_epoch_final.png
[gif] saved reconstruction GIF (GT top row, recon bottom) → runs/inpainting/nftm_unet/final/progress_epoch_final.gif
[done] checkpoints and plots saved under: runs/inpainting/nftm_unet | controller=unet
[metrics] saved metrics.json & psnr_curve.npy in runs/inpainting/nftm_unet | controller=unet

[stage] nftm_pyramid
>>> /gpfs/users/balazsk/.conda/envs/nftm/bin/python3 /gpfs/workdir/balazsk/NFTM/image_inpainting.py --controller dense --save_dir out_celebahq64/nftm_dense_pyramid --epochs 30 --K_train 20 --K_eval 30 --seed 0 --device cuda --save_metrics --train_dataset celebahq --benchmark celebahq --img_size 64 --pyramid 16,32,64 --pyr_steps 3,10,17
[stage] nftm_pyramid completed in 7295.9s

[stage] train_unet
>>> /gpfs/users/balazsk/.conda/envs/nftm/bin/python3 /gpfs/workdir/balazsk/NFTM/nftm_inpaint/train_unet.py --epochs 20 --batch_size 256 --lr 0.002 --weight_decay 0.0001 --tv_weight 0.01 --s

In [None]:
import matplotlib.pyplot as plt
import numpy as np

def plot_metrics(exp_name, summary):
    """Create metric comparison plots for all models in a summary.json."""

    # Extract model names except meta
    models = [m for m in summary.keys() if m != "meta"]

    # Enforce consistent order when present
    order = ["unet", "tvl1", "nftm", "dense", "unet_controller"]
    models = sorted(models, key=lambda m: order.index(m) if m in order else 999)

    # Metrics to display
    metrics = ["psnr_all", "ssim_all", "lpips_all", "fid", "kid"]

    # Plot each metric separately
    for metric in metrics:
        values = []
        labels = []

        for m in models:
            val = summary[m].get(metric)
            if val is not None:
                values.append(val)
                labels.append(m)

        if not values:
            continue

        plt.figure(figsize=(7, 4))

        # LPIPS/FID/KID -> lower is better, use inverted coloring
        color = "tab:red" if metric in ["lpips_all", "fid", "kid"] else "tab:blue"
        
for name, summary in results.items():
    print(f"\n### Visualizing: {name} ###")
    plot_metrics(name, summary)