<a href="https://colab.research.google.com/github/ysb06/dgm-2024-vae-diffusion/blob/main/notebooks/sample.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# DiffuseVAE in Google Colab

Google Colab에서 실행하는 예시

In [1]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


## Git Clone

Github에서 Personal Access Token 발급 후 Clone을 수행합니다. Token은 재사용이 가능하므로 안전한 곳에 저장해 두고 사용하면 됩니다.

### Personal Access Token 발급 방법

1. Github의 Settings - Developer settings (맨아래) 클릭
2. Personal access token 선택
3. Fine-grained tokens 또는 Tokens (classic) 선택
4. 이름, 만료일, 권한 등 설정 후 Generate token
    - Fine-grained tokens의 경우 최소 권한은 Contents만 Read and write로 선택
    - Classic의 경우 최소 권한은 repo만 선택
5. 키 복사 후 사용

In [2]:
!git clone https://(Github Personal Access Token)@github.com/ysb06/dgm-2024-vae-diffusion.git

Cloning into 'dgm-2024-vae-diffusion'...
remote: Enumerating objects: 330, done.[K
remote: Counting objects: 100% (142/142), done.[K
remote: Compressing objects: 100% (100/100), done.[K
remote: Total 330 (delta 60), reused 96 (delta 32), pack-reused 188 (from 1)[K
Receiving objects: 100% (330/330), 51.93 MiB | 27.67 MiB/s, done.
Resolving deltas: 100% (125/125), done.


## 프로젝트 패키지 설치

Google Colab에서 아래 실행 후, 런타임 세션을 재시작한다. (런타임 해제 및 삭제하면 안 됨)

In [3]:
%pip install -e dgm-2024-vae-diffusion

Obtaining file:///content/dgm-2024-vae-diffusion
  Installing build dependencies ... [?25l[?25hdone
  Checking if build backend supports build_editable ... [?25l[?25hdone
  Getting requirements to build editable ... [?25l[?25hdone
  Preparing editable metadata (pyproject.toml) ... [?25l[?25hdone
Collecting lightning>=2.4.0 (from dgm-2024-vae-diffusion==0.1.0)
  Downloading lightning-2.4.0-py3-none-any.whl.metadata (38 kB)
Collecting hydra-core>=1.3.2 (from dgm-2024-vae-diffusion==0.1.0)
  Downloading hydra_core-1.3.2-py3-none-any.whl.metadata (5.5 kB)
Collecting lmdb>=1.5.1 (from dgm-2024-vae-diffusion==0.1.0)
  Downloading lmdb-1.5.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (1.1 kB)
Collecting torch-fidelity>=0.3.0 (from dgm-2024-vae-diffusion==0.1.0)
  Downloading torch_fidelity-0.3.0-py3-none-any.whl.metadata (2.0 kB)
Collecting omegaconf<2.4,>=2.2 (from hydra-core>=1.3.2->dgm-2024-vae-diffusion==0.1.0)
  Downloading omegaconf-2.3.0-py3-none-any.wh

## 예시: Baseline 학습 실행

### 하이퍼파라미터 로드

In [1]:
from hydra import initialize, compose
from omegaconf import DictConfig


def load_config(
    path: str = "dgm-2024-vae-diffusion/src/baseline/configs",
    name: str = "config",
) -> DictConfig:
    with initialize(config_path=path, version_base=None):
        config = compose(config_name=name)
    return config

In [None]:
config = load_config()
print(config)

### Config 수정

In [None]:
# VSCode에서 실행하는 것과 다르게 Notebook파일이 있는 폴더가 현재 폴더 위치
config.dataset.ddpm.data.root = "./dgm-2024-vae-diffusion/datasets"
config.dataset.ddpm.training.vae_chkpt_path = "./dgm-2024-vae-diffusion/outputs/vae.pt"
config.dataset.ddpm.training.results_dir = "./dgm-2024-vae-diffusion/outputs"
config.dataset.vae.data.root = "./dgm-2024-vae-diffusion/datasets"
config.dataset.vae.training.results_dir = "./dgm-2024-vae-diffusion/outputs"

config

In [None]:
import baseline.train_ae as vae_trainer
import baseline.train_ddpm as ddpm_trainer

vae_trainer.train(config)
# ddpm_trainer.train(config)

## Custom Hybrid DiffuseVAE 실행

In [2]:
import os

config_root = os.path.join("dgm-2024-vae-diffusion", "src/hybrid_vd/configs")

In [None]:
# config_root = os.path.join("../..", config_root)

In [3]:
config = load_config(config_root, "train")
print(config)

{'vae': {'model': {'input_res': 32, 'enc_block_str': '32x7,32d2,32t16,16x4,16d2,16t8,8x4,8d2,8t4,4x3,4d4,4t1,1x3', 'enc_channel_str': '32:64,16:128,8:256,4:256,1:512', 'dec_block_str': '1x1,1u4,1t4,4x2,4u2,4t8,8x3,8u2,8t16,16x7,16u2,16t32,32x15', 'dec_channel_str': '32:64,16:128,8:256,4:256,1:512', 'lr': 0.0001, 'alpha': 1.0}}, 'ddpm': {'decoder': {'in_channels': 3, 'model_channels': 128, 'out_channels': 3, 'num_res_blocks': 2, 'attention_resolutions': '16,', 'channel_mult': '1,2,2,2', 'use_checkpoint': False, 'dropout': 0.3, 'num_heads': 8, 'z_dim': 512, 'use_scale_shift_norm': False, 'use_z': False}, 'model': {'beta_1': 0.0001, 'beta_2': 0.02, 'T': 1000}, 'wrapper': {'lr': 0.0002, 'cfd_rate': 0.0, 'n_anneal_steps': 5000, 'loss': 'l2', 'conditional': True, 'grad_clip_val': 1.0, 'z_cond': False}}, 'dataset': {'name': 'cifar10', 'root': './datasets', 'image_size': 32, 'norm': False, 'flip': False}, 'dataloader': {'batch_size': 128, 'num_workers': 2, 'pin_memory': True, 'shuffle': True, 

In [None]:
config.dataset.root = os.path.join("..", config.dataset.root)
config.trainer.default_root_dir = os.path.join("..", config.trainer.default_root_dir)
config.results_dir = os.path.join("..", config.results_dir)
config.ckpt_path = os.path.join("..", config.ckpt_path) if config.ckpt_path is not None else None

In [4]:
config.dataset.root = "/content/drive/MyDrive/Colab Notebooks/data"
config.trainer.default_root_dir = "/content/drive/MyDrive/Colab Notebooks/outputs/DiffuseVAE"
config.results_dir = "/content/drive/MyDrive/Colab Notebooks/outputs/DiffuseVAE"
config.ckpt_path = "/content/drive/MyDrive/Colab Notebooks/outputs/DiffuseVAE/checkpoints/diffuse_vae-epoch=100-loss=0.0000.ckpt"

In [5]:
import hybrid_vd.train as hybrid_trainer

hybrid_trainer.train(config)

INFO: Seed set to 42
INFO:lightning.fabric.utilities.seed:Seed set to 42


Files already downloaded and verified


INFO: GPU available: True (cuda), used: True
INFO:lightning.pytorch.utilities.rank_zero:GPU available: True (cuda), used: True
INFO: TPU available: False, using: 0 TPU cores
INFO:lightning.pytorch.utilities.rank_zero:TPU available: False, using: 0 TPU cores
INFO: HPU available: False, using: 0 HPUs
INFO:lightning.pytorch.utilities.rank_zero:HPU available: False, using: 0 HPUs
INFO: You are using a CUDA device ('NVIDIA A100-SXM4-40GB') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
INFO:lightning.pytorch.utilities.rank_zero:You are using a CUDA device ('NVIDIA A100-SXM4-40GB') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for p

Training: |          | 0/? [00:00<?, ?it/s]

INFO: 
Detected KeyboardInterrupt, attempting graceful shutdown ...
INFO:lightning.pytorch.utilities.rank_zero:
Detected KeyboardInterrupt, attempting graceful shutdown ...


NameError: name 'exit' is not defined