<a href="https://colab.research.google.com/github/PEBpung/CNN_wandb/blob/master/WandB_pytorch_tutorial.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 🚀 Install, Import, and Log In

In [1]:
import numpy as np
import torch
import torch.nn as nn
from torchvision import datasets
from torch.utils.data import DataLoader
import torchvision.transforms as transforms
from tqdm.notebook import tqdm

device = "cuda" if torch.cuda.is_available() else "cpu"

### 0️⃣ Step 0: W&B 설치하기
colab에서 WandB를 사용하려면 wandb modul을 install 해야합니다.

In [2]:
%%capture
!pip install wandb --upgrade

### 1️⃣ Step 1: W&B 로그인

WandB의 web 서비스를 이용하기 위해선 log in이 필요합니다.

In [3]:
import wandb

wandb.login()

Failed to detect the name of this notebook, you can set it manually with the WANDB_NOTEBOOK_NAME environment variable to enable code saving.
[34m[1mwandb[0m: Currently logged in as: [33masdqwdasdqwdzxcxzv[0m (use `wandb login --relogin` to force relogin)


True

### 2️⃣ Step 2: config 설정 후 `wandb.init` 정의
이제 본격적으로 모델을 학습 시키기 전에 몇가지 준비를 하려고 합니다.  
1. hyper-parameter config 설정
2. dataloader 정의
3. model 정의

### 1) config 설정

In [15]:
config  = {
    'epochs': 5,
    'classes':10,
    'batch_size': 128,
    'kernels': [16, 32],
    'weight_decay': 0.0005,
    'learning_rate': 1e-3,
    'dataset': 'MNIST',
    'architecture': 'CNN',
    'seed': 42
    }

### 2) dataloader 정의

In [16]:
def make_loader(batch_size, train=True):
    full_dataset = datasets.MNIST(root='./data/MNIST', train=train, 
                                    download=True,  transform=transforms.ToTensor())
    
    loader = DataLoader(dataset=full_dataset,
                        batch_size=batch_size, 
                        shuffle=True,
                        pin_memory=True, num_workers=2)
    return loader

### 3) 모델 정의

In [28]:
class ConvNet(nn.Module):
    def __init__(self, kernels, classes=10):
        super(ConvNet, self).__init__()
        
        self.layer1 = nn.Sequential(
            nn.Conv2d(1, kernels[0], kernel_size=5, stride=1, padding=2),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2, stride=2))
        self.layer2 = nn.Sequential(
            nn.Conv2d(16, kernels[1], kernel_size=5, stride=1, padding=2),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2, stride=2))
        self.fc = nn.Linear(7 * 7 * kernels[-1], classes)
        
    def forward(self, x):
        out = self.layer1(x)
        out = self.layer2(out)
        out = out.reshape(out.size(0), -1)
        out = self.fc(out)
        return out

### 3️⃣ Step 3. `wandb.watch`와 `wandb.log`를 사용해서 gradients 추적하기
- **wandb.watch**는 gradient, topology와 관련된 정보를 visualization 하기 위한 코드입니다.
- **wandb.log**는 visualization 하고 싶은 정보를 넘겨줄 수 있습니다.

이 두가지 코드를 활용해서 gradient와 parameter를 시각화할 수 있습니다.   
wandb.watch는 학습하기 전 train의 앞부분에 위치 시켜줍니다.  
wandb.log는 학습 log를 출력하기 바로 전에 metric과 epoch을 입력합니다.

In [29]:
def train(model, loader, criterion, optimizer, config):
    wandb.watch(model, criterion, log="all", log_freq=10)

    example_ct = 0  
    for epoch in tqdm(range(config.epochs)):
        cumu_loss = 0
        for images, labels in loader:

            images, labels = images.to(device), labels.to(device)
    
            outputs = model(images)
            loss = criterion(outputs, labels)
            cumu_loss += loss.item()
            
            optimizer.zero_grad()
            loss.backward()

            optimizer.step()

            example_ct +=  len(images)

        avg_loss = cumu_loss / len(loader)
        wandb.log({"loss": avg_loss}, step=epoch)
        print(f"TRAIN: EPOCH {epoch + 1:04d} / {config.epochs:04d} | Epoch LOSS {avg_loss:.4f}")

### 4️⃣ Optional Step 4:  `wandb.save`로 저장하기

In [30]:
def test(model, test_loader):
    model.eval()

    with torch.no_grad():
        correct, total = 0, 0
        for images, labels in test_loader:
            images, labels = images.to(device), labels.to(device)

            outputs = model(images)
            pred = outputs.max(1, keepdim = True)[1]                                       
            correct += pred.eq(labels.view_as(pred)).sum().item() 

            total = len(test_loader.dataset)

        print(f"Accuracy of the model on the {total} " +
              f"test images: {100 * correct / total}%")
        
        wandb.log({"test_accuracy": correct / total})

    # Save the model in the exchangeable ONNX format
    torch.onnx.export(model, images, "model.onnx")
    wandb.save("model.onnx")

## 🏃‍♀️ WandB 실행하기

In [31]:
def run(config=None):
    wandb.init(project='test-pytorch', entity='pebpung', config=config)
      
    config = wandb.config

    train_loader = make_loader(batch_size=config.batch_size, train=True)
    test_loader = make_loader(batch_size=config.batch_size, train=False)

    model = ConvNet(config.kernels, config.classes).to(device)
    criterion = nn.CrossEntropyLoss()
    optimizer = torch.optim.Adam(model.parameters(), lr=config.learning_rate)

    train(model, train_loader, criterion, optimizer, config)
    test(model, test_loader)
    return model

In [32]:
model = run(config)

VBox(children=(Label(value=' 0.11MB of 0.11MB uploaded (0.00MB deduped)\r'), FloatProgress(value=1.0, max=1.0)…

0,1
epoch,▁▁▁▁▁▁▁▁▃▃▃▃▃▃▃▃▅▅▅▅▅▅▅▅▆▆▆▆▆▆▆▆████████
loss,█▃▃▂▃▂▃▄▂▂▂▂▂▂▂▂▁▂▁▁▁▁▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
test_accuracy,▁

0,1
epoch,4.0
loss,0.03358
test_accuracy,0.9886


  0%|          | 0/5 [00:00<?, ?it/s]

TRAIN: EPOCH 0001 / 0005 | Epoch LOSS 0.2709
TRAIN: EPOCH 0002 / 0005 | Epoch LOSS 0.0666
TRAIN: EPOCH 0003 / 0005 | Epoch LOSS 0.0487
TRAIN: EPOCH 0004 / 0005 | Epoch LOSS 0.0387
TRAIN: EPOCH 0005 / 0005 | Epoch LOSS 0.0326
Accuracy of the model on the 10000 test images: 99.05%
