# Data Augmentation

## 本章節大綱
* [讀取資料](#讀取資料)
* [建立模型並訓練](#建立模型並訓練)
* [亮度 Brightness](#亮度-Brightness)
* [色調 Hue](#色調-Hue)
* [飽和度 Saturation](#飽和度-Saturation)
* [翻轉 Flip](翻轉-Flip)
* [旋轉 Rotation](#旋轉-Rotation)
* [裁剪 Crop](#裁剪-Crop)
* [隨機 Zoom](#縮放-Zoom)
* [隨機高度 RandomHeight](#隨機高度-RandomHeight)
* [隨機寬度 RandomWidth](#隨機寬度-RandomWidth)
* [隨機平移 RandomTranslation](#隨機平移-RandomTranslation)
* [綜合](#整合)
* [加快資料讀取速度](#加快資料讀取速度)


## 匯入套件


In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import cv2
import glob  # 讀取特定格式路徑
from PIL import Image

import torch
import torchvision.transforms as T

## 讀取路徑

In [None]:
# upload Data
!wget -q https://github.com/TA-aiacademy/course_3.0/releases/download/CVCNN_Data/cat_dog.zip
!unzip -q cat_dog

In [None]:
print(glob.glob('*'))  # 查看現在資料夾底下所有東西

In [None]:
print(glob.glob('cat_dog/*'))  # 查看 cat_dog 資料夾底下所有東西

In [None]:
print(glob.glob('cat_dog/*/*')[:5])  # 查看 cat_dog 資料夾底下兩層所有東西

In [None]:
# 建立一個字典來存放路徑跟標籤資訊
data_dict={'file_name': [], 'type': []}
# 只拿 train 資料中的 .jpg 檔案
for i in glob.glob('cat_dog/train/*.jpg'):
    # i 會類似 cat_dog/train/cat.11996.jpg
    data_dict['file_name'].append(i)
    # 字串處理取出檔案名稱前三個字元來判斷類別
    animal = i.split('/')[-1][:3]
    if animal == 'cat':
        data_dict['type'].append(0)
    elif animal == 'dog':
        data_dict['type'].append(1)
    else:
        print(i)

In [None]:
# 將字典轉換成 DataFrame
datalist = pd.DataFrame(data_dict)

In [None]:
datalist.head()

## 製造一個 dataset 讀取資料

In [None]:
class ImageDataset(torch.utils.data.Dataset):
    def __init__(self, df, transform):
        self.df = df
        self.transform = transform

    def __len__(self):
        return len(self.df)

    def __getitem__(self, idx):
        img_path = self.df.iloc[idx, 0]
        img = Image.open(img_path)
        img = self.transform(img)
        label = self.df.iloc[idx, 1]
        return img, label

transform = T.Compose([
    T.Resize((256, 256)),
    T.ToTensor(),
])
dataset = ImageDataset(datalist, transform)

In [None]:
def plot_dataset(dataset):
    plt.figure(figsize=(13, 7))
    for i in range(8):
        img, label = dataset[i]
        plt.subplot(2, 4, i+1)
        plt.imshow(img.permute(1, 2, 0))
        plt.title(f"Label: {label}")
    plt.show()

plot_dataset(dataset)

---
# 資料擴增

## torchvision.transforms

- 圖像數值：亮度、對比度、色調、飽和度、品質、翻轉
- 圖像形狀：剪裁、翻轉、旋轉、縮放、高度、寬度
---

# Augmentation: ColorJitter



* ## 亮度 Brightness

 * brightness：調整的幅度，0 ~ 1

In [None]:
transform = T.Compose([
    T.Resize((256, 256)),
    T.ColorJitter(brightness=0.4),
    T.ToTensor(),
])
dataset = ImageDataset(datalist, transform)
plot_dataset(dataset)

[(back...)](#本章節大綱)

* ## 對比度 Contrast

 * contrast=0.3 (1-0.3 ~ 1+0.3)

In [None]:
transform = T.Compose([
    T.Resize((256, 256)),
    T.ColorJitter(contrast=(0.3)),
    T.ToTensor(),
])
dataset = ImageDataset(datalist, transform)
plot_dataset(dataset)

* ## 色調 Hue

 * hue
    * float: 0<= hue <= 0.5
    * (min, max): -0.5 <= min <= max <= 0.5

In [None]:
transform = T.Compose([
    T.Resize((256, 256)),
    T.ColorJitter(hue=0.4),
    T.ToTensor(),
])
dataset = ImageDataset(datalist, transform)
plot_dataset(dataset)

[(back...)](#本章節大綱)

* ## 飽和度 Saturation

 * saturation:
    * float: [max(0, 1 - saturation), 1 + saturation]
    * (min, max): [min, max]

In [None]:
transform = T.Compose([
    T.Resize((256, 256)),
    T.ColorJitter(saturation=(0.5)),
    T.ToTensor(),
])
dataset = ImageDataset(datalist, transform)
plot_dataset(dataset)

[(back...)](#本章節大綱)

* ## 翻轉 Flip

 * RandomHorizontalFlip：隨機水平翻轉
 * RandomVerticalFlip：隨機垂直翻轉


In [None]:
transform = T.Compose([
    T.Resize((256, 256)),
    T.RandomHorizontalFlip(0.5),
    T.RandomVerticalFlip(0.5),
    T.ToTensor(),
])
dataset = ImageDataset(datalist, transform)
plot_dataset(dataset)

[(back...)](#本章節大綱)

* ## 旋轉 Rotation

 * degrees: (-degrees, +degrees)


In [None]:
transform = T.Compose([
    T.Resize((256, 256)),
    T.RandomRotation(degrees=(-20, 30)),
    T.ToTensor(),
])
dataset = ImageDataset(datalist, transform)
plot_dataset(dataset)

[(back...)](#本章節大綱)

* ## 裁剪 Crop: RandomResizedCrop

 * size: 輸出大小
 * scale: 裁剪比例範圍


In [None]:
transform = T.Compose([
    T.RandomResizedCrop(size=(256, 256),
                        scale=(0.5, 1.0)),
    T.ToTensor(),
])
dataset = ImageDataset(datalist, transform)
plot_dataset(dataset)

[(back...)](#本章節大綱)

* ## 縮放 Zoom: RandomAffine

In [None]:
transform = T.Compose([
    T.Resize((256, 256)),
    T.RandomAffine(degrees=0,
                   scale=(0.5, 1.5)),
    T.ToTensor(),
])
dataset = ImageDataset(datalist, transform)
plot_dataset(dataset)

[(back...)](#本章節大綱)

* ## 隨機平移 RandomAffine
 * translate

In [None]:
transform = T.Compose([
    T.Resize((256, 256)),
    T.RandomAffine(
        degrees=0,
        translate=(0.1, 0.2) # (width+-10%, height+-20%))
    ),
    T.ToTensor(),
])
dataset = ImageDataset(datalist, transform)
plot_dataset(dataset)

[(back...)](#本章節大綱)

# 整合

In [None]:
transform = T.Compose([
    T.ColorJitter(
        brightness=0.4,
        contrast=(0.3),
        hue=0.4,
        saturation=(0.5),
    ),
    T.RandomHorizontalFlip(0.5),
    T.RandomVerticalFlip(0.5),
    T.RandomAffine(
        degrees=15,
        scale=(0.5, 1.5),
        translate=(0.1, 0.2), # (width+-10%, height+-20%))
    ),
    T.RandomResizedCrop(
        size=(256, 256),
        scale=(0.5, 1.0)
    ),
    T.ToTensor(),
])

dataset = ImageDataset(datalist, transform)
plot_dataset(dataset)

* ## 加快資料讀取速度

In [None]:
from tqdm.auto import tqdm

In [None]:
subset = torch.utils.data.Subset(dataset, list(range(1000)))

batch_size = 64

dataloader = torch.utils.data.DataLoader(
    subset,
    batch_size=batch_size,
)

dataloader_fast = torch.utils.data.DataLoader(
    subset,
    batch_size=batch_size,
    num_workers=2, # 非windows作業系統可使用 num_workers > 0
)

- 在讀取資料時能使用的dataset優化在 DL Part4 中 1_Custom_dataset.ipynb 有提到

In [None]:
len(subset)

In [None]:
# calculate time of read dataloder
import time
def calculate_time(dataloader):
    start_time = time.time()
    for x, y in tqdm(dataloader):
        pass
    print(time.time()-start_time)

In [None]:
calculate_time(dataloader)
calculate_time(dataloader_fast)

補充 torchvision transform v2

In [None]:
import torchvision.transforms.v2 as T2
# torchvision.transform V2
transform = T.Compose([
    T2.ColorJitter(
        brightness=0.4,
        contrast=(0.3),
        hue=0.4,
        saturation=(0.5),
    ),
    T2.RandomHorizontalFlip(0.5),
    T2.RandomVerticalFlip(0.5),
    T2.RandomAffine(
        degrees=15,
        scale=(0.5, 1.5),
        translate=(0.1, 0.2), # (width+-10%, height+-20%))
    ),
    T2.RandomResizedCrop(
        size=(256, 256),
        scale=(0.5, 1.0)
    ),
    T2.ToImage(),
    T2.ToDtype(torch.float32, scale=True)
])