<a href="https://colab.research.google.com/github/DavoodSZ1993/Dive-into-Deep-Learning-Notes-/blob/main/14_1_image_augmentation_notes.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
!pip install d2l==1.0.0-alpha1.post0 --quiet

## 14.1 Image Augmentation

### 14.1.1 Common Image Augmentation Methods

* `torchvision`: This package consists of popular datasets, model architectures, and common image transformations for computer vision.
* `torchvision.transforms`: Contains common image transformations.

* Class `torch.transforms.RandomHorizontalFlip(p=0.5)`: Horizontally flip the given image with the given probability. 
* `p(float)`: Probability of the image being flipped. Default value is 0.5.

In [2]:
import torchvision.transforms as T

aug = T.RandomHorizontalFlip()


* `forward(img)`: `img(PIL Image or Tensor)`

* Class `torchvision.transforms.RandomVerticalFlip(p=0.5)`: Vertically flip the given image randomly with a given probability.
* `p(float)`: Porbability of the image being flipped. Default value is 0.5.

In [3]:
aug = T.RandomVerticalFlip()

* `forward(img)`: `img(PIL Image or Tensor)`

* Class `torchvision.transforms.RandomResizedCrop(size, scale=(0.8, 1.0), ratio=(0.75, 1.33))`: Crop a random portion of image and resize it to a given size.
* A crop of the original image is made. The crop has a random area ($H \times W$) and a random aspect ratio. This crop is finally resized to the given size.
* `size(int)`: Expected output size of the crop.
* `scale(tuple of floats)`: Specifies the lower and upper bounds for the random area of the crop, before resizing. The scale is denfined with respect to the area of the original image.
* `ratio (tuple of floats)`: Lower and upper bounds for the random aspect ratio of the crop, before resizing.

In [4]:
aug = T.RandomResizedCrop((200, 200), scale=(0.1, 1), ratio=(0.5, 2))

* `forward(img)`: `img(PIL Image or Tensor)`

* Class `torchvision.transforms.ColorJitter(brightness, contrast, saturation, hue)`: Randomly change the brightness, contrast, saturation, and hue of an image.
* `brightness(float)`: How much to jitter brightness.
* `contrast(float)`: How much to jitter contrast.
* `saturation(float)`: How much to jitter saturation.
* `hue(float)`: How much to jitter hue.

In [5]:
aug = T.ColorJitter(brightness=0.5, contrast=0, saturation=0, hue=0)

* `forward(img)`: `img(PIL or Tensor)`

### 14.1.2 Training with Image Augmentation

* `torchvision.datasets`: This package contains many built-in datasets as well as utility classes for building your own datasets.
* Class `torchvision.datasets.CIFAR10(root, train=bool, download=bool, transform=optional)`: CIFAR10 dataset.
* `root(string)`: Root directory of dataset where direcory `cifar-10-batches.py` exists or will be saved to if download is set to True.
* `train(bool, optional)`: If True, creates dataset from training set, otherwise creates from a test set.
* `transform(optional)`: A function/transform that takes in an PIL image and returns a transformed version.
* `download(bool, optional)`: If True, downloads the dataset from the internet and puts it in root directory. If dataset is already downloaded, it is not downloaded again.
* `__getitem__(index) --> Tuple[Any, Any]`: Returns (image, target) where target is the index of the target class.

* Class `torchvision.transforms.ToTensor`: Converts a PIL image or ndarray to tensor and scale the values accordingly.
* Converts a PIL image or numpy.ndarray ($H \times W \times C$) in the range [0, 255] to a torch.FLoatTensor of shape ($C \times H \times W$) in the range [0.0, 1.0].

In [6]:
aug = T.ToTensor()

* Class `torchvision.transforms.Compose(transforms)`: Compose several transforms together.

In [7]:
import torch
aug = T.Compose([
    T.CenterCrop(10),
    T.PILToTensor(),
    T.ConvertImageDtype(torch.float)
])

* Class `torch.utils.data.DataLoader(dataset, batch_szie, shuffle, num_workers)`: Data loader. Combines a dataset and a sampler, and provides an iterable over the given dataset.

* **dataset**: dataset from which to load the data.
* **batch_size (int, optional)**: How many samples per batch to load (default=1).
* **shuffle (bool, optional)**: Set to `True` to have the data reshuffled at every epoch (default= `False`)
* **num_workers (int, optional)**: How many subprocesses to use for data loading. `0` means that the data will be loaded in the main process (default= 0)

* Class `torch.nn.DataParallel(module, device_ids=None)`: Implements data parallelism at the module level.
* **module**: module to be parallelized.
* **device_ids (list of python or torch.device)**: CUDA devices (default: all devices)