## Pytorch API & Library
- https://pytorch.org/vision/stable/

![pytorch_api_library](img/pytorch_api_library.png)

## TorchVision Package Reference

### 1) Torchvision.datasets 
- Face
    - CelebA (얼굴데이터셋)
    

- Image Classification
    - CIFAR (사물 데이터셋)
    - ImageNet
    - MNIST
        - EMNIST(알파벳 + 숫자)
        - Fashion-MNIST
        - KMNIST(고대 중세 일본문자)
        - QMNIST
    - LSUN (Scene classification)
    

- Segmentation
    - Cityscapes(도시 길거리 데이터셋, segmentation용),
    

- Object Detection
    - COCO, VOC (Captioning and Detection)
    

- Image Description
    - Flickr
    

- Etc
    - FakeData(Random Noise Data)
    - HMDB51 (action classification dataset)
    - Kinetics-400 / Omniglot / PhotoTour / Places365
    - SBD / SBU / STL10 / SVHN / UCF101 / USPS

<hr>

### 2) Torchvision.io
- performing IO operations, reading and writing video and images.

- Video (Function 기반)
    - read_video, read_video_timestamps, write_video


- Fine-grained video API (Class 기반)
    - Class : VideoReader


- Image
    - read_image, decode_image, encode_jpeg, write_jpeg, encode_png, write_png ...

<hr>

### 3) Torchvision.models
- import torchvision.models as models
    - ex) mobilenet_v3_large = models.mobilenet_v3_large()
    - ex) inception = models.inception_v3()
    
    
#### Classification
- AlexNet, VGG, ResNet, SqueezeNet, DenseNet
- Inception v3, GoogLeNet, ShuffleNet v2
- MobileNetV2, MobileNetV3, 
- ResNeXt, Wide ResNet
- MNASNet


#### Quantized Models
- INT8 quantized models
- ex) resnet50 = models.quantization.resnet50()
- ex) shufflenet_v2_x1_5 = models.quantization.shufflenet_v2_x1_5()


#### Semantic Segmentation
- FCN ResNet50, ResNet101
- DeepLabV3 ResNet50, ResNet101, MobileNetV3-Large
- LR-ASPP MobileNetV3-Large


#### Object Detection
- Faster R-CNN ResNet-50 FPN, Faster R-CNN MobileNetV3-Large FPN, Faster R-CNN MobileNetV3-Large 320 FPN
- RetinaNet ResNet-50 FPN, Mask R-CNN ResNet-50 FPN


#### Keypoint R-CNN
- Keypoint R-CNN ResNet-50 FPN


#### Video classification
- ResNet 3D 18, ResNet MC 18, ResNet (2+1)D


<hr>

### 4) Torchvision.ops
- implements operators that are specific for Computer Vision.

- API Method (torchvision.ops.)
    - nms, batched_nms
    - remove_small_boxes, clip_boxes_to_image
    - box_convert : xyxy to xywh, cxcywh
    - box_area, box_iou, roi_align, ps_roi_align
    - roi_pool, ps_roi_pool
    - sigmoid_focal_loss

<hr>

### 5) Torchvision.transforms
- class : torchvision.transforms.Compose(transforms)
- transforms (class 들)
    - CenterCrop(), RandomCrop()
    - ToTensor()
    - Normalize(), Resize(), Scale(), Pad() (padding)
    - RandomHorizontalFlip(), RandomVerticalFlip(), RandomPerspective()
    - RandomAffine(), RandomRotation(), 
    - ColorJitter(), RandomErasing(), GaussianBlur()
    
- 최종 example
    - 1) transforms.Compose([transforms.CenterCrop(10), transforms.ToTensor()])
    - 2) transforms = torch.nn.Sequential(transforms.CenterCrop(10), transforms.Normalize(0.485, 0.456, 0.406))
    
<hr>


## 6) torchvision.utils
- torchvision.utils.make_grid : Make a grid of images.
- torchvision.utils.save_image : Save a given Tensor into an image file
- torchvision.utils.draw_bounding_boxes : Draws bbox on given image

<hr>