## models in torchvision.models:
 - inception_v3:
    - RuntimeError: Calculated padded input size per channel: (3 x 3). Kernel size: (5 x 5). Kernel size can't be greater than actual input size

 - googlenet:
     - AttributeError: 'GoogLeNetOutputs' object has no attribute 'log_softmax'


## Summary of models and dataset in Mobile AI
 - in https://ai-benchmark.com/tests.html

#### Recognition task:
- Inception -V3, 346x346, ILSVRC 2012, inference: recognition
- MobileNet - V3,  512 x 512,
    classification: Imagenet, MobilenetNet V3
    detection: . detection: MS COCO (replacement for the backbone feature extractor in SSDLite)
    Semantic Segmentation: Cityscapes, R-ASPP and Lite R-ASPP as head
- alexnet
- vgg
- resnet

#### Object Detection:
 - YOLOv5-Tiny, 416 x 416,
    dataset： train with imagenet, detect with: MS COCO

#### Semantic Segmentation:
 -  DeepLab-V3+ , 1024 x 1024,
    dataset:
        employ ImageNet-1k pretrained ResNet-101 or modified aligned Xception to extract dense feature maps
        test: PASCAL VOC 2012 and Cityscapes datasets
            backbone: {ResNet-101:513 x 513, xception: 299 x 299}

## JSON横竖切换
  - 快捷键Ctrl + Alt + L 

### Dataset summary:
    - Recognition: Imagenet(ILSVRC 2012), CIFAR-10
    - Detect: COCO, Imagenet(ILSVRC 2012)
    - Semantic Segmentation: Cityscapes, PASCAL VOC 2012, R-ASPP and Lite R-ASPP
    - Super Resolution: DIV2k, Set5, Set14, BSD100, Urban100, Manga109

In [10]:
# check models 
import torchvision.models as models
model_names = sorted(name for name in models.__dict__
                     if name.islower() and not name.startswith("__")
                     and callable(models.__dict__[name]))


In [23]:
# Cityscapes dataset
from torchvision.datasets import Cityscapes as cityscapes  # will unzip the file when run at first time
import os
root ='/home/royliu/Documents/datasets/'
data_dir ='cityscapes'

data_dir = os.path.join(root, data_dir)
dataset = cityscapes(data_dir, split = 'train', mode ='fine', target_type = 'semantic')
img, smnt = dataset[0]
dataset


Dataset Cityscapes
    Number of datapoints: 2975
    Root location: /home/royliu/Documents/datasets/cityscapes
    Split: train
    Mode: gtFine
    Type: ['semantic']

In [None]:
# MS coco detection dataset
CLASS torchvision.datasets.CocoDetection(root, annFile, transform=None, target_transform=None, transforms=None)

In [None]:
{'deeplabv3_resnet50':'Cityscapes', }

{'deeplabv3_resnet50':'torchvision.models.segmentation.deeplabv3_resnet50'}



In [2]:
'''
https://pytorch.org/vision/stable/models.html 
'''
model_list_all = ['alexnet', 'convnext_base', 'densenet121','densenet201','efficientnet_v2_l', \
                  'googlenet','inception_v3','mnasnet0_5','mobilenet_v2','mobilenet_v3_small',\
                  'regnet_y_400mf','resnet18','resnet50', 'resnet152','shufflenet_v2_x1_0', \
                  'squeezenet1_0', 'squeezenet1_1','vgg11', 'vgg16', 'vgg19','vit_b_16']



In [21]:
import json
args_train, args_infer = [] ,[]
config_list = ['alexnet','densenet201','mobilenet_v2','resnet152','shufflenet_v2_x1_0', 'squeezenet1_0', 'vgg16']


# training
img_size_list = [224]
batch_size_list_train = [64,128,256,512,1024]
train_config_list = config_list
for image_size in img_size_list:
    for model in train_config_list:
        for batch_size in batch_size_list_train:
            new_args = {'arch': model,'workers': 1, 'epochs': 3, 'batch_size': batch_size, 'image_size':image_size, 'device': 'cuda'}
            args_train.append(new_args)

# infering   
img_size_list = [224]
batch_size_list_infer = [1]
infer_config_list = ['alexnet','mobilenet_v2','resnet152', 'vgg16']
for image_size in img_size_list:
    for model in infer_config_list:
        for batch_size in batch_size_list_infer:
            new_args = {'arch': model,'workers': 1, 'batch_size': 1, 'image_size':image_size, 'device': 'cuda'}
            args_infer.append(new_args)
        
tr = json.dumps(args_train)
inf= json.dumps(args_infer)
inf

'[{"arch": "alexnet", "workers": 1, "batch_size": 1, "image_size": 224, "device": "cuda"}, {"arch": "mobilenet_v2", "workers": 1, "batch_size": 1, "image_size": 224, "device": "cuda"}, {"arch": "resnet152", "workers": 1, "batch_size": 1, "image_size": 224, "device": "cuda"}, {"arch": "vgg16", "workers": 1, "batch_size": 1, "image_size": 224, "device": "cuda"}]'

## Experiments record:
### by 12_21_2022 :

    - training
            - img_size_list = [224]
            - batch_size_list = [64, 128,256,512]
            - config_list = ['alexnet','densenet201','mobilenet_v2','resnet152','shufflenet_v2_x1_0', 'squeezenet1_0', 'vgg16']
    - infering   
            - img_size_list = [224]
            - batch_size_list = [1]
            - config_list = ['alexnet','mobilenet_v2','resnet152', 'vgg16']


### by 12_22_2022 :

    - training
            - img_size_list = [224]
            - batch_size_list = [1024]
            - config_list = ['alexnet','densenet201','mobilenet_v2','resnet152','shufflenet_v2_x1_0', 'squeezenet1_0', 'vgg16']
    - infering   
            - img_size_list = [224]
            - batch_size_list = [1]
            - config_list = ['alexnet']


##  01172023
### training:
  - deeplab_v3'
       batch size: 2,4,6,8
  - 'alexnet','densenet201','mobilenet_v2','resnet152','shufflenet_v2_x1_0', 'squeezenet1_0', 'vgg16'
       batch size: 64,128,256,512,1024

### Inference:
 - yolov5
 - 'alexnet','densenet201','mobilenet_v2','resnet152', 'vgg16'
 

##  01/20/2023 - 01/23a/2023
### training:
    - 'deeplab_v3','alexnet','densenet201','mobilenet_v2','resnet152','shufflenet_v2_x1_0', 'squeezenet1_0', 'vgg16'
    - batch size: 2,4,8,16,32,64,128,256,512,1024,1
    - image size 224

### Inference:
    - 'yolo_v5s', 'alexnet','densenet201','mobilenet_v2','resnet152', 'vgg16'
    - batch size: 1
    - image size: 224
    
  #### except:
    train:
    - 'shufflenet_v2_x1_0', 'squeezenet1_0', 'vgg16'
    - batch size: 2,4,8,16,32,64,128,256,512,1024,1
    - image size 224
    
    infer:
    'vgg16'
    - batch size: 1
    - image size: 224
  


## new: 01/23/2023 - 01/24/2023
### training:
    - 'deeplab_v3','alexnet','densenet201','mobilenet_v2','resnet152','shufflenet_v2_x1_0', 'squeezenet1_0', 'vgg16'
    - batch size: 2,4,8,16,32,64,128,256,512,1024
    - image size 224

### Inference:
    -  'yolo_v5s', 'alexnet','densenet201','mobilenet_v2'
    - batch size: 1
    - image size: 448
    

  

In [1]:
## 20230118
import json
args_train, args_infer = [] ,[]
config_list = ['deeplab_v3', 'alexnet','densenet201','mobilenet_v2','resnet152','shufflenet_v2_x1_0', 'squeezenet1_0', 'vgg16']


# training
img_size_list = [224]
batch_size_list_train = [2,4,8,16,32,64,128,256,512,1024,1] ## deeplab_v3 is  2, 4, 6, 8 instead
train_config_list = config_list
for image_size in img_size_list:
    for model in train_config_list:
        for batch_size in batch_size_list_train:
            new_args = {'arch': model,'workers': 1, 'epochs': 3, 'batch_size': batch_size, 'image_size':image_size, 'device': 'cuda'}
            args_train.append(new_args)

tr = json.dumps(args_train)
tr

'[{"arch": "deeplab_v3", "workers": 1, "epochs": 3, "batch_size": 2, "image_size": 224, "device": "cuda"}, {"arch": "deeplab_v3", "workers": 1, "epochs": 3, "batch_size": 4, "image_size": 224, "device": "cuda"}, {"arch": "deeplab_v3", "workers": 1, "epochs": 3, "batch_size": 8, "image_size": 224, "device": "cuda"}, {"arch": "deeplab_v3", "workers": 1, "epochs": 3, "batch_size": 16, "image_size": 224, "device": "cuda"}, {"arch": "deeplab_v3", "workers": 1, "epochs": 3, "batch_size": 32, "image_size": 224, "device": "cuda"}, {"arch": "deeplab_v3", "workers": 1, "epochs": 3, "batch_size": 64, "image_size": 224, "device": "cuda"}, {"arch": "deeplab_v3", "workers": 1, "epochs": 3, "batch_size": 128, "image_size": 224, "device": "cuda"}, {"arch": "deeplab_v3", "workers": 1, "epochs": 3, "batch_size": 256, "image_size": 224, "device": "cuda"}, {"arch": "deeplab_v3", "workers": 1, "epochs": 3, "batch_size": 512, "image_size": 224, "device": "cuda"}, {"arch": "deeplab_v3", "workers": 1, "epochs

In [5]:
## 20230118
import json
args_train, args_infer = [] ,[]
# config_list = ['yolo_v5', 'alexnet','densenet201','mobilenet_v2','resnet152','shufflenet_v2_x1_0', 'squeezenet1_0', 'vgg16']


# infering   
img_size_list = [448]
batch_size_list_infer = [1]
infer_config_list = ['yolo_v5s','alexnet','densenet201','mobilenet_v2']
for image_size in img_size_list:
    for model in infer_config_list:
        for batch_size in batch_size_list_infer:
            new_args = {'arch': model,'workers': 1, 'batch_size': 1, 'image_size':image_size, 'device': 'cuda'}
            args_infer.append(new_args)
            
inf= json.dumps(args_infer)
inf

'[{"arch": "yolo_v5s", "workers": 1, "batch_size": 1, "image_size": 448, "device": "cuda"}, {"arch": "alexnet", "workers": 1, "batch_size": 1, "image_size": 448, "device": "cuda"}, {"arch": "densenet201", "workers": 1, "batch_size": 1, "image_size": 448, "device": "cuda"}, {"arch": "mobilenet_v2", "workers": 1, "batch_size": 1, "image_size": 448, "device": "cuda"}]'

In [19]:
import time
def date_time():
    s_l = time.localtime(time.time())
    date = time.strftime("%Y%m%d", s_l)
    tm = time.strftime("%H%M%S", s_l)
    # print(date, tm )
    return date, tm

date, tm = date_time()
tm


'122415'

## Others

In [22]:
file = 'config.json'
with open(file) as f:
    config = json.load(f)
args_train, args_infer= config['train'],  config['infer']
len(args_train)

36

## Summary of models and dataset in Mobile AI
 - in https://ai-benchmark.com/tests.html

#### Recognition task:
- MobileNet - V2, 224x224, Imagenet + MS COCO
    train Imagenet, detection: MS COCO (with a modified version of the Single Shot Detector (SSD))
- Inception -V3, 346x346, ILSVRC 2012, inference: recognition
- MobileNet - V3,  512 x 512,
    classification: Imagenet, MobilenetNet V3
    detection: . detection: MS COCO (replacement for the backbone feature extractor in SSDLite)
    Semantic Segmentation: Cityscapes, R-ASPP and Lite R-ASPP as head
- EfficientNet-B4, 380 x 380 ,  CIFAR-10, subset of the CamSDD datase
- Inception -V3, 346x346, NA


 - alexnet
 - vgg
 - resnet

#### Object Detection:
 - YOLOv4-Tiny, 416 x 416,
    dataset： train with imagenet, detect with: MS COCO

 - CRNN / Bi-LSTM , 64 x 200
    train: synthetic dataset (Synth) released by Jaderberg et al.
    test: y ICDAR 2003 (IC03), ICDAR 2013 (IC13), IIIT 5k-word (IIIT5k), and Street View Text (SVT).

#### Semantic Segmentation:
 -  DeepLab-V3+ , 1024 x 1024,
    dataset:
        employ ImageNet-1k pretrained ResNet-101 or modified aligned Xception to extract dense feature maps
        test: PASCAL VOC 2012 and Cityscapes datasets
            backbone: {ResNet-101:513 x 513, xception: 299 x 299}

###### Super resolution:
 - ESRGAN, 512 x512,
    dataset:
        train: DIV2K dataset (PIRM2018-SR Challenge), Flickr2K,  OutdoorSceneTraining (OST)
        evaluate: Set5 , Set14, BSD100, Urban100
 - SRGAN, 1024 x 1024
    dataset:
        train: a random sample of 350 thousand images from the ImageNet database
        evaluate: Set5 , Set14 and BSD100

###### Video Super-Resolution:
 - XLSR, 1080 x 1920
    dataset:
        train:Div2K
        evaluate: Div2K 100 test images,  Set5, Set14, BSD100, Manga109, Urban100
 - VSR, 2160 x 3840, dataset:  {train and test: DIV2K}
