# Top Down Experiments

These are the experiments to test the performance of the top-down strategy. If you are running an experiment to test the validity of a particular design space, please make sure to load the approrpiate OFA network before doing so. The cells corresponding to each design space have been marked with "DESIGN SPACE: <NAME>" at the top of the cell.
     
We do not recommend running any of the experiments as they may take a while (we average over 10 runs). Please run the cell under the DEMO section to see bottom-up in action. Before running it, make sure you run all cells in the PREP section

# PREP

In [50]:
import os
import torch
import torch.nn as nn
from torchvision import transforms, datasets
import numpy as np
import time
import random
import math
import copy
from matplotlib import pyplot as plt

In [51]:
import ofa
import ofa.model_zoo
import ofa.tutorial
import ofa.nas.accuracy_predictor.arch_encoder
import ofa.nas.search_algorithm
import ofa.nas.efficiency_predictor.latency_lookup_table

In [52]:
random_seed = 1
random.seed(random_seed)
np.random.seed(random_seed)
torch.manual_seed(random_seed)
print('Successfully imported all packages and configured random seed to %d!'%random_seed)

cuda_available = torch.cuda.is_available()
if cuda_available:
    torch.backends.cudnn.enabled = True
    torch.backends.cudnn.benchmark = True
    torch.cuda.manual_seed(random_seed)
    print('Using GPU.')
else:
    print('Using CPU.')

Successfully imported all packages and configured random seed to 1!
Using CPU.


In [53]:
ofa_network = ofa.model_zoo.ofa_net('ofa_mbv3_d234_e346_k357_w1.2', pretrained=True)
#ofa_network = ofa.model_zoo.ofa_net('ofa_resnet50', pretrained=True)
print('The OFA Network is ready.')

The OFA Network is ready.


In [54]:
target_hardware = 'note10'
latency_table = ofa.tutorial.LatencyTable(device=target_hardware)
print('The Latency lookup table on %s is ready!' % target_hardware)

Downloading: "https://hanlab.mit.edu/files/OnceForAll/tutorial/latency_table@note10/160_lookup_table.yaml" to /Users/vidhur2k/.hancai/latency_tools/160_lookup_table.yaml


Built latency table for image size: 160.


Downloading: "https://hanlab.mit.edu/files/OnceForAll/tutorial/latency_table@note10/176_lookup_table.yaml" to /Users/vidhur2k/.hancai/latency_tools/176_lookup_table.yaml


Built latency table for image size: 176.


Downloading: "https://hanlab.mit.edu/files/OnceForAll/tutorial/latency_table@note10/192_lookup_table.yaml" to /Users/vidhur2k/.hancai/latency_tools/192_lookup_table.yaml


Built latency table for image size: 192.


Downloading: "https://hanlab.mit.edu/files/OnceForAll/tutorial/latency_table@note10/208_lookup_table.yaml" to /Users/vidhur2k/.hancai/latency_tools/208_lookup_table.yaml


Built latency table for image size: 208.


Downloading: "https://hanlab.mit.edu/files/OnceForAll/tutorial/latency_table@note10/224_lookup_table.yaml" to /Users/vidhur2k/.hancai/latency_tools/224_lookup_table.yaml


Built latency table for image size: 224.
The Latency lookup table on note10 is ready!


In [55]:
if cuda_available:
    # path to the ImageNet dataset
    print("Please input the path to the ImageNet dataset.\n")
    imagenet_data_path = input()
    #imagenet_data_path = 'C:\School\once-for-all-master\imgnet'

    # if 'imagenet_data_path' is empty, download a subset of ImageNet containing 2000 images (~250M) for test
    if not os.path.isdir(imagenet_data_path):
        os.makedirs(imagenet_data_path, exist_ok=True)
        ofa.utils.download_url('https://hanlab.mit.edu/files/OnceForAll/ofa_cvpr_tutorial/imagenet_1k.zip', model_dir='data')
        ! cd data && unzip imagenet_1k 1>/dev/null && cd ..
        ! copy -r data/imagenet_1k/* $imagenet_data_path
        ! del -rf data
        print('%s is empty. Download a subset of ImageNet for test.' % imagenet_data_path)

    print('The ImageNet dataset files are ready.')
else:
    print('Since GPU is not found in the environment, we skip all scripts related to ImageNet evaluation.')

Since GPU is not found in the environment, we skip all scripts related to ImageNet evaluation.


In [56]:
if cuda_available:
    # The following function build the data transforms for test
    def build_val_transform(size):
        return transforms.Compose([
            transforms.Resize(int(math.ceil(size / 0.875))),
            transforms.CenterCrop(size),
            transforms.ToTensor(),
            transforms.Normalize(
                mean=[0.485, 0.456, 0.406],
                std=[0.229, 0.224, 0.225]
            ),
        ])

    data_loader = torch.utils.data.DataLoader(
        datasets.ImageFolder(
            root=os.path.join(imagenet_data_path, 'val'),
            transform=build_val_transform(224)
        ),
        batch_size=250,  # test batch size
        shuffle=True,
        num_workers=16,  # number of workers for the data loader
        pin_memory=True,
        drop_last=False,
    )
    print('The ImageNet dataloader is ready.')
else:
    data_loader = None
    print('Since GPU is not found in the environment, we skip all scripts related to ImageNet evaluation.')

Since GPU is not found in the environment, we skip all scripts related to ImageNet evaluation.


In [57]:
#accuracy_predictor = ofa.nas.accuracy_predictor.AccuracyPredictor(
accuracy_predictor = ofa.tutorial.AccuracyPredictor(
    pretrained=True,
    device='cuda:0' if cuda_available else 'cpu'
)

print('The accuracy predictor is ready!')

The accuracy predictor is ready!


# Experiments

In [58]:
def run_top_down_evolutionary_search(latency_constraint):
#     latency_constraint = (30,25,20)  # ms, suggested range [15, 33] ms
    P = 100  # The size of population in each generation
    N = 500  # How many generations of population to be searched
    N2 = 100
    r = 0.25  # The ratio of networks that are used as parents for next generation
    params = {
        'constraint_type': target_hardware, # Let's do FLOPs-constrained search
        'efficiency_constraint': latency_constraint,
        'mutate_prob': 0.1, # The probability of mutation in evolutionary search
        'mutation_ratio': 0.5, # The ratio of networks that are generated through mutation in generation n >= 2.
        'efficiency_predictor': latency_table, # To use a predefined efficiency predictor.
        'accuracy_predictor': accuracy_predictor, # To use a predefined accuracy_predictor predictor.
        'population_size': P,
        'max_time_budget': N,
        'max_time_budget2': N2,
        'parent_ratio': r,
    }

    # build the evolution finder
    finder = ofa.tutorial.EvolutionFinder(**params)

    # start searching
    result_lis = []
    st = time.time()
    best_valids, best_info = finder.run_evolution_search_multi_mixed()
    for i in range(len(latency_constraint)):
        result_lis.append(best_info[i])
        print('Found best architecture on %s with latency <= %.2f ms  '
              'It achieves %.2f%s predicted accuracy with %.2f ms latency on %s.' %
              (target_hardware, latency_constraint[i],  best_info[i][0] * 100, '%', best_info[i][-1], target_hardware))

        # visualize the architecture of the searched sub-net
        _, net_config, latency = best_info[i]
        ofa_network.set_active_subnet(ks=net_config['ks'], d=net_config['d'], e=net_config['e'])
        print('Architecture of the searched sub-net:')
        print(ofa_network.module_str)
    ed = time.time()
    print("Time:", ed-st)
    return ed-st

In [41]:
# DESIGN SPACE: MobileNetV3
ofa_network = ofa.model_zoo.ofa_net('ofa_mbv3_d234_e346_k357_w1.2', pretrained=True)
#ofa_network = ofa.model_zoo.ofa_net('ofa_resnet50', pretrained=True)
print('The OFA Network is ready.')

The OFA Network is ready.


In [18]:
# DESIGN SPACE: Resnet50D
ofa_network = ofa.model_zoo.ofa_net('ofa_resnet50', pretrained=True)
#ofa_network = ofa.model_zoo.ofa_net('ofa_resnet50', pretrained=True)
print('The OFA Network is ready.')

The OFA Network is ready.


In [20]:
# DESIGN SPACE: ProxylessNAS
ofa_network = ofa.model_zoo.ofa_net('ofa_proxyless_d234_e346_k357_w1.3', pretrained=True)
#ofa_network = ofa.model_zoo.ofa_net('ofa_resnet50', pretrained=True)
print('The OFA Network is ready.')

The OFA Network is ready.


## 1. Running time for k Latency Constraints

### MobileNetV3

In [11]:
latency_constraints = (35, 30, 25, 20, 15)
times = {}
for i in range(len(latency_constraints)):
    latency_constraint = latency_constraints[:i+1]
    temp_times = 0
    for j in range(10):
        temp_times += run_top_down_evolutionary_search(latency_constraint)
    times[latency_constraint] = temp_times / 10
latency_constraints = (60, 55, 50, 45, 40, 35, 30, 25, 20, 15)
temp_times = 0
for j in range(10):
    temp_times += run_top_down_evolutionary_search(latency_constraints)
times[latency_constraints] = temp_times / 10
times

Searching with note10 constraint (35): 100%|██████████| 500/500 [00:15<00:00, 32.66it/s]
Searching with note10 constraint (35):   1%|          | 3/500 [00:00<00:16, 29.76it/s]

Found best architecture on note10 with latency <= 35.00 ms  It achieves 83.24% predicted accuracy with 34.63 ms latency on note10.
Architecture of the searched sub-net:
3x3_Conv_O24_H_SWISH_BN
(3x3_MBConv1_RELU_O24_BN, Identity)
((O32, E4.0, K3), None)
((O32, E4.0, K3), Identity)
(SE(O48, E3.0, K5), None)
(SE(O48, E6.0, K3), Identity)
(SE(O48, E4.0, K7), Identity)
(SE(O48, E6.0, K7), Identity)
((O96, E4.0, K5), None)
((O96, E3.0, K5), Identity)
((O96, E4.0, K3), Identity)
((O96, E6.0, K7), Identity)
(SE(O136, E4.0, K7), None)
(SE(O136, E6.0, K3), Identity)
(SE(O136, E6.0, K3), Identity)
(SE(O136, E6.0, K3), Identity)
(SE(O192, E6.0, K5), None)
(SE(O192, E6.0, K3), Identity)
(SE(O192, E6.0, K5), Identity)
(SE(O192, E4.0, K7), Identity)
1x1_Conv_O1152_H_SWISH_BN
1x1_Conv_O1536_H_SWISH
1536x1000_Linear

Time: 15.352033138275146


Searching with note10 constraint (35): 100%|██████████| 500/500 [00:17<00:00, 28.27it/s]
Searching with note10 constraint (30): 100%|██████████| 100/100 [00:03<00:00, 27.84it/s]
Searching with note10 constraint (35):   1%|          | 3/500 [00:00<00:19, 24.92it/s]

Found best architecture on note10 with latency <= 35.00 ms  It achieves 82.86% predicted accuracy with 34.99 ms latency on note10.
Architecture of the searched sub-net:
3x3_Conv_O24_H_SWISH_BN
(3x3_MBConv1_RELU_O24_BN, Identity)
((O32, E4.0, K5), None)
((O32, E4.0, K3), Identity)
(SE(O48, E3.0, K5), None)
(SE(O48, E4.0, K3), Identity)
(SE(O48, E4.0, K5), Identity)
((O96, E4.0, K5), None)
((O96, E3.0, K5), Identity)
((O96, E3.0, K5), Identity)
(SE(O136, E4.0, K5), None)
(SE(O136, E4.0, K5), Identity)
(SE(O136, E3.0, K5), Identity)
(SE(O192, E6.0, K5), None)
(SE(O192, E6.0, K3), Identity)
(SE(O192, E6.0, K7), Identity)
1x1_Conv_O1152_H_SWISH_BN
1x1_Conv_O1536_H_SWISH
1536x1000_Linear

Found best architecture on note10 with latency <= 30.00 ms  It achieves 82.19% predicted accuracy with 29.97 ms latency on note10.
Architecture of the searched sub-net:
3x3_Conv_O24_H_SWISH_BN
(3x3_MBConv1_RELU_O24_BN, Identity)
((O32, E4.0, K5), None)
((O32, E4.0, K3), Identity)
(SE(O48, E3.0, K5), None)
(

Searching with note10 constraint (35): 100%|██████████| 500/500 [00:19<00:00, 26.19it/s]
Searching with note10 constraint (30): 100%|██████████| 100/100 [00:03<00:00, 26.00it/s]
Searching with note10 constraint (25): 100%|██████████| 100/100 [00:03<00:00, 26.83it/s]
Searching with note10 constraint (35):   1%|          | 3/500 [00:00<00:17, 29.16it/s]

Found best architecture on note10 with latency <= 35.00 ms  It achieves 83.37% predicted accuracy with 34.90 ms latency on note10.
Architecture of the searched sub-net:
3x3_Conv_O24_H_SWISH_BN
(3x3_MBConv1_RELU_O24_BN, Identity)
((O32, E3.0, K3), None)
((O32, E3.0, K3), Identity)
(SE(O48, E4.0, K5), None)
(SE(O48, E4.0, K3), Identity)
(SE(O48, E4.0, K3), Identity)
((O96, E4.0, K7), None)
((O96, E4.0, K7), Identity)
((O96, E4.0, K3), Identity)
(SE(O136, E6.0, K7), None)
(SE(O136, E4.0, K5), Identity)
(SE(O136, E4.0, K3), Identity)
(SE(O192, E6.0, K3), None)
(SE(O192, E6.0, K5), Identity)
(SE(O192, E6.0, K5), Identity)
1x1_Conv_O1152_H_SWISH_BN
1x1_Conv_O1536_H_SWISH
1536x1000_Linear

Found best architecture on note10 with latency <= 30.00 ms  It achieves 82.38% predicted accuracy with 29.90 ms latency on note10.
Architecture of the searched sub-net:
3x3_Conv_O24_H_SWISH_BN
(3x3_MBConv1_RELU_O24_BN, Identity)
((O32, E3.0, K3), None)
((O32, E3.0, K3), Identity)
(SE(O48, E4.0, K5), None)
(

Searching with note10 constraint (35): 100%|██████████| 500/500 [00:18<00:00, 27.69it/s]
Searching with note10 constraint (30): 100%|██████████| 100/100 [00:03<00:00, 26.18it/s]
Searching with note10 constraint (25): 100%|██████████| 100/100 [00:03<00:00, 25.73it/s]
Searching with note10 constraint (20): 100%|██████████| 100/100 [00:03<00:00, 25.56it/s]
Searching with note10 constraint (35):   1%|          | 3/500 [00:00<00:18, 26.82it/s]

Found best architecture on note10 with latency <= 35.00 ms  It achieves 83.27% predicted accuracy with 34.73 ms latency on note10.
Architecture of the searched sub-net:
3x3_Conv_O24_H_SWISH_BN
(3x3_MBConv1_RELU_O24_BN, Identity)
((O32, E4.0, K3), None)
((O32, E3.0, K3), Identity)
(SE(O48, E4.0, K5), None)
(SE(O48, E4.0, K5), Identity)
(SE(O48, E4.0, K3), Identity)
((O96, E4.0, K5), None)
((O96, E4.0, K7), Identity)
((O96, E3.0, K3), Identity)
(SE(O136, E4.0, K3), None)
(SE(O136, E4.0, K5), Identity)
(SE(O136, E4.0, K3), Identity)
(SE(O192, E4.0, K5), None)
(SE(O192, E4.0, K5), Identity)
(SE(O192, E4.0, K5), Identity)
1x1_Conv_O1152_H_SWISH_BN
1x1_Conv_O1536_H_SWISH
1536x1000_Linear

Found best architecture on note10 with latency <= 30.00 ms  It achieves 82.12% predicted accuracy with 29.88 ms latency on note10.
Architecture of the searched sub-net:
3x3_Conv_O24_H_SWISH_BN
(3x3_MBConv1_RELU_O24_BN, Identity)
((O32, E4.0, K3), None)
((O32, E3.0, K3), Identity)
(SE(O48, E4.0, K5), None)
(

Searching with note10 constraint (35): 100%|██████████| 500/500 [00:19<00:00, 25.56it/s]
Searching with note10 constraint (30): 100%|██████████| 100/100 [00:03<00:00, 26.64it/s]
Searching with note10 constraint (25): 100%|██████████| 100/100 [00:03<00:00, 26.66it/s]
Searching with note10 constraint (20): 100%|██████████| 100/100 [00:04<00:00, 24.58it/s]
Searching with note10 constraint (15): 100%|██████████| 100/100 [00:04<00:00, 24.09it/s]
Searching with note10 constraint (60):   1%|          | 4/500 [00:00<00:16, 30.70it/s]

Found best architecture on note10 with latency <= 35.00 ms  It achieves 83.20% predicted accuracy with 34.68 ms latency on note10.
Architecture of the searched sub-net:
3x3_Conv_O24_H_SWISH_BN
(3x3_MBConv1_RELU_O24_BN, Identity)
((O32, E3.0, K3), None)
((O32, E4.0, K3), Identity)
((O32, E3.0, K5), Identity)
(SE(O48, E4.0, K5), None)
(SE(O48, E4.0, K3), Identity)
(SE(O48, E4.0, K3), Identity)
((O96, E4.0, K7), None)
((O96, E3.0, K5), Identity)
((O96, E3.0, K7), Identity)
(SE(O136, E3.0, K7), None)
(SE(O136, E6.0, K5), Identity)
(SE(O136, E6.0, K3), Identity)
(SE(O192, E6.0, K3), None)
(SE(O192, E6.0, K5), Identity)
(SE(O192, E6.0, K5), Identity)
(SE(O192, E3.0, K5), Identity)
1x1_Conv_O1152_H_SWISH_BN
1x1_Conv_O1536_H_SWISH
1536x1000_Linear

Found best architecture on note10 with latency <= 30.00 ms  It achieves 82.55% predicted accuracy with 29.70 ms latency on note10.
Architecture of the searched sub-net:
3x3_Conv_O24_H_SWISH_BN
(3x3_MBConv1_RELU_O24_BN, Identity)
((O32, E3.0, K3), No

Searching with note10 constraint (60): 100%|██████████| 500/500 [00:17<00:00, 28.36it/s]
Searching with note10 constraint (55): 100%|██████████| 100/100 [00:03<00:00, 28.49it/s]
Searching with note10 constraint (50): 100%|██████████| 100/100 [00:03<00:00, 28.73it/s]
Searching with note10 constraint (45): 100%|██████████| 100/100 [00:03<00:00, 28.60it/s]
Searching with note10 constraint (40): 100%|██████████| 100/100 [00:03<00:00, 27.35it/s]
Searching with note10 constraint (35): 100%|██████████| 100/100 [00:03<00:00, 25.16it/s]
Searching with note10 constraint (30): 100%|██████████| 100/100 [00:04<00:00, 21.45it/s]
Searching with note10 constraint (25): 100%|██████████| 100/100 [00:04<00:00, 20.81it/s]
Searching with note10 constraint (20): 100%|██████████| 100/100 [00:05<00:00, 19.11it/s]
Searching with note10 constraint (15): 100%|██████████| 100/100 [00:05<00:00, 19.65it/s]

Found best architecture on note10 with latency <= 60.00 ms  It achieves 84.71% predicted accuracy with 58.13 ms latency on note10.
Architecture of the searched sub-net:
3x3_Conv_O24_H_SWISH_BN
(3x3_MBConv1_RELU_O24_BN, Identity)
((O32, E6.0, K5), None)
((O32, E3.0, K5), Identity)
((O32, E6.0, K3), Identity)
(SE(O48, E6.0, K3), None)
(SE(O48, E6.0, K3), Identity)
(SE(O48, E4.0, K7), Identity)
(SE(O48, E3.0, K3), Identity)
((O96, E4.0, K7), None)
((O96, E4.0, K7), Identity)
((O96, E4.0, K3), Identity)
((O96, E6.0, K7), Identity)
(SE(O136, E6.0, K7), None)
(SE(O136, E4.0, K7), Identity)
(SE(O136, E6.0, K5), Identity)
(SE(O136, E3.0, K3), Identity)
(SE(O192, E6.0, K7), None)
(SE(O192, E6.0, K3), Identity)
(SE(O192, E6.0, K5), Identity)
(SE(O192, E3.0, K5), Identity)
1x1_Conv_O1152_H_SWISH_BN
1x1_Conv_O1536_H_SWISH
1536x1000_Linear

Found best architecture on note10 with latency <= 55.00 ms  It achieves 85.00% predicted accuracy with 52.83 ms latency on note10.
Architecture of the searched 




{(35,): 15.352033138275146,
 (35, 30): 21.316713094711304,
 (35, 30, 25): 26.708261013031006,
 (35, 30, 25, 20): 29.721980810165405,
 (35, 30, 25, 20, 15): 35.34088611602783,
 (60, 55, 50, 45, 40, 35, 30, 25, 20, 15): 55.60786724090576}

### ResNet50D

In [13]:
latency_constraints = (35, 30, 25, 20, 15)
times = {}
for i in range(len(latency_constraints)):
    latency_constraint = latency_constraints[:i+1]
    temp_times = 0
    for j in range(10):
        temp_times += run_top_down_evolutionary_search(latency_constraint)
    times[latency_constraint] = temp_times / 10
latency_constraints = (60, 55, 50, 45, 40, 35, 30, 25, 20, 15)
temp_times = 0
for j in range(10):
    temp_times += run_top_down_evolutionary_search(latency_constraints)
times[latency_constraints] = temp_times / 10
times

Searching with note10 constraint (35): 100%|██████████| 500/500 [00:15<00:00, 31.56it/s]
Searching with note10 constraint (35):   1%|          | 4/500 [00:00<00:17, 28.57it/s]

Found best architecture on note10 with latency <= 35.00 ms  It achieves 83.49% predicted accuracy with 34.92 ms latency on note10.
Architecture of the searched sub-net:
DyConv(O32, K3, S2)
(DyConv(O32, K3, S1), Identity)
DyConv(O64, K3, S1)
max_pooling(ks=3, stride=2)
(3x3_BottleneckConv_in->1024->256_S1, avgpool_conv)
(3x3_BottleneckConv_in->768->256_S1, Identity)
(3x3_BottleneckConv_in->768->256_S1, Identity)
(3x3_BottleneckConv_in->1536->256_S1, Identity)
(3x3_BottleneckConv_in->2048->512_S2, avgpool_conv)
(3x3_BottleneckConv_in->3072->512_S1, Identity)
(3x3_BottleneckConv_in->2048->512_S1, Identity)
(3x3_BottleneckConv_in->1536->512_S1, Identity)
(3x3_BottleneckConv_in->4096->1024_S2, avgpool_conv)
(3x3_BottleneckConv_in->3072->1024_S1, Identity)
(3x3_BottleneckConv_in->4096->1024_S1, Identity)
(3x3_BottleneckConv_in->3072->1024_S1, Identity)
(3x3_BottleneckConv_in->6144->1024_S1, Identity)
(3x3_BottleneckConv_in->4096->1024_S1, Identity)
(3x3_BottleneckConv_in->12288->2048_S2, avg

Searching with note10 constraint (35): 100%|██████████| 500/500 [00:18<00:00, 26.57it/s]
Searching with note10 constraint (30): 100%|██████████| 100/100 [00:04<00:00, 24.50it/s]
Searching with note10 constraint (35):   1%|          | 3/500 [00:00<00:18, 26.31it/s]

Found best architecture on note10 with latency <= 35.00 ms  It achieves 83.28% predicted accuracy with 34.46 ms latency on note10.
Architecture of the searched sub-net:
DyConv(O32, K3, S2)
DyConv(O64, K3, S1)
max_pooling(ks=3, stride=2)
(3x3_BottleneckConv_in->768->256_S1, avgpool_conv)
(3x3_BottleneckConv_in->768->256_S1, Identity)
(3x3_BottleneckConv_in->768->256_S1, Identity)
(3x3_BottleneckConv_in->768->256_S1, Identity)
(3x3_BottleneckConv_in->1536->512_S2, avgpool_conv)
(3x3_BottleneckConv_in->2048->512_S1, Identity)
(3x3_BottleneckConv_in->1536->512_S1, Identity)
(3x3_BottleneckConv_in->1536->512_S1, Identity)
(3x3_BottleneckConv_in->3072->1024_S2, avgpool_conv)
(3x3_BottleneckConv_in->4096->1024_S1, Identity)
(3x3_BottleneckConv_in->6144->1024_S1, Identity)
(3x3_BottleneckConv_in->6144->1024_S1, Identity)
(3x3_BottleneckConv_in->3072->1024_S1, Identity)
(3x3_BottleneckConv_in->4096->1024_S1, Identity)
(3x3_BottleneckConv_in->8192->2048_S2, avgpool_conv)
(3x3_BottleneckConv_in->

Searching with note10 constraint (35): 100%|██████████| 500/500 [00:19<00:00, 26.08it/s]
Searching with note10 constraint (30): 100%|██████████| 100/100 [00:03<00:00, 25.41it/s]
Searching with note10 constraint (25): 100%|██████████| 100/100 [00:03<00:00, 25.35it/s]
Searching with note10 constraint (35):   1%|          | 3/500 [00:00<00:17, 28.97it/s]

Found best architecture on note10 with latency <= 35.00 ms  It achieves 83.52% predicted accuracy with 34.94 ms latency on note10.
Architecture of the searched sub-net:
DyConv(O32, K3, S2)
DyConv(O64, K3, S1)
max_pooling(ks=3, stride=2)
(3x3_BottleneckConv_in->1024->256_S1, avgpool_conv)
(3x3_BottleneckConv_in->1024->256_S1, Identity)
(3x3_BottleneckConv_in->1024->256_S1, Identity)
(3x3_BottleneckConv_in->1024->256_S1, Identity)
(3x3_BottleneckConv_in->2048->512_S2, avgpool_conv)
(3x3_BottleneckConv_in->3072->512_S1, Identity)
(3x3_BottleneckConv_in->2048->512_S1, Identity)
(3x3_BottleneckConv_in->3072->512_S1, Identity)
(3x3_BottleneckConv_in->4096->1024_S2, avgpool_conv)
(3x3_BottleneckConv_in->3072->1024_S1, Identity)
(3x3_BottleneckConv_in->4096->1024_S1, Identity)
(3x3_BottleneckConv_in->3072->1024_S1, Identity)
(3x3_BottleneckConv_in->6144->1024_S1, Identity)
(3x3_BottleneckConv_in->4096->1024_S1, Identity)
(3x3_BottleneckConv_in->8192->2048_S2, avgpool_conv)
(3x3_BottleneckConv_

Searching with note10 constraint (35): 100%|██████████| 500/500 [00:19<00:00, 25.97it/s]
Searching with note10 constraint (30): 100%|██████████| 100/100 [00:03<00:00, 25.43it/s]
Searching with note10 constraint (25): 100%|██████████| 100/100 [00:03<00:00, 26.11it/s]
Searching with note10 constraint (20): 100%|██████████| 100/100 [00:04<00:00, 24.91it/s]
Searching with note10 constraint (35):   1%|          | 3/500 [00:00<00:17, 29.04it/s]

Found best architecture on note10 with latency <= 35.00 ms  It achieves 83.32% predicted accuracy with 35.00 ms latency on note10.
Architecture of the searched sub-net:
DyConv(O32, K3, S2)
(DyConv(O32, K3, S1), Identity)
DyConv(O64, K3, S1)
max_pooling(ks=3, stride=2)
(3x3_BottleneckConv_in->1024->256_S1, avgpool_conv)
(3x3_BottleneckConv_in->1024->256_S1, Identity)
(3x3_BottleneckConv_in->768->256_S1, Identity)
(3x3_BottleneckConv_in->1024->256_S1, Identity)
(3x3_BottleneckConv_in->1536->512_S2, avgpool_conv)
(3x3_BottleneckConv_in->1536->512_S1, Identity)
(3x3_BottleneckConv_in->2048->512_S1, Identity)
(3x3_BottleneckConv_in->2048->512_S1, Identity)
(3x3_BottleneckConv_in->4096->1024_S2, avgpool_conv)
(3x3_BottleneckConv_in->3072->1024_S1, Identity)
(3x3_BottleneckConv_in->4096->1024_S1, Identity)
(3x3_BottleneckConv_in->3072->1024_S1, Identity)
(3x3_BottleneckConv_in->4096->1024_S1, Identity)
(3x3_BottleneckConv_in->4096->1024_S1, Identity)
(3x3_BottleneckConv_in->6144->2048_S2, avg

Searching with note10 constraint (35): 100%|██████████| 500/500 [00:18<00:00, 27.48it/s]
Searching with note10 constraint (30): 100%|██████████| 100/100 [00:03<00:00, 25.37it/s]
Searching with note10 constraint (25): 100%|██████████| 100/100 [00:04<00:00, 23.22it/s]
Searching with note10 constraint (20): 100%|██████████| 100/100 [00:03<00:00, 25.12it/s]
Searching with note10 constraint (15): 100%|██████████| 100/100 [00:04<00:00, 24.17it/s]
Searching with note10 constraint (60):   1%|          | 4/500 [00:00<00:16, 30.68it/s]

Found best architecture on note10 with latency <= 35.00 ms  It achieves 83.67% predicted accuracy with 34.60 ms latency on note10.
Architecture of the searched sub-net:
DyConv(O32, K3, S2)
(DyConv(O32, K3, S1), Identity)
DyConv(O64, K3, S1)
max_pooling(ks=3, stride=2)
(3x3_BottleneckConv_in->1024->256_S1, avgpool_conv)
(3x3_BottleneckConv_in->1024->256_S1, Identity)
(3x3_BottleneckConv_in->1024->256_S1, Identity)
(3x3_BottleneckConv_in->768->256_S1, Identity)
(3x3_BottleneckConv_in->2048->512_S2, avgpool_conv)
(3x3_BottleneckConv_in->2048->512_S1, Identity)
(3x3_BottleneckConv_in->1536->512_S1, Identity)
(3x3_BottleneckConv_in->1536->512_S1, Identity)
(3x3_BottleneckConv_in->4096->1024_S2, avgpool_conv)
(3x3_BottleneckConv_in->4096->1024_S1, Identity)
(3x3_BottleneckConv_in->4096->1024_S1, Identity)
(3x3_BottleneckConv_in->4096->1024_S1, Identity)
(3x3_BottleneckConv_in->4096->1024_S1, Identity)
(3x3_BottleneckConv_in->3072->1024_S1, Identity)
(3x3_BottleneckConv_in->8192->2048_S2, avg

Searching with note10 constraint (60): 100%|██████████| 500/500 [00:18<00:00, 27.39it/s]
Searching with note10 constraint (55): 100%|██████████| 100/100 [00:03<00:00, 27.61it/s]
Searching with note10 constraint (50): 100%|██████████| 100/100 [00:03<00:00, 27.67it/s]
Searching with note10 constraint (45): 100%|██████████| 100/100 [00:04<00:00, 24.73it/s]
Searching with note10 constraint (40): 100%|██████████| 100/100 [00:04<00:00, 24.83it/s]
Searching with note10 constraint (35): 100%|██████████| 100/100 [00:05<00:00, 19.60it/s]
Searching with note10 constraint (30): 100%|██████████| 100/100 [00:05<00:00, 19.85it/s]
Searching with note10 constraint (25): 100%|██████████| 100/100 [00:05<00:00, 19.71it/s]
Searching with note10 constraint (20): 100%|██████████| 100/100 [00:05<00:00, 19.20it/s]
Searching with note10 constraint (15): 100%|██████████| 100/100 [00:05<00:00, 19.07it/s]

Found best architecture on note10 with latency <= 60.00 ms  It achieves 84.81% predicted accuracy with 59.73 ms latency on note10.
Architecture of the searched sub-net:
DyConv(O32, K3, S2)
DyConv(O64, K3, S1)
max_pooling(ks=3, stride=2)
(3x3_BottleneckConv_in->1024->256_S1, avgpool_conv)
(3x3_BottleneckConv_in->1024->256_S1, Identity)
(3x3_BottleneckConv_in->768->256_S1, Identity)
(3x3_BottleneckConv_in->768->256_S1, Identity)
(3x3_BottleneckConv_in->2048->512_S2, avgpool_conv)
(3x3_BottleneckConv_in->2048->512_S1, Identity)
(3x3_BottleneckConv_in->2048->512_S1, Identity)
(3x3_BottleneckConv_in->2048->512_S1, Identity)
(3x3_BottleneckConv_in->4096->1024_S2, avgpool_conv)
(3x3_BottleneckConv_in->3072->1024_S1, Identity)
(3x3_BottleneckConv_in->6144->1024_S1, Identity)
(3x3_BottleneckConv_in->6144->1024_S1, Identity)
(3x3_BottleneckConv_in->6144->1024_S1, Identity)
(3x3_BottleneckConv_in->6144->1024_S1, Identity)
(3x3_BottleneckConv_in->12288->2048_S2, avgpool_conv)
(3x3_BottleneckConv_i




{(35,): 15.879708766937256,
 (35, 30): 22.936019897460938,
 (35, 30, 25): 27.097497940063477,
 (35, 30, 25, 20): 31.084476947784424,
 (35, 30, 25, 20, 15): 34.617053747177124,
 (60, 55, 50, 45, 40, 35, 30, 25, 20, 15): 59.29659080505371}

### ProxylessNAS

In [15]:
latency_constraints = (35, 30, 25, 20, 15)
times = {}
for i in range(len(latency_constraints)):
    latency_constraint = latency_constraints[:i+1]
    temp_times = 0
    for j in range(10):
        temp_times += run_top_down_evolutionary_search(latency_constraint)
    times[latency_constraint] = temp_times / 10
latency_constraints = (60, 55, 50, 45, 40, 35, 30, 25, 20, 15)
temp_times = 0
for j in range(10):
    temp_times += run_top_down_evolutionary_search(latency_constraints)
times[latency_constraints] = temp_times / 10
times

Searching with note10 constraint (35): 100%|██████████| 500/500 [00:16<00:00, 30.53it/s]
Searching with note10 constraint (35):   1%|          | 4/500 [00:00<00:16, 30.01it/s]

Found best architecture on note10 with latency <= 35.00 ms  It achieves 83.55% predicted accuracy with 34.90 ms latency on note10.
Architecture of the searched sub-net:
3x3_Conv_O40_RELU6_BN
(3x3_MBConv1_RELU6_O24_BN, None)
((O32, E4.0, K3), None)
((O32, E6.0, K3), Identity)
((O32, E6.0, K3), Identity)
((O56, E4.0, K5), None)
((O56, E6.0, K5), Identity)
((O56, E3.0, K7), Identity)
((O56, E6.0, K3), Identity)
((O104, E4.0, K5), None)
((O104, E3.0, K7), Identity)
((O104, E4.0, K3), Identity)
((O104, E3.0, K7), Identity)
((O128, E4.0, K7), None)
((O128, E4.0, K7), Identity)
((O128, E4.0, K3), Identity)
((O128, E4.0, K7), Identity)
((O248, E6.0, K3), None)
((O248, E6.0, K3), Identity)
((O248, E6.0, K5), Identity)
((O248, E4.0, K5), Identity)
((O416, E6.0, K7), None)
1x1_Conv_O1664_RELU6_BN
1664x1000_Linear

Time: 16.41508412361145


Searching with note10 constraint (35): 100%|██████████| 500/500 [00:20<00:00, 23.91it/s]
Searching with note10 constraint (30): 100%|██████████| 100/100 [00:04<00:00, 24.97it/s]
Searching with note10 constraint (35):   1%|          | 3/500 [00:00<00:18, 27.36it/s]

Found best architecture on note10 with latency <= 35.00 ms  It achieves 83.22% predicted accuracy with 34.93 ms latency on note10.
Architecture of the searched sub-net:
3x3_Conv_O40_RELU6_BN
(3x3_MBConv1_RELU6_O24_BN, None)
((O32, E4.0, K3), None)
((O32, E4.0, K3), Identity)
((O56, E4.0, K3), None)
((O56, E4.0, K5), Identity)
((O56, E3.0, K7), Identity)
((O104, E4.0, K7), None)
((O104, E3.0, K7), Identity)
((O104, E6.0, K3), Identity)
((O128, E3.0, K7), None)
((O128, E4.0, K7), Identity)
((O128, E4.0, K3), Identity)
((O248, E6.0, K3), None)
((O248, E6.0, K3), Identity)
((O248, E3.0, K5), Identity)
((O248, E3.0, K5), Identity)
((O416, E6.0, K7), None)
1x1_Conv_O1664_RELU6_BN
1664x1000_Linear

Found best architecture on note10 with latency <= 30.00 ms  It achieves 82.38% predicted accuracy with 29.94 ms latency on note10.
Architecture of the searched sub-net:
3x3_Conv_O40_RELU6_BN
(3x3_MBConv1_RELU6_O24_BN, None)
((O32, E4.0, K3), None)
((O32, E4.0, K3), Identity)
((O56, E3.0, K5), None)

Searching with note10 constraint (35): 100%|██████████| 500/500 [00:19<00:00, 25.75it/s]
Searching with note10 constraint (30): 100%|██████████| 100/100 [00:04<00:00, 20.55it/s]
Searching with note10 constraint (25): 100%|██████████| 100/100 [00:04<00:00, 20.79it/s]
Searching with note10 constraint (35):   0%|          | 2/500 [00:00<00:31, 15.57it/s]

Found best architecture on note10 with latency <= 35.00 ms  It achieves 83.40% predicted accuracy with 34.82 ms latency on note10.
Architecture of the searched sub-net:
3x3_Conv_O40_RELU6_BN
(3x3_MBConv1_RELU6_O24_BN, None)
((O32, E4.0, K3), None)
((O32, E4.0, K3), Identity)
((O32, E4.0, K3), Identity)
((O56, E4.0, K3), None)
((O56, E4.0, K5), Identity)
((O56, E3.0, K5), Identity)
((O104, E4.0, K5), None)
((O104, E3.0, K5), Identity)
((O104, E4.0, K3), Identity)
((O104, E4.0, K7), Identity)
((O128, E4.0, K7), None)
((O128, E4.0, K7), Identity)
((O128, E4.0, K3), Identity)
((O128, E4.0, K3), Identity)
((O248, E6.0, K7), None)
((O248, E6.0, K5), Identity)
((O248, E4.0, K5), Identity)
((O248, E4.0, K5), Identity)
((O416, E6.0, K7), None)
1x1_Conv_O1664_RELU6_BN
1664x1000_Linear

Found best architecture on note10 with latency <= 30.00 ms  It achieves 82.81% predicted accuracy with 29.93 ms latency on note10.
Architecture of the searched sub-net:
3x3_Conv_O40_RELU6_BN
(3x3_MBConv1_RELU6_O24

Searching with note10 constraint (35): 100%|██████████| 500/500 [00:22<00:00, 21.82it/s]
Searching with note10 constraint (30): 100%|██████████| 100/100 [00:05<00:00, 16.67it/s]
Searching with note10 constraint (25): 100%|██████████| 100/100 [00:06<00:00, 16.15it/s]
Searching with note10 constraint (20): 100%|██████████| 100/100 [00:06<00:00, 16.30it/s]
Searching with note10 constraint (35):   0%|          | 0/500 [00:00<?, ?it/s]

Found best architecture on note10 with latency <= 35.00 ms  It achieves 82.91% predicted accuracy with 34.93 ms latency on note10.
Architecture of the searched sub-net:
3x3_Conv_O40_RELU6_BN
(3x3_MBConv1_RELU6_O24_BN, None)
((O32, E4.0, K3), None)
((O32, E3.0, K3), Identity)
((O56, E4.0, K5), None)
((O56, E4.0, K7), Identity)
((O104, E4.0, K5), None)
((O104, E4.0, K5), Identity)
((O128, E6.0, K3), None)
((O128, E4.0, K5), Identity)
((O128, E6.0, K5), Identity)
((O248, E6.0, K3), None)
((O248, E6.0, K5), Identity)
((O248, E6.0, K7), Identity)
((O416, E6.0, K7), None)
1x1_Conv_O1664_RELU6_BN
1664x1000_Linear

Found best architecture on note10 with latency <= 30.00 ms  It achieves 82.37% predicted accuracy with 29.88 ms latency on note10.
Architecture of the searched sub-net:
3x3_Conv_O40_RELU6_BN
(3x3_MBConv1_RELU6_O24_BN, None)
((O32, E4.0, K3), None)
((O32, E3.0, K3), Identity)
((O56, E4.0, K5), None)
((O56, E4.0, K3), Identity)
((O104, E4.0, K5), None)
((O104, E3.0, K5), Identity)
((O

Searching with note10 constraint (35): 100%|██████████| 500/500 [00:21<00:00, 23.02it/s]
Searching with note10 constraint (30): 100%|██████████| 100/100 [00:03<00:00, 26.53it/s]
Searching with note10 constraint (25): 100%|██████████| 100/100 [00:04<00:00, 24.76it/s]
Searching with note10 constraint (20): 100%|██████████| 100/100 [00:03<00:00, 25.56it/s]
Searching with note10 constraint (15): 100%|██████████| 100/100 [00:04<00:00, 22.66it/s]
Searching with note10 constraint (60):   1%|          | 4/500 [00:00<00:15, 32.24it/s]

Found best architecture on note10 with latency <= 35.00 ms  It achieves 83.42% predicted accuracy with 34.43 ms latency on note10.
Architecture of the searched sub-net:
3x3_Conv_O40_RELU6_BN
(3x3_MBConv1_RELU6_O24_BN, None)
((O32, E4.0, K3), None)
((O32, E3.0, K5), Identity)
((O32, E4.0, K3), Identity)
((O56, E4.0, K5), None)
((O56, E3.0, K5), Identity)
((O56, E4.0, K3), Identity)
((O104, E4.0, K5), None)
((O104, E6.0, K5), Identity)
((O104, E4.0, K3), Identity)
((O128, E6.0, K3), None)
((O128, E4.0, K5), Identity)
((O128, E4.0, K5), Identity)
((O248, E6.0, K5), None)
((O248, E6.0, K7), Identity)
((O248, E6.0, K5), Identity)
((O248, E4.0, K7), Identity)
((O416, E6.0, K7), None)
1x1_Conv_O1664_RELU6_BN
1664x1000_Linear

Found best architecture on note10 with latency <= 30.00 ms  It achieves 82.64% predicted accuracy with 29.62 ms latency on note10.
Architecture of the searched sub-net:
3x3_Conv_O40_RELU6_BN
(3x3_MBConv1_RELU6_O24_BN, None)
((O32, E4.0, K3), None)
((O32, E3.0, K3), Ident

Searching with note10 constraint (60): 100%|██████████| 500/500 [00:17<00:00, 27.87it/s]
Searching with note10 constraint (55): 100%|██████████| 100/100 [00:04<00:00, 24.11it/s]
Searching with note10 constraint (50): 100%|██████████| 100/100 [00:04<00:00, 24.56it/s]
Searching with note10 constraint (45): 100%|██████████| 100/100 [00:04<00:00, 22.49it/s]
Searching with note10 constraint (40): 100%|██████████| 100/100 [00:04<00:00, 24.02it/s]
Searching with note10 constraint (35): 100%|██████████| 100/100 [00:04<00:00, 23.57it/s]
Searching with note10 constraint (30): 100%|██████████| 100/100 [00:04<00:00, 21.25it/s]
Searching with note10 constraint (25): 100%|██████████| 100/100 [00:04<00:00, 20.89it/s]
Searching with note10 constraint (20): 100%|██████████| 100/100 [00:05<00:00, 18.06it/s]
Searching with note10 constraint (15): 100%|██████████| 100/100 [00:05<00:00, 19.54it/s]

Found best architecture on note10 with latency <= 60.00 ms  It achieves 85.23% predicted accuracy with 58.17 ms latency on note10.
Architecture of the searched sub-net:
3x3_Conv_O40_RELU6_BN
(3x3_MBConv1_RELU6_O24_BN, None)
((O32, E4.0, K5), None)
((O32, E3.0, K5), Identity)
((O32, E4.0, K5), Identity)
((O32, E3.0, K5), Identity)
((O56, E3.0, K3), None)
((O56, E4.0, K3), Identity)
((O56, E4.0, K7), Identity)
((O56, E4.0, K3), Identity)
((O104, E4.0, K3), None)
((O104, E3.0, K7), Identity)
((O104, E4.0, K3), Identity)
((O104, E6.0, K5), Identity)
((O128, E4.0, K5), None)
((O128, E6.0, K7), Identity)
((O128, E6.0, K7), Identity)
((O128, E6.0, K5), Identity)
((O248, E6.0, K7), None)
((O248, E6.0, K3), Identity)
((O248, E6.0, K5), Identity)
((O248, E4.0, K3), Identity)
((O416, E6.0, K7), None)
1x1_Conv_O1664_RELU6_BN
1664x1000_Linear

Found best architecture on note10 with latency <= 55.00 ms  It achieves 84.85% predicted accuracy with 54.98 ms latency on note10.
Architecture of the search




{(35,): 16.41508412361145,
 (35, 30): 24.95524525642395,
 (35, 30, 25): 29.147094011306763,
 (35, 30, 25, 20): 41.33411478996277,
 (35, 30, 25, 20, 15): 37.9296932220459,
 (60, 55, 50, 45, 40, 35, 30, 25, 20, 15): 59.22681903839111}

## 2. Accuracy of Discovered Subnetworks

### MobileNetV3

In [17]:
latency_constraints = (60, 55, 50, 45, 40, 35, 30, 25, 20, 15)
run_top_down_evolutionary_search(latency_constraints)

Searching with note10 constraint (60): 100%|██████████| 500/500 [00:19<00:00, 25.53it/s]
Searching with note10 constraint (55): 100%|██████████| 100/100 [00:04<00:00, 21.49it/s]
Searching with note10 constraint (50): 100%|██████████| 100/100 [00:04<00:00, 21.03it/s]
Searching with note10 constraint (45): 100%|██████████| 100/100 [00:04<00:00, 21.41it/s]
Searching with note10 constraint (40): 100%|██████████| 100/100 [00:04<00:00, 20.22it/s]
Searching with note10 constraint (35): 100%|██████████| 100/100 [00:04<00:00, 21.12it/s]
Searching with note10 constraint (30): 100%|██████████| 100/100 [00:04<00:00, 22.65it/s]
Searching with note10 constraint (25): 100%|██████████| 100/100 [00:04<00:00, 23.31it/s]
Searching with note10 constraint (20): 100%|██████████| 100/100 [00:04<00:00, 23.84it/s]
Searching with note10 constraint (15): 100%|██████████| 100/100 [00:04<00:00, 23.03it/s]

Found best architecture on note10 with latency <= 60.00 ms  It achieves 84.74% predicted accuracy with 59.66 ms latency on note10.
Architecture of the searched sub-net:
3x3_Conv_O24_H_SWISH_BN
(3x3_MBConv1_RELU_O24_BN, Identity)
((O32, E6.0, K5), None)
((O32, E4.0, K3), Identity)
((O32, E4.0, K5), Identity)
(SE(O48, E6.0, K3), None)
(SE(O48, E6.0, K5), Identity)
(SE(O48, E4.0, K7), Identity)
(SE(O48, E4.0, K7), Identity)
((O96, E6.0, K3), None)
((O96, E3.0, K5), Identity)
((O96, E4.0, K3), Identity)
((O96, E6.0, K7), Identity)
(SE(O136, E6.0, K5), None)
(SE(O136, E6.0, K5), Identity)
(SE(O136, E6.0, K5), Identity)
(SE(O136, E3.0, K3), Identity)
(SE(O192, E6.0, K7), None)
(SE(O192, E6.0, K3), Identity)
(SE(O192, E6.0, K5), Identity)
(SE(O192, E3.0, K5), Identity)
1x1_Conv_O1152_H_SWISH_BN
1x1_Conv_O1536_H_SWISH
1536x1000_Linear

Found best architecture on note10 with latency <= 55.00 ms  It achieves 84.38% predicted accuracy with 54.67 ms latency on note10.
Architecture of the searched 




60.65964221954346

### ResNet50D

In [19]:
latency_constraints = (60, 55, 50, 45, 40, 35, 30, 25, 20, 15)
run_top_down_evolutionary_search(latency_constraints)

Searching with note10 constraint (60): 100%|██████████| 500/500 [00:14<00:00, 33.48it/s]
Searching with note10 constraint (55): 100%|██████████| 100/100 [00:03<00:00, 30.38it/s]
Searching with note10 constraint (50): 100%|██████████| 100/100 [00:03<00:00, 30.76it/s]
Searching with note10 constraint (45): 100%|██████████| 100/100 [00:03<00:00, 28.13it/s]
Searching with note10 constraint (40): 100%|██████████| 100/100 [00:03<00:00, 26.98it/s]
Searching with note10 constraint (35): 100%|██████████| 100/100 [00:03<00:00, 27.23it/s]
Searching with note10 constraint (30): 100%|██████████| 100/100 [00:04<00:00, 23.74it/s]
Searching with note10 constraint (25): 100%|██████████| 100/100 [00:04<00:00, 24.79it/s]
Searching with note10 constraint (20): 100%|██████████| 100/100 [00:04<00:00, 23.01it/s]
Searching with note10 constraint (15): 100%|██████████| 100/100 [00:05<00:00, 18.45it/s]

Found best architecture on note10 with latency <= 60.00 ms  It achieves 84.65% predicted accuracy with 59.59 ms latency on note10.
Architecture of the searched sub-net:
DyConv(O32, K3, S2)
DyConv(O64, K3, S1)
max_pooling(ks=3, stride=2)
(3x3_BottleneckConv_in->768->256_S1, avgpool_conv)
(3x3_BottleneckConv_in->1536->256_S1, Identity)
(3x3_BottleneckConv_in->768->256_S1, Identity)
(3x3_BottleneckConv_in->1024->256_S1, Identity)
(3x3_BottleneckConv_in->3072->512_S2, avgpool_conv)
(3x3_BottleneckConv_in->3072->512_S1, Identity)
(3x3_BottleneckConv_in->2048->512_S1, Identity)
(3x3_BottleneckConv_in->2048->512_S1, Identity)
(3x3_BottleneckConv_in->6144->1024_S2, avgpool_conv)
(3x3_BottleneckConv_in->6144->1024_S1, Identity)
(3x3_BottleneckConv_in->6144->1024_S1, Identity)
(3x3_BottleneckConv_in->6144->1024_S1, Identity)
(3x3_BottleneckConv_in->6144->1024_S1, Identity)
(3x3_BottleneckConv_in->6144->1024_S1, Identity)
(3x3_BottleneckConv_in->8192->2048_S2, avgpool_conv)
(3x3_BottleneckConv_in




50.48656511306763

### ProxylessNAS

In [21]:
latency_constraints = (60, 55, 50, 45, 40, 35, 30, 25, 20, 15)
run_top_down_evolutionary_search(latency_constraints)

Searching with note10 constraint (60): 100%|██████████| 500/500 [00:14<00:00, 34.00it/s]
Searching with note10 constraint (55): 100%|██████████| 100/100 [00:03<00:00, 29.39it/s]
Searching with note10 constraint (50): 100%|██████████| 100/100 [00:03<00:00, 26.69it/s]
Searching with note10 constraint (45): 100%|██████████| 100/100 [00:03<00:00, 25.61it/s]
Searching with note10 constraint (40): 100%|██████████| 100/100 [00:04<00:00, 23.98it/s]
Searching with note10 constraint (35): 100%|██████████| 100/100 [00:04<00:00, 24.79it/s]
Searching with note10 constraint (30): 100%|██████████| 100/100 [00:03<00:00, 25.09it/s]
Searching with note10 constraint (25): 100%|██████████| 100/100 [00:04<00:00, 21.85it/s]
Searching with note10 constraint (20): 100%|██████████| 100/100 [00:04<00:00, 20.07it/s]
Searching with note10 constraint (15): 100%|██████████| 100/100 [00:04<00:00, 20.42it/s]

Found best architecture on note10 with latency <= 60.00 ms  It achieves 85.08% predicted accuracy with 53.89 ms latency on note10.
Architecture of the searched sub-net:
3x3_Conv_O40_RELU6_BN
(3x3_MBConv1_RELU6_O24_BN, None)
((O32, E4.0, K3), None)
((O32, E3.0, K5), Identity)
((O32, E4.0, K3), Identity)
((O32, E4.0, K3), Identity)
((O56, E4.0, K7), None)
((O56, E6.0, K3), Identity)
((O56, E4.0, K3), Identity)
((O56, E3.0, K5), Identity)
((O104, E4.0, K7), None)
((O104, E3.0, K7), Identity)
((O104, E6.0, K3), Identity)
((O104, E4.0, K7), Identity)
((O128, E6.0, K5), None)
((O128, E6.0, K5), Identity)
((O128, E6.0, K7), Identity)
((O128, E6.0, K3), Identity)
((O248, E6.0, K5), None)
((O248, E6.0, K3), Identity)
((O248, E6.0, K5), Identity)
((O248, E6.0, K7), Identity)
((O416, E6.0, K7), None)
1x1_Conv_O1664_RELU6_BN
1664x1000_Linear

Found best architecture on note10 with latency <= 55.00 ms  It achieves 85.08% predicted accuracy with 53.89 ms latency on note10.
Architecture of the search




52.4783821105957

## Cost of finding a single latency constraint subnetwork

In [59]:
times = {}
latency_constraints = (15, 20, 25, 30, 35, 40, 45, 50, 55, 60)

In [60]:
for i in range(len(latency_constraints)):
    lc = latency_constraints[i:i+1]
    times[lc] = run_top_down_evolutionary_search(lc)

Searching with note10 constraint (15): 100%|██████████| 500/500 [00:17<00:00, 27.89it/s]


Found best architecture on note10 with latency <= 15.00 ms  It achieves 78.37% predicted accuracy with 14.97 ms latency on note10.
Architecture of the searched sub-net:
3x3_Conv_O24_H_SWISH_BN
(3x3_MBConv1_RELU_O24_BN, Identity)
((O32, E3.0, K3), None)
((O32, E3.0, K3), Identity)
(SE(O48, E3.0, K5), None)
(SE(O48, E4.0, K3), Identity)
((O96, E4.0, K5), None)
((O96, E4.0, K3), Identity)
(SE(O136, E3.0, K3), None)
(SE(O136, E3.0, K5), Identity)
(SE(O136, E4.0, K3), Identity)
(SE(O192, E6.0, K3), None)
(SE(O192, E3.0, K7), Identity)
1x1_Conv_O1152_H_SWISH_BN
1x1_Conv_O1536_H_SWISH
1536x1000_Linear

Time: 26.57952380180359


Searching with note10 constraint (20): 100%|██████████| 500/500 [00:19<00:00, 25.02it/s]
Searching with note10 constraint (25):   1%|          | 3/500 [00:00<00:18, 27.39it/s]

Found best architecture on note10 with latency <= 20.00 ms  It achieves 80.67% predicted accuracy with 19.93 ms latency on note10.
Architecture of the searched sub-net:
3x3_Conv_O24_H_SWISH_BN
(3x3_MBConv1_RELU_O24_BN, Identity)
((O32, E3.0, K3), None)
((O32, E3.0, K3), Identity)
(SE(O48, E4.0, K3), None)
(SE(O48, E4.0, K3), Identity)
((O96, E4.0, K5), None)
((O96, E4.0, K5), Identity)
((O96, E3.0, K5), Identity)
((O96, E3.0, K3), Identity)
(SE(O136, E4.0, K3), None)
(SE(O136, E4.0, K5), Identity)
(SE(O136, E3.0, K3), Identity)
(SE(O192, E6.0, K3), None)
(SE(O192, E4.0, K3), Identity)
(SE(O192, E3.0, K5), Identity)
1x1_Conv_O1152_H_SWISH_BN
1x1_Conv_O1536_H_SWISH
1536x1000_Linear

Time: 20.25872278213501


Searching with note10 constraint (25): 100%|██████████| 500/500 [00:20<00:00, 24.19it/s]
Searching with note10 constraint (30):   1%|          | 3/500 [00:00<00:17, 28.31it/s]

Found best architecture on note10 with latency <= 25.00 ms  It achieves 81.83% predicted accuracy with 24.99 ms latency on note10.
Architecture of the searched sub-net:
3x3_Conv_O24_H_SWISH_BN
(3x3_MBConv1_RELU_O24_BN, Identity)
((O32, E3.0, K3), None)
((O32, E4.0, K3), Identity)
(SE(O48, E6.0, K7), None)
(SE(O48, E6.0, K5), Identity)
(SE(O48, E4.0, K3), Identity)
((O96, E4.0, K7), None)
((O96, E4.0, K5), Identity)
((O96, E3.0, K3), Identity)
((O96, E3.0, K3), Identity)
(SE(O136, E4.0, K5), None)
(SE(O136, E4.0, K7), Identity)
(SE(O136, E4.0, K3), Identity)
(SE(O136, E4.0, K5), Identity)
(SE(O192, E6.0, K3), None)
(SE(O192, E6.0, K3), Identity)
(SE(O192, E4.0, K5), Identity)
(SE(O192, E3.0, K3), Identity)
1x1_Conv_O1152_H_SWISH_BN
1x1_Conv_O1536_H_SWISH
1536x1000_Linear

Time: 20.75973391532898


Searching with note10 constraint (30): 100%|██████████| 500/500 [00:18<00:00, 26.79it/s]
Searching with note10 constraint (35):   1%|          | 3/500 [00:00<00:18, 27.52it/s]

Found best architecture on note10 with latency <= 30.00 ms  It achieves 82.63% predicted accuracy with 29.94 ms latency on note10.
Architecture of the searched sub-net:
3x3_Conv_O24_H_SWISH_BN
(3x3_MBConv1_RELU_O24_BN, Identity)
((O32, E4.0, K5), None)
((O32, E3.0, K3), Identity)
(SE(O48, E4.0, K5), None)
(SE(O48, E4.0, K5), Identity)
(SE(O48, E6.0, K3), Identity)
((O96, E6.0, K7), None)
((O96, E4.0, K5), Identity)
((O96, E3.0, K3), Identity)
(SE(O136, E6.0, K7), None)
(SE(O136, E4.0, K3), Identity)
(SE(O136, E6.0, K3), Identity)
(SE(O136, E4.0, K7), Identity)
(SE(O192, E6.0, K3), None)
(SE(O192, E6.0, K3), Identity)
(SE(O192, E6.0, K5), Identity)
(SE(O192, E3.0, K7), Identity)
1x1_Conv_O1152_H_SWISH_BN
1x1_Conv_O1536_H_SWISH
1536x1000_Linear

Time: 18.716851949691772


Searching with note10 constraint (35): 100%|██████████| 500/500 [00:19<00:00, 25.24it/s]
Searching with note10 constraint (40):   1%|          | 3/500 [00:00<00:21, 23.10it/s]

Found best architecture on note10 with latency <= 35.00 ms  It achieves 83.21% predicted accuracy with 34.86 ms latency on note10.
Architecture of the searched sub-net:
3x3_Conv_O24_H_SWISH_BN
(3x3_MBConv1_RELU_O24_BN, Identity)
((O32, E4.0, K3), None)
((O32, E3.0, K3), Identity)
(SE(O48, E6.0, K5), None)
(SE(O48, E6.0, K3), Identity)
(SE(O48, E4.0, K3), Identity)
(SE(O48, E3.0, K3), Identity)
((O96, E4.0, K5), None)
((O96, E3.0, K7), Identity)
((O96, E4.0, K3), Identity)
(SE(O136, E6.0, K5), None)
(SE(O136, E4.0, K3), Identity)
(SE(O136, E6.0, K7), Identity)
(SE(O136, E4.0, K5), Identity)
(SE(O192, E6.0, K3), None)
(SE(O192, E6.0, K3), Identity)
(SE(O192, E6.0, K5), Identity)
1x1_Conv_O1152_H_SWISH_BN
1x1_Conv_O1536_H_SWISH
1536x1000_Linear

Time: 19.841947078704834


Searching with note10 constraint (40): 100%|██████████| 500/500 [00:21<00:00, 23.29it/s]
Searching with note10 constraint (45):   1%|          | 3/500 [00:00<00:20, 24.51it/s]

Found best architecture on note10 with latency <= 40.00 ms  It achieves 83.36% predicted accuracy with 39.78 ms latency on note10.
Architecture of the searched sub-net:
3x3_Conv_O24_H_SWISH_BN
(3x3_MBConv1_RELU_O24_BN, Identity)
((O32, E4.0, K3), None)
((O32, E3.0, K3), Identity)
(SE(O48, E6.0, K7), None)
(SE(O48, E3.0, K7), Identity)
(SE(O48, E6.0, K7), Identity)
((O96, E6.0, K7), None)
((O96, E6.0, K5), Identity)
((O96, E4.0, K5), Identity)
((O96, E6.0, K5), Identity)
(SE(O136, E6.0, K3), None)
(SE(O136, E4.0, K7), Identity)
(SE(O136, E3.0, K5), Identity)
(SE(O136, E4.0, K5), Identity)
(SE(O192, E6.0, K3), None)
(SE(O192, E6.0, K3), Identity)
(SE(O192, E6.0, K5), Identity)
(SE(O192, E4.0, K7), Identity)
1x1_Conv_O1152_H_SWISH_BN
1x1_Conv_O1536_H_SWISH
1536x1000_Linear

Time: 21.50108790397644


Searching with note10 constraint (45): 100%|██████████| 500/500 [00:21<00:00, 23.79it/s]
Searching with note10 constraint (50):   1%|          | 3/500 [00:00<00:19, 25.53it/s]

Found best architecture on note10 with latency <= 45.00 ms  It achieves 84.16% predicted accuracy with 44.72 ms latency on note10.
Architecture of the searched sub-net:
3x3_Conv_O24_H_SWISH_BN
(3x3_MBConv1_RELU_O24_BN, Identity)
((O32, E4.0, K5), None)
((O32, E3.0, K5), Identity)
(SE(O48, E6.0, K5), None)
(SE(O48, E6.0, K7), Identity)
(SE(O48, E4.0, K7), Identity)
(SE(O48, E3.0, K5), Identity)
((O96, E4.0, K3), None)
((O96, E3.0, K7), Identity)
((O96, E6.0, K3), Identity)
((O96, E6.0, K7), Identity)
(SE(O136, E6.0, K7), None)
(SE(O136, E6.0, K7), Identity)
(SE(O136, E4.0, K7), Identity)
(SE(O136, E4.0, K3), Identity)
(SE(O192, E6.0, K5), None)
(SE(O192, E6.0, K3), Identity)
(SE(O192, E6.0, K5), Identity)
(SE(O192, E4.0, K7), Identity)
1x1_Conv_O1152_H_SWISH_BN
1x1_Conv_O1536_H_SWISH
1536x1000_Linear

Time: 21.04682493209839


Searching with note10 constraint (50): 100%|██████████| 500/500 [00:17<00:00, 27.88it/s]
Searching with note10 constraint (55):   1%|          | 4/500 [00:00<00:16, 30.14it/s]

Found best architecture on note10 with latency <= 50.00 ms  It achieves 84.29% predicted accuracy with 49.36 ms latency on note10.
Architecture of the searched sub-net:
3x3_Conv_O24_H_SWISH_BN
(3x3_MBConv1_RELU_O24_BN, Identity)
((O32, E4.0, K3), None)
((O32, E3.0, K7), Identity)
((O32, E6.0, K3), Identity)
(SE(O48, E6.0, K5), None)
(SE(O48, E6.0, K7), Identity)
(SE(O48, E4.0, K3), Identity)
(SE(O48, E6.0, K3), Identity)
((O96, E6.0, K5), None)
((O96, E4.0, K7), Identity)
((O96, E6.0, K5), Identity)
((O96, E3.0, K7), Identity)
(SE(O136, E6.0, K7), None)
(SE(O136, E6.0, K7), Identity)
(SE(O136, E6.0, K5), Identity)
(SE(O136, E4.0, K7), Identity)
(SE(O192, E6.0, K5), None)
(SE(O192, E6.0, K3), Identity)
(SE(O192, E6.0, K5), Identity)
(SE(O192, E4.0, K5), Identity)
1x1_Conv_O1152_H_SWISH_BN
1x1_Conv_O1536_H_SWISH
1536x1000_Linear

Time: 17.964640855789185


Searching with note10 constraint (55): 100%|██████████| 500/500 [00:19<00:00, 25.85it/s]
Searching with note10 constraint (60):   1%|          | 3/500 [00:00<00:17, 29.21it/s]

Found best architecture on note10 with latency <= 55.00 ms  It achieves 84.97% predicted accuracy with 54.94 ms latency on note10.
Architecture of the searched sub-net:
3x3_Conv_O24_H_SWISH_BN
(3x3_MBConv1_RELU_O24_BN, Identity)
((O32, E4.0, K3), None)
((O32, E3.0, K3), Identity)
((O32, E3.0, K5), Identity)
((O32, E3.0, K5), Identity)
(SE(O48, E6.0, K7), None)
(SE(O48, E6.0, K3), Identity)
(SE(O48, E4.0, K7), Identity)
(SE(O48, E4.0, K7), Identity)
((O96, E3.0, K7), None)
((O96, E3.0, K5), Identity)
((O96, E4.0, K3), Identity)
((O96, E6.0, K3), Identity)
(SE(O136, E4.0, K5), None)
(SE(O136, E6.0, K7), Identity)
(SE(O136, E6.0, K7), Identity)
(SE(O136, E6.0, K5), Identity)
(SE(O192, E6.0, K3), None)
(SE(O192, E6.0, K3), Identity)
(SE(O192, E6.0, K5), Identity)
(SE(O192, E6.0, K3), Identity)
1x1_Conv_O1152_H_SWISH_BN
1x1_Conv_O1536_H_SWISH
1536x1000_Linear

Time: 19.373605012893677


Searching with note10 constraint (60): 100%|██████████| 500/500 [00:19<00:00, 25.43it/s]

Found best architecture on note10 with latency <= 60.00 ms  It achieves 85.04% predicted accuracy with 59.02 ms latency on note10.
Architecture of the searched sub-net:
3x3_Conv_O24_H_SWISH_BN
(3x3_MBConv1_RELU_O24_BN, Identity)
((O32, E4.0, K5), None)
((O32, E3.0, K5), Identity)
((O32, E6.0, K3), Identity)
((O32, E4.0, K5), Identity)
(SE(O48, E6.0, K7), None)
(SE(O48, E6.0, K3), Identity)
(SE(O48, E4.0, K5), Identity)
(SE(O48, E6.0, K5), Identity)
((O96, E4.0, K7), None)
((O96, E6.0, K3), Identity)
((O96, E6.0, K3), Identity)
((O96, E6.0, K7), Identity)
(SE(O136, E6.0, K3), None)
(SE(O136, E6.0, K5), Identity)
(SE(O136, E6.0, K5), Identity)
(SE(O136, E4.0, K3), Identity)
(SE(O192, E6.0, K5), None)
(SE(O192, E6.0, K3), Identity)
(SE(O192, E6.0, K5), Identity)
(SE(O192, E6.0, K7), Identity)
1x1_Conv_O1152_H_SWISH_BN
1x1_Conv_O1536_H_SWISH
1536x1000_Linear

Time: 19.693042278289795





In [61]:
times.values()

dict_values([26.57952380180359, 20.25872278213501, 20.75973391532898, 18.716851949691772, 19.841947078704834, 21.50108790397644, 21.04682493209839, 17.964640855789185, 19.373605012893677, 19.693042278289795])

# DEMO

Run the cell below to see the top down strategy in action!

In [None]:
latency_constraints = (35, 30, 25, 20, 15)
run_top_down_evolutionary_search(latency_constraints)