Automated machine learning (AutoML) provides methods to find the optimal neural architecture and the best hyperparameter settings for a given neural network. 

One way to think of machine learning algorithms is that they automate the process of learning relationships between given inputs and outputs. In traditional software engineering, we would have to explicitly write/code these relationships in the form of functions that take in input and return output. In the machine learning world, machine learning models find such functions for us. Although we automate to a certain extent, there is still a lot to be done. Besides mining and cleaning data, here are a few routine tasks to be performed in order to get those functions:
- Choosing a machine learning model (or a model family and then a model)
- Deciding the model architecture (especially in the case of deep learning)
- Choosing hyperparameters
- Adjusting hyperparameters based on validation set performance
- Trying different models (or model families)

These are the kinds of tasks that justify the requirement of a human machine learning expert. Most of these steps are manual and either take a lot of time or need a lot of expertise to discount the required time, and we have far fewer machine learning experts than needed to create and deploy machine learning models that are increasingly popular, valuable, and useful across both industries and academia.

This is where AutoML comes to the rescue. AutoML has become a discipline within the field of machine learning that aims to automate the previously listed steps and beyond.

we will look more broadly at the AutoML tool for PyTorch—Auto-PyTorch—which performs both neural architecture search and hyperparameter search. We will first load the dataset, then define an Auto-PyTorch model search instance, and finally run the model searching routine, which will provide us with a best-performing model.


We will also look at another AutoML tool called Optuna that performs hyperparameter search for a PyTorch model.

In [None]:
!pip install git+https://github.com/shukon/HpBandSter.git
!pip install autoPyTorch==0.0.2
!pip install torchviz==0.0.1
!pip install configspace==0.4.12

In [None]:
import torch
from torchviz import make_dot
from torchvision import datasets, transforms
from autoPyTorch import AutoNetClassification

import matplotlib.pyplot as plt
import numpy as np

  import pandas.util.testing as tm


In [None]:
train_ds = datasets.MNIST('../data', train=True, download=True,
                   transform=transforms.Compose([
                       transforms.ToTensor(),
                       transforms.Normalize((0.1302,), (0.3069,))]))

test_ds = datasets.MNIST('../data', train=False, 
                   transform=transforms.Compose([
                       transforms.ToTensor(),
                       transforms.Normalize((0.1302,), (0.3069,))]))

  return torch.from_numpy(parsed.astype(m[2], copy=False)).view(*s)


In [None]:
X_train, X_test, y_train, y_test = train_ds.data.numpy().reshape(-1, 28*28), test_ds.data.numpy().reshape(-1, 28*28) ,train_ds.targets.numpy(), test_ds.targets.numpy()

In [None]:
# running Auto-PyTorch
autoPyTorch = AutoNetClassification("tiny_cs",  # config preset
                                    log_level='info',
                                    max_runtime=2000,
                                    min_budget=100,
                                    max_budget=1500)

autoPyTorch.fit(X_train, y_train, validation_split=0.1)

09:19:15 [AutoNet] Start bohb
09:19:15 WORKER: start listening for jobs
09:19:15 DISPATCHER: started the 'discover_worker' thread
09:19:15 DISPATCHER: started the 'job_runner' thread
09:19:15 DISPATCHER: Pyro daemon running on 172.28.0.2:41259
09:19:15 DISPATCHER: discovered new worker, hpbandster.run_0.worker.b20b68280430.581.-1140550406272896
09:19:15 HBMASTER: adjusted queue size to (0, 1)
09:19:15 DISPATCHER: A new worker triggered discover_worker
09:19:15 HBMASTER: starting run at 1631265555.806195
09:19:15 WORKER: start processing job (0, 0, 0)


True


09:19:15 Fit optimization pipeline
09:19:15 [AutoNet] No validation set given and either no cross validator given or budget too low for CV. Continue by splitting 0.1 of training data.
09:19:15 [AutoNet] CV split 0 of 1
09:19:15 Reduced initial budget 166.5209392706553 to cv budget 166.51660958925882 compensate for 0.004329681396484375
09:21:52 Finished train with budget 166.51660958925882: Preprocessing took 14s, Training took 142s, Wrap up took 0s. Total time consumption in s: 156
09:21:52 [AutoNet] Done with current split!
09:21:52 Aggregate the results across the splits
09:21:52 Process 1 additional result(s)
09:21:52 Training ['shapedresnet'] with budget 166.66666666666666 resulted in optimize-metric-loss: -86.61666666666666 took 157.09570217132568 seconds
09:21:52 WORKER: registered result for job (0, 0, 0) with dispatcher
09:21:52 WORKER: start processing job (0, 0, 1)
09:21:52 Fit optimization pipeline
09:21:53 [AutoNet] No validation set given and either no cross validator give

{'budget': 166.66666666666666,
 'info': {'loss': 0.1887385893130192,
  'lr': 2.8504234646506757e-05,
  'lr_scheduler_converged': 1.0,
  'train_accuracy': 96.6925925925926,
  'val_accuracy': 96.71666666666667},
 'loss': -95.71666666666667,
 'optimized_hyperparameter_config': {'CreateDataLoader:batch_size': 125,
  'Imputation:strategy': 'median',
  'InitializationSelector:initialization_method': 'default',
  'InitializationSelector:initializer:initialize_bias': 'No',
  'LearningrateSchedulerSelector:cosine_annealing:T_max': 10,
  'LearningrateSchedulerSelector:cosine_annealing:T_mult': 2,
  'LearningrateSchedulerSelector:lr_scheduler': 'cosine_annealing',
  'LossModuleSelector:loss_module': 'cross_entropy_weighted',
  'NetworkSelector:network': 'shapedresnet',
  'NetworkSelector:shapedresnet:activation': 'relu',
  'NetworkSelector:shapedresnet:blocks_per_group': 3,
  'NetworkSelector:shapedresnet:max_units': 60,
  'NetworkSelector:shapedresnet:num_groups': 1,
  'NetworkSelector:shapedres

In [None]:
y_pred = autoPyTorch.predict(X_test)
print("Accuracy score", np.mean(y_pred.reshape(-1) == y_test))

Accuracy score 0.9691


In [None]:
pytorch_model = autoPyTorch.get_pytorch_model()
print(pytorch_model)

Sequential(
  (0): Linear(in_features=100, out_features=100, bias=True)
  (1): Sequential(
    (0): ResBlock(
      (layers): Sequential(
        (0): BatchNorm1d(100, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (1): ReLU()
        (2): Linear(in_features=100, out_features=100, bias=True)
        (3): BatchNorm1d(100, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (4): ReLU()
        (5): Linear(in_features=100, out_features=100, bias=True)
      )
    )
    (1): ResBlock(
      (layers): Sequential(
        (0): BatchNorm1d(100, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (1): ReLU()
        (2): Linear(in_features=100, out_features=100, bias=True)
        (3): BatchNorm1d(100, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (4): ReLU()
        (5): Linear(in_features=100, out_features=100, bias=True)
      )
    )
    (2): ResBlock(
      (layers): Sequential(
        (0): BatchNo

In [None]:
x = torch.randn(1, pytorch_model[0].in_features)
y = pytorch_model(x)
arch = make_dot(y.mean(), params=dict(pytorch_model.named_parameters()))
arch.format="pdf"
arch.filename = "convnet_arch"
arch.render(view=False)

'convnet_arch.pdf'

In [None]:
autoPyTorch.get_hyperparameter_search_space()

Configuration space object:
  Hyperparameters:
    CreateDataLoader:batch_size, Type: Constant, Value: 125
    Imputation:strategy, Type: Categorical, Choices: {median}, Default: median
    InitializationSelector:initialization_method, Type: Categorical, Choices: {default}, Default: default
    InitializationSelector:initializer:initialize_bias, Type: Constant, Value: No
    LearningrateSchedulerSelector:cosine_annealing:T_max, Type: Constant, Value: 10
    LearningrateSchedulerSelector:cosine_annealing:T_mult, Type: Constant, Value: 2
    LearningrateSchedulerSelector:lr_scheduler, Type: Categorical, Choices: {cosine_annealing}, Default: cosine_annealing
    LossModuleSelector:loss_module, Type: Categorical, Choices: {cross_entropy_weighted}, Default: cross_entropy_weighted
    NetworkSelector:network, Type: Categorical, Choices: {shapedresnet}, Default: shapedresnet
    NetworkSelector:shapedresnet:activation, Type: Constant, Value: relu
    NetworkSelector:shapedresnet:blocks_per_gr