# Neural Architecture Search with DARTS

In this example you will deploy Katib experiment with Differentiable Architecture Search (DARTS) algorithm using Jupyter Notebook and Katib SDK. Your Kubernetes cluster must have at least one GPU for this example.

You can read more about how we use DARTS in Katib [here](https://github.com/kubeflow/katib/tree/master/pkg/suggestion/v1beta1/nas/darts).

The notebook shows how to create, get, check status and delete experiment.

# Install required package

In [1]:
pip install kubeflow-katib

Defaulting to user installation because normal site-packages is not writeable
Collecting kubeflow-katib
  Downloading kubeflow_katib-0.0.5-py3-none-any.whl (112 kB)
[K     |████████████████████████████████| 112 kB 18.7 MB/s eta 0:00:01
Collecting table-logger>=0.3.5
  Downloading table_logger-0.3.6-py3-none-any.whl (14 kB)
Installing collected packages: table-logger, kubeflow-katib
Successfully installed kubeflow-katib-0.0.5 table-logger-0.3.6
You should consider upgrading via the '/usr/bin/python3 -m pip install --upgrade pip' command.[0m
Note: you may need to restart the kernel to use updated packages.


## Restart the Notebook kernel to use SDK package

In [2]:
from IPython.display import display_html
display_html("<script>Jupyter.notebook.kernel.restart()</script>",raw=True)

## Import required packages

In [1]:
from kubeflow.katib import KatibClient
from kubernetes.client import V1ObjectMeta
from kubeflow.katib import V1beta1Experiment
from kubeflow.katib import V1beta1AlgorithmSpec
from kubeflow.katib import V1beta1AlgorithmSetting
from kubeflow.katib import V1beta1ObjectiveSpec
from kubeflow.katib import V1beta1MetricsCollectorSpec
from kubeflow.katib import V1beta1CollectorSpec
from kubeflow.katib import V1beta1SourceSpec
from kubeflow.katib import V1beta1FilterSpec
from kubeflow.katib import V1beta1FeasibleSpace
from kubeflow.katib import V1beta1ExperimentSpec
from kubeflow.katib import V1beta1NasConfig
from kubeflow.katib import V1beta1GraphConfig
from kubeflow.katib import V1beta1Operation
from kubeflow.katib import V1beta1ParameterSpec
from kubeflow.katib import V1beta1TrialTemplate
from kubeflow.katib import V1beta1TrialParameterSpec

## Define experiment

You have to create experiment object before deploying it. This experiment is similar to [this](https://github.com/kubeflow/katib/blob/master/examples/v1beta1/nas/darts-example-gpu.yaml) example.

You can read more about DARTS algorithm settings [here](https://www.kubeflow.org/docs/components/hyperparameter-tuning/experiment/#differentiable-architecture-search-darts).

In [2]:
# Experiment metadata
namespace = "anonymous"
experiment_name = "darts-example"

metadata = V1ObjectMeta(
    name=experiment_name,
    namespace=namespace
)


# Algorithm specification
algorithm_spec=V1beta1AlgorithmSpec(
    algorithm_name="darts",
    algorithm_settings=[
        V1beta1AlgorithmSetting(
            name="num_epochs",
            value="2"
        ),
        V1beta1AlgorithmSetting(
            name="stem_multiplier",
            value="1"
        ),
        V1beta1AlgorithmSetting(
            name="init_channels",
            value="4"
        ),
        V1beta1AlgorithmSetting(
            name="num_nodes",
            value="3"
        ),
        
    ]
)

# Objective specification. For DARTS Goal is omitted.
objective_spec=V1beta1ObjectiveSpec(
    type="maximize",
    objective_metric_name="Best-Genotype",
)

# Metrics collector specification.
# We should specify metrics format to get Genotype from training container.
metrics_collector_spec=V1beta1MetricsCollectorSpec(
    collector=V1beta1CollectorSpec(
        kind="StdOut"
    ),
    source=V1beta1SourceSpec(
        filter=V1beta1FilterSpec(
            metrics_format=[
                "([\\w-]+)=(Genotype.*)"
            ]
        )
    )
)

# Configuration for Neural Network (NN)
# This NN contains 2 number of layers and 5 various operations with different parameters.
nas_config=V1beta1NasConfig(
    graph_config=V1beta1GraphConfig(
        num_layers=2
    ),
    operations=[
        V1beta1Operation(
            operation_type="separable_convolution",
            parameters=[
                V1beta1ParameterSpec(
                    name="filter_size",
                    parameter_type="categorical",
                    feasible_space=V1beta1FeasibleSpace(
                        list=["3"]
                    ),
                )
            ]
        ),
        V1beta1Operation(
            operation_type="dilated_convolution",
            parameters=[
                V1beta1ParameterSpec(
                    name="filter_size",
                    parameter_type="categorical",
                    feasible_space=V1beta1FeasibleSpace(
                        list=["3", "5"]
                    ),
                )
            ]
        ),
        V1beta1Operation(
            operation_type="avg_pooling",
            parameters=[
                V1beta1ParameterSpec(
                    name="filter_size",
                    parameter_type="categorical",
                    feasible_space=V1beta1FeasibleSpace(
                        list=["3"]
                    ),
                )
            ]
        ),
        V1beta1Operation(
            operation_type="max_pooling",
            parameters=[
                V1beta1ParameterSpec(
                    name="filter_size",
                    parameter_type="categorical",
                    feasible_space=V1beta1FeasibleSpace(
                        list=["3"]
                    ),
                )
            ]
        ),
        V1beta1Operation(
            operation_type="skip_connection",
        ),
    ]
)


# JSON trial template specification
trial_spec={
    "apiVersion": "batch/v1",
    "kind": "Job",
    "spec": {
        "template": {
            "spec": {
                "containers": [
                    {
                        "name": "training-container",
                        "image": "docker.io/kubeflowkatib/darts-cnn-cifar10:v1beta1-e294a90",
                        "command": [
                            'python3',
                            'run_trial.py',
                            '--algorithm-settings="${trialParameters.algorithmSettings}"',
                            '--search-space="${trialParameters.searchSpace}"',
                            '--num-layers="${trialParameters.numberLayers}"'
                        ],
                        # Training container requires 1 GPU
                        "resources": {
                            "limits": {
                                "nvidia.com/gpu": 1
                            }
                        }
                    }
                ],
                "restartPolicy": "Never"
            }
        }
    }
}

# Template with trial parameters and trial spec
# Set retain to True to save trial resources after completion.
trial_template=V1beta1TrialTemplate(
    retain=True,
    trial_parameters=[
        V1beta1TrialParameterSpec(
            name="algorithmSettings",
            description=" Algorithm settings of DARTS Experiment",
            reference="algorithm-settings"
        ),
        V1beta1TrialParameterSpec(
            name="searchSpace",
            description="Search Space of DARTS Experiment",
            reference="search-space"
        ),
        V1beta1TrialParameterSpec(
            name="numberLayers",
            description="Number of Neural Network layers",
            reference="num-layers"
        ),
    ],
    trial_spec=trial_spec
)


# Experiment object
experiment = V1beta1Experiment(
    api_version="kubeflow.org/v1beta1",
    kind="Experiment",
    metadata=metadata,
    spec=V1beta1ExperimentSpec(
        max_trial_count=1,
        parallel_trial_count=1,
        max_failed_trial_count=1,
        algorithm=algorithm_spec,
        objective=objective_spec,
        metrics_collector_spec=metrics_collector_spec,
        nas_config=nas_config,
        trial_template=trial_template,
    )
)

You can print experiment's info to verify it before submission

In [3]:
# Print trial template container info
print(experiment.spec.trial_template.trial_spec["spec"]["template"]["spec"]["containers"][0])

{'name': 'training-container', 'image': 'docker.io/kubeflowkatib/darts-cnn-cifar10:v1beta1-e294a90', 'command': ['python3', 'run_trial.py', '--algorithm-settings="${trialParameters.algorithmSettings}"', '--search-space="${trialParameters.searchSpace}"', '--num-layers="${trialParameters.numberLayers}"'], 'resources': {'limits': {'nvidia.com/gpu': 1}}}


# Create experiment

You have to create Katib client to use SDK

TODO (andreyvelich): Current experiment link for NAS is incorrect.

In [4]:
# Create client
kclient = KatibClient()

# Create experiment
kclient.create_experiment(experiment,namespace=namespace)

{'apiVersion': 'kubeflow.org/v1beta1',
 'kind': 'Experiment',
 'metadata': {'creationTimestamp': '2020-09-14T23:21:51Z',
  'generation': 1,
  'name': 'darts-example',
  'namespace': 'anonymous',
  'resourceVersion': '127105457',
  'selfLink': '/apis/kubeflow.org/v1beta1/namespaces/anonymous/experiments/darts-example',
  'uid': 'b6b73ce9-e1a5-4d7f-8569-88da89285671'},
 'spec': {'algorithm': {'algorithmName': 'darts',
   'algorithmSettings': [{'name': 'num_epochs', 'value': '2'},
    {'name': 'stem_multiplier', 'value': '1'},
    {'name': 'init_channels', 'value': '4'},
    {'name': 'num_nodes', 'value': '3'}]},
  'maxFailedTrialCount': 1,
  'maxTrialCount': 1,
  'metricsCollectorSpec': {'collector': {'kind': 'StdOut'},
   'source': {'filter': {'metricsFormat': ['([\\w-]+)=(Genotype.*)']}}},
  'nasConfig': {'graphConfig': {'numLayers': 2},
   'operations': [{'operationType': 'separable_convolution',
     'parameters': [{'feasibleSpace': {'list': ['3']},
       'name': 'filter_size',
    

# Get experiment

You can get experiment by name and receive required data

In [7]:
exp = kclient.get_experiment(name=experiment_name, namespace=namespace)
print(exp)
print("-----------------\n")

# Get last status
print(exp["status"]["conditions"][-1])

{'apiVersion': 'kubeflow.org/v1beta1', 'kind': 'Experiment', 'metadata': {'creationTimestamp': '2020-09-14T23:21:51Z', 'finalizers': ['update-prometheus-metrics'], 'generation': 1, 'name': 'darts-example', 'namespace': 'anonymous', 'resourceVersion': '127105807', 'selfLink': '/apis/kubeflow.org/v1beta1/namespaces/anonymous/experiments/darts-example', 'uid': 'b6b73ce9-e1a5-4d7f-8569-88da89285671'}, 'spec': {'algorithm': {'algorithmName': 'darts', 'algorithmSettings': [{'name': 'num_epochs', 'value': '2'}, {'name': 'stem_multiplier', 'value': '1'}, {'name': 'init_channels', 'value': '4'}, {'name': 'num_nodes', 'value': '3'}]}, 'maxFailedTrialCount': 1, 'maxTrialCount': 1, 'metricsCollectorSpec': {'collector': {'kind': 'StdOut'}, 'source': {'filter': {'metricsFormat': ['([\\w-]+)=(Genotype.*)']}}}, 'nasConfig': {'graphConfig': {'numLayers': 2}, 'operations': [{'operationType': 'separable_convolution', 'parameters': [{'feasibleSpace': {'list': ['3']}, 'name': 'filter_size', 'parameterType'

# Get current experiment status

You can check current experiment status

In [9]:
kclient.get_experiment_status(name=experiment_name, namespace=namespace)

'Succeeded'

You can check if experiment is succeeded

In [10]:
kclient.is_experiment_succeeded(name=experiment_name, namespace=namespace)

True

# Get best Genotype

Best Genotype is located in optimal trial currently. Latest Genotype is the best.

Check trial logs to get more information about training process.

In [11]:
opt_trial = kclient.get_optimal_hyperparameters(name=experiment_name, namespace=namespace)

best_genotype = opt_trial["currentOptimalTrial"]["observation"]["metrics"][0]["latest"]
print(best_genotype)

Genotype(normal=[[('separable_convolution_3x3',0),('dilated_convolution_3x3',1)],[('dilated_convolution_5x5',1),('dilated_convolution_3x3',2)],[('dilated_convolution_5x5',2),('separable_convolution_3x3',3)]],normal_concat=range(2,5),reduce=[[('separable_convolution_3x3',1),('separable_convolution_3x3',0)],[('max_pooling_3x3',2),('max_pooling_3x3',1)],[('max_pooling_3x3',2),('max_pooling_3x3',3)]],reduce_concat=range(2,5))


# Delete experiments

You can delete experiments

In [12]:
kclient.delete_experiment(name=experiment_name, namespace=namespace)

{'apiVersion': 'kubeflow.org/v1beta1',
 'kind': 'Experiment',
 'metadata': {'creationTimestamp': '2020-09-14T23:21:51Z',
  'deletionGracePeriodSeconds': 0,
  'deletionTimestamp': '2020-09-14T23:27:32Z',
  'finalizers': ['update-prometheus-metrics'],
  'generation': 2,
  'name': 'darts-example',
  'namespace': 'anonymous',
  'resourceVersion': '127108134',
  'selfLink': '/apis/kubeflow.org/v1beta1/namespaces/anonymous/experiments/darts-example',
  'uid': 'b6b73ce9-e1a5-4d7f-8569-88da89285671'},
 'spec': {'algorithm': {'algorithmName': 'darts',
   'algorithmSettings': [{'name': 'num_epochs', 'value': '2'},
    {'name': 'stem_multiplier', 'value': '1'},
    {'name': 'init_channels', 'value': '4'},
    {'name': 'num_nodes', 'value': '3'}]},
  'maxFailedTrialCount': 1,
  'maxTrialCount': 1,
  'metricsCollectorSpec': {'collector': {'kind': 'StdOut'},
   'source': {'filter': {'metricsFormat': ['([\\w-]+)=(Genotype.*)']}}},
  'nasConfig': {'graphConfig': {'numLayers': 2},
   'operations': [{'o