# HyperParameter tunning using  CMA-ES

In this example you will deploy 3 Katib Experiments with Covariance Matrix Adaptation Evolution Strategy (CMA-ES) using Jupyter Notebook and Katib SDK. These Experiments have various resume policies.

Reference documentation:
- https://www.kubeflow.org/docs/components/katib/experiment/#cmaes
- https://www.kubeflow.org/docs/components/katib/resume-experiment/

The notebook shows how to create, get, check status and delete an Experiment.

In [1]:
# Install required package (Katib SDK).
!pip install kubeflow-katib==0.12.0



## Import required packages

In [17]:
import copy

from kubeflow.katib import KatibClient
from kubernetes.client import V1ObjectMeta
from kubeflow.katib import V1beta1Experiment
from kubeflow.katib import V1beta1AlgorithmSpec
from kubeflow.katib import V1beta1ObjectiveSpec
from kubeflow.katib import V1beta1FeasibleSpace
from kubeflow.katib import V1beta1ExperimentSpec
from kubeflow.katib import V1beta1ObjectiveSpec
from kubeflow.katib import V1beta1ParameterSpec
from kubeflow.katib import V1beta1TrialTemplate
from kubeflow.katib import V1beta1TrialParameterSpec

## Define your Experiment

You have to create your Experiment object before deploying it. This Experiment is similar to [this](https://github.com/kubeflow/katib/blob/master/examples/v1beta1/hp-tuning/cmaes.yaml) example.

In [22]:
# Experiment name and namespace.
namespace = "kubeflow-user-example-com"
experiment_name = "autogl-cora-suggestion"

metadata = V1ObjectMeta(
    name=experiment_name,
    namespace=namespace
)

# Algorithm specification.
algorithm_spec=V1beta1AlgorithmSpec(
    algorithm_name="cmaes"
)

# Objective specification.
objective_spec=V1beta1ObjectiveSpec(
    type="maximize",
    goal= 0.95,
    objective_metric_name="Validation-accuracy",
    additional_metric_names=["Train-accuracy"]
)

# Experiment search space. In this example we tune learning rate, number of layer and optimizer.
parameters=[
    V1beta1ParameterSpec(
        name="lr",
        parameter_type="double",
        feasible_space=V1beta1FeasibleSpace(
            min="0.01",
            max="0.06"
        ),
    ),
    V1beta1ParameterSpec(
        name="num-layers",
        parameter_type="int",
        feasible_space=V1beta1FeasibleSpace(
            min="2",
            max="5"
        ),
    ),
    V1beta1ParameterSpec(
        name="activation",
        parameter_type="categorical",
        feasible_space=V1beta1FeasibleSpace(
            list=["leaky_relu","relu","elu","tanh"]
        ),
    )
]



# JSON template specification for the Trial's Worker Kubernetes Job.
trial_spec={
    "apiVersion": "batch/v1",
    "kind": "Job",
    "spec": {
        "template": {
            "metadata": {
                "annotations": {
                    "sidecar.istio.io/inject": "false"
                }
            },
            "spec": {
                "containers": [
                    {
                        "name": "training-container",
                        "image": "docker.io/skirpichenko/tigergraph:autogl-dev",
                        "command": ["/home/user/TigerAutoGL/containers/autogl/run.sh",
                            "--batch-size=64",
                            "--lr=${trialParameters.learningRate}",
                            "--nlayers=${trialParameters.numberLayers}",
                            "--activation=${trialParameters.activation}"
                        ]
                    }
                ],
                "restartPolicy": "Never"
            }
        }
    }
}

# Configure parameters for the Trial template.
trial_template=V1beta1TrialTemplate(
    primary_container_name="training-container",
    trial_parameters=[
        V1beta1TrialParameterSpec(
            name="learningRate",
            description="Learning rate for the training model",
            reference="lr"
        ),
        V1beta1TrialParameterSpec(
            name="numberLayers",
            description="Number of training model layers",
            reference="num-layers"
        ),
        V1beta1TrialParameterSpec(
            name="activation",
            description="NN activation function",
            reference="activation"
        ),
    ],
    trial_spec=trial_spec
)


# Experiment object.
experiment = V1beta1Experiment(
    api_version="kubeflow.org/v1beta1",
    kind="Experiment",
    metadata=metadata,
    spec=V1beta1ExperimentSpec(
        max_trial_count=20,
        parallel_trial_count=3,
        max_failed_trial_count=3,
        algorithm=algorithm_spec,
        objective=objective_spec,
        parameters=parameters,
        trial_template=trial_template,
    )
)

In [23]:
print(experiment.to_str())

{'api_version': 'kubeflow.org/v1beta1',
 'kind': 'Experiment',
 'metadata': {'annotations': None,
              'cluster_name': None,
              'creation_timestamp': None,
              'deletion_grace_period_seconds': None,
              'deletion_timestamp': None,
              'finalizers': None,
              'generate_name': None,
              'generation': None,
              'labels': None,
              'managed_fields': None,
              'name': 'autogl-cora-suggestion',
              'namespace': 'kubeflow-user-example-com',
              'owner_references': None,
              'resource_version': None,
              'self_link': None,
              'uid': None},
 'spec': {'algorithm': {'algorithm_name': 'cmaes', 'algorithm_settings': None},
          'early_stopping': None,
          'max_failed_trial_count': 3,
          'max_trial_count': 20,
          'metrics_collector_spec': None,
          'nas_config': None,
          'objective': {'additional_metric_names': ['

## Create your Experiment

You have to create Katib client to use the SDK.

In [24]:
# Create client.
kclient = KatibClient()

# Create your Experiment.
kclient.create_experiment(experiment,namespace=namespace)

{'apiVersion': 'kubeflow.org/v1beta1',
 'kind': 'Experiment',
 'metadata': {'creationTimestamp': '2022-04-06T16:17:07Z',
  'generation': 1,
  'managedFields': [{'apiVersion': 'kubeflow.org/v1beta1',
    'fieldsType': 'FieldsV1',
    'fieldsV1': {'f:spec': {'.': {},
      'f:algorithm': {'.': {}, 'f:algorithmName': {}},
      'f:maxFailedTrialCount': {},
      'f:maxTrialCount': {},
      'f:objective': {'.': {},
       'f:additionalMetricNames': {},
       'f:goal': {},
       'f:objectiveMetricName': {},
       'f:type': {}},
      'f:parallelTrialCount': {},
      'f:parameters': {},
      'f:trialTemplate': {'.': {},
       'f:primaryContainerName': {},
       'f:trialParameters': {},
       'f:trialSpec': {'.': {},
        'f:apiVersion': {},
        'f:kind': {},
        'f:spec': {'.': {},
         'f:template': {'.': {},
          'f:metadata': {'.': {},
           'f:annotations': {'.': {}, 'f:sidecar.istio.io/inject': {}}},
          'f:spec': {'.': {}, 'f:containers': {}, 'f:

Create other Experiments.

## Get your Experiment

You can get your Experiment by name and receive required data.

In [21]:
import json
print (json.dumps(exp, indent=3))

NameError: name 'exp' is not defined

In [40]:
exp = kclient.get_experiment(name=experiment_name, namespace=namespace)
print(exp)
print("-----------------\n")

# Get the max trial count and latest status.
print(exp["spec"]["maxTrialCount"])
print(exp["status"]["conditions"][-1])

{'apiVersion': 'kubeflow.org/v1beta1', 'kind': 'Experiment', 'metadata': {'creationTimestamp': '2022-03-18T10:18:34Z', 'finalizers': ['update-prometheus-metrics'], 'generation': 1, 'managedFields': [{'apiVersion': 'kubeflow.org/v1beta1', 'fieldsType': 'FieldsV1', 'fieldsV1': {'f:spec': {'.': {}, 'f:algorithm': {'.': {}, 'f:algorithmName': {}}, 'f:maxFailedTrialCount': {}, 'f:maxTrialCount': {}, 'f:objective': {'.': {}, 'f:additionalMetricNames': {}, 'f:goal': {}, 'f:objectiveMetricName': {}, 'f:type': {}}, 'f:parallelTrialCount': {}, 'f:parameters': {}, 'f:trialTemplate': {'.': {}, 'f:primaryContainerName': {}, 'f:trialParameters': {}, 'f:trialSpec': {'.': {}, 'f:apiVersion': {}, 'f:kind': {}, 'f:spec': {'.': {}, 'f:template': {'.': {}, 'f:metadata': {'.': {}, 'f:annotations': {'.': {}, 'f:sidecar.istio.io/inject': {}}}, 'f:spec': {'.': {}, 'f:containers': {}, 'f:restartPolicy': {}}}}}}}}, 'manager': 'OpenAPI-Generator', 'operation': 'Update', 'time': '2022-03-18T10:18:34Z'}, {'apiVers

## Get all Experiments

You can get list of the current Experiments.

In [34]:
# Get names from the running Experiments.
exp_list = kclient.get_experiment(namespace=namespace)

for exp in exp_list["items"]:
    print(exp["metadata"]["name"])

autogl-02
autogl-08
autogl-cora
cmaes-example
cmaes-example-14
cmaes-example-20
cmaes-example-200
cmaes-example-test


## Get the current Experiment status

You can check the current Experiment status.

In [35]:
kclient.get_experiment_status(name=experiment_name, namespace=namespace)

'Failed'

You can check if your Experiment is succeeded.

In [20]:
kclient.is_experiment_succeeded(name=experiment_name, namespace=namespace)

False

## List of the current Trials

You can get list of the current trials with the latest status.

In [36]:
# Trial list.
kclient.list_trials(name=experiment_name, namespace=namespace)

[]

## Get the optimal HyperParameters

You can get the current optimal Trial from your Experiment. For the each metric you can see the max, min and latest value.

In [13]:
# Optimal HPs.
kclient.get_optimal_hyperparameters(name=experiment_name, namespace=namespace)

{'currentOptimalTrial': {'bestTrialName': 'cmaes-example-8crn89vg',
  'observation': {'metrics': [{'latest': '0.991588',
     'max': '0.991588',
     'min': '0.925057',
     'name': 'Train-accuracy'},
    {'latest': '0.978205',
     'max': '0.980096',
     'min': '0.954817',
     'name': 'Validation-accuracy'}]},
  'parameterAssignments': [{'name': 'lr', 'value': '0.04511033252270099'},
   {'name': 'num-layers', 'value': '3'},
   {'name': 'optimizer', 'value': 'sgd'}]}}

## Status for the Suggestion objects

You can check the Suggestion object status for more information about resume status.

For Experiment with FromVolume you should be able to check created PVC.

In [14]:
# Get the current Suggestion status for the never resume Experiment.
suggestion = kclient.get_suggestion(name=experiment_never_resume_name, namespace=namespace)

print(suggestion["status"]["conditions"][-1]["message"])
print("-----------------")

# Get the current Suggestion status for the from volume Experiment.
suggestion = kclient.get_suggestion(name=experiment_from_volume_resume_name, namespace=namespace)

print(suggestion["status"]["conditions"][-1]["message"])

Suggestion is succeeded, can't be restarted
-----------------
Suggestion is succeeded, suggestion volume is not deleted, can be restarted


## Delete your Experiments

You can delete your Experiments.

In [38]:
experiment_name, namespace

('autogl-cora', 'kubeflow-user-example-com')

In [39]:
kclient.delete_experiment(name=experiment_name, namespace=namespace)
#kclient.delete_experiment(name=experiment_never_resume_name, namespace=namespace)
#kclient.delete_experiment(name=experiment_from_volume_resume_name, namespace=namespace)

{'apiVersion': 'kubeflow.org/v1beta1',
 'kind': 'Experiment',
 'metadata': {'creationTimestamp': '2022-03-29T11:05:45Z',
  'deletionGracePeriodSeconds': 0,
  'deletionTimestamp': '2022-03-29T11:29:52Z',
  'finalizers': ['update-prometheus-metrics'],
  'generation': 2,
  'managedFields': [{'apiVersion': 'kubeflow.org/v1beta1',
    'fieldsType': 'FieldsV1',
    'fieldsV1': {'f:spec': {'.': {},
      'f:algorithm': {'.': {}, 'f:algorithmName': {}},
      'f:maxFailedTrialCount': {},
      'f:maxTrialCount': {},
      'f:objective': {'.': {},
       'f:additionalMetricNames': {},
       'f:goal': {},
       'f:objectiveMetricName': {},
       'f:type': {}},
      'f:parallelTrialCount': {},
      'f:parameters': {},
      'f:trialTemplate': {'.': {},
       'f:primaryContainerName': {},
       'f:trialParameters': {},
       'f:trialSpec': {'.': {},
        'f:apiVersion': {},
        'f:kind': {},
        'f:spec': {'.': {},
         'f:template': {'.': {},
          'f:metadata': {'.': {