# Hyperparameter Tuning with Katib
This notebook demonstrates hyperparameter tuning with Katib on TigerGraph's [ML Workbench Cloud](https://ml.tgcloud.io). We will use the Cora dataset and the GraphSAGE model from previous tutorials. It is recommended to go over those tutorials first. Now let's see how to automatically optimize the hyperparameters of the GraphSAGE model.

## Data <a name="data_processing"></a>

Here we assume the Cora dataset is already ingested into a TigerGraph database. If not, please refer to the tutorial on data ingestion first.

## Training Script

The next step is to prepare the training script. This script should contain the whole process of training a model from model definition to the actual training loop. Hyperparameters of the model should be input arguments to this script, and model performance metrics should be logged for Katib to tune the hyperparameters.

An example training script `train.py` is included in this folder.

## Docker Image

To run the training script with Katib, we need to containerize it and build a docker image. An example `Dockerfile` is included in this folder.

To build the image, run the command below on your terminal. You will need Docker installed.

`docker build -t katib-example:latest .` 

Then push the image to a repo that you have access to instead of `tigergraphml` in this example. 

`docker tag katib-example:latest tigergraphml/katib-example:latest`

`docker push tigergraphml/katib-example:latest`


## Experiment

Finally, we start the tuning process. First, we are going to generate the specs for this experiment. Then we call Katib's API to create and launch the experiment. 

The code below has to be run on the [ML Workbench Cloud](https://ml.tgcloud.io).

In [2]:
!pip install kubeflow-katib

Collecting kubeflow-katib
  Downloading kubeflow_katib-0.13.0-py3-none-any.whl (89 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m89.3/89.3 kB[0m [31m12.2 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: kubeflow-katib
Successfully installed kubeflow-katib-0.13.0


In [1]:
import copy

from kubeflow.katib import KatibClient
from kubernetes.client import V1ObjectMeta
from kubeflow.katib import V1beta1Experiment
from kubeflow.katib import V1beta1AlgorithmSpec
from kubeflow.katib import V1beta1ObjectiveSpec
from kubeflow.katib import V1beta1FeasibleSpace
from kubeflow.katib import V1beta1ExperimentSpec
from kubeflow.katib import V1beta1ObjectiveSpec
from kubeflow.katib import V1beta1ParameterSpec
from kubeflow.katib import V1beta1TrialTemplate
from kubeflow.katib import V1beta1TrialParameterSpec

Please read the comments in the code below carefully and make changes to the code as you see fit.

In [18]:
# Experiment name and namespace.
namespace = "bill-shi" # Change to your namespace
experiment_name = "katib-example"

metadata = V1ObjectMeta(
    name=experiment_name,
    namespace=namespace
)

# Algorithm specification.
algorithm_spec=V1beta1AlgorithmSpec(
    algorithm_name="cmaes" # Change to the algorithm you want
)

# Objective specification.
objective_spec=V1beta1ObjectiveSpec(
    type="maximize",
    goal= 0.99,
    objective_metric_name="valid_accuracy",
    additional_metric_names=["train_accuracy"]
)

# Experiment search space. In this example we tune learning rate, dropout and L2 regulariztion.
parameters=[
    V1beta1ParameterSpec(
        name="lr",
        parameter_type="double",
        feasible_space=V1beta1FeasibleSpace(
            min="0.001",
            max="0.1"
        ),
    ),
    V1beta1ParameterSpec(
        name="dropout",
        parameter_type="double",
        feasible_space=V1beta1FeasibleSpace(
            min="0",
            max="0.6"
        ),
    ),
    V1beta1ParameterSpec(
        name="l2",
        parameter_type="double",
        feasible_space=V1beta1FeasibleSpace(
            min="0.0001",
            max="0.1"
        ),
    ),
]



# JSON template specification for the Trial's Worker Kubernetes Job.
trial_spec={
    "apiVersion": "batch/v1",
    "kind": "Job",
    "spec": {
        "template": {
            "metadata": {
                "annotations": {
                    "sidecar.istio.io/inject": "false"
                }
            },
            "spec": {
                "containers": [
                    {
                        "name": "training-container",
                        "image": "docker.io/tigergraphml/katib-example:latest",
                        "command": [
                            "python3",
                            "/opt/mlwb/train.py",
                            "--epochs=20",
                            "--embed=64",
                            "--layers=2",
                            "--lr=${trialParameters.learningRate}",
                            "--dropout=${trialParameters.dropout}",
                            "--l2=${trialParameters.l2}"
                        ]
                    }
                ],
                "restartPolicy": "Never"
            }
        }
    }
}

# Configure parameters for the Trial template.
trial_template=V1beta1TrialTemplate(
    primary_container_name="training-container",
    trial_parameters=[
        V1beta1TrialParameterSpec(
            name="learningRate",
            description="Learning rate for the training model",
            reference="lr"
        ),
        V1beta1TrialParameterSpec(
            name="dropout",
            description="Dropout",
            reference="dropout"
        ),
        V1beta1TrialParameterSpec(
            name="l2",
            description="Weight of L2 regularization",
            reference="l2"
        )
    ],
    trial_spec=trial_spec
)


# Experiment object.
experiment = V1beta1Experiment(
    api_version="kubeflow.org/v1beta1",
    kind="Experiment",
    metadata=metadata,
    spec=V1beta1ExperimentSpec(
        max_trial_count=10,
        parallel_trial_count=2,
        max_failed_trial_count=3,
        algorithm=algorithm_spec,
        objective=objective_spec,
        parameters=parameters,
        trial_template=trial_template,
    )
)

In [19]:
# Create client.
kclient = KatibClient()

# Create your Experiment.
kclient.create_experiment(experiment,namespace=namespace)

{'apiVersion': 'kubeflow.org/v1beta1',
 'kind': 'Experiment',
 'metadata': {'creationTimestamp': '2022-08-11T18:13:03Z',
  'generation': 1,
  'managedFields': [{'apiVersion': 'kubeflow.org/v1beta1',
    'fieldsType': 'FieldsV1',
    'fieldsV1': {'f:spec': {'.': {},
      'f:algorithm': {'.': {}, 'f:algorithmName': {}},
      'f:maxFailedTrialCount': {},
      'f:maxTrialCount': {},
      'f:objective': {'.': {},
       'f:additionalMetricNames': {},
       'f:goal': {},
       'f:objectiveMetricName': {},
       'f:type': {}},
      'f:parallelTrialCount': {},
      'f:parameters': {},
      'f:trialTemplate': {'.': {},
       'f:primaryContainerName': {},
       'f:trialParameters': {},
       'f:trialSpec': {'.': {},
        'f:apiVersion': {},
        'f:kind': {},
        'f:spec': {'.': {},
         'f:template': {'.': {},
          'f:metadata': {'.': {},
           'f:annotations': {'.': {}, 'f:sidecar.istio.io/inject': {}}},
          'f:spec': {'.': {}, 'f:containers': {}, 'f: