Skip to content

Optimization

dxenes1 edited this page Mar 5, 2020 · 4 revisions

Optimization

Although parameter sweeps are helpful, they are far from efficient for optimizing over a set of hyperparameters. To solve this issue, we introduce a method of optimization that runs workflows, and samples new sets of hyperparameters from their results and continues.

Obtaining cost / score

Since cost and objective score per step are critical for workflow optimization, we have provided a method to obtain the cost in terms of time and allowed the user to specify the method in which score is obtained. Both of these values are stored in Datajoint, which is launched in a docker container with the docker-compose command. This takes place in the workflow definition. An example can be seen below, which is taken from the xbrain workflow.

membrane_classify:
        run: ../../../../saber/xbrain/tools/membrane_classify_nos3.cwl
        in:
            input: boss_pull/pull_output
            output_name: membrane_classify_output_name
            classifier: classifier
            ram_amount: ram_amount
            num_threads: num_threads
           
        out: [membrane_probability_map]
        hints: 
            saber:
                  score_format: '{} Average OOB: {score}'

If you are familiar with Python, you might recognize the score format as a format string. This is the format used by the Python Parse package. This format is matched with the logs of the tool that it is placed under. In this example, the score format would match with the line in the membrane classify stdout stream:

INFO lazyflow.classifiers.parallelVigraRfLazyflowClassifier: Training complete. Average OOB: 0.005116627377897277

returning the score 0.005116627377897277. Notice that {} functions as a wildcard, allowing you to ignore things that don't matter (such as INFO and the name of the class outputting this line). {score} is necessary, as that is what SABER uses as the identifier. The score value must be able to be cast to a float.

Optimizing over a set of parameters

The command format for optimization is

conduit optimize [-h] [--output OUTPUT] [--retry] [--continue_on_failure]
                        cwl job parameterize

This uses a similar parameterize file as the parameter sweeps. The difference here is that all input parameters are saved into an output CSV.

Optional parameters:

--output: (default=optiout.csv) where to store final CSV file
--retry: (default=False) flag to retry failed tasks once
--continue_on_failure: (default=False) flag to continue DAG runs even when a task fails

Specifying a sampling method in your workflow

Parameter sampling strategies can be specified in the parameterize YAML file. The optimize-specific block heading sampler and underlying key-value method denote which sampling strategy to use. Additional parameters required for a particular sampler can be added as key-value pairs in the same block heading. Below is an example of this format in a parameterize file:

# Sampling Method and Parameters 
sampler:
  method: batch-grid
  batch_size: 10

# Workflow parameters to search over
x:
  range:
      start: 5000
      stop: 6000
      step: 500
  parameters:
      min: xmin
      max: xmax
  steps:
      - boss_pull

Currently, we have added the following sampling-strategies:

Sampling Strategy method Name Required Parameters
Random random max_iterations
Grid Search grid None
Batch Grid Search batch-grid batch_size

Writing a sampler class

The abstract sampling class can be found in conduit/utils/parameterization.py. Essentially, Samplers are classes which have an update and sample method.

The update method takes results and updates an internal storage with the next set of jobs to run based on the results. Results are provided from datajoint as a dictionary where the keys are parameter names / score identifiers and the values are parameter/score values.

The sample method yields a set of parameters. It is designed to be iterated through until it is none, and then the Sampler will be updated.