Skip to content

Json Parameters File

arturluis edited this page Feb 9, 2022 · 11 revisions

Before running HyperMapper first setup a json configuration file.
The fields of this json file are defined here.

How to setup a json input parameters file

Follow this example.

Syntax

  • String parameters: [string][default value]
  • Integer parameters: [integer][min value][max value][default value].
  • Optional parameters: [option1|option2|option3][default option].
  • Array parameters: [array][type][default array].
  • Object parameters: [object]

All json files start with a open curly bracket "{" and end with an close curly bracket "}":

{
json fields here
}

Mandatory parameters:

  • application_name: [string]["application"].
    Name of the application, this is for printing and output file naming purposes. Example:
    "application_name":"branin"
  • optimization_objectives: [array][string][].
    List that defines the names and the number of the objectives HyperMapper will optimize. HyperMapper will automatically infer if this application is a mono or multi-objective optimization problem. Example:
    "optimization_objectives": ["Value", "Energy"]
  • input_parameters: [object][].
    The input search space file defining the search parameters.
    • parameter name: The name of the parameter being declared.
      • parameter_type: [string][].
        The parameter type. HyperMapper provides four types of parameters: real, integer, ordinal and categorical.
      • values: [array][integer|float|string][].
      • parameter_default: [integer|float|string][].
      • prior: [array|string][].
        Example:
    "input_parameters" : {
          "x1": {
              "parameter_type" : "real",
              "values" : [-5, 10],
              "parameter_default" : 0,
              "prior": "gaussian"
          }

Optional parameters:

  • hypermapper_mode: [object].
    One of the ways of using HyperMapper: exhaustive, client-server and default.

    • mode: ["exhaustive"|"client-server"|"default"]["default"]
      • "mode": "exhaustive", required: ["exhaustive_search_file"]
        • exhaustive_search_file: [string] [].
          File containing the exhaustive search, this is usually not available because the space is often too big to perform an exhaustive search. Example:
          "exhaustive_search_file": "exhaustive_search_file.csv"
        • real_pareto_file: [string][].
          File containing the real Pareto points. Example:
          "real_pareto_file": "real_pareto_file.csv"
      • "mode": "client-server".
      • "mode": "default".

    Example 1:

    "hypermapper_mode": { "mode": "default" }

    Example 2:

    "hypermapper_mode": {
            "mode": "exhaustive",
            "exhaustive_search_file": "example_scenarios/spatial/BlackScholes_exhaustive_search_data.csv",
            "real_pareto_file": "./example_scenarios/spatial/BlackScholes_pareto_real.csv"
    }
  • log_file: [string][hypermapper_logfile.log].
    An optional log file. This is very handy in client-server mode where the prints cannot be displayed, instead, they are printed in the log. The location is relative to run_directory.

    "log_file": "log_BlackScholes.log"
  • max_number_of_predictions: [integer][10000][][1000000]
    A number greater than 10k. Max number of predictions that the HyperMapper internal model can perform. We set a max number to limit the execution time of HyperMapper. Usually, a bigger number will give a better accuracy but slower results.

    "max_number_of_predictions": 1000000
  • optimization_iterations: [integer][1][5000][5] Usually a number between 1 and 10. Max number of optimization iterations that HyperMapper can internally perform. We set a max number to limit the execution time of HyperMapper. Usually, a bigger number will give a better accuracy but slower results.

    "optimization_iterations": 5
  • time_budget: [float][-1][][-1] Max number of minutes that HyperMapper is allowed to run for. If -1, the runtime will not be limited

    "time_budget": 60
  • number_of_repetitions: [integer][1][][1] Usually a number between 1 and 5. The number of times HyperMapper runs a single sample of the search space. For statistical significance, it may be useful to run a sample several times. The mean or the median of the multiple runs is taken by HyperMapper. Execution time is negatively affected by a high number of repetitions. This feature is not yet implemented.

    "number_of_repetitions": 3
  • models: HyperMapper supports Random Forest and Gaussian Process models.

    • model: [string]["random_forest", "gaussian_process"]["random_forest"]
      • number_of_trees: [integer][1][1000][10]. Number of trees in the Random Forest model.
      • max_features: [float][0][1][0.5]. Percentage of the features to be used when fitting the Random Forest model.
      • number_of_trees: [bolean][false]. Whether to use bagging when fitting the Random Forest model.
      • min_samples_split: [integer][2][][5]. Minimum number of samples required to split a node when fitting the Random Forest model.
    "models": {
            "model": "random_forest",
            "number_of_trees": 20
    }
  • output_image: [object].
    Info used by the plot script to plot the results of the HyperMapper search.

    • output_image_pdf_file": [string]["output_image.pdf"].
      Output image containing the Pareto and the exploration of HyperMapper.
    • optimization_objectives_labels_image_pdf: [array][string]["x", "y"].
      The labels of the objectives HyperMapper will optimize. These are used in the plot script.
    • image_xlog: [boolean][false].
      The x axis of the image will be plot with a log scale if set to true.
    • image_ylog: [boolean][false].
      The y-axis of the image will be plot with a log scale if set to true.
    • objective_1_max: [integer].
      This max value if present enables the plot to show axis 1 as a percentage. The value is used to compute the percentage.
    • objective_2_max: [integer].
      This max value if present enables the plot to show axis 2 as a percentage. The value is used to compute the percentage.
      Example:
    "output_image": {
            "output_image_pdf_file": "BlackScholes_output_image.pdf",
            "optimization_objectives_labels_image_pdf": ["Logic Utilization (%)", "Cycles (log)"],
            "image_xlog": false,
            "image_ylog": true,
            "objective_1_max": 262400
        }
  • feasible_output: [object]. This the feasible/non-feasible output flag, which is the validity or feasibility bit (true, false) of one sample of the space. This is an output of the code being optimized.

    • name: [string]["Valid"] Name of the validity bit.
      Example:
      "name": "Valid"
    • true_value: []["true"] The value that indicates that the sample is valid.
      Example 1:
      "true_value": "true"
      Example 2:
      "true_value": 1
      Example 3:
      "true_value": True
    • false_value: []["false"].
      The value that indicates that the sample is non-valid. Example 1:
      "true_value": "false"
      Example 2:
      "true_value": 0
      Example 3:
      "true_value": False
    • enable_feasible_predictor: [][true].
      Enables a classifier (the predictor) that will predict which samples of the space are feasible (i.e. valid) samples. This, in turn, helps to focus the search on areas that are feasible optimizing the number of samples that are actually run. This field has a negative impact on the speed of HyperMapper but a positive impact on the final result. The feasibility model is used only during optimization, it does not affect the design of experiment phase.
      Example:
      "enable_feasible_predictor": true
    • enable_feasible_predictor_grid_search_on_recall_and_precision: [][false].
      Enables a grid search cross-validation on the classifier (the predictor). This is useful for dev purposes to see if the classifier is classifying correctly the samples. An external dataset has to be provided (in the json field feasible_predictor_grid_search_validation_file) to run the cross-validation.
      Example:
      "enable_feasible_predictor_grid_search_on_recall_and_precision": false
    • feasible_predictor_grid_search_validation_file: []["/home/lnardi/spatial-lang/results/apps_classification_test_set/BlackScholes.csv"]
      Provides the cross-validation dataset enable_feasible_predictor_grid_search_on_recall_and_precision filed of the json.
      Example:
      "feasible_predictor_grid_search_validation_file": "apps_classification_test_set/BlackScholes.csv"

    Ensemble example:

    "feasible_output": {
          "name": "Valid",
          "true_value": "true",
          "false_value": "false"
      }
  • timestamp: [string]["Timestamp"].
    Name of timestamp variable, this is a float that represents seconds from the Linux epoch. This is useful to track the progress of the new samples over time and for comparison with other approaches than HyperMapper.
    Example:

    "timestamp": "Timestamp",
  • evaluations_per_optimization_iteration: [integer][1][][1].
    Defines the cap to how many runs are done in one optimization iteration. For now, HyperMapper supports only one run per optimization iteration. Example:

    "evaluations_per_optimization_iteration": 1
  • run_directory: [string][""].
    Relative path from where HyperMapper is launched. The result files will be saved here.
    Example:

    "run_directory": "spatial_data"
  • output_data_file: [string]["output_data_file.csv"].
    The output file containing all the points explored by HyperMapper.
    Example:

    "output_data_file": "BlackScholes_output_data.csv"
  • output_pareto_file: [string]["output_pareto_file.csv"].
    The output file containing the Pareto points explored by HyperMapper.
    Example:

    "output_pareto_file": "BlackScholes_output_pareto.csv"
  • design_of_experiment: [object].
    Before starting the optimization phase, HyperMapper samples the space using one method between random, standard latin hypercube and k latin hypercube. The type of sampling and how many times it samples the space is declared here. If grid_search is used, the number_of_samples field is ignored and all possible parameter combinations are sampled instead. If grid_search is used, HyperMapper will automatically set the number of optimization iterations to 0.

    • doe_type: ["random sampling"|"standard latin hypercube"|"k latin hypercube"]["grid_search"]. Type of warmup sampling.
      Example:
      "doe_type": "standard latin hypercube"
    • number_of_samples: [integer][10] Number of samples used to warmup the active learning phase.
      Example:
      "number_of_samples": 100

    Ensemble example:

    "design_of_experiment": {
        "doe_type": "random sampling",
        "number_of_samples": 1000
      }
  • acquisition_function: ["UCB"|"TS"|"EI"]["EI"].
    The acquisition function that HyperMapper will use during optimization. Only applicable with the "random_scalarizations" optimization method. Example:

    "acquisition_function": "EI"
  • scalarization_method: ["linear"|"tchebyshev"|"modified_tchebyshev"]["tchebyshev"].
    The scalarizing function that HyperMapper will use during optimization. Linear and modified_tchebyshev are implemented as presented in https://arxiv.org/pdf/1805.12168.pdf, while tchebyshev is implemented as presented in https://www.cs.bham.ac.uk/~jdk/parego/ParEGO-TR3.pdf. Example:

    "scalarization_method": "tchebyshev"
  • weight_sampling: ["flat"|"bounding_box"]["flat"].
    How to sample the weights used in the random scalarizations approach. Flat means weights are sampled from a uniform distribution. Bounding box means weights will be sampled so that HyperMapper prioritizes objective values within the limits specified in 'bounding_box_limits'. Example:

    "weight_sampling": "flat"
  • bounding_box_limits: [array][0, 1].
    An array of integer with the limits of the bounding boxes, either two elements or two elements per objective. Limits should be given in the same order as optimization_objectives. If only two elements are provided, the same bounds will be used for all objectives. This field is ignored if the 'weight_sampling' method is not 'bounding_box'.", Example:

    "bounding_box_limits": [0, 1]
  • optimization_method: ["bayesian_optimization"|"local_search"|"evolutionary_optimization"]["bayesian_optimization"].
    Which method HyperMapper should use for optimization. Which method HyperMapper should use for optimization. All methods support multi-objective optimization. Example:

    "optimization_method": "bayesian_optimization"
  • local_search_starting_points: [integer][10].
    Number of starting points for the multi-start local search. Example:

    "local_search_starting_points": 10
  • local_search_random_points: [integer][10000].
    Number of random points sampled for the multi-start local search. Used by the "local_search" and "random_scalarizations" optimizers. HyperMapper will sample this number of points twice, once uniformly and once from the prior, if provided. If no prior is provided, HyperMapper will random sample uniformly twice. Example:

    "local_search_starting_points": 10000
  • local_search_evaluation_limit: [integer][-1][][-1].
    the maximum number of function evaluations the local search can perform. If -1, the number of function evaluations will not be limited. Example:

    "local_search_evaluation_limit": -1
  • local_search_scalarization_weights: [array][float][1].
    weights to use in the scalarization of the optimization objectives. Must match the number of objectives. The sum of the weights should be 1, if it is not, HyperMapper will normalize them to 1. Example:

    "local_search_evaluation_limit": [1]
  • print_parameter_importance: [boolean][false]. Whether to print parameter importances after each optimization iteration. Can only be used with the random_forest model. Example:

    "print_parameter_importance": true
  • normalize_inputs: [boolean][false]. Whether to normalize inputs before fitting the random forest or gaussian process model. Example:

    "normalize_inputs": false
  • epsilon_greedy_threshold: [float][0][1][0.1]. Hyperparameter for the epsilon greedy component of HyperMapper. If greater than 0, HyperMapper will randomly sample a configuration to evaluate instead of doing Bayesian Optimization with probability equal to the value of the hyperparameter. Example:

    "epsilon_greedy_threshold": 0.1
  • model_posterior_weight: [integer][10]. Weight given to the model versus the prior in HyperMapper's posterior computation. Only applicable to prior_guided_optimization. Example:

    "model_posterior_weight": 10
  • model_good_quantile: [float][0.05]. Defines the quantile of points HyperMapper's model will consider 'good'. Only applicable to prior_guided_optimization. Example:

    "model_good_quantile": 0.05
  • prior_estimation_file: [string][samples.csv]. A csv file containing a set of points to be used to estimate a prior using kernel density estimation. Example:

    "prior_estimation_file": "branin_output_samples.csv"
  • prior_estimation_quantile: [float][0.1]. Defines the quantile of points HyperMapper's model will consider 'good' when estimating a KDE prior. Example:

    "prior_estimation_quantile": 0.1
  • estimate_multivariate_priors: [boolean][false]. Whether to estimate a multivariate KDE prior for all input parameters. If true, the individual priors from each parameter will be ignored. Requires the 'prior_estimation_file' field and can only be used with real parameters. Example:

    "estimate_multivariate_priors": true
  • resume_optimization: [boolean][true]. Whether to resume optimization from a previous state instead of starting a new optimization. Requires the resume_optimization_data field. Example:

    "resume_optimization": true
  • resume_optimization_data: [string]["output_samples.csv"]. File with csv data of a previous optimization run to use to resume optimization. Example:

    "resume_optimization_data": "branin_output_samples.csv"
  • bandwidth_parameter: [integer][0]. Parameter used in the bandwidth selection. We use Scott's rule, but replace the hardcoded 4 by this parameter. Example:

    "bandwidth_parameter": 0
  • bandwidth_n_factor: [integer][100]. Parameter used in the bandwidth selection. We use Scott's rule, but multiply n by this factor. Example:

    "bandwidth_n_factor": 100
  • custom_gaussian_prior_means: [array][float][0]. Means for the custom gaussian prior. Array must have size 1 or match the number of input parameters. If only one element is passed, the same mean will be used for all points. Can only be used with real parameters. Example:

    "custom_gaussian_prior_means": [3.1415]
  • custom_gaussian_prior_stds: [array][float][0]. standard deviations for the custom gaussian prior. Array must have size 1 or match the number of input parameters. If only one element is passed, the same std will be used for all points. If -1, the std will be half of the input parameter's range. Can only be used with real parameters. Example:

    "custom_gaussian_prior_stds": [1]
  • evolution_population_size: [integer][50]. Number of points the Evolutionary Algorithm keeps track of. Only used with evolutionary_optimization. Example:

    "evolution_population_size": 100
  • evolution_generations: [integer][150]. Number of iterations through the evolutionary loop. Only used with evolutionary_optimization. Example:

    "evolution_generations": 200
  • mutation_rate: [integer][1]. Number of parameters to mutate in each generation. Only used with evolutionary_optimization. Example:

    "mutation_rate": 2
  • evolution_crossover: [boolean][false]. Whether to use crossover. Only used with evolutionary_optimization. Example:

    "evolution_crossover": true
  • regularize_evolution: [boolean][false]. Whether to regularize (remove the oldest) the evolution. Only used with evolutionary_optimization. Example:

    "regularize_evolution": true
  • batch_size: [integer][2]. Number of samples to pick for tournament selection. If using crossover, must be at least three. Only used with evolutionary_optimization. Example:

    "batch_size": 3
  • profiling: [boolean][false]. Run a profiling run of hypermapper, displaying the time allocation between different parts of the application. Example:

    "profiling": true
  • profiling_file: [string]["profiles/profile.csv"] The name of the profiling output file. Example:

    "profiling_file": "profiling_data.csv"
  • append_profiles: [boolean][false]. For profiling runs, whether to append output of the next profiling run to the same file as the previous one. Example:

    "append_profiles": true
  • noise: [boolean][true]. Whether the function is assumed to be noisy or not. Example:

    "noise": false
  • number_of_cpus: [integer][0][][0] This is the number of cpus to use. If 0 it means to let HyperMapper decide (querying the system), otherwise if forces the number of cpus to this number.

    "number_of_cpus": 4
  • print_best: [boolean,string]["auto"]. Whether to print out the best point found after optimization. By default, it will be printed if running on default mode and will not be printed if running on client-server mode. The best point is always written on the logfile. Example:

    "print_best": true
  • print_posterior_best: [boolean][false]. Whether to print out the best point according to HyperMapper's posterior model's mean after optimization. The best point is computed using HyperMapper's local search. Example:

    "print_posterior_best": true
Clone this wiki locally