Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pipeline and hyperparameter structure improvement #204

Open
sarahmish opened this issue Mar 4, 2021 · 0 comments
Open

Pipeline and hyperparameter structure improvement #204

sarahmish opened this issue Mar 4, 2021 · 0 comments
Labels
enhancement Improvements on the current features

Comments

@sarahmish
Copy link
Collaborator

sarahmish commented Mar 4, 2021

Here is a suggestion to improve the hyperparameter storage of pipelines. Currently we have a lot of files of the style pipeline_name/pipeline_name_dataset.json to denote hyperparameter changes of the pipeline for the purpose of benchmarking. To reduce the number of files in a particular folder, we suggest the following:

Under a particular pipeline folder (pipeline_name/), we can have benchmark-meta.json with all the datasets defined and their corresponding hyperparameter. The corresponding hyperparameter can be defined as:

  • a dictionary denoting the primitive and hyperparameter change
  • a path to a file that contains the hyperparameter changes, this is useful when we have a complex set of hyperparameters to be changed.

See example here

{
    "datasets": [
        {
            "name": "artificialwithanomaly",
            "hyperparameters": {
                "mlprimitives.custom.timeseries_preprocessing.time_segments_aggregate#1": {
                    "interval": 600
                }
            }
        },
        {
            "name": "smap",
            "path": "./orion/pipelines/verified/pipeline_name/pipeline_name_smap.json"
        }
    ]
}

Pipelines and templates
We technically don't differentiate between pipelines and templates using Orion API. It might even get confusing if we display pipelines and we store them as only templates. For lstm_dt and lstm pipelines (dynamic threshold, and fixed threshold respectively), I suggest having two pipeline folders to eliminate confusion. This will result in:

  • lstm_dt folder
  • lstm folder
@sarahmish sarahmish added the enhancement Improvements on the current features label Mar 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Improvements on the current features
Projects
None yet
Development

No branches or pull requests

1 participant