Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Conditional Parameter Constraints #2197

Closed
CCranney opened this issue Feb 13, 2024 · 5 comments
Closed

Feature Request: Conditional Parameter Constraints #2197

CCranney opened this issue Feb 13, 2024 · 5 comments
Assignees

Comments

@CCranney
Copy link

Hi,

I'm using Ax to perform a neural architectural search (NAS). I'm trying to design a search space that could range from simple neural networks to more complicated configurations. This includes potentially removing layers from the possible model (likely by making a harmless "passthrough" layer to be inserted in place of the layer).

In doing so, I've realized there may be a problem with having other parameters that are dependent on now-removed layers. For instance, if I am removing layer 1, that layer will have parameters like number of neurons, activation function etc. that are now moot. I am not intimately familiar with the optimization algorithms used to select a model, but I would assume having "dead" parameters may cause problems (correct me if I am wrong). It may draw false connections between, say, the number of neurons of a defunct layer and the learning rate.

Looking through your documentation, Ax supports three types of parameter constraints, all dealing with floats or ints. It does not allow for straight conditional comparisons. Would it be possible to have a constraint that, for instance, sets the number of neurons in layer 1 to 0 and the related activation function to a null function if layer 1 is set to be removed? Or would you have any recommendations on how to optimize search space design to account for such possibilities?

As an alternative, I'm thinking of designing multiple search spaces to account for these possibilities without accidentally making "dead" parameters. However, it would be nice to be able to account for all of that in a single search space.

@CCranney
Copy link
Author

As a correction, you would not set the neurons in layer 1 to 0, but rather to the length of the previous layer.

@mpolson64 mpolson64 self-assigned this Feb 13, 2024
@mpolson64
Copy link
Contributor

@CCranney this is possible using something we call a "HierarchicalSearchSpace" -- we developed this with situations exactly like yours in mind, though we haven't written a tutorial showing off this functionality quiet yet. Assuming you're using AxClient you would set up your optimization as follows:

ax_client.create_experiment(
    name="nas_example",
    parameters=[
        {
            "name": "num_layers",
            "type": "choice",
            "values": [1, 2, 3],
            "is_ordered": True,
            "dependents": {
                1: ["num_neurons_1_1"],
                2: ["num_neurons_2_1", "num_neurons_2_2"],
                3: ["num_neurons_3_1", "num_neurons_3_2", "num_neurons_3_3"],
            },
        },
        {
            "name": "num_neurons_1_1",
            "type": "range",
            "bounds": [1, 8],
        },
        {
            "name": "num_neurons_2_1",
            "type": "range",
            "bounds": [1, 8],
        },
        {
            "name": "num_neurons_2_2",
            "type": "range",
            "bounds": [1, 8],
        },
        {
            "name": "num_neurons_3_1",
            "type": "range",
            "bounds": [1, 8],
        },
        {
            "name": "num_neurons_3_2",
            "type": "range",
            "bounds": [1, 8],
        },
        {
            "name": "num_neurons_3_3",
            "type": "range",
            "bounds": [1, 8],
        },
    ],
    objectives={"loss": ObjectiveProperties(minimize=True)},
)

Notice how there is an extra option "dependents" on our choice parameter that maps some value to a list of parameters -- this tells Ax to only generate a values for those parameters if a certain value is chosen. Calling ax_client.get_next_trial() will yield results like {'num_layers': 2, 'num_neurons_2_1': 3, 'num_neurons_2_2': 5} and {'num_layers': 1, 'num_neurons_1_1': 7}.

Tree shaped search spaces like this have been an active area of research for our team and I'm excited about how we can take advantage of this structure to optimize more efficiently. Currently by default we actually just flatten the search space under the hood and use our SAAS model (this works shockingly well even with the "dead" parameters!), but as our research develops we will update Ax to always use SOTA methodology and our model selection heuristics will opt users into the improved methodology.

I hope this was helpful and don't hesitate to reopen this task if you have any follow-up questions!

@Runyu-Zhang
Copy link

@CCranney this is possible using something we call a "HierarchicalSearchSpace" -- we developed this with situations exactly like yours in mind, though we haven't written a tutorial showing off this functionality quiet yet. Assuming you're using AxClient you would set up your optimization as follows:

ax_client.create_experiment(
    name="nas_example",
    parameters=[
        {
            "name": "num_layers",
            "type": "choice",
            "values": [1, 2, 3],
            "is_ordered": True,
            "dependents": {
                1: ["num_neurons_1_1"],
                2: ["num_neurons_2_1", "num_neurons_2_2"],
                3: ["num_neurons_3_1", "num_neurons_3_2", "num_neurons_3_3"],
            },
        },
        {
            "name": "num_neurons_1_1",
            "type": "range",
            "bounds": [1, 8],
        },
        {
            "name": "num_neurons_2_1",
            "type": "range",
            "bounds": [1, 8],
        },
        {
            "name": "num_neurons_2_2",
            "type": "range",
            "bounds": [1, 8],
        },
        {
            "name": "num_neurons_3_1",
            "type": "range",
            "bounds": [1, 8],
        },
        {
            "name": "num_neurons_3_2",
            "type": "range",
            "bounds": [1, 8],
        },
        {
            "name": "num_neurons_3_3",
            "type": "range",
            "bounds": [1, 8],
        },
    ],
    objectives={"loss": ObjectiveProperties(minimize=True)},
)

Notice how there is an extra option "dependents" on our choice parameter that maps some value to a list of parameters -- this tells Ax to only generate a values for those parameters if a certain value is chosen. Calling ax_client.get_next_trial() will yield results like {'num_layers': 2, 'num_neurons_2_1': 3, 'num_neurons_2_2': 5} and {'num_layers': 1, 'num_neurons_1_1': 7}.

Tree shaped search spaces like this have been an active area of research for our team and I'm excited about how we can take advantage of this structure to optimize more efficiently. Currently by default we actually just flatten the search space under the hood and use our SAAS model (this works shockingly well even with the "dead" parameters!), but as our research develops we will update Ax to always use SOTA methodology and our model selection heuristics will opt users into the improved methodology.

I hope this was helpful and don't hesitate to reopen this task if you have any follow-up questions!

immensely helpful! Testing it now

@CCranney
Copy link
Author

CCranney commented Feb 23, 2024

Thank you for your comments! I'm going to try to implement this using the ChoiceParameter class as used in the tutorial I referenced above, which I see also has a dependents option. I'm pretty new to Ax, so am not familiar with how to use ax_client in code.

Can I ask what the difference is between ax_client.create_experiment function and the ax.core.Experiment class? It looks like they serve similar functions, but I'm not seeing the distinction. Is there a potential problem with using the ChoiceParameter class instead of what you described that I should be aware of?

@mpolson64
Copy link
Contributor

@CCranney There is no issue using ChoiceParameter directly -- go ahead and do so if you would prefer.

AxClient and its create_experiment method come from our "Service API" which is an ask-tell interface for using Ax. In this setup we:

  1. Initialize and AxClient and configure our experiment with ax_client.create_experiment
  2. Call ax_client.get_next_trial to generate candidate parameterizations
  3. Evaluate the parameterization however we want outside of Ax (in your case train and eval the NN)
  4. Call ax_client.complete_trial to save data to the experiment
  5. Repeat 2-4

In general we recommend most users use Ax through this API rather than dealing with the Experiment and GenerationStrategy directly because it can be quite a bit simpler, but should someone want/need to use the ax.core abstractions directly they should feel free to do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants