Setting search space step size in Ax Service API #2371

trevor-haas · 2024-04-17T19:22:45Z

Hello Ax team,

First off I would like to extend a big thank you for all your hard work and this amazing piece of software. I am coming from Emukit, and while it was nice, the GPy dependencies have made it a mess to use. Hence my switch to Ax. And I must say, WOW... this is light years ahead, congrats on all the progress!

Now onto my problem...
This is a multi objective bo problem. I am trying to understand the best approach for setting up my search space. The search space includes machine settings such as flow rate and wipe, the upper and lower limits to each of these are provided from domain knowledge. The objectives are the quality of parts the machine produces, scored visually and subjectively (assume a large amount of noise, I have mechanisms to help mitigate the subjectiveness).

My question is, in practice a flow rate of 100% and 100.05% will have no effect on the objectives. My idea was to setup step sizes based on domain knowledge to limit the size of the search space. I did this possibly a hacky way (see below). However, after reading a couple threads (#849) it seems that the model will function better with continuous parameters and I might be shooting my self in the foot using this approach. Ive come to realize the service api does a lot for you under the hood (took me almost a day to figure out how to specify qNEHVI as I thought the MOO model used EHVI but it turns out it uses qNEHVI under the hood for multi objective problems anyways...🤦‍♂️), so im not sure how the rounding works under the hood, if I can control that or not.

Id like to learn more about if its possible to do a step size with the service API. What the trade offs are? And maybe a suggestion on how to implement this while staying within the service API.

Below is my current setup

# Define parameter space
# Flowrate
s1_upper = 110
s1_lower = 90
s1_step = 1
s1_group = [str(round(x,0)) for x in np.arange(s1_lower, s1_upper + s1_step, s1_step)]

# Wipe
s2_upper = 1
s2_lower = 0
s2_step = 0.5
s2_group = [str(round(x,2)) for x in np.arange(s2_lower, s2_upper + s2_step, s2_step)]

# there are more settings, 7 total, each with different step sizes and such

# Define experiment with above param space
ax_client_modified = AxClient(generation_strategy=gs)
ax_client_modified.create_experiment(
    name="moo_experiment",
    parameters=[
        {
            "name": "flowrate", 
            "type": "choice", 
            "values": s1_group,
            "value_type": "float"
        },
        {
            "name": "wipe", 
            "type": "choice", 
            "values": s2_group,
            "value_type": "float"
        },
    ],
    objectives={
        # threshold to denote the lowest possible score thats acceptable
        "Y1": ObjectiveProperties(minimize=False, threshold=8),
        "Y2": ObjectiveProperties(minimize=False, threshold=8),
    },
    # Sets max parallelism to 5 for all steps of the generation strategy.
    choose_generation_strategy_kwargs={"max_parallelism_override": 5},
)

Thanks in advance for any help or intuition about the problem/solution.

The text was updated successfully, but these errors were encountered:

cheeseheist · 2024-04-17T19:54:05Z

Was just about to ask a separate question, but I think it will fit in this discussion. I'm running into cases where say I want three parallel trials. Often, the generator provides three nearly identical recommendations. As @trevor-haas described, it will be negligibly different in a real experiment with noise in the inputs. Any best practices to get the BO to provide more uniqueness between subsequent parallel recommendations, such as the step-size approach mentioned? In my case, I'm just doing single objective optimization.

Thanks for letting me tag along to your question! If it doesn't seem related, feel free to ignore me and I can make a separate thread!

Balandat · 2024-04-21T17:11:16Z

@trevor-haas thanks for the accolades, happy to hear that you're enjoying Ax. And thanks for contributing to making it better by engaging with the community.

As you said in general you won't help the optimization much by artificially discretizing a continuous parameter - but you may shoot yourself in the foot if the optimal setting happens to be between two of the pre-specified values (this can be mitigated with domain knowledge of how the function behaves in which case you might just not get any benefit).

Under the hood we will model a discretized floating point parameter in the same way as we model a continuous one (we infer that the values are ordered). The main difference is that the acquisition function optimization will happen on the discrete values. This often turns out to be harder than continuous optimization since we can't use gradient information on a discrete search space, so this is another downside. This is particularly pronounced if there are many steps and many parameters due to the combinatorial explosion of choices.

A valid (and I would argue maybe the only valid) reason to "pre-discretize" your search space is if you are indeed restricted in what parameters you can choose (maybe the flow rate settings are incremental by some step size). In this case you'll want to let the optimization to know that, but that doesn't make the problem easier.

That said, if you did want to do that, the code you have above seems to achieve that just fine.

Balandat · 2024-04-21T17:31:02Z

Often, the generator provides three nearly identical recommendations. As @trevor-haas described, it will be negligibly different in a real experiment with noise in the inputs. Any best practices to get the BO to provide more uniqueness between subsequent parallel recommendations, such as the step-size approach mentioned? In my case, I'm just doing single objective optimization.

@cheeseheist, I'd like to understand whether this is in fact a problem - if the outcomes are noisy and the model is relatively confident in what the region with the optimal parameter settings is, then evaluating multiple configurations close together may be the appropriate strategy. Note that the underlying GP surrogate model will be able to use multiple noisy observations in the same area to better infer the noise level (if not provided) and better estimate the latent function.

cheeseheist · 2024-04-22T13:26:47Z

Often, the generator provides three nearly identical recommendations. As @trevor-haas described, it will be negligibly different in a real experiment with noise in the inputs. Any best practices to get the BO to provide more uniqueness between subsequent parallel recommendations, such as the step-size approach mentioned? In my case, I'm just doing single objective optimization.

@cheeseheist, I'd like to understand whether this is in fact a problem - if the outcomes are noisy and the model is relatively confident in what the region with the optimal parameter settings is, then evaluating multiple configurations close together may be the appropriate strategy. Note that the underlying GP surrogate model will be able to use multiple noisy observations in the same area to better infer the noise level (if not provided) and better estimate the latent function.

Thanks for the response @Balandat. I have found a work around and this issue may have stemmed from my lack of understanding in how multiple subsequent generator calls work. Previously, I was doing multiple subsequent model.gen(n=1) calls to get multiple suggestions as opposed to say a single model.gen(n=3) call. The reason I was avoiding the second case was that I read in the documentation that these batch trials should only be used if they will be evaluated simultaneously, which they aren't really in my case. When I do a single model.gen(n=3) call, the recommendations are quite different and unique (which is what I'm looking for). When I do three model.gen(n=1) calls, they aren't that unique, but are slightly different. Perhaps this isn't surprising? I'm not sure how the model tracks multiple subsequent gen calls.

Anyway, my work around is to do a single model.gen(n=3) call and then to pull the parameters out of these and make three new individual trials. So, what the model sees in terms of trials are three separate trials rather than one batch trial, but I can use the batch generation to get more unique recommendations. So, I think I'm good to go (unless you have concerns with this approach) but would still be curious if consecutive model.gen(n=1) calls should behave better or if that just won't work without provided new data and retraining. I guess I assumed it would either have some memory and do effectively the same thing as model.gen(n=3) if I did it three times, or it would give an identical suggestion each time, but neither of those seem to be the case.

Balandat · 2024-04-22T13:39:49Z

@cheeseheist Ah I see when you are generating points in sequence you are not properly accounting for these "pending points". Can you share some code how exactly you're using the API?

Note that the dev API when you call ModelBridge.gen() will just return a GeneratorRun with a parameter suggestion, but in the Dev API you are responsible for tracking those if you do want to make that suggestion a candidate arm for a trial. The easiest way to do this is to call experiment.new_trial() and pass in the GeneratorRun. Then, in the next call to ModelBridge.gen(), Ax will know to account for these parameters as "pending". The reason this is done in the Dev API is that it gives flexibility to generate different suggestions with different settings (e.g modified search spaces or optimization configs) from which the user can choose which ones to evaluate (rather than pre-committing to making everything into a trial).

It's also possible to manually pass in pending observations into the ModelBridge.gen() call here: https://www.internalfb.com/code/fbsource/[3e4a0e5bf97b59b364ac7e098fd85def9205a5ac]/fbcode/ax/modelbridge/base.py?lines=741, though if you do want to add the previously generated suggestions as trials you should do that instead.

Note that when using the AxClient ask/tell interface, we will automatically track previously generated suggestions under the hood as trials after calling AxClient.get_next_trial(), so you don't have to worry about it there.

trevor-haas · 2024-04-22T14:58:00Z

@Balandat, this makes sense. I am only able to input certain values into the machine, ie 100% 100.1% 100.2% etc. So a value of 100.15% wouldn't be useful to me. I think due to the noise I can get away with using a continuous search space and just round the values on my own later in the pipeline. Thanks for the response.

I also have questions similar to that of @cheeseheist regarding the sample size and doing batches, however only with the service api. Please let me know if I should put this in another issue as I know we are straying away from the original question. I know this was also talked about I think in 2019 and the conclusion was that the service api doesn't support batch trials, so I've decided to just loop through ax_client_modified.get_next_trial() as many times as I want and manage the batches on my own. I also came across this which makes me think the way I am approaching it is my only option.

My problem calls for generating a new trial to run on the machine and Id like to include a couple arms to make the process more efficient (its expensive and time consuming to setup the machine for each trial). All the trials are processed by the machine one after the other without human intervention. Then at the end, all the arms are evaluated at the same time.

Im wondering if its okay to do the batch generation the way I have described for this application. From my understanding of the docs, this should be done with a batch trial as all arms will be evaluated at the same time, not multiple one armed trials. Im also not sure if this really even matters that much, could it just be best practices? Or would it substantially reduce the performance of the model?

Thanks for all your help!

Balandat · 2024-04-22T15:32:21Z

@trevor-haas this approach should work fine. For the "evaluate multiple configurations together to avoid overhead" scenario it is fine to manually group a set of individual trials together and use the standard AxClient API. Where it really matters to use a "batch trial" is where the trial deployment has to be done concurrently b/c the underlying data generation process may change over time (e.g. if you run online A/B tests with multiple treatment groups).

cheeseheist · 2024-04-22T19:23:34Z

@cheeseheist Ah I see when you are generating points in sequence you are not properly accounting for these "pending points". Can you share some code how exactly you're using the API?

Note that the dev API when you call ModelBridge.gen() will just return a GeneratorRun with a parameter suggestion, but in the Dev API you are responsible for tracking those if you do want to make that suggestion a candidate arm for a trial. The easiest way to do this is to call experiment.new_trial() and pass in the GeneratorRun. Then, in the next call to ModelBridge.gen(), Ax will know to account for these parameters as "pending". The reason this is done in the Dev API is that it gives flexibility to generate different suggestions with different settings (e.g modified search spaces or optimization configs) from which the user can choose which ones to evaluate (rather than pre-committing to making everything into a trial).

It's also possible to manually pass in pending observations into the ModelBridge.gen() call here: https://www.internalfb.com/code/fbsource/[3e4a0e5bf97b59b364ac7e098fd85def9205a5ac]/fbcode/ax/modelbridge/base.py?lines=741, though if you do want to add the previously generated suggestions as trials you should do that instead.

Note that when using the AxClient ask/tell interface, we will automatically track previously generated suggestions under the hood as trials after calling AxClient.get_next_trial(), so you don't have to worry about it there.

Here is a code snippet. I am doing generally what you suggested, but perhaps the issue is that I'm marking them complete immediately? The reason this is done is because my metric basically prompts a user for the output within the metric (currently playing around with this for offline optimization of experiments) and I have to mark it complete before the exp.fetch_data() works to trigger the metric and prompt the user. If there is a better way that I should be attaching the data to the trials and then accessing the trials and marking them complete, let me know. Though honestly, now I'm playing with my old code where I was generating them sequentially and it seems to be doing just fine, so maybe there was a separate bug I inadvertently fixed or it was all in my head :).

for j in range(BATCH_SIZE): 
    generator_run = gpei.gen(n=1)    
    trial = exp.new_trial(generator_run=generator_run)  
    trial.run()  
    trial.mark_completed()  
exp.fetch_data()

Balandat · 2024-04-23T13:15:55Z

Actually, I need to correct myself: If you use the bare Modelbridge, creating and attaching the trial to the experiment is not enough. You have to manually pass in the pending_observations.

If you use AxClient this is done for you (code pointer, but this is not the case with barebones Dev API.

Internally, this uses get_pending_observation_features to extract that input, so if you do want to use the Dev API you could use that and manually pass in the pending_observations. Of course if you call gen(n=n) with n>1 then that will also properly account for generating a batch of candidates.

I am doing generally what you suggested, but perhaps the issue is that I'm marking them complete immediately?

Given the above, marking the trial as COMPLETED is not the issue here. You can look into the guts of get_pending_observation_features to see that this returns the features for trials as pending that are completed but do not have any data attached.

trevor-haas · 2024-04-24T15:38:41Z

Thank you everyone for the help!

mgarrard · 2024-07-24T20:24:04Z

@trevor-haas closing due to inactivity, please feel free to re-open if you need additional support or start a new issue. :)

esantorella self-assigned this Apr 18, 2024

esantorella mentioned this issue Apr 22, 2024

Arms from previous batch keep appearing in new batches #2385

Closed

esantorella assigned Balandat and unassigned esantorella Apr 22, 2024

Abrikosoff mentioned this issue May 16, 2024

Restrict RangeParameter to a certain stepsize (or grid) #849

Closed

mgarrard closed this as completed Jul 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Setting search space step size in Ax Service API #2371

Setting search space step size in Ax Service API #2371

trevor-haas commented Apr 17, 2024

cheeseheist commented Apr 17, 2024

Balandat commented Apr 21, 2024

Balandat commented Apr 21, 2024

cheeseheist commented Apr 22, 2024

Balandat commented Apr 22, 2024

trevor-haas commented Apr 22, 2024

Balandat commented Apr 22, 2024

cheeseheist commented Apr 22, 2024

Balandat commented Apr 23, 2024

trevor-haas commented Apr 24, 2024

mgarrard commented Jul 24, 2024

Setting search space step size in Ax Service API #2371

Setting search space step size in Ax Service API #2371

Comments

trevor-haas commented Apr 17, 2024

cheeseheist commented Apr 17, 2024

Balandat commented Apr 21, 2024

Balandat commented Apr 21, 2024

cheeseheist commented Apr 22, 2024

Balandat commented Apr 22, 2024

trevor-haas commented Apr 22, 2024

Balandat commented Apr 22, 2024

cheeseheist commented Apr 22, 2024

Balandat commented Apr 23, 2024

trevor-haas commented Apr 24, 2024

mgarrard commented Jul 24, 2024