Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to save and load experiment/model from optimize #87

Closed
HanGuo97 opened this issue May 11, 2019 · 20 comments
Closed

How to save and load experiment/model from optimize #87

HanGuo97 opened this issue May 11, 2019 · 20 comments
Assignees
Labels
enhancement New feature or request question Further information is requested

Comments

@HanGuo97
Copy link

Hi, from the documentation, the optimize function returns the (best_parameters, values, experiment, model) tuple. I'm wondering what's the best practices for saving these values (e.g. for visualization in a different machine)? Also, is it possible to interrupt a model and later resume from the state in the optimize API? Thanks!

@lena-kashtelyan
Copy link
Contributor

lena-kashtelyan commented May 13, 2019

Hi, @HanGuo97!

  1. We are working on the ability to save the model and stop/resume optimization through the Loop API (the one that provides the optimize function), but those are not yet included in the current version of Ax.

  2. For best parameters and values, I don't think there is yet a best practice per se, since those are just mappings. However, we should add an ability to easily retrieve those from the experiment –– thank you for pointing it out.

  3. For saving the experiment, you can make use of JSON save or SQA save_experiment functionality.

  4. Finally, re: visualization on a different machine, you can easily restore the best objective per iteration plot from the experiment you saved and reloaded, like so (took this example from the Loop API tutorial):

experiment = load("experiment.json")  # `load` docs: https://ax.dev/api/index.html#ax.load

best_objectives = np.array([[trial.objective_mean for trial in experiment.trials.values()]])
best_objective_plot = optimization_trace_single_method(
    y=np.minimum.accumulate(best_objectives, axis=1),
    optimum=hartmann6.fmin,
    title="Model performance vs. # of iterations",
    ylabel="Hartmann6",
)
render(best_objective_plot)

For the response surface contour plots, we will provide an ability to reload those plots soon.

@lena-kashtelyan lena-kashtelyan self-assigned this May 13, 2019
@lena-kashtelyan lena-kashtelyan added enhancement New feature or request question Further information is requested labels May 13, 2019
@HanGuo97
Copy link
Author

Thanks for the response!

@ksanjeevan
Copy link

@lena-kashtelyan any updates on this front? Specifically, on point 2? How can one obtained the best parameters and values from just an experiment object? Thanks!

@theinlinaung2010
Copy link

@lena-kashtelyan Have you got any updates on these features? I'm trying to do the same things.

@lena-kashtelyan
Copy link
Contributor

lena-kashtelyan commented Jul 13, 2020

Hi, @ksanjeevan, @theinlinaung2010! Sorry to have missed your tags, the issue was closed so we weren't notified in time. Feel free to reopen the issue if there is a follow-up, in the future, to make sure we get back to you as soon as we can!

For 2), you can use get_best_parameters utility to find the best point on the experiment. You can also use exp_to_df to view the trials in your experiment as a convenient dataframe. Let me know if those don't fully address your concern!

To use storage functionality for experiment and models used in optimization, I would recommend using our Service API that is well integrated with our storage layer (you can store locally to a .json file or to an SQL backend). You can check out our API comparison code snippets and the Service API tutorial to see how to get started and how to leverage storage.

Let us know if you have further questions, I'll keep the issue open for now.

@lena-kashtelyan
Copy link
Contributor

Closing since it appears that my last answer resolved the questions. @ksanjeevan, @theinlinaung2010, feel free to reopen if you have follow-ups!

@nwrim
Copy link

nwrim commented Apr 7, 2021

Hi @lena-kashtelyan, I was searching through the web to see if we can explicitly save the models, and this looked like the most similar issue. I was wondering if we can locally save models, such as ax.modelbridge.factory.get_GPEI (e.g., if we ran m = ax.modelbridge.factory.get_GPEI(experiment=experiment, data=ax.Data(data)), we want to save m locally so that we can load it back at later stage). We are not using the service API, which seems to make saving quite easier. This is because we do not have an evaluation function - we just make people do some task, get the results, load and feed the data to the model.

If the model cannot be saved, is there a way where we can preserve the predict() functionality of the model? For context, we are running human-subject experiments, and sometimes the experiment takes weeks to run to actually get the data. We want to compare the results with the prediction the model made at that point (similar to cross-validation plot in model example, but across batches rather than CVs). Do you have any suggestions?

Let me know if anything is unclear, or if you want me to open a separate issue!

@ldworkin
Copy link
Contributor

ldworkin commented Apr 7, 2021

Hi @nwrim ! The easiest way to accomplish what you want is probably to save the data that you're using to fit the model (rather than the model itself), and then you can refit it whenever you want. If you want to take advantage of Ax's storage to do so, you would use experiment.attach_data() followed by save_experiment. Then you can use load_experiment and experiment.lookup_data_for_trial or experiment.lookup_data_for_ts at a later point to grab that data again and refit the model. Does that sound like it would help?

@ldworkin ldworkin reopened this Apr 7, 2021
@lena-kashtelyan
Copy link
Contributor

lena-kashtelyan commented Apr 7, 2021

We are not using the service API, which seems to make saving quite easier. This is because we do not have an evaluation function - we just make people do some task, get the results, load and feed the data to the model.

Just wanted to quickly chime in that you don't need to have an evaluation function to use the Service API –– it's actually a perfect fit for the case of "do some task, get results, feed them back to Ax model". The AxClient.get_next_trial (get parameters to try) then AxClient.complete_trial (feed data back to Ax model) pattern is made to serve this exact purpose.

So if you wanted to leverage the Service API and its convenient storage setup, you totally could! An 'evaluation function' is just a stub for the purpose of the tutorial to show that one can do whatever one needs with the parameter configuration in the given trial to get the data to log back to Ax : ) This has been a source of confusion to multiple folks it seems, so we'll think about how to make it clearer in the tutorial.

Even with Service API though, what would be happening under the hood is not storage of model but what @ldworkin said above –– under the hood we will re-fit the model to new data when data is available and just store information about the model settings (and a bit of its state in some cases). So it would just be a convenience wrapper around that same functionality.

@nwrim
Copy link

nwrim commented Apr 7, 2021

Hi @ldworkin and @lena-kashtelyan! Thanks for the response! I really appreciate it.

The reason behind why I wanted to store the model, rather than the experiment+data itself, was because the model seemed to generate different suggestions when .gen() was called when I fit the model with the exact same data. Based on this, I guessed that the model has a slightly different prediction due to some stochastic process (we are using GPEI model) even though we are fitting it with the exact same data. We kind of wanted to preserve the model because we want to release what the model predicted at the final stage. It is good to know that model saving is not supported - I will think of how I can approach this better.

For using the service API, the greatest reason why I did not use it was because I thought adding arms that the model did not suggest (guessing through AxClient.get_next_trial, as you mentioned) in the next batch was not possible in the service API. Is this possible through the service API? In that case, I think I should consider changing my pipeline to service API since it looks so much easier to use!

Thanks again for the response!

@Balandat
Copy link
Contributor

Balandat commented Apr 7, 2021

I guessed that the model has a slightly different prediction due to some stochastic process (we are using GPEI model) even though we are fitting it with the exact same data. We kind of wanted to preserve the model because we want to release what the model predicted at the final stage. It is good to know that model saving is not supported - I will think of how I can approach this better.

Are you concerned about the model predictions being reproducible or the candidates that the model generates? In either case, it should generally be possible to make that deterministic by fixing the random seed in the proper places (so long as you pass in the same data of course). If that would be useful we could give you some pointers on how to achieve this.

@nwrim
Copy link

nwrim commented Apr 7, 2021

Hi @Balandat! Yes, we are essentially concerned about reproducibility, since we likely will have to release data from all steps (predictions, candidates, etc). It will be great if you can point us to seeding the models. I used the get_sobol model with seed in the first generation of random arms so that it can be reproducible, but I did not see a seed option in other models like get_GPEI, which are the actual BayesOpt models that we wanted to use.

@lena-kashtelyan
Copy link
Contributor

lena-kashtelyan commented Apr 12, 2021

For using the service API, the greatest reason why I did not use it was because I thought adding arms that the model did not suggest (guessing through AxClient.get_next_trial, as you mentioned) in the next batch was not possible in the service API. Is this possible through the service API? In that case, I think I should consider changing my pipeline to service API since it looks so much easier to use!

AxClient.attach_trial should be what you are looking for, I think, @nwrim! See example here: https://ax.dev/tutorials/gpei_hartmann_service.html#Special-Cases.

You can also pass random_seed keyword argument to AxClient to specify the random seed for Sobol, but it will not fix it for GPEI, but I think @Balandat can suggest a way that will work for fixing the GPEI seed by fixing seed in PyTorch?

@Balandat
Copy link
Contributor

So the brute force approach would be to just try to set torch.manual_seed before calling get_next_trial. This probably will work but could have some unwanted effects on other torch things going on under the hood.

For the dev API it's possible to pass the args down to the Modelbridge's gen method as

model_gen_kwargs: {"optimizer_kwargs": {"options": {"seed": 1234}}}

(lots of dicts I know). I don't think we have this exposed in the service API though at this point, it may make sense to add an optional seed kwargs to get_next_trial so that it's easy to get determinsitic behavior.

@lena-kashtelyan
Copy link
Contributor

lena-kashtelyan commented Apr 22, 2021

@nwrim, the approach @Balandat suggests above is actually also possible in the Service API, but one would have to manually construct the generation strategy and pass it to AxClient. Here is how:

from ax.modelbridge.generation_strategy import GenerationStep, GenerationStrategy
from ax.modelbridge.registry import Models

gs = GenerationStrategy(
 steps=[
    GenerationStep(
        model=Models.SOBOL,
        num_trials = 5,  # Set number of trials to generate from Sobol
        min_trials_observed = 3,  # Number of Sobol trials that must be completed with data before next generation step
        model_kwargs = {"seed": 239},  # Fix seed to something
    ),
    GenerationStep(
        model=Models.BOTORCH,
        num_trials = -1,
        max_parallelism = None,   # Can set a limit on parallelism in this step if desired
        model_gen_kwargs = {"optimizer_kwargs": {"options": {"seed": 239}}},   # Fix seed to something
      ]
 )

ax_client = AxClient(..., generation_strategy=gs)

ax_client.create_experiment(...)  # This function will no longer auto-set the generation strategy

@nwrim
Copy link

nwrim commented Apr 22, 2021

This is so, so helpful! I will be trying out the suggested approaches, both in dev API and service API, and let you know if anything does not pan out. Please feel free to close the issue!

@lena-kashtelyan
Copy link
Contributor

Okay! Don't hesitate to reopen if you run into any issues or have follow-up questions : )

@mickelindahl
Copy link

This is how I recreated a model from experiment data. Did not need to call experiment.attach_data(). Only used the save and load functions from from ax import save, load

m = get_GPEI(experiment, experiment.fetch_data())

Visualize:
from ax.modelbridge.factory import get_GPEI
from ax.plot.contour import plot_contour

m = get_GPEI(experiment, experiment.fetch_data())

render(plot_contour(model=m, param_x='lr', param_y='momentum', metric_name='accuracy'))

@sgbaird
Copy link
Contributor

sgbaird commented Nov 11, 2022

@lena-kashtelyan I have a large model that takes a while to fit since it has ~2000 datapoints across 3 objectives with ~22 parameters (a few are categorical) and two parameter constraints. Do you have any suggestions for saving and reloading of the model without refitting (Service API)? Something hacky or non-portable to other machines would be fine. For example, do you know if using pickle would work? If not, I'll probably give it a try soon.

@Runyu-Zhang
Copy link

@lena-kashtelyan I have a large model that takes a while to fit since it has ~2000 datapoints across 3 objectives with ~22 parameters (a few are categorical) and two parameter constraints. Do you have any suggestions for saving and reloading of the model without refitting (Service API)? Something hacky or non-portable to other machines would be fine. For example, do you know if using pickle would work? If not, I'll probably give it a try soon.

Kindly ask if there is a solution for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request question Further information is requested
Projects
None yet
Development

No branches or pull requests

10 participants