Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement ENTMOOT in Bofire #278

Merged
merged 43 commits into from
Mar 15, 2024
Merged

Conversation

TobyBoyne
Copy link
Collaborator

Initial progress on converting a BoFire problem to an ENTMOOT optimisation problem.

@TobyBoyne
Copy link
Collaborator Author

As shown in strategies/entmoot/entmoot_example.py, the ask() method now matches the syntax for other strategies.

Is there any functionality that is missing? If not, I will add tests (especially for the converting between features/constraints), and better documentation.

@TobyBoyne TobyBoyne marked this pull request as ready for review September 7, 2023 14:25
@jduerholt jduerholt self-requested a review September 7, 2023 17:18
Copy link
Contributor

@jduerholt jduerholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @TobyBoyne,

thank you very much for your PR and contributing to BoFire. I let some high level comments. Feel free to question and discuss them. I am looking forward to be able to use ENTMOOT in BoFire.

Best,

Johannes

bofire/data_models/strategies/api.py Outdated Show resolved Hide resolved
bofire/data_models/strategies/enting.py Outdated Show resolved Hide resolved
bofire/data_models/strategies/enting.py Outdated Show resolved Hide resolved
bofire/data_models/strategies/enting.py Outdated Show resolved Hide resolved
bofire/data_models/strategies/enting.py Outdated Show resolved Hide resolved
bofire/strategies/entmoot/entmoot_example.ipynb Outdated Show resolved Hide resolved
bofire/strategies/entmoot/entmoot_benchmarks.py Outdated Show resolved Hide resolved
@TobyBoyne
Copy link
Collaborator Author

Hi @jduerholt,

Thank you for the review - lots of helpful feedback!

I still haven't addressed the issue of hint typing parameters - I started looking into the surrogates, but I'm not sure I understand what function they have? I can implement the types as a simple class that inherits from BaseModel, but I'd rather use surrogates if that is the proper way to do things. As mentioned in the comments, I can't use any of the provided surrogates since they don't have the same attributes (ENTMOOT uses lbgm, and xgboost is not yet supported.

Kind regards,
Toby

@TobyBoyne
Copy link
Collaborator Author

For reference, here is a mock-up of what the enting_params looks like. Note that the parameters in TrainParams are defined by lgbm, and I can't find a comprehensive Type of all possible options.

from typing import Literal

class EntingParams:
    unc_params: 'UncParams'
    tree_train_parameters: 'TreeTrainParams'

class UncParams:
    beta: float = 1.96 #>0
    acq_sense: Literal["exploration", "penalty"] = "exploration"
    dist_trafo: Literal["normal", "standard"] = "normal"
    dist_metric: Literal["euclidean_squared", "l1", "l2"] = "euclidean_squared"
    cat_metric: Literal["overlap", "of", "goodall4"] = "overlap"


class TreeTrainParams:
    train_lib: Literal["lgbm"] = "lgbm"
    train_params: 'TrainParams'

class TrainParams:
    # this is determined by lightbgm
    # the typing in the package uses Dict[str, Any]
    # the defaults used by ENTMOOT are shown below
    objective: str = "regression"
    metric: str = "rmse"
    boosting: str = "gbdt"
    num_boost_round: int = 100
    max_depth: int = 3
    min_data_in_leaf: int = 1
    min_data_per_group: int = 1
    verbose: int = -1

@jduerholt
Copy link
Contributor

Thanks @TobyBoyne, I was not at home over the weekend. I will have a detailed look tmr. Best, Johannes

@jduerholt
Copy link
Contributor

Hi Toby,

I would opt for someting like this:

class LGBMSurrogate(Surrogate, TrainableSurrogate):
    type: Literal["LGBMSurrogate"] = "LGBMSurrogate"
    objective: str = "regression"
    metric: str = "rmse"
    boosting: str = "gbdt"
    num_boost_round: Annotated[int, Field(ge=1)] = 100
    max_depth: Annnotated[int, Field(ge=1)] = 3
    min_data_in_leaf: Annotated[int, Field(ge=1)] = 1
    min_data_per_group: Annotated[int, Field(ge=1)] = 1
    # verbose: int = -1
    
class EntingSurrogates(Surrogates): # Surrogates is a new common base class for both BotorchSurrogates and EntingSurrogates
    surrogates: List[LGBMSurrogate]

    @validator(surrogates)
    def validate_surrogates()

class EntingStrategy(PredictiveStrategy):
    type: Literal["EntingStrategy"] = "EntingStrategy"
    # unc params
    beta: Annotated[float, Filed(gt=0)] = 1.96
    acq_sense: Literal["exploration", "penalty"] = "exploration"
    dist_trafo: Literal["normal", "standard"] = "normal"
    dist_metric: Literal["euclidean_squared", "l1", "l2"] = "euclidean_squared"
    cat_metric: Literal["overlap", "of", "goodall4"] = "overlap"
    # surrogate params
    surrogate_specs: Optional[EntingSurrogates] = None
    # solver params
    solver_params: Dict[str, Any] = {} # do we have any defaults here?

In Bofire we have surrogates and strategies, surrogates are regression models that can be used to model output values of experiments and (predictive) strategies use surrogates for optimization. The 100% solution would be to implement a new surrogate model called LGBMSurrogate, which can be used as standalone. And via the parameters of the surrogate datamodel one could then modify the hyperparameters of the LGBM Model used within the Enting Strategy.

We do the same for the botorch based strategies, there also different model types and hyperparams can be used within the same optimization for different outputs. Even subsets of outputs can be used. I looked a bit in the Entmoot code, and from my understanding the model parameters are globally used, meaning that for every output that should be optimized a model with the same hyperparameters and input features is trained. One could guarantee this behavior with a validator in the class EntingSurrogates. As soon as more freedom is allowed in Entmoot one could also relax the validators ... Maybe one could also derive the EntingSurrogates together with the BotorchSurrgoates from a common base class. I try to come up with a PR for such a common base class.

What do you think? Is this somehow clear?

Best,

Johannes

@jduerholt
Copy link
Contributor

Here is the first version of the mentioned PR: #283

@TobyBoyne
Copy link
Collaborator Author

Hi @jduerholt,

Yes, I think that all makes sense - thank you for the explanation. So to confirm, as an example, the _fit() logic should be moved to LGBMSurrogate, but the _ask() method should remain in the strategy? I'm looking through BotorchStrategy._ask() and can't find any references to the surrogate - is the surrogate not used in this method?

Also, in terms of moving forward - do I have write permissions for the feature/surrogates branch/PR? If so, should I implement the EntingSurrogates there, and if not, should I wait for that pull request to be accepted before creating the EntingSurrogates?

Kind regards,
Toby

@spiralulam
Copy link

spiralulam commented Sep 12, 2023 via email

@jduerholt
Copy link
Contributor

I'm looking through BotorchStrategy._ask() and can't find any references to the surrogate - is the surrogate not used in this method?

In botorch, the model is attached to the acquistion function object, for this reason we first get an acquistion function object in _ask and then use this object which has the model inside for the optimization:

acqfs = self._get_acqfs(candidate_count)
and here:
acqf = get_acquisition_function(
.

The self.model object is created here on the basis of the surrogates:

self.model = self.surrogate_specs.compatibilize( # type: ignore

In the call to compatibilize, the individual surrogates are compile in one botorch ModelList object.

So to confirm, as an example, the _fit() logic should be moved to LGBMSurrogate, but the _ask() method should remain in the strategy?

You would have it in both the surrgate and the strategy, but you can call the in the strategies _fit the _fit of the surrogate, but I am not sure if this is currently needed, as Entmoot is currently not supporting different surrogate specs for different outputs. So you could do it independly from each other for now.

Also, in terms of moving forward - do I have write permissions for the feature/surrogates branch/PR? If so, should I implement the EntingSurrogates there, and if not, should I wait for that pull request to be accepted before creating the EntingSurrogates?

I try to get the PR trough the latest tmr, but in terms of moving forward, I would recommend that you just create a new branch/PR in which you first just implement the LGBMSurrogate (including tests) as this is the first thing that is needed. In the meantime, I will then finalize my PR and you can then implement the EntingSurrogates and the EntingStrategy on top of both.

For the LGBMSurrogate, you can draw inspiration from the XGBSurrogate.

What do you think?

@jduerholt
Copy link
Contributor

When we have the EntingStrategy included we will also refactor the directory strategy for the botorch based strategies into strategies/predictives/botorch/.

@TobyBoyne
Copy link
Collaborator Author

I would recommend that you just create a new branch/PR in which you first just implement the LGBMSurrogate (including tests) as this is the first thing that is needed

Happy to work on that! Thanks for all the reviewing and help, very much appreciated!

@TobyBoyne
Copy link
Collaborator Author

I've made an initial commit for the surrogate, however in order to implement the regression in #285, I need the utils in this PR - any thoughts on how I can overcome that?

@TobyBoyne
Copy link
Collaborator Author

I've been working on implementing changes in the Entmoot codebase. Since this was last discussed, there are a few things that have changed:

  1. We have discussed that it is not necessary to split this into a surrogate. This PR contains a working strategy implementation, and splitting this into a strategy is potential future work.
  2. The typing issue has been partially resolved with the addition of the EntingParams. This has been implemented in entmoot - see that repo for more information. The solver still accepts Dict[any], but that's somewhat out of our hands in that Pyomo doesn't have an extensive list of arguments.
  3. Other changes include matching entmoot's Constraints to be similar to bofire's, and creating an interface for maximisation problems.

@TobyBoyne TobyBoyne changed the title Convert features and constraints to entmoot Implement ENTMOOT in Bofire Nov 7, 2023
@jduerholt
Copy link
Contributor

Hi @TobyBoyne,

I will do a proper review this evening. Thank you already!

Best,

Johannes

Copy link
Contributor

@jduerholt jduerholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are converging! Biggest obstacle is from my perspective currently some problem with the entmoot installation in the test pipeline. But this looks for me like an entmoot problem and not a bofire problem (see my comments).

Best,

Johannes


import numpy as np
import pandas as pd
import pyomo.environ as pyo
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also the pyomo import has to be moved into the try except.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Furthermore there is something strange with the entmoot installation. Looking at the logs of the failing test runs, it is installing entmoot but not pyomo, lgbm and gurobipy which are entmoots dependencies. I tried then to install entmoot locally on my laptop into an environment without it and also there it is not installing the dependencies. For me this looks like to be a problem with entmoot. Any idea?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The solver dependencies need to be installed manually.
https://entmoot.readthedocs.io/en/master/installation.html

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, then we have to put it into our dependencies for the option entmoot in setup.py, because as the strategy is now implemented, it is based on pyomo.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added these dependencies in setup.py in BoFire. However, these dependencies definitely do need to be fixed on entmoot's side - for example, lightgbm is not installed despite being required, and it should probably just install pyomo anyway.

# This ensures that new points are far from pending candidates
real_experiments = self.experiments
if self.candidates is not None:
for i in range(len(self.candidates)):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, makes sense. I have to further think about it, but for now we let it like it is and will merge it like this. So for now, regard this as resolved ;)

preds = self.predict(candidate)
candidate = pd.concat((candidate, preds), axis=1)
as_experiment = self._fantasy_as_experiment(candidate)
self.tell(as_experiment)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In your case here, just using _fit should be fine.

assert (input_values != 0).sum().sum() <= allowed_k


@pytest.mark.slow
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense, could you check if gurobi is installed in somehow the same way that you check that entmoot is installed and then add another skipif. This would be a bit cleaner. If not possible, then we keep it.

def test_propose_optimal_point(common_args):
# regression test, ensure that a good point is proposed
np.random.seed(42)
random.seed(42)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer passing the seed explicitly to entmoot via calling strategy._get_seed() and passing the int then to entmoot. As a general note, it would be also better in entmoot to not globally set the seeds beut using numpys rng functionality.

@TobyBoyne
Copy link
Collaborator Author

Addressed all above comments. I also raised an issue on ENTMOOT about random state because I think you are definitely correct! cog-imperial/entmoot#31

@TobyBoyne TobyBoyne requested a review from jduerholt March 8, 2024 14:20
Copy link
Contributor

@jduerholt jduerholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks overall very good to me. But somehow tests are still failing. One sees in the test logs that entmoot gets installed but somehow the suff in the try except block fails, so that nothing gets imported. No idea why.

I will try to set the permissions for you in a way that the test pipeline is directly triggered, without needing us for this. So you can iterate faster.

bofire/strategies/predictives/enting.py Outdated Show resolved Hide resolved
@TobyBoyne
Copy link
Collaborator Author

TobyBoyne commented Mar 11, 2024

Two sources of issues that I've found so far:

Entmoot requires gurobipy, even if it isn't being used. This is an issue with Entmoot (cog-imperial/entmoot#32), but I've added it to setup.py for now

pyright isn't happy with domain_to_problem_config. This is because the functions only accept a subset of the possible feature types. To me, the solution would be to make the type hinting in _bofire_feat_to_entmoot less specific. However, this undoes a change that @jduerholt recommended earlier in this PR, so I just wanted to flag it so that you are aware I'm reverting it!

Edit: Actually, domain.inputs.get() returns a list of type AnyFeature, not AnyInput. Is there any way around this?

@TobyBoyne
Copy link
Collaborator Author

I've made some progress. There are quite a few # type: ignores in enting.py which may not all be necessary, but I used to get around pyright and the above issue about the return types of output.get() and input.get().

However, I'm still getting issues with running the pipeline. On my machine, if I remove the license file, the tests that require Gurobi are skipped. However, the pipeline seems to still be attempting to run these, and then having issues. Any ideas?

(Note that pyomo.common.errors.ApplicationError: Solver (gurobi) did not exit normally error is raised when there is no license file).

@jduerholt
Copy link
Contributor

Using type:ignore when using the get methods is fine. Pyright is not able to work with this dynamic behavior. Currently, I cannot start the pipeline, can you resolve the merge conflicts? They came in due to a change by Behrang ;) I will then have a look again at the pipeline and the tests.

@jduerholt
Copy link
Contributor

Found the results from the latest pipeline execution here: https://github.com/TobyBoyne/bofire/actions/runs/8236200085/job/22522063543, and I also tried your catch for gurobi locally, (i just installed it via pip) without having a liscense and got this:

image

Any ideas?

@jduerholt
Copy link
Contributor

Here it is stated that it comes with a trial liscense that can be used only for small problems: https://pypi.org/project/gurobipy/

@jduerholt
Copy link
Contributor

Hi @TobyBoyne,

I thought about it, to get it quickly implemented, just mark the test in which gurobi is needed with slow and we are done ;)

Please also resolve the merge conflicts and then we can have it finally in.

Best,

Johannes

@TobyBoyne
Copy link
Collaborator Author

Hi @jduerholt, I've added this in and solved merge conflicts. Looks like the tests are all passing...?

Hopefully this should be good to go! Thank you for all of the reviews! Looking forward to the next PR :)

Copy link
Contributor

@jduerholt jduerholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @TobyBoyne,

looks good to me. Thanks for all your patience!

@bertiqwerty @R-M-Lee also ready to merge from your side?

Best,

Johannes

@bertiqwerty
Copy link
Contributor

Thanks a lot, Toby and Johannes!

@bertiqwerty bertiqwerty merged commit 86f1220 into experimental-design:main Mar 15, 2024
10 checks passed
@R-M-Lee
Copy link
Contributor

R-M-Lee commented Mar 18, 2024

@jduerholt, @TobyBoyne Thanks to both of you for getting this through. Are we happy with the treatment of the optional entmoot dependencies now or was this more of a short-term fix and we still have some open to-dos?

@TobyBoyne TobyBoyne mentioned this pull request Mar 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants