Implement ENTMOOT in Bofire #278

TobyBoyne · 2023-09-06T11:52:09Z

Initial progress on converting a BoFire problem to an ENTMOOT optimisation problem.

TobyBoyne · 2023-09-07T14:14:29Z

As shown in strategies/entmoot/entmoot_example.py, the ask() method now matches the syntax for other strategies.

Is there any functionality that is missing? If not, I will add tests (especially for the converting between features/constraints), and better documentation.

jduerholt

Hi @TobyBoyne,

thank you very much for your PR and contributing to BoFire. I let some high level comments. Feel free to question and discuss them. I am looking forward to be able to use ENTMOOT in BoFire.

Best,

Johannes

bofire/data_models/strategies/api.py

bofire/data_models/strategies/enting.py

bofire/strategies/entmoot/entmoot_example.ipynb

bofire/strategies/entmoot/entmoot_benchmarks.py

TobyBoyne · 2023-09-08T14:49:32Z

Hi @jduerholt,

Thank you for the review - lots of helpful feedback!

I still haven't addressed the issue of hint typing parameters - I started looking into the surrogates, but I'm not sure I understand what function they have? I can implement the types as a simple class that inherits from BaseModel, but I'd rather use surrogates if that is the proper way to do things. As mentioned in the comments, I can't use any of the provided surrogates since they don't have the same attributes (ENTMOOT uses lbgm, and xgboost is not yet supported.

Kind regards,
Toby

TobyBoyne · 2023-09-08T15:13:25Z

For reference, here is a mock-up of what the enting_params looks like. Note that the parameters in TrainParams are defined by lgbm, and I can't find a comprehensive Type of all possible options.

from typing import Literal

class EntingParams:
    unc_params: 'UncParams'
    tree_train_parameters: 'TreeTrainParams'

class UncParams:
    beta: float = 1.96 #>0
    acq_sense: Literal["exploration", "penalty"] = "exploration"
    dist_trafo: Literal["normal", "standard"] = "normal"
    dist_metric: Literal["euclidean_squared", "l1", "l2"] = "euclidean_squared"
    cat_metric: Literal["overlap", "of", "goodall4"] = "overlap"


class TreeTrainParams:
    train_lib: Literal["lgbm"] = "lgbm"
    train_params: 'TrainParams'

class TrainParams:
    # this is determined by lightbgm
    # the typing in the package uses Dict[str, Any]
    # the defaults used by ENTMOOT are shown below
    objective: str = "regression"
    metric: str = "rmse"
    boosting: str = "gbdt"
    num_boost_round: int = 100
    max_depth: int = 3
    min_data_in_leaf: int = 1
    min_data_per_group: int = 1
    verbose: int = -1

jduerholt · 2023-09-10T16:51:01Z

Thanks @TobyBoyne, I was not at home over the weekend. I will have a detailed look tmr. Best, Johannes

jduerholt · 2023-09-11T13:30:21Z

Hi Toby,

I would opt for someting like this:

class LGBMSurrogate(Surrogate, TrainableSurrogate):
    type: Literal["LGBMSurrogate"] = "LGBMSurrogate"
    objective: str = "regression"
    metric: str = "rmse"
    boosting: str = "gbdt"
    num_boost_round: Annotated[int, Field(ge=1)] = 100
    max_depth: Annnotated[int, Field(ge=1)] = 3
    min_data_in_leaf: Annotated[int, Field(ge=1)] = 1
    min_data_per_group: Annotated[int, Field(ge=1)] = 1
    # verbose: int = -1
    
class EntingSurrogates(Surrogates): # Surrogates is a new common base class for both BotorchSurrogates and EntingSurrogates
    surrogates: List[LGBMSurrogate]

    @validator(surrogates)
    def validate_surrogates()

class EntingStrategy(PredictiveStrategy):
    type: Literal["EntingStrategy"] = "EntingStrategy"
    # unc params
    beta: Annotated[float, Filed(gt=0)] = 1.96
    acq_sense: Literal["exploration", "penalty"] = "exploration"
    dist_trafo: Literal["normal", "standard"] = "normal"
    dist_metric: Literal["euclidean_squared", "l1", "l2"] = "euclidean_squared"
    cat_metric: Literal["overlap", "of", "goodall4"] = "overlap"
    # surrogate params
    surrogate_specs: Optional[EntingSurrogates] = None
    # solver params
    solver_params: Dict[str, Any] = {} # do we have any defaults here?

In Bofire we have surrogates and strategies, surrogates are regression models that can be used to model output values of experiments and (predictive) strategies use surrogates for optimization. The 100% solution would be to implement a new surrogate model called LGBMSurrogate, which can be used as standalone. And via the parameters of the surrogate datamodel one could then modify the hyperparameters of the LGBM Model used within the Enting Strategy.

We do the same for the botorch based strategies, there also different model types and hyperparams can be used within the same optimization for different outputs. Even subsets of outputs can be used. I looked a bit in the Entmoot code, and from my understanding the model parameters are globally used, meaning that for every output that should be optimized a model with the same hyperparameters and input features is trained. One could guarantee this behavior with a validator in the class EntingSurrogates. As soon as more freedom is allowed in Entmoot one could also relax the validators ... Maybe one could also derive the EntingSurrogates together with the BotorchSurrgoates from a common base class. I try to come up with a PR for such a common base class.

What do you think? Is this somehow clear?

Best,

Johannes

jduerholt · 2023-09-11T14:10:35Z

Here is the first version of the mentioned PR: #283

TobyBoyne · 2023-09-12T09:05:38Z

Hi @jduerholt,

Yes, I think that all makes sense - thank you for the explanation. So to confirm, as an example, the _fit() logic should be moved to LGBMSurrogate, but the _ask() method should remain in the strategy? I'm looking through BotorchStrategy._ask() and can't find any references to the surrogate - is the surrogate not used in this method?

Also, in terms of moving forward - do I have write permissions for the feature/surrogates branch/PR? If so, should I implement the EntingSurrogates there, and if not, should I wait for that pull request to be accepted before creating the EntingSurrogates?

Kind regards,
Toby

spiralulam · 2023-09-12T11:33:57Z

Good point! Will work on it for Entmoot. Am Di., 12. Sept. 2023 um 13:23 Uhr schrieb Robert Lee < ***@***.***>:

…

***@***.**** commented on this pull request. ------------------------------ In bofire/data_models/strategies/enting.py <#278 (comment)> : > +) +from bofire.data_models.features.api import ( + CategoricalDescriptorInput, + CategoricalInput, + ContinuousInput, + ContinuousOutput, + DiscreteInput, + Feature, +) +from bofire.data_models.objectives.api import MinimizeObjective, Objective +from bofire.data_models.strategies.strategy import Strategy + + +class EntingStrategy(Strategy): + type: Literal["EntingStrategy"] = "EntingStrategy" + enting_params: Dict[str, Any] = {} I am not a big fan of having dictionary fields that can have anything inside, as it is somehow contraty of using a validation framework like pydantic. Can you provide the possible parameters and its defaults and types and validation, that can be used as fields of the data model? This change should be made directly in entmoot ***@***.*** <https://github.com/spiralulam> wanted to do it); Johannes's argument is in general a good one and not just valid here — Reply to this email directly, view it on GitHub <#278 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AK3L6ODLKJ6DRHFCV472C6DX2BA4NANCNFSM6AAAAAA4NGSZYI> . You are receiving this because you were mentioned.Message ID: ***@***.***>

jduerholt · 2023-09-12T14:06:05Z

I'm looking through BotorchStrategy._ask() and can't find any references to the surrogate - is the surrogate not used in this method?

In botorch, the model is attached to the acquistion function object, for this reason we first get an acquistion function object in _ask and then use this object which has the model inside for the optimization:

bofire/bofire/strategies/predictives/botorch.py

Line 256 in a4ca3eb

acqfs = self._get_acqfs(candidate_count)

and here:

bofire/bofire/strategies/predictives/sobo.py

Line 60 in a4ca3eb

acqf = get_acquisition_function(

.

The self.model object is created here on the basis of the surrogates:

bofire/bofire/strategies/predictives/botorch.py

Line 113 in a4ca3eb

self.model = self.surrogate_specs.compatibilize( # type: ignore

In the call to compatibilize, the individual surrogates are compile in one botorch ModelList object.

So to confirm, as an example, the _fit() logic should be moved to LGBMSurrogate, but the _ask() method should remain in the strategy?

You would have it in both the surrgate and the strategy, but you can call the in the strategies _fit the _fit of the surrogate, but I am not sure if this is currently needed, as Entmoot is currently not supporting different surrogate specs for different outputs. So you could do it independly from each other for now.

Also, in terms of moving forward - do I have write permissions for the feature/surrogates branch/PR? If so, should I implement the EntingSurrogates there, and if not, should I wait for that pull request to be accepted before creating the EntingSurrogates?

I try to get the PR trough the latest tmr, but in terms of moving forward, I would recommend that you just create a new branch/PR in which you first just implement the LGBMSurrogate (including tests) as this is the first thing that is needed. In the meantime, I will then finalize my PR and you can then implement the EntingSurrogates and the EntingStrategy on top of both.

For the LGBMSurrogate, you can draw inspiration from the XGBSurrogate.

What do you think?

jduerholt · 2023-09-12T14:08:38Z

When we have the EntingStrategy included we will also refactor the directory strategy for the botorch based strategies into strategies/predictives/botorch/.

TobyBoyne · 2023-09-12T16:57:50Z

I would recommend that you just create a new branch/PR in which you first just implement the LGBMSurrogate (including tests) as this is the first thing that is needed

Happy to work on that! Thanks for all the reviewing and help, very much appreciated!

TobyBoyne · 2023-09-13T11:52:39Z

I've made an initial commit for the surrogate, however in order to implement the regression in #285, I need the utils in this PR - any thoughts on how I can overcome that?

TobyBoyne · 2023-10-11T12:26:59Z

I've been working on implementing changes in the Entmoot codebase. Since this was last discussed, there are a few things that have changed:

We have discussed that it is not necessary to split this into a surrogate. This PR contains a working strategy implementation, and splitting this into a strategy is potential future work.
The typing issue has been partially resolved with the addition of the EntingParams. This has been implemented in entmoot - see that repo for more information. The solver still accepts Dict[any], but that's somewhat out of our hands in that Pyomo doesn't have an extensive list of arguments.
Other changes include matching entmoot's Constraints to be similar to bofire's, and creating an interface for maximisation problems.

jduerholt · 2024-03-06T16:16:09Z

Hi @TobyBoyne,

I will do a proper review this evening. Thank you already!

Best,

Johannes

jduerholt

We are converging! Biggest obstacle is from my perspective currently some problem with the entmoot installation in the test pipeline. But this looks for me like an entmoot problem and not a bofire problem (see my comments).

Best,

Johannes

jduerholt · 2024-03-06T09:59:26Z

bofire/strategies/predictives/enting.py

+
+import numpy as np
+import pandas as pd
+import pyomo.environ as pyo


Also the pyomo import has to be moved into the try except.

Furthermore there is something strange with the entmoot installation. Looking at the logs of the failing test runs, it is installing entmoot but not pyomo, lgbm and gurobipy which are entmoots dependencies. I tried then to install entmoot locally on my laptop into an environment without it and also there it is not installing the dependencies. For me this looks like to be a problem with entmoot. Any idea?

The solver dependencies need to be installed manually.
https://entmoot.readthedocs.io/en/master/installation.html

Ok, then we have to put it into our dependencies for the option entmoot in setup.py, because as the strategy is now implemented, it is based on pyomo.

I have added these dependencies in setup.py in BoFire. However, these dependencies definitely do need to be fixed on entmoot's side - for example, lightgbm is not installed despite being required, and it should probably just install pyomo anyway.

jduerholt · 2024-03-06T19:59:00Z

bofire/strategies/predictives/enting.py

+        # This ensures that new points are far from pending candidates
+        real_experiments = self.experiments
+        if self.candidates is not None:
+            for i in range(len(self.candidates)):


Ok, makes sense. I have to further think about it, but for now we let it like it is and will merge it like this. So for now, regard this as resolved ;)

jduerholt · 2024-03-06T20:00:32Z

bofire/strategies/predictives/enting.py

+                preds = self.predict(candidate)
+                candidate = pd.concat((candidate, preds), axis=1)
+                as_experiment = self._fantasy_as_experiment(candidate)
+                self.tell(as_experiment)


In your case here, just using _fit should be fine.

jduerholt · 2024-03-06T20:04:10Z

tests/bofire/strategies/test_enting.py

+    assert (input_values != 0).sum().sum() <= allowed_k
+
+
+@pytest.mark.slow


This makes sense, could you check if gurobi is installed in somehow the same way that you check that entmoot is installed and then add another skipif. This would be a bit cleaner. If not possible, then we keep it.

jduerholt · 2024-03-06T20:06:14Z

tests/bofire/strategies/test_enting.py

+def test_propose_optimal_point(common_args):
+    # regression test, ensure that a good point is proposed
+    np.random.seed(42)
+    random.seed(42)


I would prefer passing the seed explicitly to entmoot via calling strategy._get_seed() and passing the int then to entmoot. As a general note, it would be also better in entmoot to not globally set the seeds beut using numpys rng functionality.

TobyBoyne · 2024-03-08T14:19:30Z

Addressed all above comments. I also raised an issue on ENTMOOT about random state because I think you are definitely correct! cog-imperial/entmoot#31

jduerholt

Looks overall very good to me. But somehow tests are still failing. One sees in the test logs that entmoot gets installed but somehow the suff in the try except block fails, so that nothing gets imported. No idea why.

I will try to set the permissions for you in a way that the test pipeline is directly triggered, without needing us for this. So you can iterate faster.

bofire/strategies/predictives/enting.py

TobyBoyne · 2024-03-11T12:08:09Z

Two sources of issues that I've found so far:

Entmoot requires gurobipy, even if it isn't being used. This is an issue with Entmoot (cog-imperial/entmoot#32), but I've added it to setup.py for now

pyright isn't happy with domain_to_problem_config. This is because the functions only accept a subset of the possible feature types. To me, the solution would be to make the type hinting in _bofire_feat_to_entmoot less specific. However, this undoes a change that @jduerholt recommended earlier in this PR, so I just wanted to flag it so that you are aware I'm reverting it!

Edit: Actually, domain.inputs.get() returns a list of type AnyFeature, not AnyInput. Is there any way around this?

TobyBoyne · 2024-03-11T16:41:32Z

I've made some progress. There are quite a few # type: ignores in enting.py which may not all be necessary, but I used to get around pyright and the above issue about the return types of output.get() and input.get().

However, I'm still getting issues with running the pipeline. On my machine, if I remove the license file, the tests that require Gurobi are skipped. However, the pipeline seems to still be attempting to run these, and then having issues. Any ideas?

(Note that pyomo.common.errors.ApplicationError: Solver (gurobi) did not exit normally error is raised when there is no license file).

jduerholt · 2024-03-12T20:40:59Z

Using type:ignore when using the get methods is fine. Pyright is not able to work with this dynamic behavior. Currently, I cannot start the pipeline, can you resolve the merge conflicts? They came in due to a change by Behrang ;) I will then have a look again at the pipeline and the tests.

jduerholt · 2024-03-12T20:51:46Z

Found the results from the latest pipeline execution here: https://github.com/TobyBoyne/bofire/actions/runs/8236200085/job/22522063543, and I also tried your catch for gurobi locally, (i just installed it via pip) without having a liscense and got this:

Any ideas?

jduerholt · 2024-03-12T20:53:02Z

Here it is stated that it comes with a trial liscense that can be used only for small problems: https://pypi.org/project/gurobipy/

jduerholt · 2024-03-14T14:57:59Z

Hi @TobyBoyne,

I thought about it, to get it quickly implemented, just mark the test in which gurobi is needed with slow and we are done ;)

Please also resolve the merge conflicts and then we can have it finally in.

Best,

Johannes

TobyBoyne · 2024-03-14T18:01:43Z

Hi @jduerholt, I've added this in and solved merge conflicts. Looks like the tests are all passing...?

Hopefully this should be good to go! Thank you for all of the reviews! Looking forward to the next PR :)

jduerholt

Hi @TobyBoyne,

looks good to me. Thanks for all your patience!

@bertiqwerty @R-M-Lee also ready to merge from your side?

Best,

Johannes

bertiqwerty · 2024-03-15T06:54:11Z

Thanks a lot, Toby and Johannes!

R-M-Lee · 2024-03-18T08:40:51Z

@jduerholt, @TobyBoyne Thanks to both of you for getting this through. Are we happy with the treatment of the optional entmoot dependencies now or was this more of a short-term fix and we still have some open to-dos?

TobyBoyne added 3 commits September 6, 2023 12:51

Convert features and constraints to entmoot

492ef46

EntingStrategy functional and datamodel

0f4c28b

Ask implemented for Enting with example

2f37e9a

TobyBoyne marked this pull request as ready for review September 7, 2023 14:25

jduerholt self-requested a review September 7, 2023 17:18

jduerholt requested changes Sep 7, 2023

View reviewed changes

TobyBoyne added 3 commits September 8, 2023 12:23

Convert Enting to Predictive

6090738

Implement multi objective and maximization

8c8127e

Refactor, move files to appropriate directories

be01f0d

Enting docstrings and test for domain conversion

345d1d8

TobyBoyne mentioned this pull request Sep 13, 2023

Create surrogate for entmoot strategy #285

Draft

TobyBoyne added 4 commits September 14, 2023 12:28

Multiple candidates generated in ask

3497610

Implement new entmoot constraints

e8c6468

Implement new entmoot Maximize objective

726532f

Implement new entmoot EntingParams

69f83e6

Implement entmoot.ConstraintList

bb0543d

TobyBoyne changed the title ~~Convert features and constraints to entmoot~~ Implement ENTMOOT in Bofire Nov 7, 2023

jduerholt requested changes Mar 6, 2024

View reviewed changes

Fix enting imports/installs; minor bugs in tests

2250576

TobyBoyne requested a review from jduerholt March 8, 2024 14:20

jduerholt reviewed Mar 9, 2024

View reviewed changes

bofire/strategies/predictives/enting.py Outdated Show resolved Hide resolved

TobyBoyne added 5 commits March 11, 2024 09:54

Fix import-dependent type hints

d6b6451

Commit to trigger workflow

b5bc43c

Fix problematic imports in test_enting

b44b7e8

Add entmoot to tests that need it

bb9eb9f

More changes to fix pipeline

4682cba

TobyBoyne added 2 commits March 11, 2024 15:51

Add ignores for pyright

a0253ba

Check if gurobi license is available; remove notebook

dcedd09

TobyBoyne requested a review from jduerholt March 11, 2024 16:41

TobyBoyne added 4 commits March 14, 2024 17:26

Mark tests requiring Gurobi as slow

a5dae0b

Merge branch 'main' of https://github.com/experimental-design/bofire

659b133

Add EntingStrategy to types

2ec9a71

Add Enting to mapper

ebb288f

jduerholt approved these changes Mar 14, 2024

View reviewed changes

bertiqwerty merged commit 86f1220 into experimental-design:main Mar 15, 2024
10 checks passed

TobyBoyne mentioned this pull request Mar 25, 2024

Add ENTMOOT API #277

Closed

		assert (input_values != 0).sum().sum() <= allowed_k


		@pytest.mark.slow

Implement ENTMOOT in Bofire #278

Implement ENTMOOT in Bofire #278

Conversation

TobyBoyne commented Sep 6, 2023

TobyBoyne commented Sep 7, 2023

jduerholt left a comment

Choose a reason for hiding this comment

TobyBoyne commented Sep 8, 2023

TobyBoyne commented Sep 8, 2023

jduerholt commented Sep 10, 2023

jduerholt commented Sep 11, 2023

jduerholt commented Sep 11, 2023

TobyBoyne commented Sep 12, 2023

spiralulam commented Sep 12, 2023 via email

jduerholt commented Sep 12, 2023

jduerholt commented Sep 12, 2023

TobyBoyne commented Sep 12, 2023

TobyBoyne commented Sep 13, 2023

TobyBoyne commented Oct 11, 2023

jduerholt commented Mar 6, 2024

jduerholt left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TobyBoyne commented Mar 8, 2024

jduerholt left a comment

Choose a reason for hiding this comment

TobyBoyne commented Mar 11, 2024 • edited Loading

TobyBoyne commented Mar 11, 2024

jduerholt commented Mar 12, 2024

jduerholt commented Mar 12, 2024

jduerholt commented Mar 12, 2024

jduerholt commented Mar 14, 2024

TobyBoyne commented Mar 14, 2024

jduerholt left a comment

Choose a reason for hiding this comment

bertiqwerty commented Mar 15, 2024

R-M-Lee commented Mar 18, 2024

TobyBoyne commented Mar 11, 2024 •

edited

Loading