Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assertion error when accessing the Random Forest object after optimization #1092

Open
jsalva9 opened this issue Jan 29, 2024 · 2 comments
Open

Comments

@jsalva9
Copy link

jsalva9 commented Jan 29, 2024

Description

After running SMAC on an instance or set of instances, I would want to access the final state of the surrogate model - the Random Forest - to study the probability distributions that are estimated for each parameter. In essence, I would just want to call the predict_marginalized(X) method of smac.model.random_forest.RandomForest

The problem is that when trying to use this method, some assertion error is raised (see Section Actual Result).

If there is another way to get the same information - the learned distribution/contribution of each parameter to the cost, please feel free to share it :)

Steps/Code to Reproduce

Here is the code to reproduce the assertion error.

from typing import Dict, Any
from ConfigSpace.configuration_space import ConfigurationSpace
from ConfigSpace.configuration import Configuration
from ConfigSpace.hyperparameters import UniformFloatHyperparameter as Float
from smac.facade import HyperparameterOptimizationFacade
from smac.scenario import Scenario

import numpy as np


class QuadraticFunction:
    @property
    def configspace(self) -> ConfigurationSpace:
        cs = ConfigurationSpace(seed=0)
        x = Float(name='x', lower=0, upper=5, default_value=2)
        cs.add_hyperparameters([x])

        return cs

    def train(self, config: Configuration, seed: int = 0) -> float:
        """Returns the y value of a quadratic function with a minimum we know to be at x=0."""
        x = config["x"]
        return x**2

def main():
    model = QuadraticFunction()

    # Scenario object specifying the optimization "environment"
    scenario = Scenario(model.configspace, name='test_example', deterministic=True, n_trials=100)

    # Now we use SMAC to find the best hyperparameters
    smac = HyperparameterOptimizationFacade(
        scenario,
        model.train,  
        overwrite=True,
    )

    incumbent = smac.optimize()
    print(f'incumbent: {incumbent}')
    print(f'default cost: {model.train(model.configspace.get_default_configuration())}')
    print(f'cost: {model.train(incumbent)}')


    # Predict using the Random Forest regressor on 10 samples
    smac.get_model(scenario).predict_marginalized(np.random.rand(10,1))

    

if __name__ == '__main__':
    main()

Expected Results

I would simply expect to be able to call the predict method of the RF object without an assertion error.

Actual Results

I get an assertion error. It seems like the Random Forest object is not available at the time of the prediction.

  File "/home/jssoler/repositories/hyperparam-opt/src/example.py", line 51, in <module>
    main()
  File "/home/jssoler/repositories/hyperparam-opt/src/example.py", line 46, in main
    smac.get_model(scenario).predict_marginalized(np.random.rand(10,1))
  File "/home/jssoler/repositories/hyperparam-opt/hyperparam_opt_env/lib/python3.10/site-packages/smac/model/random_forest/random_forest.py", line 258, in predict_marginalized
    mean_, var = self.predict(X)
  File "/home/jssoler/repositories/hyperparam-opt/hyperparam_opt_env/lib/python3.10/site-packages/smac/model/abstract_model.py", line 221, in predict
    mean, var = self._predict(X, covariance_type)
  File "/home/jssoler/repositories/hyperparam-opt/hyperparam_opt_env/lib/python3.10/site-packages/smac/model/random_forest/random_forest.py", line 199, in _predict
    assert self._rf is not None
AssertionError

Versions

smac 2.0.2

@alexandertornede
Copy link
Contributor

Hi,

thanks for reporting this!

@dengdifan : Can you please look into this, once you have time? Thanks!

@dengdifan
Copy link
Contributor

Hi,
smac.get_model create a new Surrogate Model instance that is not trained with the observed data: https://github.com/automl/SMAC3/blob/main/smac/facade/hyperparameter_optimization_facade.py#L24

if you need to have a model from the smac instance, you can directly call: smac._model: https://github.com/automl/SMAC3/blob/main/smac/facade/abstract_facade.py#L161

if you have any further doubts, please let me know.

@dengdifan dengdifan removed the bug label Feb 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants