Skip to content
This repository has been archived by the owner on Feb 23, 2023. It is now read-only.

Non-numeric categorical variables #161

Open
fdtomasi opened this issue Feb 14, 2018 · 3 comments
Open

Non-numeric categorical variables #161

fdtomasi opened this issue Feb 14, 2018 · 3 comments

Comments

@fdtomasi
Copy link
Member

I would like to use non-numerical categorical variables in GPyOpt.
From the error, I understand such variables are being treated in the same way as DiscreteVariables, but it is not possible to convert them to float.

Minimal example

from sklearn.svm import SVC
from GPyOpt.methods.bayesian_optimization import BayesianOptimization

domain = [{'name': 'kernel', 'type': 'categorical', 'domain': ('linear', 'poly')}]
BayesianOptimization(SVC(), domain=domain)

Output

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-7-d90c79264e40> in <module>()
      2 
      3 domain = [{'name': 'kernel', 'type': 'categorical', 'domain': ('linear', 'poly')}]
----> 4 BayesianOptimization(SVC(), domain=domain)

/home/fede/src/GPyOpt/GPyOpt/methods/bayesian_optimization.pyc in __init__(self, f, domain, constraints, cost_withGradients, model_type, X, Y, initial_design_numdata, initial_design_type, acquisition_type, normalize_Y, exact_feval, acquisition_optimizer_type, model_update_interval, evaluator_type, batch_size, num_cores, verbosity, verbosity_model, maximize, de_duplication, **kwargs)
    115         self.initial_design_type  = initial_design_type
    116         self.initial_design_numdata = initial_design_numdata
--> 117         self._init_design_chooser()
    118 
    119         # --- CHOOSE the model type. If an instance of a GPyOpt model is passed (possibly user defined), it is used.

/home/fede/src/GPyOpt/GPyOpt/methods/bayesian_optimization.pyc in _init_design_chooser(self)
    189         # Case 1:
    190         if self.X is None:
--> 191             self.X = initial_design(self.initial_design_type, self.space, self.initial_design_numdata)
    192             self.Y, _ = self.objective.evaluate(self.X)
    193         # Case 2

/home/fede/src/GPyOpt/GPyOpt/experiment_design/__init__.pyc in initial_design(design_name, space, init_points_count)
     18         raise ValueError('Unknown design type: ' + design_name)
     19 
---> 20     return design.get_samples(init_points_count)

/home/fede/src/GPyOpt/GPyOpt/experiment_design/random_design.pyc in get_samples(self, init_points_count)
     17             return self.get_samples_with_constraints(init_points_count)
     18         else:
---> 19             return self.get_samples_without_constraints(init_points_count)
     20 
     21     def get_samples_with_constraints(self, init_points_count):

/home/fede/src/GPyOpt/GPyOpt/experiment_design/random_design.pyc in get_samples_without_constraints(self, init_points_count)
     57         samples = np.empty((init_points_count, self.space.dimensionality))
     58 
---> 59         self.fill_noncontinous_variables(samples)
     60 
     61         if self.space.has_continuous():

/home/fede/src/GPyOpt/GPyOpt/experiment_design/random_design.pyc in fill_noncontinous_variables(self, samples)
     44             if isinstance(var, DiscreteVariable) or isinstance(var, CategoricalVariable) :
     45                 sample_var = np.atleast_2d(np.random.choice(var.domain, init_points_count))
---> 46                 samples[:,idx] = sample_var.flatten()
     47 
     48             # sample in the case of bandit variables

ValueError: could not convert string to float: linear
@joshring
Copy link

Make a wrapper function to internally call SVC, and encode inside the function numbers corresponding to the attributes you care about.

This way the optimisation method need not need to know the details of the arguments to optimise the function

@hmanz
Copy link

hmanz commented Mar 22, 2018

I'm having the same issue when trying to use categorical variables with strings in their domain. @joshring If I understood correctly you are suggesting to write a wrapper function in order to be able to use integers (like a discrete variable) instead of strings.

This brings me to my next question: Are discrete variables fitted with a GP? Or are they assumed to be categorical with no local correlation?
Fyi this question I'm asking is similar to the one in this link, but nobody responded to that.

@ahundt
Copy link

ahundt commented Mar 29, 2018

This might do what you are looking for: #175

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants