-
Notifications
You must be signed in to change notification settings - Fork 547
Gaussian Process for Integer and Categorical Dimensions? #580
Comments
If the search space is purely categorical then we use a special kernel. Beyond that take a look at the transform for categorical dimensions. Personally I use tree based optimisers when the space has lots of integers/categories instead of using a GP based optimizer. |
I would like to understand this so I can explain it in my tutorial. For Integer dimensions, it appears from the comment (yay!) in Is it documented somewhere that all dimensions are converted / transformed internally to real-valued? Or is this transformation only done for Gaussian Processes? For Categorical dimensions, I managed to track down something called But the categorical dimension can have a very non-smooth impact on the objective function. Is the Gaussian Process really able to handle such non-smooth dimensions? Or is that perhaps why you prefer to use tree-based search? |
How each optimisation method handles the encoding/transforming of dimensions is considered an implementation detail. As a result you need to read the code to find out how it is done. It has the advantage that we can change how this works without having to deprecate behaviour first. Your description of what happens for a GP is correct: In [9]: from skopt.utils import normalize_dimensions
In [10]: space = normalize_dimensions([(4, 13), ('a', 'b', 'c', 'd'), (34., 42.2
...: 3)])
In [11]: space.rvs(random_state=1)
Out[11]: [[8, 'c', 34.000941304746746]]
In [13]: space.transform(space.rvs(random_state=1))
Out[13]:
array([[ 4.44444444e-01, 0.00000000e+00, 0.00000000e+00,
1.00000000e+00, 0.00000000e+00, 1.14374817e-04]])
In [14]: space.inverse_transform(np.array([[0.444444444444444445, 0.2, 0.2, 0.2,
...: 0.4, 1.14374817e-4], ]))
Out[14]: [[8, 'd', 34.000941304743911]]
Correct. To learn more you can start reading papers by Frank Hutter like for example https://www.cs.ubc.ca/~hutter/papers/10-TR-SMAC.pdf (section 4). I think they were the people who first used trees instead of GPs. I think the best way to understand the trade-offs and caveats of using GPs vs trees is to watch some of the many videos from the various summer schools on black-box optimisation and GPs (google for Nando de Freitas and Sheffield summer school). |
Thank you. It might be useful to add some of this information to doc-strings. |
@Hvass-Labs if you are still interested in the subject, have a look on this very ilustrative paper They modify the kernel to use the information about the integerness of a variable - look at the sections 3.1 and 3.2 and at the figure 1 at page 7. |
One of the many things that I really like about skopt is that I can mix real-valued, integer and categorical dimensions in the search-space. This is REALLY powerful!
However, I'm not an expert on Gaussian Processes (GP) and Bayesian Optimization (BO), so can you explain to me (and the rest of the users) how you actually do this internally in skopt? All the lectures I have watched on GP and BO always assume a single real-valued variable. How do you model integers and categorical variables using a GP?
The text was updated successfully, but these errors were encountered: