Skip to content
This repository has been archived by the owner on Feb 28, 2024. It is now read-only.

[MRG] Cleanup skopt/parameter.py with docs and some minor changes #75

Merged
merged 10 commits into from Jun 10, 2016

Conversation

MechCoder
Copy link
Member

Added docs and some minor cosmetics.
Removed prior for now so that we can compare our results with a randomized search where we have some prior knowledge about the candidates. (I understand it might be useful but YAGNI)

@MechCoder
Copy link
Member Author

@AlexanderFabisch @fabianp Since you have used the library to an extent, we would like to hear your thoughts on this. The idea is to support function calls like this at some point of time

gp_optimize(func, [Real(2, 4), ["red", "green"]])

bounds will be either a list of Distribution objects or lists. If it is a list of lists, then the type of parameter will be inferred.

(I am just cleaning up the work of @betatim done here #70)

@AlexanderFabisch
Copy link
Contributor

That's a good idea. Here are some suggestions:

  • Maybe the argument name bounds is a little bit misleading for [Real(2, 4), ["red", "green"]]. You could introduce an additional argument param_distributions.
  • Real should actually be UniformReal. There could be NormalReal or LogReal as well like (similar to hyperopt).

However, my main focus at the moment is the optimization of real vectors so I actually don't really care about this use case.

@AlexanderFabisch
Copy link
Contributor

By the way, is it possible to set a seed in scipy.stats distributions? It would be much better if you could simply pass a seed for the random number generator in the function call like in sklearn.

@betatim
Copy link
Member

betatim commented Jun 3, 2016

I would vote to keep it Real (hahaha...) and support non uniform priors via Real(..., prior=dist.norm(..)) or Real(..., prior=dist.exp(..)) etc. At least that was my thinking about having a prior parameter. What do you think?

To me it seems worth having the prior and the transform separate. The prior represents where you think likely good values are and the transform takes care of turning the values into something that is "ncie" for the optimiser to handle (onehot encoding, similar ranges)

@fabianp
Copy link

fabianp commented Jun 3, 2016

Hi, thanks for the heads up.

I'm general I'm skeptical for the need of such framework (but I could be wrong). In the usercases I'm familiar with you never need more than to set bounds and/or optionally transform to logspace, so this might be overkill. I know RandomizedSearchCV takes a distribution argument, but I've never used it for anything else besides uniform sampling (sometimes in log space).

@MechCoder
Copy link
Member Author

Thanks a lot for your feedback.

@fabianp By skeptical, are you skeptical about the use of priors or the support for categorical parameters? A common use case would be the type of scaling used or the kernel in SVM (I remember the paper on randomized search had something on those lines).

I also think it would be okay to support priors via arguments later.

@AlexanderFabisch Yes, we would pass a random seed to the function call which will then be passed internally to the rvs method and then sampled (as being done currently at https://github.com/scikit-optimize/scikit-optimize/blob/master/skopt/gp_opt.py#L121). Were you meaning this or something else?

@MechCoder
Copy link
Member Author

@betatim Please review and if you are happy, do merge. In follow-up PR's I can address the remaining.

@AlexanderFabisch
Copy link
Contributor

@AlexanderFabisch Yes, we would pass a random seed to the function call which will then be passed internally to the rvs method and then sampled (as being done currently at https://github.com/scikit-optimize/scikit-optimize/blob/master/skopt/gp_opt.py#L121). Were you meaning this or something else?

That is what I meant.

@glouppe
Copy link
Member

glouppe commented Jun 6, 2016

+1 for decoupling the domain/type of a variable (i.e., real, integer, categorical) from how it should be converted before being fed to the optimizer.

Overall, things should be very simple by default, e.g Real(2, 4)) (no transform, uniform distribution), but extendable in a plug and play manner with optional arguments, e.g. Real(2, 4, transformer="log", prior="gaussian").

@glouppe
Copy link
Member

glouppe commented Jun 6, 2016

Also I agree that we should start by having something very simple to begin with, i.e. supporting real integer and categorical types, all assumed to be uniformly distributed, with an optional log transform for the real values. This should cover 80-90% of the user cases.

@glouppe
Copy link
Member

glouppe commented Jun 6, 2016

Is this ready for review? I am confused as this PR is adding parameters.py while parameter.py (without an s) is already there.

@glouppe
Copy link
Member

glouppe commented Jun 6, 2016

Could you instead base your PR on master? It is otherwise difficult to compare what you are actually changing. Not sure what was wrong with Tim's proposal. (Seems fine with me)

@MechCoder MechCoder force-pushed the cleanup branch 2 times, most recently from 8eaf198 to 544ef4f Compare June 7, 2016 07:36
@@ -151,6 +178,8 @@ def __init__(self, low, high, prior='uniform', transformer='identity'):

if transformer == 'identity':
self.transformer = Identity()
elif transformer == 'log':
self.transformer = Log()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to allow log transform for integers? Not sure we should support it, or at least the inverse transform should be cast back to integers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe not or at least I can't think of a quick use case.

@MechCoder
Copy link
Member Author

There are still tests to be added.

@MechCoder MechCoder changed the title Cleanup Real with docs and some minor changes Cleanup skopt/parameter.py with docs and some minor changes Jun 7, 2016
@MechCoder
Copy link
Member Author

Checkout the bug in the last commit :P

@betatim
Copy link
Member

betatim commented Jun 7, 2016

Woah! Thanks for catching this.

@abc.abstractmethod
def rvs(self, n_samples=None, random_state=None):
"""
Sample points randomly.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change to "Randomly sample points from the original space" to explain that the samples are not in the warped space.

@MechCoder
Copy link
Member Author

Great, I'll change it then.

@betatim
Copy link
Member

betatim commented Jun 8, 2016

My thinking behind splitting the prior and transformer was that there are two problems that need solving:

  1. how to transform values that humans like (eg strings to label categories) to values that optimisers like (eg one hot encoded categories)
  2. express how I believe the most likely/useful values of a parameter are distributed (eg gaussian around 5123 with std of 42)

Personally I find it easier to think about the prior in the original space and then let the computer transform it to the warped space for the optimiser: gauss(5123, std=42) plus StandardScaler() to get it to a sensible range for the optimiser. Maybe this is overkill/just my weird brain :)

@MechCoder
Copy link
Member Author

Yes, which is why it makes sense to keep sampling in the warped space just for the uniform prior.

@glouppe
Copy link
Member

glouppe commented Jun 8, 2016

Proposal:

  1. We hide the transformation mechanism. This is something internal and users should not deal with it.

  2. We define two hardcoded priors for now, "uniform" and "log-uniform". In the later case, everything is done internally so that the optimizer deal with values in log-space, and then converted back to the original space.

@MechCoder
Copy link
Member Author

@glouppe @betatim I've done as you have suggested. The tests should be the best way to understand the new API.

@MechCoder MechCoder changed the title [WIP] Cleanup skopt/parameter.py with docs and some minor changes [MRG] Cleanup skopt/parameter.py with docs and some minor changes Jun 9, 2016
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
from sklearn.utils import check_random_state
from sklearn.utils.fixes import sp_version


class Identity(TransformerMixin):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed because TransformerMixin assumes input to transform to be 2-D

@MechCoder
Copy link
Member Author

I have tried to keep the API as simple as possible.

@fabianp
Copy link

fabianp commented Jun 9, 2016

I like where this is going :-)
On Jun 9, 2016 9:24 AM, "Manoj Kumar" notifications@github.com wrote:

I have tried to keep the API as simple as possible.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#75 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AAQ8h5sitgMXxrZgWXcFVka-WQtyYxiiks5qJ7-3gaJpZM4ItRTO
.


def transform(self, values):
return np.log(values)
@abc.abstractmethod
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need abc and stuff? A simple raise NotImplementedError wouldnt be enough?

(This is an open question, I have no strong opinion against one or the other.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am -0 on abc

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difference is that making it an abc will prevent the instance being created if it does not implement those methods, while raising a NotImplementedError will allow the instance being created.

(I am not a purist and I don't mind removing them)

@glouppe
Copy link
Member

glouppe commented Jun 9, 2016

Thanks Manoj, this is looking good!

I think we should add a last test showcasing how non-trivial grids can be defined. E.g., by calling sample_points for grids defined in different but equivalent formats (e.g. Real(0, 1) and (0., 1.)) and checking that for the same random_states, the same points are yielded.

@glouppe
Copy link
Member

glouppe commented Jun 9, 2016

For ease of use, we should also accept triples for Real, where third argument is the prior string. That is:

(0., 1.) -> Real(0., 1., prior="uniform")
(0., 1., "log-uniform") -> Real(0., 1., prior="log-uniform")

return self._rvs.rvs(size=n_samples, random_state=random_state)
random_vals = self._rvs.rvs(size=n_samples, random_state=random_state)
if self.prior == "log-uniform":
return np.exp(random_vals)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

np.exp(x) -> 10**x to match the np.log10 done earlier.

@MechCoder
Copy link
Member Author

All right, fixed both the inverse_transform and the tests. We can introduce the tuple shorthand in another PR. Please merge if happy

@glouppe
Copy link
Member

glouppe commented Jun 10, 2016

This is great, thank you! Waiting for Travis green light, and then I'll merge.

@glouppe
Copy link
Member

glouppe commented Jun 10, 2016

I'll address the shorthand.

@glouppe glouppe merged commit ce9b9bc into scikit-optimize:master Jun 10, 2016
@MechCoder
Copy link
Member Author

Yay! We should also refactor the existing minimize code to make use of the new API.

@glouppe
Copy link
Member

glouppe commented Jun 10, 2016

I just made #81 with a few thrown ideas.

@betatim
Copy link
Member

betatim commented Jun 10, 2016

Whoop!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants