Update Suggest API #510

cthoyt · 2019-09-07T09:33:11Z

There is already a function in the Trial class for suggesting a discrete uniformly distributed float (Trial.suggest_discrete_uniform). It would be nice to have the same option that returns integers instead of floats.

My use case is for network representation learning - I want to suggest the embedding dimension in intervals of 50 between 50-350 (e.g. 50, 100, 150, ... , 300, 350).\

Currently, I can accomplish this by dividing out my interval then suggesting an integer and multiplying it like:

embeddings_dim = 50 * trial.suggest_int('embeddings_dim', 1, 7)

But this loses the power of tracking the value within optuna. Alternatively I could use the following, but the the data type is tracked incorrectly.

embeddings_dim = int(trial.suggest_discrete_uniform('embeddings_dim', 50, 350, q=7))

The text was updated successfully, but these errors were encountered:

sile · 2019-10-01T05:14:21Z

Sorry for the late reply.
we think your proposal is reasonable.
However, instead of adding suggest_discrete_int method, we're planning to reorganize Suggest API as follows:

(a) [NEW_METHOD] Trial.suggest_float(name: str, low: float, high: float, log: bool, q: Optional[float])
- sugegst_float(name, low, high, log=False, q=None) is equivalent to suggest_uniform(name, low, high)
- suggest_float(name, low, high, log=True, q=None) is equivalent to suggest_loguniform(name, low, high)
- suggest_float(name, low, high, log=False, q=0.1) is equivalent to suggest_discrete_uniform(name, low, high, q=0.1)
(b) [MODIFIED] Trial.suggest_int(name: str, low: int, high: int, q: Optional[int]) (
- suggest_int(name, low, high, q=10) is equivalent to Trial.suggest_discret_int(name, low, high, q=10)
(c) [NO_CHANGE] Trial.suggest_categorical(name: str, choices: List[Union[str, float]])
Other existing suggest methods will be deprecated.

Note that the details of the API design is still under consideration (e.g., q may be renamed to a more descriptive name).

If you have any thought about the above new API design (draft), please let us know.

cthoyt · 2019-10-01T07:27:37Z

Looks good to me! Maybe rather than giving log as a Boolean, you might use a keyword argument called “distribution” in case you’d ever like to use something more exotic

sile · 2019-10-02T00:47:44Z

Thank you for your suggestion.
We will consider the idea and create a draft PR within a month (maybe).

cthoyt · 2019-10-02T11:07:46Z

Would you like me to rename this issue, then?

sile · 2019-10-03T00:57:21Z

IMO, either is OK. Feel free to change the title if you want to do it.

hayata-yamamoto · 2020-02-04T05:56:57Z

@sile
Do you have any progress in this?
I'm still looking forward to implementing this feature.

sile · 2020-02-04T06:58:47Z

@hayata-yamamoto

Thank you for looking forward to this feature. But, unfortunately, there has been almost no progress since last year. Though Optuna dev team will be still busy for the coming months, we will keep in mind that this is a wanted feature and try to implement it if possible.

hayata-yamamoto · 2020-02-05T00:04:35Z

@sile
Oh..., I appreciate your informing.

I wonder which solution is likely to be implemented.
If design or approarch were in agreement, contributors (including me) could make this feature instead of core developers, I afraid.

sile · 2020-02-07T01:21:31Z

Hi @hayata-yamamoto, sorry for the delayed response.

To clarify the design and approach, I created a contribution-welcome issue #902 for, so-called, suggest_discrete_int() method. I'd really appreciate it if you (or other contributors) could implement the feature.

hayata-yamamoto · 2020-02-07T06:07:40Z

@sile
Thanks! I'm aiming to develop this feature.

sile · 2020-02-07T06:18:52Z

Great!

kasparthommen · 2020-02-18T11:51:06Z

Hi all,

I think the new suggest_int method should also have a log flag (or, as suggested in another comment, a distribution parameter accepting strings).

For example,

suggest_int(1, 16, log=True)  # or, equivalently: suggest_int(1, 16, distribution='log')

would produce values between 1 and 4 equally likely as values between 4 and 16.

Model parameters that benefit from discrete log-uniform distributions are e.g.:

The number of neighbors in KNN
The number of layers in a neural network
The number of components in a GMM
The number of clusters in K-means
The number of components to retain in a PCA

My current workaround is to round/truncate log-uniform float values via suggest_loguniform, but that's not great.

cthoyt · 2020-02-18T16:33:27Z

@kasparthommen it would be nice to specify the log scale as well.

suggest_int(1, 16, distribution='log', log_base=2)

Such that [1, 2, 4, 8, 16] are generated. Otherwise, it might be assumed the base is 10 and then only [1, 10] are generated

As a question for all: would these 5 values be generated with equal probability? Or is it weighted?

kasparthommen · 2020-02-18T16:45:04Z

Such that [1, 2, 4, 8, 16] are generated.

No, all values between 1 and 16 will be generated, but the probability decreases for larger values.

hayata-yamamoto · 2020-02-19T00:21:31Z

It looks a good suggestion, but should it be implemented as another function?
Comparing other suggest_ functions, I'm afraid suggest_int could be a bit huge than other functions. This difference could lead to users' confusion, I guess.

So, suggest_int should be implemented as corresponded to suggest_discrete_uniform and suggest_uniform. It's seemed that suggest_log_int should be developed apart from suggest_int.

Would it be possible to tell your thoughts, @sile @c-bata ?

kasparthommen · 2020-02-19T07:52:19Z

It looks a good suggestion, but should it be implemented as another function?
Comparing other suggest_ functions, I'm afraid suggest_int could be a bit huge than other functions. This difference could lead to users' confusion, I guess.

My suggestion to add the log parameter to suggest_int does not make it any more complicated than suggest_float IMHO. Can you elaborate why you think it would become a bit "huge"?

So, suggest_int should be implemented as corresponded to suggest_discrete_uniform

I don't really care how it is implemented underneath as long as it provides the functionality I have suggested :-)

It's seemed that suggest_log_int should be developed apart from suggest_int.

I don't agree - I think we should adopt @sile's proposal (2nd post) and extend the suggest_int method a bit.

hayata-yamamoto · 2020-02-19T09:12:16Z

Can you elaborate why you think it would become a bit "huge"?

I'm sorry. This is my wrong :(

I think we should adopt @sile's proposal (2nd post) and extend the suggest_int method a bit.

If adding log paramter to suggest_int is planned, it should be done.
I don't know whether there is a plan or not.

sile · 2020-02-19T10:24:32Z

When I wrote #510 (comment), I didn't plan to add a log-scale option to suggest_int method, but it sounds good idea.

kasparthommen · 2020-02-19T16:39:11Z

@sile Cool, appreciated!

himkt · 2020-03-10T10:42:33Z

Hi, I'd like to work on this issue if it is possible. 😸
(If it's possible, I'll start from implementing methods listed in (a).)

sile · 2020-03-11T01:01:55Z

Sure. Please go for it! > @himkt

github-actions · 2020-04-18T15:01:38Z

This issue has not seen any recent activity.

himkt · 2020-04-21T18:06:19Z

@sile

#510 (comment)

suggest_float is introduced in optuna now
suggest_float supports uniform/loguniform/discreteuniform

Is there something we have to work on in this PR?
The discussions about suggest_int have ended?

sile · 2020-04-22T00:17:34Z

@himkt Thanks!
I think we might be able to address #510 (comment) ( adding log option to suggest_int) within this issue.
What do you think of it?

himkt · 2020-04-23T01:33:57Z

@sile
Thank you for your clarification.
OK, I will work on adding log to suggest_int later.

sile · 2020-04-23T01:41:05Z

@himkt Cool! Thanks!

nzw0301 · 2020-05-01T13:07:59Z

@himkt Could I help you implement suggest_int?

I guess a new suggest_int looks like suggest_int(name: str, low: int, high: int, log: bool, base: int, step: int).
In addition, is it better to implement the following functions?

suggest_loguniform_int(name: str, low: int, high: int, base: int, step: int)
suggest_uniform_int(name: str, low: int, high: int)

as a counterpart of suggest_loguniform and suggest_uniform in suggest_float, respectively:

suggest_int(name, low, high, log=True, base, step) is equivalent to suggest_loguniform_int(name, low, high, base, step)
suggest_int(name, low, high, log=False, base, step=1) is equivalent to suggest_uniform_int(name, low, high)

himkt · 2020-05-01T13:58:33Z

Hi @nzw0301, thank you for your interest!
To be honest, your suggestion sounds helpful to me since I'm a little bit busy now.

I guess a new suggest_int looks like suggest_int(name: str, low: int, high: int, log: bool, base: int, step: int).

I basically agree with you but I think suggest_int doesn't have to support base, which is consistent with suggest_loguniform.

In addition, is it better to implement the following functions?

suggest_loguniform_int(name: str, low: int, high: int, base: int, step: int)

suggest_uniform_int(name: str, low: int, high: int)

I think what we have to do is implementing suggest_intloguniform because suggest_uniform_int is the same as suggest_int (It needs to implement the distribution IntLogUniformDistribution as well).

@nzw0301 What do you think?

@sile @cthoyt @hayata-yamamoto @kasparthommen
Please let me know if I've said something wrong. 🙇

nzw0301 · 2020-05-02T07:05:01Z

Thank you @himkt for your kind reply.

suggest_int doesn't have to support base

You are right! My understanding was wrong.

implementing suggest_intloguniform because suggest_uniform_int is the same as suggest_int (It needs to implement the distribution IntLogUniformDistribution as well).

It sounds fine.

sile · 2020-05-02T08:28:21Z

Thanks, @nzw0301 and @himkt !

IMO, it's sufficient only to extend suggest_int and we don't have to implement suggest_intloguniform (or suggest_uniform_int).
Now, there is suggest_loguniform but it's just a historical reason. It has not been determined yet, but we might deprecate the method (and suggest_uniform) in favor of suggest_float in the future.

I think suggest_int doesn't have to support base, which is consistent with suggest_loguniform.

Agree.

nzw0301 · 2020-05-02T19:42:43Z

@sile Thank you for your reply! I understood.

I'll create a PR to support log argument of suggest_int since @himkt seems too busy.

hayata-yamamoto · 2020-05-03T05:31:51Z

Wonderful!

hayata-yamamoto · 2020-05-06T02:30:18Z

@himkt @nzw0301

I basically agree with you but I think suggest_int doesn't have to support base, which is consistent with suggest_loguniform.

I understand this comment and it's reasonable, I guess.
Anyway, do you have any test case of suggest_int(high, low, step, log=True)?
Since https://github.com/optuna/optuna/pull/1201/files seems not to have test codes yet (I wrote this comment at 2020/05/06), I would like you to share supposed test cases.

To be honest, it's easy to understand if this function has a base argument. This is because it has already shown at #510 (comment)

On the other hand, if you only use natural logarithm to generate samples, it may be hard to understand at least for me. I'm afraid that I cannot remember whether natural logarithm can generate an integer value or not. If other developers have the same feeling, adding base can help us to use this function.

nzw0301 · 2020-05-06T06:49:41Z

@hayata-yamamoto

Add tests

I haven't created tests. Thank you for your kind comment.

`base` argument

I don't think so. Even if base is introduced, suggest_int doesn't return such values. suggest_int with log=True should be a counterpart of suggest_float for consistency. In addition, our implementations will be the almost same as automl/ConfigSpace.

Therefore, we may introduce power: bool and base: int arguments as another PR if optuna supports #510 (comment).

@himkt What do you think?

hayata-yamamoto · 2020-05-06T07:04:50Z

I don't think so. Even if base is introduced, suggest_int doesn't return such values. suggest_int with log=True should be a counterpart of suggest_float for consistency. In addition, our implementations will be the almost same as automl/ConfigSpace.

I've got. That's cool.
I'm looking forward :)

himkt · 2020-05-06T07:18:32Z

@hayata-yamamoto Thank you for sharing your concerns.
And thank you @nzw0301 for clarifying, I agree with you.

Let me put a small note about the option for the logarithm scale.
As #510 (comment) says, all values between low and high will be generated in
suggest_int with log option. The probability decreases for larger values.

A histogram may look like here.

kasparthommen · 2020-05-06T07:40:27Z

The implemention for a log-uniformly distributed integer is actually quite simple, it's int(round(sample_continuous/step))*step where sample_continuous comes from a continuous log-uniform distribution between low-step/2and high+step/2. Here's a script that demonstrates this:

import matplotlib.pyplot as plt
import numpy as np

# this function only implements the log-uniform case
def suggest_int_loguniform(low: int, high: int, step: int=1) -> int:
    # input validation - please replace by optuna's standard assertion/checking mechanisms
    assert low > 0
    assert high >= low
    assert step >= 0

    # adjust 'high' if necessary
    high -= (high - low) % step  # e.g. if low = 1, high = 6, step = 2 then adjust to high = 5

    # define corresponding continuous range and validate that as well
    uniform_low = low - step/2
    uniform_high = high + step/2
    assert uniform_low > 0

    # generate continuous log-uniform sample
    log_low = np.log(uniform_low)
    log_high = np.log(uniform_high)
    sample_continuous = np.exp(np.random.uniform(log_low, log_high))  # <-- the actual value between log_low and log_high needs to come from the sampler, not np.random.uniform

    # convert to int
    sample_int = int(round(sample_continuous/step))*step
    return sample_int


# histogram A: 1...16
np.random.seed(42)
values = []
for i in range(100000):
    values.append(suggest_int_loguniform(1, 16))

plt.figure()
plt.hist(values, bins=50)
plt.show()

# histogram B: 2...16, step=2
values = []
for i in range(100000):
    values.append(suggest_int_loguniform(2, 16, 2))

plt.figure()
plt.hist(values, bins=50)
plt.show()

github-actions · 2020-05-20T23:05:15Z

This issue has not seen any recent activity.

github-actions · 2020-06-07T23:08:45Z

This issue has not seen any recent activity.

sile · 2020-06-11T09:25:56Z

Through the following PRs, the update of the Suggest API has been completed. All of the features will be available by the next major release (v2.0.0) planned for July.
Many thanks to all the contributors involved in this issue! Great work!

Added step to suggest_int: Add step parameter on IntUniformDistribution #910
Introduced suggest_float: Add initial implementation of suggest_float. #1021, Reorder arguments of Trial.suggest_float. #1292
Added step to suggest_float: Support DiscreteUniformDistribution in suggest_float. #1081, suggest_float with step in multi_objective #1205
Added log to suggest_int: Implement log argument for suggest_int. #1201, Implement log argument for suggest_int of ChainerMN integration. #1275, log argument for suggest_int of skopt integration. #1277, Match suggest_float and suggest_int specifications on step and log arguments. #1329

cthoyt changed the title ~~Add suggest_discrete_int~~ Add Trial.suggest_discrete_int Sep 7, 2019

cthoyt changed the title ~~Add Trial.suggest_discrete_int~~ Update Suggest API Oct 3, 2019

toshihikoyanase mentioned this issue Feb 4, 2020

Is it possible to add suggest_discrete_int? #888

Closed

sile mentioned this issue Feb 7, 2020

Add step option to Trial.suggest_int() method. #902

Closed

hayata-yamamoto mentioned this issue Feb 10, 2020

Add step parameter on IntUniformDistribution #910

Merged

himkt mentioned this issue Mar 12, 2020

Add initial implementation of suggest_float. #1021

Merged

toshihikoyanase mentioned this issue Mar 31, 2020

Support discrete parameters by Trial.suggest_float(). #1069

Closed

github-actions bot added the stale Exempt from stale bot labeling. label Apr 18, 2020

himkt mentioned this issue Apr 20, 2020

Support DiscreteUniformDistribution in suggest_float. #1081

Merged

sile removed the stale Exempt from stale bot labeling. label Apr 22, 2020

nzw0301 mentioned this issue May 3, 2020

Implement log argument for suggest_int. #1201

Merged

toshihikoyanase mentioned this issue May 14, 2020

Equivalent of hyperopt's normal/lognormal parameters #1200

Closed

github-actions bot added the stale Exempt from stale bot labeling. label May 20, 2020

hvy removed the stale Exempt from stale bot labeling. label May 22, 2020

github-actions bot added the stale Exempt from stale bot labeling. label Jun 7, 2020

HideakiImamura removed the stale Exempt from stale bot labeling. label Jun 8, 2020

sile closed this as completed Jun 11, 2020

himkt mentioned this issue Mar 7, 2022

Support discrete log distributions #3335

Closed

Update Suggest API #510

Update Suggest API #510

Comments

cthoyt commented Sep 7, 2019

sile commented Oct 1, 2019

cthoyt commented Oct 1, 2019

sile commented Oct 2, 2019 • edited

cthoyt commented Oct 2, 2019

sile commented Oct 3, 2019

hayata-yamamoto commented Feb 4, 2020

sile commented Feb 4, 2020

hayata-yamamoto commented Feb 5, 2020 • edited

sile commented Feb 7, 2020

hayata-yamamoto commented Feb 7, 2020 • edited

sile commented Feb 7, 2020

kasparthommen commented Feb 18, 2020

cthoyt commented Feb 18, 2020 • edited

kasparthommen commented Feb 18, 2020

hayata-yamamoto commented Feb 19, 2020 • edited

kasparthommen commented Feb 19, 2020

hayata-yamamoto commented Feb 19, 2020 • edited

sile commented Feb 19, 2020

kasparthommen commented Feb 19, 2020

himkt commented Mar 10, 2020

sile commented Mar 11, 2020

github-actions bot commented Apr 18, 2020

himkt commented Apr 21, 2020

sile commented Apr 22, 2020 • edited

himkt commented Apr 23, 2020

sile commented Apr 23, 2020

nzw0301 commented May 1, 2020 • edited

himkt commented May 1, 2020

nzw0301 commented May 2, 2020

sile commented May 2, 2020 • edited

nzw0301 commented May 2, 2020

hayata-yamamoto commented May 3, 2020

hayata-yamamoto commented May 6, 2020 • edited

nzw0301 commented May 6, 2020 • edited

Add tests

base argument

hayata-yamamoto commented May 6, 2020

himkt commented May 6, 2020

kasparthommen commented May 6, 2020 • edited

github-actions bot commented May 20, 2020

github-actions bot commented Jun 7, 2020

sile commented Jun 11, 2020

sile commented Oct 2, 2019 •

edited

hayata-yamamoto commented Feb 5, 2020 •

edited

hayata-yamamoto commented Feb 7, 2020 •

edited

cthoyt commented Feb 18, 2020 •

edited

hayata-yamamoto commented Feb 19, 2020 •

edited

hayata-yamamoto commented Feb 19, 2020 •

edited

sile commented Apr 22, 2020 •

edited

nzw0301 commented May 1, 2020 •

edited

sile commented May 2, 2020 •

edited

hayata-yamamoto commented May 6, 2020 •

edited

nzw0301 commented May 6, 2020 •

edited

`base` argument

kasparthommen commented May 6, 2020 •

edited