Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optuna Suggests the Same Parameter Values in a lot of Trials (Duplicate Trials that Waste Time and Budget) #2021

Open
deathes opened this issue Nov 14, 2020 · 51 comments

Comments

@deathes
Copy link

deathes commented Nov 14, 2020

Optuna TPESampler and RandomSampler try the same suggested integer values (possible floats and loguniforms as well) for any parameter more than once for some reason. I couldn't find a way to stop it from suggesting same values over over again. Out of 100 trials quite a few of them are just duplicates. Unique suggested value count ends up around 80-90 out of 100 trials. If I include more parameters for tuning, say 3, I even see all 3 of them getting the same values a few times in 100 trials.

It's like this:

[I 2020-11-14 14:44:05,320] Trial 8 finished with value: 45910.54012028659 and parameters: {'min_data_in_leaf': 75}. Best is trial 4 with value: 45805.19030897498.
[I 2020-11-14 14:44:07,876] Trial 9 finished with value: 45910.54012028659 and parameters: {'min_data_in_leaf': 75}. Best is trial 4 with value: 45805.19030897498.
[I 2020-11-14 14:44:10,447] Trial 10 finished with value: 45831.75933279074 and parameters: {'min_data_in_leaf': 43}. Best is trial 4 with value: 45805.19030897498.
[I 2020-11-14 14:44:13,502] Trial 11 finished with value: 46125.39810101329 and parameters: {'min_data_in_leaf': 4}. Best is trial 4 with value: 45805.19030897498.
[I 2020-11-14 14:44:16,547] Trial 12 finished with value: 45910.54012028659 and parameters: {'min_data_in_leaf': 75}. Best is trial 4 with value: 45805.19030897498.

Example code below:

def lgb_optuna(trial):

    rmse = []

    params = {
        "seed": 42,
        "objective": "regression",
        "metric": "rmse",
        "verbosity": -1,
        "boosting": "gbdt",
        "num_iterations":  1000,
        'min_data_in_leaf':  trial.suggest_int('min_data_in_leaf', 1, 100)
    }

    cv = StratifiedKFold(n_splits=5, random_state=42, shuffle=False)
    for train_index, test_index in cv.split(tfd_train, tfd_train[:,-1]):
        X_train, X_test = tfd_train[train_index], tfd_train[test_index]
        y_train = X_train[:,-2].copy()
        y_test = X_test[:,-2].copy()
        
        dtrain = lgb.Dataset(X_train[:,:-2], label=y_train)
        dtest = lgb.Dataset(X_test[:,:-2], label=y_test)
    
        booster_gbm = lgb.train(params, dtrain, valid_sets=dtest, verbose_eval=False)

        y_predictions = booster_gbm.predict(X_test[:,:-2])
        final_mse = mean_squared_error(y_test, y_predictions)
        final_rmse = np.sqrt(final_mse)
        rmse.append(final_rmse)

     return np.mean(rmse)

study = optuna.create_study(sampler=TPESampler(seed=42), direction='minimize') 
study.optimize(lgb_optuna, n_trials=100) 
@crcrpar crcrpar added bug Issue/PR about behavior that is broken. Not for typos/examples/CI/test but for Optuna itself. and removed bug Issue/PR about behavior that is broken. Not for typos/examples/CI/test but for Optuna itself. labels Nov 15, 2020
@crcrpar
Copy link
Contributor

crcrpar commented Nov 16, 2020

Desirably, if the suggested parameters are the ones already suggested, the trial should be skipped for faster hyperparameter optimization, right?

Reproduced output: https://gist.github.com/crcrpar/aba308a1350bb4986276a6c87cf256cb

@deathes
Copy link
Author

deathes commented Nov 17, 2020

Desirably, if the suggested parameters are the ones already suggested, the trial should be skipped for faster hyperparameter optimization, right?

Reproduced output: https://gist.github.com/crcrpar/aba308a1350bb4986276a6c87cf256cb

Exactly. If the trial has already been suggested, it should be skipped for faster hyperparameter optimization. Or better yet, it shouldn't have been suggested in the first place. The sampler should check the history of suggested hyperparameter values first before suggesting new one.

@crcrpar
Copy link
Contributor

crcrpar commented Nov 18, 2020

By its nature, TPESampler tends to sample similar values with the number of trials increasing since it narrows the search space based on Bayesian optimization with the number of trials increasing. So I think it's a bit hard to avoid the same values suggested.

That being said, we can deal with this case by checking the suggested values in the objective function and if they aren't new ones, terminate the trial.

def objective(trial):
    some_param = trial.suggest_int("some_param", 0, 100)
    if some_param in param_history["some_param"]:
        raise optuna.exceptions.TrialPruned()
    ...
    # define your evaluation 
    return value

The full code with this hack and its result is available: https://gist.github.com/crcrpar/c6d6fdf8112280654884353d6e68f6bb.

Note that this solution might not be general enough to apply to other cases.

@HideakiImamura would have some better ideas.

@hvy
Copy link
Member

hvy commented Nov 18, 2020

Yes, and pruned past trials are treated differently for different samplers so it might or might not work. You could also just fail the trials by raising other exceptions or return an invalid value such as None. The issue is that the sampler is likely to sample similar values in the next trial. TPESampler for instance would. As for the TPESampler and the RandomSampler, these behaviors are by design.

Just based on the code snippet above, let me just clarify that in case you're working with a discrete search space and your trial budget (n_trials) is larger than or equal to the size of the space, you could also use the GridSampler. You'll avoid reevaluating objective functions with the same configurations (although it's not 100% guaranteed in distributed settings due to timing issues).

However, to properly address this issue we probably have to work on a new sampler based on a different algorithm.

@github-actions
Copy link
Contributor

github-actions bot commented Dec 2, 2020

This issue has not seen any recent activity.

@github-actions github-actions bot added the stale Exempt from stale bot labeling. label Dec 2, 2020
@aisu-wata0
Copy link

If anyone cares enough about this in the meantime, I made a class for this.
Prunes trials with all parameters equal to some existing past trial. You also can allow repeats if you want, and instead of pruning, you can get the average of these past trials, useful for nondeterministic objective functions.
Duplicates of running trials can happen when multiprocessing, so I tried to take care of that too. It's possible a running trial can be pruned in the future, and I don't want trials to be registered as repeats of pruned ones because of the "average value of past trials" thing. So I register them as unfinished_repeats and check up on them on future prunes.

If you really want to prune repeats of pruned trials (or FAIL ones) you can add the PRUNED state in the should_compare_states constructor argument.
If you want to allow some repeats use the repeats_max argument, the default is no repeats (0).

https://gist.github.com/bfs15/24045ab5e8ad007b4a09f708adfe359f

@github-actions github-actions bot removed the stale Exempt from stale bot labeling. label Dec 21, 2020
@github-actions
Copy link
Contributor

github-actions bot commented Jan 4, 2021

This issue has not seen any recent activity.

@github-actions github-actions bot added the stale Exempt from stale bot labeling. label Jan 4, 2021
@github-actions
Copy link
Contributor

This issue was closed automatically because it had not seen any recent activity. If you want to discuss it, you can reopen it freely.

@cowwoc
Copy link
Contributor

cowwoc commented Aug 26, 2021

This should really be the default implementation out-of-the-box. Can this issue be reopened?

@not522
Copy link
Member

not522 commented Aug 31, 2021

Re-open is welcome.
By the way, it is difficult to exclude duplicated trials as the default behavior. In the following case, parameters should be checked after all suggestions. We need to check them manually as above.

import optuna

def objective1(trial):
    x = trial.suggest_int("x", 0, 0)
    return x

def objective2(trial):
    x = trial.suggest_int("x", 0, 0)
    # Here, it looks the same parameters but should not be pruned.

    y = trial.suggest_int("y", 0, 0)
    # We should check after all parameters have been suggested.

    return x + y

study = optuna.create_study()
study.optimize(objective1, n_trials=1)
study.optimize(objective2, n_trials=1)

@cowwoc
Copy link
Contributor

cowwoc commented Aug 31, 2021

@not522 Maybe we could have a built-in pruner that rejects duplicates and rolls back the number of trials by one automatically.

@jmrichardson
Copy link

+1 vote to add this functionality

@cowwoc
Copy link
Contributor

cowwoc commented Sep 30, 2021

Thinking about this further, is this really a problem? While it is quite possible that the same parameter value will get suggested multiple times it seems extremely unlikely that all parameter values will be equal. So long as at least one parameter is different, isn't it worth running the trial?

@jmrichardson
Copy link

Thinking about this further, is this really a problem? While it is quite possible that the same parameter value will get suggested multiple times it seems extremely unlikely that all parameter values will be equal. So long as at least one parameter is different, isn't it worth running the trial?

IMO, yes this is a problem worth resolving. Here are my concerns:

  • I have many use cases where Optuna optimizes a single parameter. Therefore, the trials are often repeats of the same value.
  • Having to code a workaround to handle repeats shifts the burden to coding logic which introduces complexity
  • Unfortunately, the way my project is coded, the parameter suggestions are contained in an eval method which makes it impossible to intercept prior to the function executing. Therefore, the function will execute and consume resources.

I don't see any value in Optuna running the same trial multiple times. In my humble opinion, it just makes sense to handle within the lib.

@gpogorzelski
Copy link

I think @jmrichardson is right. In my work, after training for many iterations I collect say 10 models with best performance and evaluate them further (do they cluster data? etc.). out of the 10 with best performance I select the final candidate. With a lot of trials I tend to get all 10 models with the same parameters. Obviously I implemented a workaround for this, but I believe it should be handled within the library.

@blazespinnaker
Copy link

blazespinnaker commented Aug 5, 2022

This one is a bit tricky to do with distributed optuna as we'll need to query from storage what the historical parameters are. I was using it for feature selection, and it's a very serious problem as it gets rather stuck in exploitation and wastes a crazy amount of cycles. I was very surprised to see this behavior and had assumed that the sampler naturally handled this use case.

Considering the goal of optuna is to "Efficiently search large spaces" it seems like this should be a high priority bug.

I think a good way to handle this might be by getting a stable hash of the parameters and putting it into the storage.

@ezhang7423
Copy link

Hi all, what's the current status of this? I'm very interested in a tighter integration of this feature into optuna.

@not522
Copy link
Member

not522 commented Oct 18, 2022

In the black-box optimization, we can suppose "noisy" functions, i.e., including randomness in the objective values (E.g., https://arxiv.org/abs/2105.08195). For these problems, re-evaluating the objective function may be informative. Please note samplers don't know whether the objective function is noisy or not.
When the objective function is deterministic, samplers expect it returns the same value as previous trials. It can implement as in the following example.

import optuna
from optuna.trial import TrialState

def objective(trial):
    x = trial.suggest_int("x", 0, 1)
    y = trial.suggest_int("y", 0, 1)

    for previous_trial in trial.study.trials:
        if previous_trial.state == TrialState.COMPLETE and trial.params == previous_trial.params:
            print(f"Duplicated trial: {trial.params}, return {previous_trial.value}")
            return previous_trial.value

    return x + y

study = optuna.create_study()
study.optimize(objective, n_trials=10)

@teneon
Copy link

teneon commented Nov 1, 2022

This "bug" is the first thing I have noticed when I started using/testing Optuna. If you are optimizing just one or two parameters, it will surely select exactly the same one (or two) parameters multiple times in your trials. Imagine, you are training a huge neural network and each trial takes a lot of time. Now out of 100 trials, you have repeated e.g. 30% of them. That is such a waste of time and resources. Hopefully it gets implemented/fixed soon.

@louis-she
Copy link
Contributor

louis-she commented Nov 4, 2022

In the black-box optimization, we can suppose "noisy" functions, i.e., including randomness in the objective values (E.g., https://arxiv.org/abs/2105.08195). For these problems, re-evaluating the objective function may be informative. Please note samplers don't know whether the objective function is noisy or not. When the objective function is deterministic, samplers expect it returns the same value as previous trials. It can implement as in the following example.

import optuna
from optuna.trial import TrialState

def objective(trial):
    x = trial.suggest_int("x", 0, 1)
    y = trial.suggest_int("y", 0, 1)

    for previous_trial in trial.study.trials:
        if previous_trial.state == TrialState.COMPLETE and trial.params == previous_trial.params:
            print(f"Duplicated trial: {trial.params}, return {previous_trial.value}")
            return previous_trial.value

    return x + y

study = optuna.create_study()
study.optimize(objective, n_trials=10)

Parameters Tunning makes much more sense when the objective function is deterministic, cause if it is not, why not just run the same code repeatedly without any auto tunning tools?

I think at least there should be an option to control this behavior

@michaelwayman
Copy link

+1.. I am here because I was running a study to determine which initializer worked best. Each trial takes quite a while. When I saw that all 3 trials ran the same initializer I was very confused. My assumption was that optuna would use a separate categorical initializer each trial, especially considering it was the only parameter.
The comments here even gave a few good suggestions. I will open a PR myself if I think it has a chance of being merged.

@optuna is there a solution I could open a PR for that you would find acceptable?

@contramundum53
Copy link
Member

contramundum53 commented Jan 19, 2023

I think it is important for TPE algorithm to "see" the duplicate trials to converge to the desired solution. However, if you know that the objective function is deterministic, you could memoize it to save time. #4352 provides a helper function for it.

@cowwoc
Copy link
Contributor

cowwoc commented Jan 19, 2023

@contramundum53 if I understand correctly, you are caching the output of the last input. I would expect caching of multiple values. Ideally, I'd want to cache the X most frequent inputs where X could be infinity.

@contramundum53
Copy link
Member

@cowwoc No, optuna keeps all the history of previous evaluations, and #4352 simply searches the entire history for exact match. This is equivalent to X=infinity case.

@contramundum53
Copy link
Member

Sorry if the name last_value_if_duplicate is misleading, the name indicates that if there happens to be multiple same inputs (this is still possible with distributed optimization), it picks the last one.

@cowwoc
Copy link
Contributor

cowwoc commented Jan 19, 2023

Sorry if the name last_value_if_duplicate is misleading, the name indicates that if there happens to be multiple same inputs (this is still possible with distributed optimization), it picks the last one.

Right, that's what confused me. Maybe it should be called last_duplicate_value? And actually, if this mechanism is in play then why are duplicate values getting recorded at all? Shouldn't only the first duplicate get logged.

@github-actions github-actions bot removed the stale Exempt from stale bot labeling. label Jan 19, 2023
@contramundum53
Copy link
Member

contramundum53 commented Jan 26, 2023

In distributed environment, it is possible that multiple trials with the same parameters are sampled "at the same time" before the results are tell-ed. With some more engineering, we could also detect duplicate RUNNING trials and fail the second ones, but I haven't implemented that in the above PR (and I was not sure that failing the duplicate trial would be correct in that case).

@camall3n
Copy link

This is a huge issue for distributed optimization. I'm seeing 10 workers, each running identical params, and each getting the same result, then all 10 workers sample another set of identical params and run those with the same result, etc.

@not522
Copy link
Member

not522 commented Jan 27, 2023

This is a huge issue for distributed optimization. I'm seeing 10 workers, each running identical params, and each getting the same result, then all 10 workers sample another set of identical params and run those with the same result, etc.

It could be another problem than we have discussed. I guess you fixed the seed of samplers, and the samplers suggest exactly the same parameters. I recommend seed=None for distributed optimization to avoid this problem.

@camall3n
Copy link

camall3n commented Jan 27, 2023

The situation I described was with seed=None, but I take your point. Even with seed=None, if you create the study before spawning the workers, all workers will use a copy of the parent process state, which means their studies will have the same seed. But if you wait until the workers to create the study, they'll try to create it at the same time and clobber each other.

I was able to resolve the issue by calling create_study() once in the parent process, and then making additional create_study(load_if_exits=True) calls, one in each worker, and initializing the sampler for each with a different seed.

Here's the gist: https://gist.github.com/camall3n/4b0272e547b0e5df05777ebed446b69f

@gabriel-vanzandycke
Copy link

gabriel-vanzandycke commented Jan 27, 2023

It drives me crazy that same point in the hyper-parameter space is sampled multiple times (see runs 18, 20, 21 and 22)

|     17 | [0.31024930747922436, 0.9739130434782609] | 2023-01-27 15:52:51 | 2023-01-27 15:59:02 | 0:06:11.199076 | {'min_distance_cm': 100, 'min_distance_px': 50}  | COMPLETE |
|     18 | [0.55625, 0.8811881188118812]             | 2023-01-27 15:53:00 | 2023-01-27 16:01:19 | 0:08:18.946978 | {'min_distance_cm': 50, 'min_distance_px': 100}  | COMPLETE |
|     19 | [0.31024930747922436, 0.9739130434782609] | 2023-01-27 15:55:47 | 2023-01-27 16:01:53 | 0:06:06.252989 | {'min_distance_cm': 100, 'min_distance_px': 50}  | COMPLETE |
|     20 | [0.55625, 0.8811881188118812]             | 2023-01-27 15:59:04 | 2023-01-27 16:04:59 | 0:05:55.289485 | {'min_distance_cm': 50, 'min_distance_px': 100}  | COMPLETE |
|     21 | [0.55625, 0.8811881188118812]             | 2023-01-27 15:59:39 | 2023-01-27 16:05:55 | 0:06:16.187209 | {'min_distance_cm': 50, 'min_distance_px': 100}  | COMPLETE |
|     22 | [0.55625, 0.8811881188118812]             | 2023-01-27 16:01:21 | 2023-01-27 16:09:17 | 0:07:55.919076 | {'min_distance_cm': 50, 'min_distance_px': 100}  | COMPLETE |
|     23 | [0.20555555555555555, 0.9568965517241379] | 2023-01-27 16:01:54 | 2023-01-27 16:07:51 | 0:05:56.547381 | {'min_distance_cm': 50, 'min_distance_px': 25}   | COMPLETE |

I'm using GridSampler({"min_distance_cm": range(50, 101, 25), "min_distance_px": range(25, 101, 25)}) as suggested by hvy.

(Note: ignore trials index relative to search space size: several trial failed before number 17)

@contramundum53
Copy link
Member

contramundum53 commented Jan 31, 2023

@gabriel-vanzandycke Which version of optuna are you using? This problem with GridSampler is supposed to be fixed in #2783 (released in v2.9), except for a few trials in the end.

@classner
Copy link

Hi!

Great discussion! I would very much argue we need a solution for this problem, and ideally within Optuna. I totally see that there is no 'one-size-fits-all' solution, and users will have to make a choice about this (with sensible defaults ideally). Ax for example, has a flag that allows to specify whether the experiment should be regarded as deterministic.

It is important to find a solution because the phenomenon of repeated experiment values is quite prevalent from what I can see, and it burns very real time and money. I am using Optuna through the plugin for hydra, hence do not have an overview of trials to fall back to recent results (the experiments themselves are not aware of the hyperparameter optimization process), so a configurable solution from within Optuna would be amazing to have!

@AliAkhgari
Copy link

Use this pruner:

from optuna import logging
from optuna.pruners import BasePruner

_logger = logging.get_logger(__name__)


class DuplicateIterationPruner(BasePruner):
    """
    DuplicatePruner

    Pruner to detect duplicate trials based on the parameters.

    This pruner is used to identify and prune trials that have the same set of parameters
    as a previously completed trial.
    """

    def prune(
        self, study: "optuna.study.Study", trial: "optuna.trial.FrozenTrial"
    ) -> bool:
        completed_trials = study.get_trials(states=[optuna.trial.TrialState.COMPLETE])

        for completed_trial in completed_trials:
            if completed_trial.params == trial.params:
                return True

        return False

@contramundum53
Copy link
Member

contramundum53 commented Sep 4, 2023

@classner In the case you cannot modify the code of the objective function and you cannot use Pruners, you can write a sampler wrapper that hooks sample_relative and retries it until a new parameter is sampled.
It only works on samplers that use sample_relative and not sample_independent (e.g. TPESampler with multivariate=True, QMCSampler, CmaEsSampler, BoTorchSampler) and you cannot use dynamic search space, but that's all we can do given the current interface.

@artpumpkin
Copy link

+1

@tniveej
Copy link

tniveej commented Jan 5, 2024

Hi!

Great discussion! I would very much argue we need a solution for this problem, and ideally within Optuna. I totally see that there is no 'one-size-fits-all' solution, and users will have to make a choice about this (with sensible defaults ideally). Ax for example, has a flag that allows to specify whether the experiment should be regarded as deterministic.

It is important to find a solution because the phenomenon of repeated experiment values is quite prevalent from what I can see, and it burns very real time and money. I am using Optuna through the plugin for hydra, hence do not have an overview of trials to fall back to recent results (the experiments themselves are not aware of the hyperparameter optimization process), so a configurable solution from within Optuna would be amazing to have!

I agree with this, I use Optuna through Ray and find it hard to implement such changes as well.

@contramundum53
Copy link
Member

contramundum53 commented Jan 10, 2024

@tniveej I think you can write a sampler wrapper that removes duplicate parameters, which you can then plug in to the sampler argument of Ray's OptunaSearch class.
I'll write one when I have time.
(For hydra, you might need to add path to PYTHONPATH so that hydra sees your sampler.)

@drozzy
Copy link

drozzy commented Jan 10, 2024

In RL you have stochastic policies and stochastic environments.
The value of objective function can vary wildly between runs even using exactly the same hyperparameters.

@Loccret
Copy link

Loccret commented Jan 24, 2024

Hello everyone,

I understand that mathematically, Bayesian optimization can converge to a local optimum. Does this also occur in Optuna?
Additionally, if convergence is observed, say the same hyperparameters are chosen multiple times, around 10 instances, is it possible to switch to random sampling to potentially locate a more optimal point than the current best?

@contramundum53
Copy link
Member

contramundum53 commented Jan 24, 2024

TPE can converge to a local optimum, but it can and actually very often sample the same parameters multiple times long before convergence (if all parameters are discrete).
I think you can switch to random sampling after convergence (or even better: restart) if you observe 10 same parameters consecutively.

@contramundum53
Copy link
Member

contramundum53 commented Jan 24, 2024

A descent implementation of Gaussian process-based Bayesian optimization would do that much less likely (if the objective function is deterministic and the sampler correctly infers so). However, currently BoTorchSampler would not perform well if you have a categorical parameter.

We are trying to implement Gaussian process ourselves right now. (#5185)

nhuet added a commit to nhuet/discrete-optimization that referenced this issue Mar 5, 2024
…d their hyperparameters

- update OptunaPruningSingleFitCallback `report_nb_steps` param
   - rename into `optuna_report_nb_steps` to identify it more simply in
     kwargs
   - give it a default value if user does not define it in optuna script

- add a full example with
   - solvers_to_test: list of classes of solver to test
   - kwargs_fixed_by_solver: kwargs to pass to __ini__, init_model, and solve,
      except for hyperparameters
   - solvers_by_name: mapping string -> solver class (by default using
     solver_class.__name__ but should be overriden if same names for
     different solvers)
   - problem: defining the problem to solve
   - using default objectives by default, and deducing fromt it
     - direction ("minimize" or "maximize")
     - objective name to display in optuna dashboard
   - objective(trial) function: can be let as is
   - study_name: nmae given to the study
   - storage_path: path to the file used to log the study (can be the
     same for several studies). For easy parallelisation, can be a NFS
     path. A JournalFileStorage will be created at that path if none
     existing.
   - optuna_nb_trials: number of trials to be executed by optuna
      if relaunched, another batch of same number of trials will be added
      if parallelized, each node/process will launch this same number of trials
   - seed: fixed to get reproducible results. But should be None for
     parallelization to avoid having same trials for each node/process.
   - duplicate trials
      - failed trials: by default, they are ignored by optuna, so we
        explicitely prune following trials with exactlythe same
        hyperparameters
      - complete trials: TPESampler can (and will when converging on
	categorical hyperparameters) suggest duplicate trials. In that
        case we raise a warning and simply return the previously
        computed fit, as we are fully deterministic. See https://optuna.readthedocs.io/en/stable/faq.html#how-can-i-ignore-duplicated-samples
        and also optuna/optuna#2021

The optuna results can be monitored by

	optuna-dashboard optuna-journal.log

(if storage_path let as is)
g-poveda pushed a commit to airbus/discrete-optimization that referenced this issue Mar 5, 2024
…d their hyperparameters

- update OptunaPruningSingleFitCallback `report_nb_steps` param
   - rename into `optuna_report_nb_steps` to identify it more simply in
     kwargs
   - give it a default value if user does not define it in optuna script

- add a full example with
   - solvers_to_test: list of classes of solver to test
   - kwargs_fixed_by_solver: kwargs to pass to __ini__, init_model, and solve,
      except for hyperparameters
   - solvers_by_name: mapping string -> solver class (by default using
     solver_class.__name__ but should be overriden if same names for
     different solvers)
   - problem: defining the problem to solve
   - using default objectives by default, and deducing fromt it
     - direction ("minimize" or "maximize")
     - objective name to display in optuna dashboard
   - objective(trial) function: can be let as is
   - study_name: nmae given to the study
   - storage_path: path to the file used to log the study (can be the
     same for several studies). For easy parallelisation, can be a NFS
     path. A JournalFileStorage will be created at that path if none
     existing.
   - optuna_nb_trials: number of trials to be executed by optuna
      if relaunched, another batch of same number of trials will be added
      if parallelized, each node/process will launch this same number of trials
   - seed: fixed to get reproducible results. But should be None for
     parallelization to avoid having same trials for each node/process.
   - duplicate trials
      - failed trials: by default, they are ignored by optuna, so we
        explicitely prune following trials with exactlythe same
        hyperparameters
      - complete trials: TPESampler can (and will when converging on
	categorical hyperparameters) suggest duplicate trials. In that
        case we raise a warning and simply return the previously
        computed fit, as we are fully deterministic. See https://optuna.readthedocs.io/en/stable/faq.html#how-can-i-ignore-duplicated-samples
        and also optuna/optuna#2021

The optuna results can be monitored by

	optuna-dashboard optuna-journal.log

(if storage_path let as is)
@contramundum53
Copy link
Member

contramundum53 commented Apr 4, 2024

In v3.6, GPSampler(deterministic_objective=True) should solve this problem.

(If it still suggests duplicate parameters, it is due to precision loss in GP.
It is GP's way of saying "I don't think any other parameter can improve the objective function". You should switch to BruteForceSampler after that.)

@oasidorshin
Copy link

Recently opened issue again (didn't know about this one), and wanted to ask why the solution is still user-added boilerplate code from the FAQ? It would be much easier to implement this as another parameter for the sample, i.e. previously mentioned last_value_if_duplicate (#4352).

I want to mention that in machine learning there is a huge problem of copy-pasted boilerplate code for everything, and right now instead of simple parameter users have to remember to insert several rows into their objective function.

I suggest to reconsider this decision, with many users asking for built-in solution in this issue. Thank you in advance! @contramundum53 @nzw0301

@not522
Copy link
Member

not522 commented May 14, 2024

Thank you for your feedback. We see this issue as a high-demand problem with a lot of interest. However, how to resolve this issue needs a lot of discussion and we have yet to do anything other than put a work-around in the FAQ. #4352 is an option, but in exchange for easier implementation, it compromises the simplicity of the API.
Please feel free to send us your requests, but please understand that we are very careful about adding APIs so as not to lose the simplicity of Optuna.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests