Add a super quick f-ANOVA algorithm named PED-ANOVA #5212

nabenabe0928 · 2024-01-29T14:56:11Z

Motivation

As f-ANOVA is quite slow when we have many trials, so I implemented a fast algorithm for f-ANOVA.

Paper: PED-ANOVA: Efficiently Quantifying Hyperparameter Importance in Arbitrary Subspaces

Description of the changes

I added a super quick f-ANOVA algorithm.

I compared the runtimes of PED-ANOVA with the Cython implementation of f-ANOVA:

	f-ANOVA	PED-ANOVA
dim=2, n_trials=100	0.06	0.04
dim=2, n_trials=1000	0.7	0.01
dim=2, n_trials=10000	16.09	0.04
dim=8, n_trials=100	0.09	0.02
dim=8, n_trials=1000	1.6	0.02
dim=8, n_trials=10000	92.7	0.1
dim=32, n_trials=100	0.2	0.06
dim=32, n_trials=1000	2.3	0.1
dim=32, n_trials=10000	93.1	0.44

* Separate a file for efficient Parzen estimator * Raise value error when there are 2< samples better than baseline * Fix an import error and remove zero weights from Parzen estimator * Rename step to n_steps * Add a small number to categorical weights to avoid numerical errors * Rename efficient pe to scott pe * Add documentation for PED-ANOVA * Keep only quantile filters for now

eukaryo · 2024-01-30T07:38:25Z

It is just a quick note; I wrote a simple example code and confirmed that the current PR functions as expected.

import time

import optuna


def objective(trial):
    xs = [trial.suggest_float(f"x_{d}", -5.0, 5.0) for d in range(4)]
    norm = sum([x_i * x_i for x_i in xs]) ** 0.5
    if norm <= 1:
        ws = [5.0**a for a in [-3, 0, -1, -2]]
    else:
        ws = [5.0**a for a in [0, -1, -2, -3]]
    return sum([ws[i] * xs[i] * xs[i] for i in range(4)])


sampler = optuna.samplers.TPESampler(seed=12345)
study = optuna.create_study(directions=["minimize"], sampler=sampler)
study.optimize(objective, n_trials=100)
print("")
print(f"{study.best_trial._params=}")

t1 = time.time()
print("")
print(f"{optuna.importance.FanovaImportanceEvaluator(seed=12345).evaluate(study)=}")
t2 = time.time()
print("")
print(
    f"{optuna.importance.PedAnovaImportanceEvaluator(is_lower_better=True).evaluate(study)=}"
)
t3 = time.time()
print("")
print(f"(not fast) F-Anova consumed {t2-t1} seconds.")
print(f"         PED-Anova consumed {t3-t2} seconds.")

codecov · 2024-02-01T13:07:06Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (60498c5) 89.38% compared to head (17006ee) 89.37%.
Report is 189 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #5212      +/-   ##
==========================================
- Coverage   89.38%   89.37%   -0.01%     
==========================================
  Files         206      209       +3     
  Lines       15097    13119    -1978     
==========================================
- Hits        13494    11725    -1769     
+ Misses       1603     1394     -209

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

nabenabe0928 · 2024-02-15T07:33:17Z

Add the documentation here.

optuna/importance/_ped_anova/evaluator.py

tests/importance_tests/test_importance_evaluators.py

tests/importance_tests/pedanova_tests/test_scott_parzen_estimator.py

nabenabe0928 · 2024-02-15T10:05:54Z

Followup: Add a tutorial of how evaluate_on_local results in different results.

HideakiImamura

Thanks for the update. Almost, LGTM! I suggest a minor comment for the file name. PTAL.

tests/importance_tests/pedanova_tests/test_pedanova_evaluator.py

nabenabe0928 · 2024-02-19T07:32:43Z

Handle #5212 (comment) and experimental warning

HideakiImamura · 2024-02-19T07:58:21Z

optuna/importance/_ped_anova/evaluator.py

+            specified search_space. `evaluate_on_local=True` is especially useful when users
+            modify search space during optimization.
+
+        Example:


Could you please reduce the indentation by one level? In the generated document, the example code is included in the arguments section.

eukaryo · 2024-02-19T08:49:00Z

LGTM!

HideakiImamura

LGTM.

HideakiImamura · 2024-02-20T00:47:47Z

Thank you for your long running work! Let me merge this PR.

nabenabe0928 assigned eukaryo Jan 29, 2024

github-actions bot added the optuna.importance Related to the `optuna.importance` submodule. This is automatically labeled by github-actions. label Jan 29, 2024

Remove most of advanced setups

3d17362

nabenabe0928 force-pushed the feat/add-ped-anova branch from ed635c1 to 3d17362 Compare February 1, 2024 12:34

nabenabe0928 added 4 commits February 1, 2024 14:25

Refactor get grids and their indices

c60cf66

Remove filter directory

ee372ca

Refactor quantile filter

3b7c665

Add custom filter

8f35675

nabenabe0928 assigned HideakiImamura Feb 2, 2024

nabenabe0928 added 3 commits February 2, 2024 04:01

Add tests for PED-ANOVA

74b71b9

Bundle tests

6a0987f

Add tests for the PED-ANOVA arguments

96e6fba

nabenabe0928 force-pushed the feat/add-ped-anova branch from c0f7c93 to 96e6fba Compare February 2, 2024 04:27

Invert common tests for fanova and mean decrease

925b6dc

nabenabe0928 force-pushed the feat/add-ped-anova branch from 7b75dde to 925b6dc Compare February 2, 2024 04:31

nabenabe0928 added 12 commits February 2, 2024 06:05

Fix the argument of np.quantile

56e878b

Add TODO comments np.quantile

40662e1

Remove an edgecase of init in quantile filter

2c2ab29

Add some tests for quantile filters

50344a0

Fix formatting errors

633fff2

Rename files

a822d21

Add quantile filter

598ffcb

Apply mamu's reviews

1f100a3

Fix tests

681e25f

Fix ScottParzenEstimator

e13e838

Bundle the tests

120b86b

Fix mypy error

ff9a3e2

nabenabe0928 added 2 commits February 14, 2024 09:00

Cover all the forking

144ae9d

Remove mean decrease impurity test for non single

446966b

nabenabe0928 added 6 commits February 15, 2024 10:23

Add the rename suggestion by takizawa

901946a

Rename test_quantile_filter to test_evaluator

abc071c

Bundle create_trials for single and multi obj

12eee7e

Move ped-anova tests to test_evaluator.py

e429b44

Use parametrize for evaluator tests

67d4251

Remove match in pytest.raises

b690702

nabenabe0928 commented Feb 15, 2024

View reviewed changes

optuna/importance/_ped_anova/evaluator.py Show resolved Hide resolved

tests/importance_tests/test_importance_evaluators.py Outdated Show resolved Hide resolved

tests/importance_tests/pedanova_tests/test_scott_parzen_estimator.py Show resolved Hide resolved

nabenabe0928 added 2 commits February 15, 2024 11:08

Add tests for distributions with step

1acdc12

Rename test_evaluator to test_pedanova_evaluator to avoid conflict

57a2a1a

HideakiImamura added the feature Change that does not break compatibility, but affects the public interfaces. label Feb 16, 2024

HideakiImamura added this to the v3.6.0 milestone Feb 16, 2024

HideakiImamura reviewed Feb 16, 2024

View reviewed changes

tests/importance_tests/pedanova_tests/test_pedanova_evaluator.py Outdated Show resolved Hide resolved

Apply mamu's suggestion

dbe7ad8

nabenabe0928 added 2 commits February 19, 2024 08:35

Add PED-ANOVA in docs/source

2c12f7a

Add experimental warning to PED-ANOVA class

da1fe66

HideakiImamura reviewed Feb 19, 2024

View reviewed changes

Fix examples in doc

17006ee

eukaryo approved these changes Feb 19, 2024

View reviewed changes

eukaryo removed their assignment Feb 19, 2024

HideakiImamura approved these changes Feb 20, 2024

View reviewed changes

HideakiImamura merged commit beacca8 into optuna:master Feb 20, 2024
20 checks passed

nabenabe0928 mentioned this pull request Feb 22, 2024

Add PED-ANOVA as the first option for importance evaluator optuna/optuna-dashboard#811

Merged

1 task

nabenabe0928 deleted the feat/add-ped-anova branch April 11, 2024 17:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a super quick f-ANOVA algorithm named PED-ANOVA #5212

Add a super quick f-ANOVA algorithm named PED-ANOVA #5212

nabenabe0928 commented Jan 29, 2024

eukaryo commented Jan 30, 2024

codecov bot commented Feb 1, 2024 •

edited

nabenabe0928 commented Feb 15, 2024

nabenabe0928 commented Feb 15, 2024

HideakiImamura left a comment

nabenabe0928 commented Feb 19, 2024

HideakiImamura Feb 19, 2024

eukaryo commented Feb 19, 2024 •

edited

HideakiImamura left a comment

HideakiImamura commented Feb 20, 2024

Add a super quick f-ANOVA algorithm named PED-ANOVA #5212

Add a super quick f-ANOVA algorithm named PED-ANOVA #5212

Conversation

nabenabe0928 commented Jan 29, 2024

Motivation

Description of the changes

eukaryo commented Jan 30, 2024

codecov bot commented Feb 1, 2024 • edited

Codecov Report

nabenabe0928 commented Feb 15, 2024

nabenabe0928 commented Feb 15, 2024

HideakiImamura left a comment

Choose a reason for hiding this comment

nabenabe0928 commented Feb 19, 2024

HideakiImamura Feb 19, 2024

Choose a reason for hiding this comment

eukaryo commented Feb 19, 2024 • edited

HideakiImamura left a comment

Choose a reason for hiding this comment

HideakiImamura commented Feb 20, 2024

codecov bot commented Feb 1, 2024 •

edited

eukaryo commented Feb 19, 2024 •

edited