Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow Execution of MOBSTER Algorithm for Tabulated Benchmarks #712

Closed
piiiiiiie opened this issue Jun 13, 2023 · 4 comments
Closed

Slow Execution of MOBSTER Algorithm for Tabulated Benchmarks #712

piiiiiiie opened this issue Jun 13, 2023 · 4 comments

Comments

@piiiiiiie
Copy link

Hello,

Thanks for your well-developed syne-tune.

I was trying to run the MOBSTER algorithm on the tabulated benchmarks (e.g., "lcbench") using simulator backend. I have attached my launching script below for reference. The running time is too long than expected. It reported a wallclock-time of 250s and 116s for the "Fashion-MNIST" and "christine" datasets respectively. However, ASHA only uses 0.25s and 0.13s respectively, for the same benchmark (datasets) and settings. I was wondering if my launching script has any issues.

Based on my understanding, the tabulated benchmarks are designed to accelerate and reduce the cost of experiments by bypassing the actual training procedure. And the wallclock-time reported should be the program execution time on my local machine. So I have another question: How can I obtain the execution time of the tabulated benchmarks (including the real training part)?

Thanks for your attention.

Launching script:

import logging
import itertools
from tqdm import tqdm

from benchmarking.commons.benchmark_definitions.lcbench import lcbench_benchmark
from syne_tune.blackbox_repository import BlackboxRepositoryBackend
from syne_tune.backend.simulator_backend.simulator_callback import SimulatorCallback
from syne_tune.optimizer.baselines import MOBSTER
from syne_tune import Tuner, StoppingCriterion


if __name__ == "__main__":
    logging.getLogger().setLevel(logging.WARNING)

    random_seed = 31415927
    dataset_name_list = ["Fashion-MNIST", "airlines", "albert", "covertype", "christine"]
    n_workers_list = [4]
    max_wallclock_time_list = [3600]
    combinations = list(
        itertools.product(dataset_name_list, n_workers_list, max_wallclock_time_list)
    )
    print(combinations)

    for dataset_name, n_workers, max_wallclock_time in tqdm(combinations):
        benchmark = lcbench_benchmark(dataset_name)

        max_resource_attr = benchmark.max_resource_attr
        trial_backend = BlackboxRepositoryBackend(
            elapsed_time_attr=benchmark.elapsed_time_attr,
            max_resource_attr=max_resource_attr,
            blackbox_name=benchmark.blackbox_name,
            dataset=dataset_name,
        )

        blackbox = trial_backend.blackbox
        restrict_configurations = blackbox.all_configurations()
        scheduler = MOBSTER(
            config_space=blackbox.configuration_space_with_max_resource_attr(
                max_resource_attr
            ),
            max_resource_attr=max_resource_attr,
            resource_attr=blackbox.fidelity_name(),
            mode=benchmark.mode,
            metric=benchmark.metric,
            random_seed=random_seed,
            search_options=dict(restrict_configurations=restrict_configurations),
        )

        stop_criterion = StoppingCriterion(max_wallclock_time=max_wallclock_time)

        # Printing the status during tuning takes a lot of time, and so does
        # storing results.
        print_update_interval = 700
        results_update_interval = 300
        # It is important to set ``sleep_time`` to 0 here (mandatory for simulator
        # backend)
        tuner = Tuner(
            trial_backend=trial_backend,
            scheduler=scheduler,
            stop_criterion=stop_criterion,
            n_workers=n_workers,
            sleep_time=0,
            results_update_interval=results_update_interval,
            print_update_interval=print_update_interval,
            # This callback is required in order to make things work with the
            # simulator callback. It makes sure that results are stored with
            # simulated time (rather than real time), and that the time_keeper
            # is advanced properly whenever the tuner loop sleeps
            callbacks=[SimulatorCallback()],
        )
        tuner.run()
@mseeger
Copy link
Collaborator

mseeger commented Jun 13, 2023

Hello,

I don't think your script has any issues.

The real-world runtime depends on two factors:

  • Training time for trials
  • Decision-making time of the HPO algorithms

The simulator backend on a tabulated benchmark eliminates the first factor, but the second factor remains. While simple methods like RandomSearch and ASHA have very cheap decision making (just random sampling), BayesianOptimization and MOBSTER need a substantial amount of time for decision making.

You can decrease this decision making time in various ways, as said in our docs. But in the end, you need to be careful, because what matters in the end are the results against real wall-clock time, where each trial has a runtime as in the tabulated benchmark.

The fact that simulations run faster for one method than another, has no real significance for comparing them. BO often does much better than RS, unless you have extremely cheap trials.

@mseeger
Copy link
Collaborator

mseeger commented Jun 13, 2023

Your second question: The real wall-clock time for an experiment is what is logged in the results, in the column of name ST_TUNER_TIME. That is why if you plot results, you get the correct results (as if trials really took the time in the table).

@mseeger
Copy link
Collaborator

mseeger commented Jun 13, 2023

BTW: I strongly encourage you to look at this tutorial:
https://syne-tune.readthedocs.io/en/latest/tutorials/benchmarking/README.html

Using this is much simpler than writing launcher scripts. At the moment, benchmarking is not available from pip, but only when installed from source, but we'll change this soon.

@piiiiiiie
Copy link
Author

Thank you for your thorough explanations! I am studying it and will reach out again by opening a new issue if having other questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants