Slow Execution of MOBSTER Algorithm for Tabulated Benchmarks #712

piiiiiiie · 2023-06-13T15:55:26Z

Hello,

Thanks for your well-developed syne-tune.

I was trying to run the MOBSTER algorithm on the tabulated benchmarks (e.g., "lcbench") using simulator backend. I have attached my launching script below for reference. The running time is too long than expected. It reported a wallclock-time of 250s and 116s for the "Fashion-MNIST" and "christine" datasets respectively. However, ASHA only uses 0.25s and 0.13s respectively, for the same benchmark (datasets) and settings. I was wondering if my launching script has any issues.

Based on my understanding, the tabulated benchmarks are designed to accelerate and reduce the cost of experiments by bypassing the actual training procedure. And the wallclock-time reported should be the program execution time on my local machine. So I have another question: How can I obtain the execution time of the tabulated benchmarks (including the real training part)?

Thanks for your attention.

Launching script:

import logging
import itertools
from tqdm import tqdm

from benchmarking.commons.benchmark_definitions.lcbench import lcbench_benchmark
from syne_tune.blackbox_repository import BlackboxRepositoryBackend
from syne_tune.backend.simulator_backend.simulator_callback import SimulatorCallback
from syne_tune.optimizer.baselines import MOBSTER
from syne_tune import Tuner, StoppingCriterion


if __name__ == "__main__":
    logging.getLogger().setLevel(logging.WARNING)

    random_seed = 31415927
    dataset_name_list = ["Fashion-MNIST", "airlines", "albert", "covertype", "christine"]
    n_workers_list = [4]
    max_wallclock_time_list = [3600]
    combinations = list(
        itertools.product(dataset_name_list, n_workers_list, max_wallclock_time_list)
    )
    print(combinations)

    for dataset_name, n_workers, max_wallclock_time in tqdm(combinations):
        benchmark = lcbench_benchmark(dataset_name)

        max_resource_attr = benchmark.max_resource_attr
        trial_backend = BlackboxRepositoryBackend(
            elapsed_time_attr=benchmark.elapsed_time_attr,
            max_resource_attr=max_resource_attr,
            blackbox_name=benchmark.blackbox_name,
            dataset=dataset_name,
        )

        blackbox = trial_backend.blackbox
        restrict_configurations = blackbox.all_configurations()
        scheduler = MOBSTER(
            config_space=blackbox.configuration_space_with_max_resource_attr(
                max_resource_attr
            ),
            max_resource_attr=max_resource_attr,
            resource_attr=blackbox.fidelity_name(),
            mode=benchmark.mode,
            metric=benchmark.metric,
            random_seed=random_seed,
            search_options=dict(restrict_configurations=restrict_configurations),
        )

        stop_criterion = StoppingCriterion(max_wallclock_time=max_wallclock_time)

        # Printing the status during tuning takes a lot of time, and so does
        # storing results.
        print_update_interval = 700
        results_update_interval = 300
        # It is important to set ``sleep_time`` to 0 here (mandatory for simulator
        # backend)
        tuner = Tuner(
            trial_backend=trial_backend,
            scheduler=scheduler,
            stop_criterion=stop_criterion,
            n_workers=n_workers,
            sleep_time=0,
            results_update_interval=results_update_interval,
            print_update_interval=print_update_interval,
            # This callback is required in order to make things work with the
            # simulator callback. It makes sure that results are stored with
            # simulated time (rather than real time), and that the time_keeper
            # is advanced properly whenever the tuner loop sleeps
            callbacks=[SimulatorCallback()],
        )
        tuner.run()

The text was updated successfully, but these errors were encountered:

mseeger · 2023-06-13T16:15:03Z

Hello,

I don't think your script has any issues.

The real-world runtime depends on two factors:

Training time for trials
Decision-making time of the HPO algorithms

The simulator backend on a tabulated benchmark eliminates the first factor, but the second factor remains. While simple methods like RandomSearch and ASHA have very cheap decision making (just random sampling), BayesianOptimization and MOBSTER need a substantial amount of time for decision making.

You can decrease this decision making time in various ways, as said in our docs. But in the end, you need to be careful, because what matters in the end are the results against real wall-clock time, where each trial has a runtime as in the tabulated benchmark.

The fact that simulations run faster for one method than another, has no real significance for comparing them. BO often does much better than RS, unless you have extremely cheap trials.

mseeger · 2023-06-13T16:16:23Z

Your second question: The real wall-clock time for an experiment is what is logged in the results, in the column of name ST_TUNER_TIME. That is why if you plot results, you get the correct results (as if trials really took the time in the table).

mseeger · 2023-06-13T16:17:55Z

BTW: I strongly encourage you to look at this tutorial:
https://syne-tune.readthedocs.io/en/latest/tutorials/benchmarking/README.html

Using this is much simpler than writing launcher scripts. At the moment, benchmarking is not available from pip, but only when installed from source, but we'll change this soon.

piiiiiiie · 2023-06-14T03:16:06Z

Thank you for your thorough explanations! I am studying it and will reach out again by opening a new issue if having other questions.

piiiiiiie closed this as completed Jun 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slow Execution of MOBSTER Algorithm for Tabulated Benchmarks #712

Slow Execution of MOBSTER Algorithm for Tabulated Benchmarks #712

piiiiiiie commented Jun 13, 2023

mseeger commented Jun 13, 2023

mseeger commented Jun 13, 2023

mseeger commented Jun 13, 2023

piiiiiiie commented Jun 14, 2023

Slow Execution of MOBSTER Algorithm for Tabulated Benchmarks #712

Slow Execution of MOBSTER Algorithm for Tabulated Benchmarks #712

Comments

piiiiiiie commented Jun 13, 2023

mseeger commented Jun 13, 2023

mseeger commented Jun 13, 2023

mseeger commented Jun 13, 2023

piiiiiiie commented Jun 14, 2023