Skip to content

Commit

Permalink
Hard-code Joblib backend to loky (#423)
Browse files Browse the repository at this point in the history
  • Loading branch information
jan-matthis committed Feb 19, 2020
1 parent 6ed1267 commit 45ba5e3
Show file tree
Hide file tree
Showing 4 changed files with 19 additions and 36 deletions.
Expand Up @@ -8,15 +8,6 @@ hydra:
# maximum number of concurrently running jobs. if -1, all CPUs are used
n_jobs: -1

# allows to hard-code backend, otherwise inferred based on prefer and require
backend: null

# processes or threads, soft hint to choose backend
prefer: processes

# null or sharedmem, sharedmem will select thread-based backend
require: null

# if greater than zero, prints progress messages
verbose: 0

Expand All @@ -33,7 +24,7 @@ hydra:
temp_folder: null

# thresholds size of arrays that triggers automated memmapping
max_nbytes: 1M
max_nbytes: null

# memmapping mode for numpy arrays passed to workers
mmap_mode: r
Expand Up @@ -4,7 +4,7 @@
from typing import Any, Dict, List, Optional, Sequence

from joblib import Parallel, delayed # type: ignore
from omegaconf import DictConfig, open_dict
from omegaconf import DictConfig, OmegaConf, open_dict

from hydra.core.config_loader import ConfigLoader
from hydra.core.config_search_path import ConfigSearchPath
Expand Down Expand Up @@ -76,20 +76,25 @@ def launch(self, job_overrides: Sequence[Sequence[str]]) -> Sequence[JobReturn]:
configure_log(self.config.hydra.hydra_logging, self.config.hydra.verbose)
sweep_dir = Path(str(self.config.hydra.sweep.dir))
sweep_dir.mkdir(parents=True, exist_ok=True)

# Joblib's backend is hard-coded to loky since the threading
# backend is incompatible with Hydra
joblib_keywords = OmegaConf.to_container(self.joblib, resolve=True)
joblib_keywords["backend"] = "loky"

log.info(
"Joblib.Parallel({}) is launching {} jobs".format(
",".join([f"{k}={v}" for k, v in self.joblib.items()]),
",".join([f"{k}={v}" for k, v in joblib_keywords.items()]),
len(job_overrides),
)
)
log.info("Launching jobs, sweep output dir : {}".format(sweep_dir))

singleton_state = Singleton.get_state()

for idx, overrides in enumerate(job_overrides):
log.info("\t#{} : {}".format(idx, " ".join(filter_overrides(overrides))))

runs = Parallel(**self.joblib)(
singleton_state = Singleton.get_state()

runs = Parallel(**joblib_keywords)(
delayed(execute_job)(
idx,
overrides,
Expand All @@ -116,9 +121,6 @@ def execute_job(
singleton_state: Dict[Any, Any],
) -> JobReturn:
"""Calls `run_job` in parallel
Note that Joblib's default backend runs isolated Python processes, see
https://joblib.readthedocs.io/en/latest/parallel.html#shared-memory-semantics
"""
setup_globals()
Singleton.set_state(singleton_state)
Expand Down
2 changes: 1 addition & 1 deletion plugins/hydra_joblib_launcher/setup.py
Expand Up @@ -5,7 +5,7 @@
LONG_DESC = fh.read()
setup(
name="hydra-joblib-launcher",
version="1.0.1",
version="1.0.2",
author="Jan-Matthis Lueckmann, Omry Yadan",
author_email="mail@jan-matthis.de, omry@fb.com",
description="Joblib Launcher for Hydra apps",
Expand Down
22 changes: 6 additions & 16 deletions website/docs/plugins/joblib_launcher.md
Expand Up @@ -18,7 +18,7 @@ defaults:
- hydra/launcher: joblib
```

By default, process-based parallelism using all available CPU cores is used. By overriding the default configuration, it is e.g. possible to switch to thread-based parallelism and limit the number of parallel executions.
By default, process-based parallelism using all available CPU cores is used. By overriding the default configuration, it is e.g. possible limit the number of parallel executions.

The default configuration packaged with the plugin is:

Expand All @@ -33,15 +33,6 @@ hydra:
# maximum number of concurrently running jobs. if -1, all CPUs are used
n_jobs: -1

# allows to hard-code backend, otherwise inferred based on prefer and require
backend: null

# processes or threads, soft hint to choose backend
prefer: processes

# null or sharedmem, sharedmem will select thread-based backend
require: null

# if greater than zero, prints progress messages
verbose: 0

Expand All @@ -58,21 +49,20 @@ hydra:
temp_folder: null

# thresholds size of arrays that triggers automated memmapping
max_nbytes: 1M
max_nbytes: null

# memmapping mode for numpy arrays passed to workers
mmap_mode: r
```

`n_jobs` defaults to -1 (all available CPUs) and `prefer` defaults to `processes`. Depending on the application, `threads` might be a good alternative. All arguments specified in `joblib` are passed to `Joblib.Parallel` (see [`Joblib.Parallel` documentation](https://joblib.readthedocs.io/en/latest/parallel.html) for full details).
`n_jobs` defaults to -1 (all available CPUs). See [`Joblib.Parallel` documentation](https://joblib.readthedocs.io/en/latest/parallel.html) for full details on arguments. Note that the backend is hard-coded to use process-based parallelism (Joblib's loky backend), since thread-based parallelism is incompatible with Hydra.

An [example application](https://github.com/facebookresearch/hydra/tree/master/plugins/hydra_joblib_launcher/example) using this launcher is provided in `plugins/hydra_joblib_launcher/example`. Starting the app with `python my_app.py --multirun task=1,2,3,4,5` will launch five parallel executions:

```text
$ python example/my_app.py --multirun task=1,2,3,4,5
[HYDRA] Sweep output dir : multirun/2020-02-05/13-59-56
[HYDRA] Joblib.Parallel(n_jobs=-1,backend=None,prefer=processes,require=None,verbose=0,timeout=None,pre_dispatch=2*n_jobs,batch_size=auto,temp_folder=None,max_nbytes=1M,mmap_mode=r) is launching 5 jobs
[HYDRA] Sweep output dir : multirun/2020-02-05/13-59-56
$ python my_app.py --multirun task=1,2,3,4,5
[HYDRA] Joblib.Parallel(n_jobs=-1,verbose=0,timeout=None,pre_dispatch=2*n_jobs,batch_size=auto,temp_folder=None,max_nbytes=None,mmap_mode=r,backend=loky) is launching 5 jobs
[HYDRA] Launching jobs, sweep output dir : multirun/2020-02-18/10-00-00
[__main__][INFO] - Process ID 14336 executing task 2 ...
[__main__][INFO] - Process ID 14333 executing task 1 ...
[__main__][INFO] - Process ID 14334 executing task 3 ...
Expand Down

0 comments on commit 45ba5e3

Please sign in to comment.