Skip to content

Commit

Permalink
Beaker workspace performance (#328)
Browse files Browse the repository at this point in the history
* Improve performance of `BeakerWorkspace`

* bump beaker-py

Co-authored-by: Dirk Groeneveld <dirkg@allenai.org>
  • Loading branch information
epwalsh and dirkgr committed Jul 5, 2022
1 parent f43e5ea commit a6b0be9
Show file tree
Hide file tree
Showing 3 changed files with 5 additions and 4 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Improved `Step.ensure_result()` such that the step's result doesn't have to be read from the cache.
- Fixed an issue with the output from `MulticoreExecutor` such that it's now consistent with the default `Executor` for steps that were found in the cache.
- One of our error messages referred to a configuration file that no longer exists.
- Improved performance of `BeakerWorkspace`.


## [v0.9.1](https://github.com/allenai/tango/releases/tag/v0.9.1) - 2022-06-24
Expand Down
2 changes: 1 addition & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ pytorch-lightning>=1.6,<1.7 # needed by: pytorch_lightning
transformers>=4.12.3 # needed by: transformers
sentencepiece>=0.1.96 # needed by: transformers
fairscale==0.4.6 # needed by: fairscale
beaker-py>=1.3.0,<2.0.0 # needed by: beaker
beaker-py>=1.6.2,<2.0.0 # needed by: beaker

# sacremoses should be a dependency of transformers, but it is missing, so we add it manually.
sacremoses # needed by: transformers
Expand Down
6 changes: 3 additions & 3 deletions tango/integrations/beaker/workspace.py
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ class BeakerWorkspace(Workspace):

def __init__(self, workspace: str, **kwargs):
super().__init__()
self.beaker = Beaker.from_env(default_workspace=workspace, **kwargs)
self.beaker = Beaker.from_env(default_workspace=workspace, session=True, **kwargs)
self.cache = BeakerStepCache(beaker=self.beaker)
self.steps_dir = tango_cache_dir() / "beaker_workspace"
self.locks: Dict[Step, BeakerStepLock] = {}
Expand Down Expand Up @@ -233,7 +233,7 @@ def registered_runs(self) -> Dict[str, Run]:
runs: Dict[str, Run] = {}

with concurrent.futures.ThreadPoolExecutor(
max_workers=3, thread_name_prefix="BeakerWorkspace.registered_runs()-"
max_workers=9, thread_name_prefix="BeakerWorkspace.registered_runs()-"
) as executor:
run_futures = []
for dataset in self.beaker.workspace.datasets(
Expand Down Expand Up @@ -295,7 +295,7 @@ def _get_run_from_dataset(self, dataset: Dataset) -> Optional[Run]:
import concurrent.futures

with concurrent.futures.ThreadPoolExecutor(
max_workers=3, thread_name_prefix="BeakerWorkspace._get_run_from_dataset()-"
max_workers=9, thread_name_prefix="BeakerWorkspace._get_run_from_dataset()-"
) as executor:
step_info_futures = []
for step_name, unique_id in steps_info.items():
Expand Down

0 comments on commit a6b0be9

Please sign in to comment.