Skip to content

[Bug] Directories created awkwardly early #704

@liamhuber

Description

@liamhuber

With the single node executor using caching, I don't actually get a directory on my file system until I absolutely need it:

import pathlib
import shutil

from executorlib import SingleNodeExecutor

def foo(x):
    return x + 1

dirname = "not_created_until_submit"
with SingleNodeExecutor(cache_directory=dirname) as exe:
    print("Directory exists before submitting?", pathlib.Path(dirname).is_dir())
    f = exe.submit(foo, 1, resource_dict={"cache_key": "my_foo"})
    print("Result", f.result())
    print("Directory exists after submitting?", pathlib.Path(dirname).is_dir())
    shutil.rmtree(dirname)
>>> Directory exists before submitting? False
>>> Result 2
>>> Directory exists after submitting? True

But with the SlurmClusterExecutor and FluxClusterExecutor when they are in their branch to invoke executorlib.task_scheduler.file.task_scheduler.create_file_executor the cache directory is created at initialization:

import pathlib
import shutil

from executorlib import SlurmClusterExecutor

def foo(x):
    return x + 1

dirname = "gets_created_at_init"
with SlurmClusterExecutor(cache_directory=dirname, resource_dict={"partition": "s.cmfe"}) as exe:
    print("Directory exists before submitting?", pathlib.Path(dirname).is_dir())
    f = exe.submit(foo, 1, resource_dict={"cache_key": "my_foo"})
    print("Result", f.result())
    print("Directory exists after submitting?", pathlib.Path(dirname).is_dir())
    shutil.rmtree(dirname)
>>> Directory exists before submitting? True
>>> Result 2
>>> Directory exists after submitting? True

IMO it would be mildly preferable to delay creating the directory until it's needed, i.e. by wrapping the os.makedirs(cache_directory_path, exist_ok=True) call into whatever function is actually writing the *_i.h5 file. This is a bit prettier, more symmetric with the SingleNodeExecutor, and prevents useless folders from being created (in case the user is silly and never uses the instantiated executor) and prevents them from being prematurely deleted (there is rather some gap between the creation and writing the *_i.h5 file, so a very ambitious user/script may delete it in this small gap).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions