Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(Internal) Task and data creation wrapper(s) #249

Open
eirrgang opened this issue Nov 3, 2022 · 0 comments
Open

(Internal) Task and data creation wrapper(s) #249

eirrgang opened this issue Nov 3, 2022 · 0 comments
Labels
infrastructure development infrastructure or lower level implementation details workflow metadata scalems workflow state, data references, task tracking, and the associated nonvolatile backing store
Milestone

Comments

@eirrgang
Copy link
Contributor

eirrgang commented Nov 3, 2022

Internal asynchronous and dispatched task management.

We have some cases where we have to track the relationship between scalems tasks, Python threads, RP tasks, and other resource handles. We have started to use entries in the metadata FileStore, but we are maintaining these manually.

We see groups of lines like

filestore.add_task(task_identity, **task_metadata)
task: rp.Task = await asyncio.create_task(to_thread(task_manager.submit_tasks, td))

with other surrounding boiler plate.

For maintainability, robustness, and convenience, we should add a wrapper to apply the current task management protocol with something like

task = runtime.to_thread(command)

See also scalems.radical.runtime.rp_task(), which is currently a module-level function that is not connected to instances of Runtime or FileStore.

Generated files

We want to robustly identify filesystem objects that are used internally and to defer file management to the scalems framework. For generated or temporary files, we see code like the following.

    tmp_base = filestore.directory
    with tempfile.TemporaryDirectory(dir=tmp_base) as dir:
        file_path = os.path.join(dir, file_name)
        with open(file_path, 'w') as fh:
            json.dump(data_structure, fh, default=object_encoder, indent=2)
        file_description = describe_file(file_path, mode='r')
        handle: FileReference = await asyncio.create_task(
            filestore.add_file(file_description),
            name='add-file')

The above snippet creates a temporary file, writes to it, then adds it to the filestore as a workflow artifact.

We could instead write directly to a file-like object or filehandle wrapper that finalizes the editing handle into a FileReference when the editing context is closed.

@eirrgang eirrgang added the workflow metadata scalems workflow state, data references, task tracking, and the associated nonvolatile backing store label Nov 3, 2022
@eirrgang eirrgang added this to the maintenance milestone Nov 3, 2022
@eirrgang eirrgang changed the title Task creation wrapper(s) (Internal) Task and data creation wrapper(s) Nov 4, 2022
@eirrgang eirrgang added the infrastructure development infrastructure or lower level implementation details label Dec 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
infrastructure development infrastructure or lower level implementation details workflow metadata scalems workflow state, data references, task tracking, and the associated nonvolatile backing store
Projects
Status: 🔖 On Deck
Development

No branches or pull requests

1 participant