Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

💡 What is a "project" and how do we use them? #163

Open
liamhuber opened this issue Jan 12, 2024 · 0 comments
Open

💡 What is a "project" and how do we use them? #163

liamhuber opened this issue Jan 12, 2024 · 0 comments

Comments

@liamhuber
Copy link
Member

@pmrv, @JNmpi, and I discussed a bit the similarities and differences between a "project" (e.g. tinybase ProjectAdapter), and Workflow.

Workflow is a dynamic and flexible object used when you're developing your workflow (once it's crystallized you can turn it into a Macro). Workflow is also a parent-most object in the graph context, i.e. it is not intrinsically aware of any other graphs. In their current implementation the semantic path of a workflow always just starts with the workflow label, and then there is a perfect 1:1 correspondence of filesystem directories and the semantic path.

@pmrv pointed out that there are times when you may be jointly developing two or more different "workflows", which are related by some data connection, but where you don't necessarily want to always be re-running the upstream part of the process while you're modifying and playing with some downstream component. One day you might jam them all into a single big workflow that runs top to bottom, but in the moment it can be helpful to keep different development chunks separated.

A "project" may then bring:

  • The ability to show a semantic connection between multiple workflows
  • The ability to specify a difference between semantic location and storage location
  • Tools to easily grab output of owned workflows and make it available to other workflows
    • e.g. shallow de-serialization of just the workflow output level from storage
  • A place to specify generic behaviour, e.g. what type of backend to use for storage (HDF, S3, ...)
  • A connection to the database

In our conversation, the question was whether this fundamentally required a separate Project class, or if extension of the existing Workflow behaviour would be sufficient. We came to the tentative conclusion that Workflow could simply be more empowered. E.g. this pseudocode:

pr = Project(
    semantic_path="test/subdir", 
    storage_root="/usr/some/other/place"
    storage_type="hdf", 
)
wf = pr.Workflow("foo")

Could be equivalent to this:

wf = Workflow(
    "test/subdir/foo", 
    storage_root="/usr/some/other/place", 
    storage_path="foo",
    storage_type="hdf"
)

In both cases the resulting workflow has the same semantic path (wf.semantic_root / "test/subdir/foo") and storage location that differs from it ("/usr/some/other/place/foo") and storage back-end (hdf). In the former case, because the full semantic path was given to the project, the wf.sematic_root would just be nothing. More generally, one can imagine in the latter case that wf.semantic_path == wf.semantic_root / wf.label and wf.storage_location == wf.storage_root / wf.label, where the default for both the semantic_root and storage_root is just cwd(), but could otherwise be provided at instantiation or set in a config file.

This is all just some pseudocode, but it shows there is no obvious reason an extra Project class is needed -- the separation of semantic and filesystem paths can be handled right inside Workflow.

Similarly, if we have a database interface (singleton?), this can be slapped onto Workflow and given useful shortcuts just the way Creator is (Workflow.create::Creator(), Workflow.register::Creator().register, ...).

So the tentative plan is to bring tinybase here from contrib (#161), and then slowly start merging in the project and/or job capabilities we need from there into Workflow and/or Node.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant