A first pass at storage #160

liamhuber · 2024-01-10T22:26:40Z

@samwaseda, @JNmpi -- saving and loading is working! Needs this branch, and the tinydata branch on contrib.

@pmrv, before I dig deep and start fleshing out tests and examples, I would love to get some high-level feedback from you about the architecture and how I'm (probably mis)using tinybase storage tools.

Design

Uses tinybase.storage.H5ioStorage to save and load nodes (individual function nodes, macros, and workflows). Is written using this branch in contrib, so it is not surprising when tests fail!

Node (and some children) and DataChannel define methods from storing their data in/restoring their data from an H5ioStorage instance, although the DataChannel

The expectation is that loading happens from an already-live instance of a Node; that means that it has already instantiated its child DataChannel objects and they are live too -- so we are really just populating attributes of live objects.

Information about channel connections is stored at the Composite level, and these are recreated after all children are available.

Macros instantiate their children at their own instantiation time, but Workflows need to look at the stored data (package identifier and node class), re-register packages, and re-instantiate their child nodes. Then these nodes can be restored from HDF as usual.

We delay interaction with pyiron_contrib until the last possible moment, as the import still takes forever.

Features

save() and load() interface available at the Node level -- works for function nodes, macros, and workflows
Newly created nodes will try to load themselves by file, if a matching save file exists

It works like this:

from pyiron_workflow import Workflow

Workflow.register("pyiron_workflow.node_library.atomistics", "atomistics")

wf = Workflow("ev_curve", overwrite_save=True)

wf.structure = wf.create.atomistics.task.Bulk("Al")
wf.calculator = wf.create.atomistics.calculator.Emt()

wf.ev = wf.create.atomistics.macro.EnergyVolumeCurve(
    structure=wf.structure, 
    calculator=wf.calculator,
)

out = wf()

wf.save()
out.ev__result_dict["bulkmodul_eq"]
>>> 39.494306059591686

Then we can come back later, in a fresh python interpreter (with the same cwd so we can find the save file), and do this:

from pyiron_workflow import Workflow

wf = Workflow("ev_curve")

wf.outputs.ev__result_dict.value["bulkmodul_eq"]
>>> 39.494306059591686

Note that at load-time we didn't need to register any packages -- everything we needed was found in the stored data and registered automatically.

Shortcomings

Children of Composite and children of Function need to have macro_creator and node_function defined as class methods, not passed at instantiation, or I'm 99% sure saving/loading will break. This is not too bad, since all the nodes created by the decorators fulfill this automatically, and that's both our preferred and most common way of doing things. This is a non-issue; if you define a new Node instance and save it with the same name, indeed, you'll get garbage -- IMO this is just a special case of changing the source code between saving and loading. If you set up the same node instance both times, it saves and loads just fine because it has the necessary function again anyhow.
If you change source code for any nodes between saving and loading, all bets are off
To be saved, all of a Workflow's child nodes need to be created from a package (wf.create....) -- this is so they get a package_identifier and the loading workflow can re-instantiate them. This is not too much of a pain right now since our package tracking is pretty sloppy -- a package is just a module -- so you just need to move nodes from your jupyter notebook over to a .py file somewhere in your python path and register them like usual prior to saving.
There is no filtering of any type right now -- all the IO data gets saved to full depth
If you modify a Macro instance in your notebook (e.g. by changing its internal connections, or replacing an internal node), you'll still be able to save it, but loading it will probably just break
The interaction with the storage uses a lot of by-hand string keys that need to be synced in saving and loading, and what gets saved is all manually included, both of which feel like very brittle implementation to me

TODO:

High-level design review (@pmrv)
- ~~Stack a PR: Pull in tinybase from contrib~~ Do it later, after contrib 949 is merged
- ~~Stack a PR: Rebase classes onto Storable (at least Node -- maybe ChannelData still gets a different solution) so we can use _re/store directly instead of the bespoke to/from_storage~~ Do it later, we have an issue placeholder Storage: Use tinybase.storage.Storable directly #162
Add more functionality
- Relate storage file to the parent-most node (cf. Expose root node #165)
- Give direct access to child node storage based on the storage file and a semantic path (cf. Introduce ~semantic~ graph path #167)
- A node attribute to flag whether a save should be done post-run
TESTS
Docs
examples
~~Wait for tinybase tools used to be available on conda~~ Git-copy it to this repo instead (and later, per above)

Closes #153

github-actions · 2024-01-10T22:26:53Z

👈 Launch a binder notebook on branch pyiron/pyiron_workflow/simple_storage

pyiron_workflow/node.py

pmrv

I didn't have time today to read in depth, but will do so before our meeting tomorrow.

One thing that occurs to me right now: to_storage/from_storage is very similar to the Storable interface. I guess implementing restore would be more difficult in your setup because you wanted the objects to be live before populating from storage, but it would make other things much easier (no need to keep track of class names in storage, it already saves a simple version tag). If this is a blocker I would rather discuss what needs to change in Storable to enable it to be used here.

liamhuber · 2024-01-12T03:47:24Z

I didn't have time today to read in depth, but will do so before our meeting tomorrow.

Super. Indeed, if the academic agenda permits, spending time talking about this stuff live would be both enjoyable and helpful!

One thing that occurs to me right now: to_storage/from_storage is very similar to the Storable interface. I guess implementing restore would be more difficult in your setup because you wanted the objects to be live before populating from storage, but it would make other things much easier (no need to keep track of class names in storage, it already saves a simple version tag). If this is a blocker I would rather discuss what needs to change in Storable to enable it to be used here.

So there are two fundamental objects we're storing: DataChannel and Node. The channels are owned by nodes, and take their parent node as an arg at instantiation time. This sort of reciprocal link is super convenient when everything is alive, and making it a mandatory arg is perfectly sensible under those conditions, but it's hell for storage. So, for DataChannel I think we're stuck defining some sort of bespoke data-extraction routine. Thankfully this is so far quite simple.

For Node I tend to agree with you. I did try sticking Storable into the MRO, but ran into two problems, both of which are probably resolvable:

I just couldn't get it to do what I wanted. I really suspect this is from lack of trying/insufficiently deep understanding of Storable and surrounding classes, and probably not a fundamental problem. We're doing all sorts of nonsense with decorators and dynamically generated classes, but I'm still cautiously optimistic the only real problem was my idiocy.
Importing from contrib is too damned slow, and if Node inherits from it directly then I have to import at boot time and can't delay it until I go out of my way to call .save(). This is resolvable by getting tinybase out on its own somewhere, so is only a short-term barrier. But I don't want to merge and release a solution that requires the "pyiron" boot time, as right now getting running with workflows is pretty snappy.

codacy-production · 2024-01-18T00:46:14Z

Coverage summary from Codacy

See diff coverage on Codacy

Coverage variation	Diff coverage
❌ -2.76% (target: -1.00%)	✅ 35.48%

Coverage variation details

	Coverable lines	Covered lines	Coverage
Common ancestor commit (`cc93042`)	2591	2213	85.41%
Head commit (`eda307a`)	2744 (+153)	2268 (+55)	82.65% (-2.76%)

Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: <coverage of head commit> - <coverage of common ancestor commit>

Diff coverage details

	Coverable lines	Covered lines	Diff coverage
Pull request (#160)	155	55	35.48%

Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: <covered lines added or modified>/<coverable lines added or modified> * 100%

See your quality gate settings Change summary preferences

_{You may notice some variations in coverage metrics with the latest Coverage engine update. For more details, visit the documentation}

Unless instructed to delete them

The storage can't handle the set and leaves empty contents

Instead of the project

It was just weird. Just explicitly check for empty. Honestly, we can probably revert the `only_if_empty` kwarg too, but that can be its own PR.

liamhuber · 2024-02-16T18:42:01Z

Could not solve for environment specs
The following package could not be installed
└─ pyiron_contrib 0.1.15**  does not exist (perhaps a typo or a missing channel).

It does though. Something on the internet must just not have updated yet. I'll try again in a few minutes.

liamhuber · 2024-02-16T18:49:16Z

Could not solve for environment specs
The following package could not be installed
└─ pyiron_contrib 0.1.15**  does not exist (perhaps a typo or a missing channel).
It does though. Something on the internet must just not have updated yet. I'll try again in a few minutes.

0.1.15 is the most recent version showing on anaconda.org, but 0.1.14 is the most recent on my local machine using conda search -c conda-forge -f pyiron_contrib, so I guess conda is just updating different parts of itself at different times and this is not github's fault at all.

liamhuber · 2024-02-16T19:14:36Z

The "rerun failed jobs" button for the CI report and "rerun job" for individual jobs are both doing nothing, so I am going to close and reopen...

liamhuber · 2024-02-16T21:14:27Z

There's some sort of pathing issue going on in the CI environment compared to my local machine; everything fails at the first node instantiation with errors of the form:

======================================================================
ERROR: test_run_data_tree (test_node.TestNode)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/share/miniconda3/envs/my-env/lib/python3.10/pathlib.py", line 1175, in mkdir
    self._accessor.mkdir(self, mode)
FileNotFoundError: [Errno 2] No such file or directory: 'start'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/runner/work/pyiron_workflow/pyiron_workflow/tests/unit/test_node.py", line 64, in setUp
    self.n1 = ANode("start", x=0)
  File "/home/runner/work/pyiron_workflow/pyiron_workflow/pyiron_workflow/snippets/has_post.py", line 16, in __call__
    post(instance, *args, **kwargs)
  File "/home/runner/work/pyiron_workflow/pyiron_workflow/pyiron_workflow/node.py", line 362, in __post__
    do_load = self.storage.has_contents
  File "/home/runner/work/pyiron_workflow/pyiron_workflow/pyiron_workflow/storage.py", line 101, in has_contents
    has_contents = self._tinybase_storage_is_there or self._h5io_storage_is_there
  File "/home/runner/work/pyiron_workflow/pyiron_workflow/pyiron_workflow/storage.py", line 150, in _tinybase_storage_is_there
    if os.path.isfile(self._tinybase_storage_file_path):
  File "/home/runner/work/pyiron_workflow/pyiron_workflow/pyiron_workflow/storage.py", line 133, in _tinybase_storage_file_path
    self.node.graph_root.working_directory.path
  File "/home/runner/work/pyiron_workflow/pyiron_workflow/pyiron_workflow/node.py", line 788, in working_directory
    self._working_directory = DirectoryObject(self.label)
  File "/home/runner/work/pyiron_workflow/pyiron_workflow/pyiron_workflow/snippets/files.py", line 41, in __init__
    self.create()
  File "/home/runner/work/pyiron_workflow/pyiron_workflow/pyiron_workflow/snippets/files.py", line 44, in create
    self.path.mkdir(parents=True, exist_ok=True)
  File "/usr/share/miniconda3/envs/my-env/lib/python3.10/pathlib.py", line 1180, in mkdir
    self.mkdir(mode, parents=False, exist_ok=exist_ok)
  File "/usr/share/miniconda3/envs/my-env/lib/python3.10/pathlib.py", line 1175, in mkdir
    self._accessor.mkdir(self, mode)
FileNotFoundError: [Errno 2] No such file or directory: 'start'

liamhuber · 2024-02-16T22:06:35Z

Ok, the tests were failing on my local machine as well. It was an issue with the NodeJob stuff interacting with the storage stuff and bad merges that I missed because the CI was out and I wasn't careful enough.

It looks like I'll also need to patch all the storage to run only on >=3.11 after all.

It was just not showing up because the test name was doubled up

And test for it, hopefully, we'll find out on the CI I guess

Otherwise we check to see if storage is there and get an error

…h_jobs_in_storage

Out of kindness

Make NodeJob compliant with the storage interface

codacy-production · 2024-02-17T05:54:41Z

Coverage summary from Codacy

See diff coverage on Codacy

Coverage variation	Diff coverage
✅ +1.31% (target: -1.00%)	✅ 90.86%

Coverage variation details

	Coverable lines	Covered lines	Coverage
Common ancestor commit (`344d1c4`)	2596	2218	85.44%
Head commit (`2a887dc`)	3065 (+469)	2659 (+441)	86.75% (+1.31%)

Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: <coverage of head commit> - <coverage of common ancestor commit>

Diff coverage details

	Coverable lines	Covered lines	Diff coverage
Pull request (#160)	514	467	90.86%

Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: <covered lines added or modified>/<coverable lines added or modified> * 100%

See your quality gate settings Change summary preferences

_{You may notice some variations in coverage metrics with the latest Coverage engine update. For more details, visit the documentation}

liamhuber requested a review from pmrv January 10, 2024 22:26

liamhuber added format_black trigger the Black formatting bot and removed format_black trigger the Black formatting bot labels Jan 10, 2024

pmrv reviewed Jan 11, 2024

View reviewed changes

pyiron_workflow/node.py Outdated Show resolved Hide resolved

pmrv reviewed Jan 11, 2024

View reviewed changes

pyiron_workflow/node.py Outdated Show resolved Hide resolved

pmrv reviewed Jan 11, 2024

View reviewed changes

liamhuber linked an issue Jan 12, 2024 that may be closed by this pull request

Storage: node and channel data should be able to be stored in HDF #153

Closed

liamhuber mentioned this pull request Jan 12, 2024

Storage: Use tinybase.storage.Storable directly #162

Closed

liamhuber force-pushed the simple_storage branch from 569cd35 to bd62c77 Compare January 15, 2024 21:41

liamhuber and others added 17 commits January 18, 2024 09:38

Expose the set of package identifiers for child nodes on a composite

9b3735c

Add methods for interacting with tinybase.storage.H5ioStorage

c4a457d

Give public-facing save and load methods

5de74c5

Fail early on Workflow saves if any children are not from a package

57b06d4

Load discovered save files automatically

8606a63

Unless instructed to delete them

🐛 use correct method name

b299973

Save list not set

c1d107b

The storage can't handle the set and leaves empty contents

Format black

87ae9f9

Use renamed method

583cb33

Clean the working directory attribute

f377971

Use H5ioStorage directly

9f6153a

Instead of the project

Don't rely on return behaviour on DirectoryObject.delete

f541fa9

Revert the return behaviour on DirectoryObject.delete

b6b47ce

It was just weird. Just explicitly check for empty. Honestly, we can probably revert the `only_if_empty` kwarg too, but that can be its own PR.

Remove comment

b55702c

No magic strings

ff0463d

Revert name changes on the DirectoryObject removing files method

6309516

Use FileObject directly

8edc5a0

liamhuber force-pushed the simple_storage branch from 00a5709 to 8edc5a0 Compare January 18, 2024 17:55

liamhuber added the format_black trigger the Black formatting bot label Feb 15, 2024

pyiron-runner and others added 3 commits February 15, 2024 22:06

Format black

394ccd4

Bump atomistics and contrib

e685c84

[dependabot skip] Update env file

648cfdc

liamhuber closed this Feb 16, 2024

liamhuber reopened this Feb 16, 2024

liamhuber mentioned this pull request Feb 16, 2024

Which version to use for DPG tutorial #207

Closed

Make NodeJob compliant with the storage interface

0370f52

liamhuber and others added 13 commits February 16, 2024 20:58

Wrap tests that use storage in a version check

84c326c

🐛 de-double and de-bug the node test

f831420

It was just not showing up because the test name was doubled up

Fail hard and clean when the python version is too low

6abc23a

And test for it, hopefully, we'll find out on the CI I guess

Don't even try loading if python <3.11

740e236

Otherwise we check to see if storage is there and get an error

🐛 Add missed skipIf

8fbd070

Add a warning to the node job tests

dc1ea55

Only test node job docs on valid python versions

e03ab93

Add the version requirement right to the top of the docs about storage

afc7d90

Add the version requirement right to the top of the docs about storage

c7474f3

Merge remote-tracking branch 'origin/patch_jobs_in_storage' into patc…

d21e85a

…h_jobs_in_storage

Add version filtering to the deepdive

fabf465

Out of kindness

Format black

3f0f8d9

Merge pull request #209 from pyiron/patch_jobs_in_storage

2a887dc

Make NodeJob compliant with the storage interface

liamhuber marked this pull request as ready for review February 17, 2024 06:06

liamhuber merged commit 7fcae5f into main Feb 17, 2024
15 of 16 checks passed

liamhuber deleted the simple_storage branch February 17, 2024 06:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A first pass at storage #160

A first pass at storage #160

liamhuber commented Jan 10, 2024 •

edited

github-actions bot commented Jan 10, 2024

pmrv left a comment

liamhuber commented Jan 12, 2024

codacy-production bot commented Jan 18, 2024 •

edited

liamhuber commented Feb 16, 2024

liamhuber commented Feb 16, 2024

liamhuber commented Feb 16, 2024

liamhuber commented Feb 16, 2024 •

edited

liamhuber commented Feb 16, 2024

codacy-production bot commented Feb 17, 2024

A first pass at storage #160

A first pass at storage #160

Conversation

liamhuber commented Jan 10, 2024 • edited

Design

Features

Shortcomings

TODO:

github-actions bot commented Jan 10, 2024

pmrv left a comment

Choose a reason for hiding this comment

liamhuber commented Jan 12, 2024

codacy-production bot commented Jan 18, 2024 • edited

Coverage summary from Codacy

See diff coverage on Codacy

See your quality gate settings Change summary preferences

liamhuber commented Feb 16, 2024

liamhuber commented Feb 16, 2024

liamhuber commented Feb 16, 2024

liamhuber commented Feb 16, 2024 • edited

liamhuber commented Feb 16, 2024

codacy-production bot commented Feb 17, 2024

Coverage summary from Codacy

See diff coverage on Codacy

See your quality gate settings Change summary preferences

liamhuber commented Jan 10, 2024 •

edited

codacy-production bot commented Jan 18, 2024 •

edited

liamhuber commented Feb 16, 2024 •

edited