Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tinybase: New Input/Output, Renamed Executors, Storage/Directory brokering by Project #949

Merged
merged 47 commits into from
Feb 2, 2024

Conversation

pmrv
Copy link
Contributor

@pmrv pmrv commented Jan 8, 2024

Supersedes #948 and #936.

  • Input and output classes are based on dataclasses now; this also changed how the Output is generated by tasks. Previously they got an output instance injected and were expected to populated it, now they are expected to return it. This simplified some of the "workflow" nodes like the SeriesTask and makes it e.g. possible to run Murnaghan calculations with static, minimizing and molecular dynamics quite generically.
  • Executors are called Submitters now and work with anything that follows the concurrent.futures interface. In the future I'll likely completely remove them.
  • I added a new storage interface based on h5io directly and added new project implementations that e.g. store everything in a single file.
  • Tasks are now aware of their working directory and ask for one from a Project. This allows the project to fully control filesystem access of its jobs.

@pmrv pmrv added enhancement New feature or request format_black Invoke a black formatting commit labels Jan 8, 2024
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Copy link
Contributor

github-actions bot commented Jan 8, 2024

Binder 👈 Launch a binder notebook on branch pyiron/pyiron_contrib/tinydata

@pmrv
Copy link
Contributor Author

pmrv commented Jan 8, 2024

@liamhuber The details of the bug I mentioned are described here. Since in the snippet we discussed you used getstate/setstate on your own outside of h5io, it might work for you anyway though.

@pmrv
Copy link
Contributor Author

pmrv commented Jan 8, 2024

The error now is related to pip/setuptools, googling around this can be caused when some dependencies are not compatible to the latest version of setuptools. In fact downgrading it seems to resolve the issue locally for me, but no idea what "not compatible" actually means in this context and how to fix it for the CI.

@liamhuber I should hope that if you just check this out locally and install the dependencies with conda/mamba it should work. :|

.binder/environment.yml Outdated Show resolved Hide resolved
@liamhuber liamhuber mentioned this pull request Jan 15, 2024
@liamhuber
Copy link
Member

Question: Why are the windows tests so slow?

Windows takes 12 minutes and counting, while other OSs take <5 minutes. I compared the windows log to a couple ubuntu logs, and the actual tests themselves (55 OK + 1 skipped in all cases) do take longer, but we're talking 8s instead of 5s -- insignificant.

I didn't see where the actual time is being eaten, but it must be somewhere in the env setup since the tests themselves are OK.

Ultimately, I find this acceptable. The windows tests are annoyingly slow, but it's something like 3x not 10x, and the windows operation times are not horrible (at least in what is unit tested). It's a bit puzzling, but I can live with it.

@liamhuber
Copy link
Member

liamhuber commented Jan 16, 2024

Question: why do the notebooks hang?

These are running 20m and going right now and I've seen them go longer, so I believe it's a full hang. I'm updating my local env and will try the notebooks out locally and edit this comment with whatever data I get.

EDIT:

Unfortunately they all fail. I can't say for sure what is code failure and what is just a mismatch in actual vs expected environment (which could be because I didn't make sure everything's pointed to the specified version, or because tinybase is missing some specifications).

The good news is that I did discover that the ASE notebook is not just failing, but actually hanging! In the short term I'll stack a branch with that notebook excluded, so at least we'll be able to see how the rest fare.

Results of running the notebooks locally:

ASE:

ASE

Basic:

basic

LAMMPS:

lammps

Shell:

shell

I actually very quickly played with this, and the trouble is that sh.input.check_ready() fails because the working directory hasn't been specified. So this looks like a legit problem with the code/example, but at least sounds very easy to fix.

TinyJob:

Head

tinyjob_head

Tail

tinyjob_tail

@liamhuber liamhuber mentioned this pull request Jan 16, 2024
@liamhuber
Copy link
Member

Omitting the ASE notebook over on #963 did indeed allow the rest of the notebook tests to complete ok, and quickly.

The Basic, Lammps, and Shell notebooks all fail exactly as shown above. Most of these look to me like they're just hiccups in either the code or example calls, and that providing missing data, arguments, etc. will clean things up.

Interestingly, the TinyBase notebook ran totally fine on the CI. I don't have any knee-jerk response to the error message I got and am not sure why this is. It is also not high-priority for me to figure it out.

Regarding the ASE notebook, the pickling error above smells a lot like pickle's difficulties with serializing stuff defined inside a jupyter notebook.

@pmrv, I don't plan to patch any of these problems imminently, but if I do get to it then it will be in a stacked PR and I'll open it as a draft early in the process so we don't duplicate any work. In the meantime I don't want to migrate the submodule from here to the workflows while the demos are broken.
Immediately, I will just proceed on the current path of delaying the contrib import until save-time, and may spoof Storable as we discussed. In the near-term, we have the option of migrating just the bits that are needed right now (i.e. the storage module) and migrating the rest (database interface abstractions, etc.) later as needed. I have mixed feelings about this as (a) I want the stuff, but (b) breaking it apart has a good chance of causing more headaches than it solves. Opinion?

@pmrv
Copy link
Contributor Author

pmrv commented Jan 17, 2024

Thanks for the detailed check! Sounds like I was careless and only checked the working directory changes on TinyJob.ipynb. Will update next week when we're back.

pmrv and others added 4 commits January 26, 2024 16:36
We previously set the AbstractTask.context.working_directory in execute automatically,
if not set before.  But this can break re-usability of a task, so do not do it for now.
@pmrv
Copy link
Contributor Author

pmrv commented Jan 27, 2024

Input Notebook: notebooks/tinybase/Shell.ipynb

AttributeError                            Traceback (most recent call last)
Cell In[5], line 1
----> 1 out.stdout

AttributeError: 'NoneType' object has no attribute 'stdout'

@liamhuber
Copy link
Member

...
Input Notebook:  notebooks/tinybase/Shell.ipynb
...
---------------------------------------------------------------------------
Exception encountered at "In [5]":
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[5], line 1
----> 1 out.stdout
...

@liamhuber
Copy link
Member

...
Input Notebook: notebooks/tinybase/Shell.ipynb
...

Exception encountered at "In [5]":

AttributeError Traceback (most recent call last)
Cell In[5], line 1
----> 1 out.stdout
...

This was because the shell task now requires the working_directory input, so the task never ran because its input was incomplete

@liamhuber liamhuber merged commit 6efcef4 into main Feb 2, 2024
15 of 16 checks passed
@liamhuber liamhuber deleted the tinydata branch February 2, 2024 20:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request format_black Invoke a black formatting commit
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants