Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

s.resource_list sometimes grows without bounds #452

Closed
pmrv opened this issue Nov 30, 2021 · 6 comments · Fixed by #453
Closed

s.resource_list sometimes grows without bounds #452

pmrv opened this issue Nov 30, 2021 · 6 comments · Fixed by #453
Labels
bug Something isn't working

Comments

@pmrv
Copy link
Contributor

pmrv commented Nov 30, 2021

Summary

I have a long running kernel (~1000 cells, a few thousand jobs started) and noticed that simply creating new jobs starts to get really slow (~3min for a Vasp job), because somewhere Settings.resource_list gets spurious entries of the iprpy path, like this

['/cmmc/u/zora/pyiron/resources',
 '/cmmc/u/system/SLES12/soft/pyiron/dev/pyiron-resources-cmmc',
 '/cmmc/u/system/SLES12/soft/pyiron/dev/anaconda3/share/pyiron',
 '/u/system/SLES12/soft/pyiron/dev/anaconda3/share/iprpy',
 '/u/system/SLES12/soft/pyiron/dev/anaconda3/share/iprpy',
... # 2000 more entries
]

pyiron Version and Platform

I'm on the cluster on pyiron_base version 331642e8, pyiron_atomistic at #444.

I can't reproduce it for normal Vasp jobs, but I suppose one of the job classes adds the iprpy path without checking whether it is really there and that then accumulates. I'll have a grep later to see if I can find it.

@pmrv pmrv added the bug Something isn't working label Nov 30, 2021
@liamhuber
Copy link
Member

Ahh, good catch. I bet this line should rather read if os.path.exists(res_path) and res_path not in self.resource_paths:

btw, you mean Settings.resource_paths, right?

@pmrv
Copy link
Contributor Author

pmrv commented Nov 30, 2021

Ahh, good catch. I bet this line should rather read if os.path.exists(res_path) and res_path not in self.resource_paths:

I see that this happens for Lammps jobs (each job adds 2-3 entries), but not for Vasp, so it must be simulation code specific somewhere. So while this probably doesn't cause this issue, we should still fix this one in base.

I think the offending line is here

resource_path_lst += [os.path.join(env[conda_var], "share", "iprpy")]

and will make a PR.

btw, you mean Settings.resource_paths, right?

Ah, yes. :/

@liamhuber
Copy link
Member

So while this probably doesn't cause this issue, we should still fix this one in base.

Actually I spoke too quickly. I was worried that repeat calls of state.update would see the conda paths appended repeatedly, but actually update starts by giving us a clean slate, so it's safe to append each time anyhow.

I think the offending line is here

Yep, I think you're spot on. I actually really don't like this line. First, appending to the config deep in the bowels of another module is dangerous (and we see the result of that now here). Second, I worry that we'll get OS dependent behaviour since this code doesn't force the path to be posix but the configuration is expecting posix paths.

The entire config paradigm right now is an all-or-nothing setup, so tweaking a value like this does not play well with the rest of the code. What are the odds we can just gut this section entirely and force people to include the path in their config if they really want it?

@pmrv
Copy link
Contributor Author

pmrv commented Dec 1, 2021

I think the offending line is here

Yep, I think you're spot on. I actually really don't like this line. First, appending to the config deep in the bowels of another module is dangerous (and we see the result of that now here). Second, I worry that we'll get OS dependent behaviour since this code doesn't force the path to be posix but the configuration is expecting posix paths.

The entire config paradigm right now is an all-or-nothing setup, so tweaking a value like this does not play well with the rest of the code. What are the odds we can just gut this section entirely and force people to include the path in their config if they really want it?

I'd guess having to define the iprpy path manually on every install is going to be a pain. We could merge this bit to be on lammps module import maybe and offer better API from base, so that downstream module cannot mess with the resource path directly, only call a function that adds a path (and then converts/checks if it's in there already).

@pmrv
Copy link
Contributor Author

pmrv commented Dec 1, 2021

I'll merge the fix now, but we should talk about this issue again on Monday.

@liamhuber
Copy link
Member

I'd guess having to define the iprpy path manually on every install is going to be a pain.

Unfortunately, I do agree here...

...and offer better API from base...

I'll merge the fix now, but we should talk about this issue again on Monday.

These sound like good plans. We definitely need to discuss it as it's a distinct paradigm shift from our all-or-nothing updates. At the end of the day I'm not really in love with our current setup, and allowing a more construction-based approach where we layer changes together (thus allowing appending and/or updating individual changes) is a-ok by me.

@pmrv pmrv closed this as completed in #453 Dec 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants