Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Adding a nnpdf backend #57

Merged
merged 11 commits into from
Feb 10, 2020
Merged

[WIP] Adding a nnpdf backend #57

merged 11 commits into from
Feb 10, 2020

Conversation

scarlehoff
Copy link
Owner

@scarlehoff scarlehoff commented Jan 26, 2020

Nowadays I'm only using slurm but I am using it in several different systems so might as well add it as a public backend (even if I think I will be the only one using it for now :__)

Todo:

  • Check that the input works
  • Generate the runfolder before sending the job
  • Check the input and try to guess the necessary memory requirement

# Fail without piety
raise ValueError("There are replicas in the range considered")

def include_arguments(self, current_arguments):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To check I understand - if you ever wanted to run NNPDF with Arc or DIRAC, would your include_arguments lead to command-line arguments for your executable like --memsize X?

@marianheil and I were discussing that supporting arbitrary arguments across the board might be quite useful, especially if more programs are going to be added to pyHepGrid, so we don't have to add all programs' variables to all users' headers. But it looks like they'd be ignored for SLURM, unless specifically added as a {key} to the .sh template, but for Arc/Dirac they'd be parsed and passed as additional --X Y command-line flags.

Ideally, for those programs we intend to be able to run on all the Backends, each runcard should lead to the same command-line arguments being passed regardless of which Backend it's being submitted to.

Any idea how we could both manage that, and arbitrary arguments? (Probably related to #17 )

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To check I understand - if you ever wanted to run NNPDF with Arc or DIRAC, would your include_arguments lead to command-line arguments for your executable like --memsize X?

Yes. Although it will fail because I'm trusting in the existence of some arguments that are there only for slurm, but yes, that's the idea.

@marianheil and I were discussing that supporting arbitrary arguments across the board might be quite useful, especially if more programs are going to be added to pyHepGrid, so we don't have to add all programs' variables to all users' headers. But it looks like they'd be ignored for SLURM, unless specifically added as a {key} to the .sh template, but for Arc/Dirac they'd be parsed and passed as additional --X Y command-line flags.

There are a few things that are going to be always backend dependent. In principle it might be possible to do it more general but disentangling all of that would be quite a task. I've tried in the past and got bored before achieving anything :__

In any case, this is particularly true for slurm. I remember we had some problems when giving options through command line arguments so we had to have them hard-coded on the scripts (which lead to having to have several different scripts). Don't remember whether this was on the IPPP batch.

Any idea how we could both manage that, and arbitrary arguments? (Probably related to #17 )

I'd say #58 is not a bad compromise: each of the programs must decide/know how to treat the arguments.

Allow loading custom runmode without modifying `pyHepGrid` Code
This should allow submission/testing from any folder
@marianheil
Copy link
Collaborator

What is the status on here? Is this ready to merge? This MR includes a few extra features which would be nice to have in master.

@scarlehoff
Copy link
Owner Author

Ah, feel free to merge. This is in working condition.
There are a few things I wanted to add but I'm in a conference this week so minimum of 1-2 weeks until I work on this again. I'll push it to a new branch when I do.

util.spCall(["./{0}".format(header.runfile)] + nnlojob_args)
runfile = os.path.basename(header.runfile)
util.spCall(["chmod","+x",runfile])
util.spCall(["./{0}".format(runfile)] + nnlojob_args)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: what happens if header.runfile is not in the local directory? (or is it always?)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is exactly what should changed. Before to run test mode you had to have the runfile in your current directory. Now you can also specify an absolute path. In the lines you highlighted we only get the file name the copy happens in:

shutil.copyfile(header.runfile, os.path.join(header.sandbox_dir,
os.path.basename(header.runfile)))

For the actual submission this already worked before, since we just pass header.runfile to Arc/Dirac/Slurm and they can handle absolute and relative paths.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So does the os.chdir in setup() persist after it exits?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean with "persist"? Within python yes, but it doesn't change your "main shell" from which you are running pyHepGrid. Nothing change in this regard. We also don't have to change back in pyHepGrid after run_test, since pyHepGrid finishes afterwards.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, within python - I assumed it would basically work like a local variable in the setup scope, but it looks like it does persist.

Copy link
Collaborator

@jcwhitehead jcwhitehead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me

@jcwhitehead jcwhitehead merged commit 9aca48c into master Feb 10, 2020
@jcwhitehead jcwhitehead deleted the nnpdfrun branch February 10, 2020 18:09
@marianheil marianheil added the enhancement New feature or request label Feb 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants