New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scripts: New -continue implementation #713

Open
Lestropie opened this Issue Jul 9, 2016 · 3 comments

Comments

Projects
None yet
2 participants
@Lestropie
Copy link
Member

Lestropie commented Jul 9, 2016

As first mentioned in #672.

The current implementation of the -continue option is relatively simple. It just tells the script to use an existing directory rather than creating a new one, and to skip runCommand() calls until a particular string (most likely a file name) appears in the command string, after which point it runs all script commands as normal.

It may now be possible to provide a more robust solution:

  1. A new script, called e.g. resume, would be provided. The sole input parameter to this script would be the location of the temporary directory of a previous script execution.
  2. From the contents of this directory, this script would be able to determine the script that was executed (along with all of its arguments), and the original working directory where the script was invoked.
  3. It would then invoke the relevant script using a command-line flag hidden from users, that would configure that script accordingly.
  4. The invoked script would look at the contents of the command log, and skip any runCommand() calls that were successfully completed during the previous execution. This would also allow users to erase lines from this log that they wish to be redone.
  5. The -force option would be automatically added to any MRtrix command calls, just in case a user has deleted entries from the command log as in point 4 but not erased the relevant created files.

Another little thing that could be added here would be to write the current MRtrix3 version to a text file in the temporary directory, and warn the user if the script is resumed with a mismatched version of MRtrix3.

@maxpietsch

This comment has been minimized.

Copy link
Member

maxpietsch commented Jul 9, 2016

The invoked script would look at the contents of the command log, and skip any runCommand() calls that were successfully completed during the previous execution. This would also allow users to erase lines from this log that they wish to be redone.

The population_template script overwrites the linear transformation files for each iteration. I guess we'll need to change this to make resume work?

It would be useful if resume could pass down changed parameters to the script. I was thinking of splitting population_template into two independent sections: linear,non-linear. This allows people to first run the linear registration, inspect (or manually fix) the result and then resume with the non-linear registration. One could also use this to tweak parameters that do not affect the first part of the script (such as non-linear registration parameters).

@Lestropie

This comment has been minimized.

Copy link
Member Author

Lestropie commented Jul 9, 2016

The population_template script overwrites the linear transformation files for each iteration. I guess we'll need to change this to make resume work?

Yes, the fundamental assumption with the current implementation of -continue is that each file is new, so just looking for that file name in the command string is adequate to trigger disabling the command skip. These proposed changes would be more robust to that since it'd be looking to match the entire command string with entries in the command log from the previous execution, rather than just a substring.

It would be useful if resume could pass down changed parameters to the script.

I presume by this that you mean overriding command-line options set in the initial execution of the script with new ones? I think that should be possible. It would however make comparison of command strings with the contents of the prior command log error-prone.

I was thinking of splitting population_template into two independent sections: linear,non-linear. This allows people to first run the linear registration, inspect (or manually fix) the result and then resume with the non-linear registration.

This could potentially be dealt with by introducing some sort of pause mechanism rather than splitting up scripts? In the case of running on an HPC, the pause would instead disable cleanup and terminate, with the second half being triggered using the continue mechanism. The disadvantage of the continue mechanism (both current and future) is that it's only based on runCommand() calls, and doesn't log / can't trigger on internal Python functions.

One could also use this to tweak parameters that do not affect the first part of the script (such as non-linear registration parameters).

I don't quite see how checking the results of the first half would then cause you to change parameter options that only influence the second half...?

@maxpietsch

This comment has been minimized.

Copy link
Member

maxpietsch commented Jul 9, 2016

The disadvantage of the continue mechanism (both current and future) is that it's only based on runCommand() calls, and doesn't log / can't trigger on internal Python functions.

That should be fine if no python command overwrites a file and if we resort to repeating the last successful runCommand() to make sure the following python code is also rerun.

I don't quite see how checking the results of the first half would then cause you to change parameter options that only influence the second half...?

If one wants to only run the linear registration to check the output or manually overwrite a linear transformation file that could be done with i.e. population_template -run affine -tempdir tmp. Then one could just resume tmp -run syn to finish the template generation. Your suggestion to pause the script after the linear registration or at suspected failure of mrregister (already implemented in the registration_linear_tweaks branch) would also work for this but the advantage of the resume call would be that if one wants to tweak a parameter of the non-linear part, one does not need to rerun the linear registration for each trial:

population_template -run affine -tempdir tmp
cp -r  tmp tmp_trial1
resume  tmp_trial1 -run syn -syn_parameter 1
cp -r  tmp tmp_trial2
resume  tmp_trial2 -run syn -syn_parameter 2

It would however make comparison of command strings with the contents of the prior command log error-prone

It would work if the parameter affects only runCommand(): Just continue at the first runCommand() with the updated parameter that does not match the log. But you are right, I did not think of the case where the parameter change affects the python code that might be well before the last matching runCommand(). This could be solved by adding lines to the log that indicate independent sections of the script and re-running everything since the last unaffected section. Not sure if it is worth the trouble, though.

maxpietsch added a commit that referenced this issue Jul 20, 2016

population_template: added option to pause script if linear registrat…
…ion result is implausible. separate linear transformation folders for each level as discussed in #713
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment