-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resume and reboot modes #74
Conversation
That Travis build failed because the |
I think we can safely force the use of python2.7, on the assumption that it will be installed everywhere that we want to run. |
The design sounds sane. Now I will inspect the code. |
# This script is executed at the end of each multiuser runlevel. | ||
# | ||
|
||
nohup sudo -H -u krun python krun.py --resume --reboot /home/krun/krun/examples/example.krun |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really want nohup? That means all krun output is going to disk twice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure, but the script needs to exit 0
to work with the init framework (I need to test this later today)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually nohup
will write stdout to nohup.out
you see.
@@ -86,6 +86,30 @@ $ PYTHONPATH=../ ../krun.py example.krun | |||
You should see a log scroll past, and results will be stored in the file: | |||
`../krun/examples/example_results.json.bz2`. | |||
|
|||
## Running in reboot and resume modes | |||
|
|||
krun can resume an interrupted benchmark by passing in the `--resume` flag: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Say something about the granularity of this feature? I.e. executions.
This now works on my machine, with some caveats:
|
If you use |
yes, but if I run krun from the command line as a non-root user it asks me for a sudo password. I don't think it is doing an |
|
This is a basic check that benchmarking has been resumed on "the same" platform that the benchmark was started on.
Resume mode removes jobs from the schedule that have already been executed and adds old data to the set of results.
Log name either based on current time (ordinary run) or mtime of config file (resume mode).
Information provided in audits differs between platforms.
ETA emails are sent. Fixed existing error in documentation.
Appends logs to /var/log/rc.local.log. Linux only.
if len(self) == 0: | ||
debug("krun started with an empty queue of jobs") | ||
|
||
if not resume: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logic here is wrong?
…ally starts Krun on boot. Improvements to error messages: if the output file does not exist, don't tell the user it isn't a regular file. Only wait for network when --started-by-init. --dry-run now simulates time.sleep.
…r. These show that a --reboot makes progress through the schedule, and should help prevent an infinite reboot loop.
This PR adds two new command-line switches to krun:
--resume
and--reboot
.In resume-mode krun will look for an existing set of results. If one is found, krun first checks that the current platform is (approximately) the same as the platform detailed in the results file. If this test passes, the schedule is built and executions which have already been run are removed from the job queue. Old results are added to the current job scheduler, which means the JSon results file can be dumped (rather than appended to), as before.
Under reboot-mode, every time an execution has finished, krun runs a reboot command which is defined with the platform definition (actually, krun currently just prints the command out, to ease testing, this needs to be fixed before merging).
Json-related code has been refactored into
krun/util.py
. A very basic test suite has been added tokrun/tests/
. The documentation inexamples/README.md
has been updated.Different platforms have different conventions for starting a program on boot. An example
rc.local
file has been added toetc/
.Fixes #41
Fixes #54