add 'continue' functionality (as opposed to the already available 'stop') #203

boegel · 2012-08-29T14:33:34Z

(old internal ticket 241)

Since we have a way to stop at a certain step, it would be nice if we then could continue also.

This way you can debug each step of the configure, build, install and create module process without having to do all of the previous steps again and again.

JensTimmerman · 2012-08-29T16:41:33Z

I think this should be solved by creating the -devel modulefile after easch step?

The argument against a continue function is that eventually the .eb file will be commited to the repo, and the software isntalled, but there is no knowing what actually happened in between, and as such the build is not reproducable.

see issue #109

boegel · 2012-08-29T20:57:55Z

Agreed. the step-wise devel modules would almost offer what a continue function would implement.

Not completely though: devel modules can only set environment variables, not do things like create/adjust files, create directories, run commands, etc.

But nevertheless, you have a good argument not to implement it; we don't want that people are able to implement easyblocks that still require human intervention and are thus incomplete. Keeping this closed.

fgeorgatos · 2013-12-03T16:05:35Z

continue may be feasible if the intermediate states of a build-sequence are saved in a "tar" file or something; that would allow clean restart without side-effects (ie. it is a form of checkpointing of the build process)

boegel · 2013-12-03T16:11:00Z

Reopening this, since @wpoely86 was was asking for this.

Are devel modules a solution in case one wants to use eb --continue after fixing a bug in the install step, i.e. continue without redoing the build step?

wpoely86 · 2013-12-03T16:12:02Z

@fgeorgatos That would require to catch the exceptions.

The use case in which I would like a continue functionality is to make a easyblock for highly non-standaard build systems. It's annoying that you have to start again from scratch if something doesn't work in the install_step.

fgeorgatos · 2013-12-03T22:04:13Z

@wpoley86;
if I understand correctly what you described, a CRIU checkpoint [1] right before the install step might help;
the question is then, how to allow modifications in the install step independently of the rest... I'm puzzled on this.

[1] http://en.wikipedia.org/wiki/CRIU # v1.0 was just out, but the need for 3.11 kernel is not encouraging; ok, may be we find a better direction...

wpoely86 · 2013-12-04T09:19:36Z

Yeah, something like that but not that complex. I think CRIU is a bit overkill and it's not intended for our purposes. I see no way to change the python script after a checkpoint.

Anyway, what I want is not that complex (Don't shoot me if it turns out to be very complex 😉):
Restart in the step that failed using all previous successful steps. That would mean: keep the current builddir, keep all mktemp generated paths and files. And store all current variables in the easyblock and the easyconfig. That would do it, no?

So, we would need to store the current status and all files paths etc in a file before executing a step in the block.
I would only do this if a certain option have been activated (--debug maybe?) and give the checkpoint file/dir in the beginning of the log output.

What do you think? Totally crazy?

boegel · 2013-12-31T14:26:36Z

@wpoely86: I could see some use for that, but making it work might involve quite a bit of work here and there. The current codebase is totally unaware of this restart feature, so you might need to make sure stuff sticks around rather than being cleaned up, etc.
It's not totally crazy imho, but it will involve quite a bit of work I think, and the usefulness may be limited (i.e. it will likely only work in a couple of specific cases, etc.)

JensTimmerman closed this as completed Aug 29, 2012

boegel reopened this Dec 3, 2013

boegel mentioned this issue Jan 2, 2016

Allow a build to restart in the middle #1531

Closed

citibeth mentioned this issue Jan 3, 2016

generated devel module needs work #109

Open

boegel removed the wontfix label Oct 30, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add 'continue' functionality (as opposed to the already available 'stop') #203

add 'continue' functionality (as opposed to the already available 'stop') #203

boegel commented Aug 29, 2012

JensTimmerman commented Aug 29, 2012

boegel commented Aug 29, 2012

fgeorgatos commented Dec 3, 2013

boegel commented Dec 3, 2013

wpoely86 commented Dec 3, 2013

fgeorgatos commented Dec 3, 2013

wpoely86 commented Dec 4, 2013

boegel commented Dec 31, 2013

add 'continue' functionality (as opposed to the already available 'stop') #203

add 'continue' functionality (as opposed to the already available 'stop') #203

Comments

boegel commented Aug 29, 2012

JensTimmerman commented Aug 29, 2012

boegel commented Aug 29, 2012

fgeorgatos commented Dec 3, 2013

boegel commented Dec 3, 2013

wpoely86 commented Dec 3, 2013

fgeorgatos commented Dec 3, 2013

wpoely86 commented Dec 4, 2013

boegel commented Dec 31, 2013