Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reprod easyconfig does not work for VMD, Tau, MrBayes (and possibly others) #1574

Open
mboisson opened this issue Nov 12, 2018 · 12 comments
Open
Milestone

Comments

@mboisson
Copy link
Contributor

I am rebuilding a lot of software packages using the .eb that is contained in the reprod folder. VMD fails to rebuild with the error :

== 2018-11-12 12:38:14,171 run.py:192 INFO running cmd:  ./configure  LINUXAMD64 LP64 IMD PTHREADS FLTK TK COLVARS NOSILENT TCL OPENGL MESA NETCDF  PYTHON  ICC  LINUXAMD64 LP64 IMD PTHREADS FLTK TK COLVARS NOSILENT TCL OPENGL MESA NETCDF  PYTHON  ICC
== 2018-11-12 12:38:14,336 build_log.py:158 ERROR EasyBuild crashed with an error (at ?:124 in __init__): cmd " ./configure  LINUXAMD64 LP64 IMD PTHREADS FLTK TK COLVARS NOSILENT TCL OPENGL MESA NETCDF  PYTHON  ICC  LINUXAMD64 LP64 IMD PTHREADS FLTK TK COLVARS NOSILENT TCL OPENGL MESA NETCDF  PYTHON  ICC " exited with exit code 255 and output:
Unknown option 'LINUXAMD64'

note that the list of arguments appears twice. The culprit is that somehow, the .eb file in the reprod folder contains the list in its configopts and it gets duplicated instead. The .eb file stored in the ebfiles_repo does not contain this variable :

-bash-4.2$ grep configopts /cvmfs/soft.computecanada.ca/easybuild/ebfiles_repo/VMD/VMD-1.9.3-iimkl-2018.3-Python-2.7.14.eb
-bash-4.2$ grep configopts /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/intel2018.3/vmd/1.9.3/easybuild/reprod/VMD-1.9.3-iimkl-2018.3-Python-2.7.14.eb
configopts = " LINUXAMD64 LP64 IMD PTHREADS FLTK TK COLVARS NOSILENT TCL OPENGL MESA NETCDF  PYTHON  ICC "
@mboisson
Copy link
Contributor Author

Mmm, I think this is actually a problem in the EasyConfig.update method. It appends to the values without checking if what it's appending is already present.

@mboisson
Copy link
Contributor Author

There are other recipes for which the reprod recipe does not actually work.

For MrBayes, the src subfolder is added to the start_dir twice

== FAILED: Installation ended unsuccessfully (build directory: /dev/shm/ebuser/avx512/MrBayes/3.2.6/iompi-2018.3.312): build failed (first 300 chars): Failed to change to correct source dir /dev/shm/ebuser/avx512/MrBayes/3.2.6/iompi-2018.3.312/mrbayes-3.2.6/src/src: [Errno 2] No such file or directory: '/dev/shm/ebuser/avx512/MrBayes/3.2.6/iompi-2018.3.312/mrbayes-3.2.6/src/src'

For Tau, the duplicated arguments to the configure script also cause a problem :

== FAILED: Installation ended unsuccessfully (build directory: /dev/shm/ebuser/avx512/TAU/2.27.1/gompi-2018.3.312): build failed (first 300 chars): Specifying additional configure options for TAU is not supported (yet)

@mboisson mboisson changed the title VMD's reprod easyconfig does not work reprod easyconfig does not work for VMD, Tau, MrBayes (and possibly others) Nov 13, 2018
@boegel
Copy link
Member

boegel commented Nov 13, 2018

cc @ocaisa

@boegel boegel added this to the 3.x milestone Nov 13, 2018
@ocaisa
Copy link
Member

ocaisa commented Nov 13, 2018

@boegel I saw it :) but I think it is a very hard one to fix. It would seem like the dump method is going to need a ton more logic. We need a good handle on what can actually modify an easyconfig and make the dump method account for that. For example, anything done by an easyblock shouldn't appear in the dump (since with easybuilders/easybuild-framework#2653 they will also get archived) but we still want all modifications done by hooks, the command line and dependency resolution (any more?).

@mboisson
Copy link
Contributor Author

So far, there is only a handful of recipes for which this is problematic (and I recompiled about 400 different combinations of (software package,version,toolchain)), so we could just fix those easyblocks.

However, issue easybuilders/easybuild-framework#2658 is much more generalized and basically has the same root cause.

What if there was a copy of the EasyConfig that was kept internally, and which would have only

  1. what is read from the easyconfig file
  2. what is done by the hooks
    and does not have what is modified by the easyblock. It would then be this copy that would be dumped.

@ocaisa
Copy link
Member

ocaisa commented Nov 14, 2018

@mboisson I don't think that is really possible, the hooks have the keys to the kingdom and can be called before or after any of the steps of the easyblock. I was thinking the only way to do it would be to also record the hooks in the reprod directory.

This still doesn't solve easybuilders/easybuild-framework#2658 but there you have something very specific. In that case (--module-only --force), the easiest thing to do is create another layer of reprod dir inside the existing one and the process to reproduce is then reasonably straight forward, this is very easy to do with easybuilders/easybuild-framework#2653

@mboisson
Copy link
Contributor Author

@ocaisa, so how come the hooks are not recorded in the ebfiles_repo repository if they can be called before the easyconfig is done processing its own internal changes ?

@ocaisa
Copy link
Member

ocaisa commented Nov 14, 2018

Basically because the hooks are a fairly recent phenomenon and the implications haven't been thought about too much. The reprod dir was pretty much covering things as it happened but with this issue we now see that that initial approach may not be good enough.

@mboisson
Copy link
Contributor Author

Another example which I stumbled upon is FFTW. The FFTW easyblock converts a configopts which is a string into a list. But the reprod recipe contains a list. We therefore run into the issue that the easyblock is trying to concatenate a list with a string.

@mboisson
Copy link
Contributor Author

NWChem also has a problem, but it's trickier... the start_dir is set to a temporary directory, which is then saved in the reprod. The tmpdir obviously does not exist when you try to reproduce the build...

@mboisson
Copy link
Contributor Author

Bundle modules also don't seem to like to have sources included in their easyconfig. The FALCON recipe did not rebuild with the reprod recipe.

@ocaisa
Copy link
Member

ocaisa commented Nov 16, 2018

All these problems reinforce my idea for doing reproducability slightly differently and avoid trying to solve these issues one by one. I don't see how you can sanitize out all possible problems since the easyblocks can do anything to the instance. It's good to have a list of currently failing cases so we can work on creating relevant tests

@boegel boegel modified the milestones: 3.x, 4.x Feb 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants