-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Specify version number for installed software #32
Conversation
* magrittr * McCOILR * rehh
* msa2vcf * vt * MIPWrangler * elucidator
@arisp99 I tried specifying versions before but did not work quite well. In almost all cases, if you have such a big software list and try to specify versions, they will not be compatible. If versions are not specified, mamba does a good job of resolving dependencies and installing the packages. The idea that if we specify versions, each build will be exactly the same, hence better reproducibility, is very appealing but did not work for me before. What do you think? The advantages overweigh the troubles in your opinion? |
So, with a new build initially you let conda resolve, but when you freeze a version can you lock the versions of the underlying code (or at list dump a list that we then add so someone could force the versions) This would provide real reproducability. The other option is to be able to build proprietary stuff like bcl2fastq outside and then add to another directory. |
That is a very valid point @aydemiro—I hadn't thought of that. But you are correct I can totally see instances when you want to update a package and then you have dependency conflicts. To be honest, I am usually more of a fan of updating software as there are new features and bug fixes that be useful. Thinking about what @JeffAndBailey proposed, I see that you can install packages using a We could even have a check in the definition file to see if Some quick questions thinking about this more:
|
Let's see if we can download and build externally bcl2fastq and install it as a working version with any need libraries or accessary files. if that is possible then really our fixed builds san bcl2fastq will be fine for reproduciblility. |
I was planning to move the conda installation to an environment based system where we have an environment.yml file for the base environment in the repository, instead of listing all packages without the versions in the definition file. We can then employ something like this:
As for the bcl2fastq issue, I agree that we should explore building the software outside and providing the binary to the container as a binding. However, this is a compiled c++ program and how to create a portable binary is beyond my capabilities at the moment. Nick is probably the best person to consult on this. |
This seems similar to just using a
Yes! Yes looks right. So hashing this out a bit further, in our %files
# could be either requirements or environment
environment* /opt/conda Then as you write:
Lastly, in the cp ${SINGULARITY_ROOTFS}/opt/conda/environment_versioned.yml environment_versioned.yml
I agree with all this re the bcl2fastq installation. It would awesome if you could just plop the binary into the container. I think that it makes sense to address this as a separate issue for now as it seems a bit complex... For now, let's try to finalize if we want a |
I explore this question a bit more and it seems that an |
After installation, we save an `environment_versioned.yml` file that contains all the installed versions of our software
I have now configured MIPTools to install mamba packages using an environment file. In the definition file, we first check to see if an One important thing to note is that we are actually unable to copy files to the host during the building of our container. The singularity exec miptools.sif cat /opt/environment_versioned.yml > environment_versioned.yml and that if this |
@aydemiro and @JeffAndBailey, if you have no additional comments, I will go ahead and merge this PR early next week. |
This PR specifies the version number for most of the installed software in the MIPTools container.
Closes #31.
Checklist for software
Installed via
apt-get
Installed via
git
msa2vcf
(throughjvarkit
)vt
MIPWrangler
elucidator
Installed via
wget
conda
(viaminiconda
)miniconda
, the version ofconda
is automatically updated. However, as we usemamba
as our package manager, it is fine to leave this as is. We do specify the version number formamba
.Installed via
install.packages()
magrittr
McCOILR
rehh
Installed via
conda
mamba
Installed via
mamba