Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding support for GNU Guix for software installations #2778

Closed
rekado opened this issue Aug 12, 2016 · 12 comments
Closed

Adding support for GNU Guix for software installations #2778

rekado opened this issue Aug 12, 2016 · 12 comments

Comments

@rekado
Copy link

rekado commented Aug 12, 2016

Hi,

I'm a contributor to GNU Guix (https://gnu.org/s/guix), a functional package manager with a focus on reproducibility. In the past year I've been packaging a large number of bioinformatics tools for GNU Guix.

We have a local instance of Galaxy at our institute and would like to combine it with Guix to handle software installations for Galaxy.

Guix allows regular users to install software in different varieties or versions into separate independent profiles. We use this at the institute to create user-managed software environments per user, per workflow, per project, or per group. Software environments are reproducible and portable (among systems of the same architecture). I think Guix is very well suited for systems like Galaxy where it makes sense to quickly spin up independent environments for any given workflow.

I would like to help in getting support for Guix into Galaxy. I previously looked at the code that handles dependency resolution in Galaxy, but I think only very little of this would be needed when using Guix (as Guix package expressions already describe their dependencies exactly).

If you are open to the idea of making Guix an optional software management backend, I'd be very happy to hear your opinion on how to best get this started.

@mvdbeek
Copy link
Member

mvdbeek commented Aug 12, 2016

Hi @rekado,

that sounds like a good plan, but the way to go would be by creating a new dependency resolver.
Take a look at the conda dependency resolver, which is currently the most advanced resolver in the galaxy codebase.

@bgruening
Copy link
Member

Hi @rekado,

nice to see you here! As @mvdbeek already mentioned Galaxy supports many different dependency resolvers, e.g. Conda (our new default one), Docker or the traditional one (to which you are probably referring).

We also looked at Guix (1,5 years back) but decided to go with Conda for various reasons. Is it still true that you need to run a privileged server to install packages as user?

If you want to implement a resolver please have a look at this PR, implementing the Conda resolver: #1345

@rekado
Copy link
Author

rekado commented Aug 12, 2016

Thanks for the hints. I'll take a look at the Conda resolver and see if I can cook up a minimal working resolver for Guix.

Yes, Guix uses a daemon that runs as root in order to create isolated build environments via chroot (the build processes run as unprivileged, dedicated build users). The daemon manages writes to the store, an append-only immutable cache. Users talk to the daemon via RPCs.

With user namespace support in recent kernels we might be able to drop this requirement eventually (though some GNU/Linux distributions have disabled this feature). Although it is possible to run the Guix daemon as an unprivileged user the lack of chroot means that certain guarantees cannot be made about the build environment, which has a negative effect on reproducibility (e.g. build environment leaking into the build artifacts).

@bgruening
Copy link
Member

@rekado sounds bad :(
I really like this project but this privileged server is one of the main reason we can not use it to enable reproducibility. We need to transfer workflows/tools between instances. If we require a server running as root this will be a major hurdle for most of the Galaxy server I know of. Sounds a little bit like the same problem that we have with Docker and HPC. But for Docker we found a solution to generate the same dependencies as Conda packages and Docker containers - putting the admin in charge to choose between Conda and Docker.

@rekado
Copy link
Author

rekado commented Aug 12, 2016

I don't really understand why a daemon running as root would be a major hurdle. Admins who configure a Galaxy instance surely have root access? Galaxy itself won't have to run as root. It's merely the chroot-spawning daemon that needs to be root (as chroot is not available for anyone else but root on Linux).

To me this isn't very different from an HTTP daemon that needs to be root in order to be able to listen on port 80.

@bgruening
Copy link
Member

bgruening commented Aug 12, 2016

I don't really understand why a daemon running as root would be a major hurdle. Admins who configure a Galaxy instance surely have root access?

No they don't. And they don't need to. If you look at the Conda PR everything happens in user-space even the package manager is installed by Galaxy in user-space if needed.

Galaxy itself won't have to run as root. It's merely the chroot-spawning daemon that needs to be root (as chroot is not available for anyone else but root on Linux).

I know, but this needs to be set-up by an Admin and this is not guaranteed. Same problem as for Docker. I know Galaxy Admins that are waiting for month to get some changes to there postgresql database.

@rekado
Copy link
Author

rekado commented Aug 12, 2016

I see that this would be a problem in those cases :-/

Do you think it would be less of a problem if the daemon itself were to run as an unprivileged user and shelled out to a tiny setuid helper that would spawn the chroot environment when needed? This would only require an admin to be okay with the concept of setuid programs (such as "ping").

@bgruening
Copy link
Member

Really not sure @rekado, I discussed this during BOSC last year and we did not found a nice solution. Conda is in this regard really nice. Let's see what others have to say.
Ideally you need to create a resolver that installs all Guix components and manages themselves, only then we can easily exchange tools between instances.

@zimoun
Copy link

zimoun commented Dec 12, 2018

Dear,

I am not sure to understand your explanations @bgruening.

If admins who configure a Galaxy instance do not have any root access, how the HTTP/remote access dance is done? Somehow, they ask to someone who has the root access to configure the network or related. In this case, why not ask to install for example Guix?

If I misunderstand something with the configuration of Galaxy, sorry and I will dive more in the documentation.

In any case, the package manager Conda is not reproducible bit-to-bit and it does not provide any mechanism to ensure that two distinct installs are exactly the same. Without talking how conda is bootstrapped. ;-)
In other words, nothing ensures that the same pipeline runs on two distinct Galaxy instances will produce the same result(*).

Therefore, the integration of Guix proposed by @rekado or Nix as packages providers seems a paying off path, isn't it?

Thank you in advance for any comments.

All the best,
simon

(*) modulo the random seed of some algorithms, seed which should be fixed somehow.

@mvdbeek
Copy link
Member

mvdbeek commented Dec 12, 2018

@zimoun We will review whatever is proposed, but I would appreciate if you could just take @bgruening and myself's word for the fact that root or setuid requirement for tool dependencies is a pretty large hurdle for most Galaxy admins. Not (only) for the Galaxy instance, but on the usually shared HPC environment.

We're generating docker and singularity images for all major tools from the Conda chain, so this is getting us pretty far in terms of reproducibility for those that care about this. This is described in https://www.biorxiv.org/content/early/2017/10/27/207092 and https://www.biorxiv.org/content/early/2017/10/11/200683

@zimoun
Copy link

zimoun commented Dec 12, 2018

Thank you for your quick comment.

I understand that the deployment is an issue because the root or setuid requirement. And I experiment everyday myself when I ask to the IT people. :-)
However, from my experience, it is more about human relationships than technical details. That's why I was asking. :-)
Well, your feedback is more than relevant since you cross a large user base.

Thank you for your two pointers.
I already knew the one about Bioconda and I am reading the other one.
And thank you for the channel if you are involved.

(troll inside: "reproducibility for those that care about this" but Science, isn't it about reproducibility? ;-)

@hexylena
Copy link
Member

hexylena commented Aug 15, 2019

I think we've answered the questions, and hopefully they will contribute a GUIX resolver, for the sites that are able to use it :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants