Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

conda_auto_install and Collections cause install/remove/install loop #5655

Open
blankenberg opened this issue Mar 7, 2018 · 5 comments
Open

Comments

@blankenberg
Copy link
Member

If conda_auto_install is set to true, and a dependency isn't installed but is available in conda, and you run a collection in batch mode through a tool, it attempts to install, remove 'failed' installation, and reinstall for each individual job simultaneously(-ish).

Not only is this quite slow and resource intensive (thousands of items in a collection...), but it leads to failures, and may cause instances where an unversioned dependency ends up being installed and used for a particular job (did not closely examine logs, just killed all jobs and datasets and rebooted galaxy, so UV is not confirmed).

@blankenberg
Copy link
Member Author

blankenberg commented Mar 7, 2018

In a similar vein, it would be helpful if there was some sort of small timed local cache check for conda install. Minutes would be fine in most cases.

For example, when a specific version of a package isn't available, but an unversioned one is, Galaxy will attempt to conda install the versioned one at each execution, so it does the search and says it can't find it and will default to the unversioned one. This check seems to take between 5 seconds and a minute. The response time is probably exacerbated by having it run simultaneously or in quick succession many times and it greatly slows down the scheduling of jobs to a cluster.

One example is the fastq-join tool:
conda create -y --override-channels --channel iuc --channel bioconda --channel conda-forge --channel defaults --name __ea-utils@1.1.2-806 ea-utils=1.1.2-806
which doesn't exist at that version.

@mvdbeek
Copy link
Member

mvdbeek commented Mar 7, 2018

Yeah, that's why I have been going round saying don't use conda_auto_install if you can avoid it. What would be great in many ways is if we had a single queue type thread for such things (and maybe another one for TS installs). The queue could even be smart and cache / bundle requests.

The problem I see is that we'll only want a single unique queue, and we need to be able to communicate with it from all web and job handlers. I guess mules could work for this type of thing ?

@dannon
Copy link
Member

dannon commented Mar 7, 2018

@mvdbeek We could declare a single queue using the existing queue_workers and have any available worker consume out of it.

@mvdbeek
Copy link
Member

mvdbeek commented Mar 7, 2018

Like those that pick up jobs from the database or the communication via kombu? I think the kombu communication doesn't work in all circumstances (uwsgi web handlers communicating with job handlers, but maybe that's better now ?) ?

@dannon
Copy link
Member

dannon commented Mar 7, 2018

Right, using that infrastructure. Processes need to be named correctly for the dynamically created specific-to-process queues, but I think that's something we've worked on a lot lately. That said, we're talking about an even simpler queue, here -- a single named queue that many things work from. That should 100% work right now, if we wanted to set one up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants