Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add --use_cached_dependency_manager option #612

Merged

Conversation

abretaud
Copy link
Contributor

A small PR to support the new --use_cached_dependency_manager option introduced in galaxyproject/galaxy/pull/3106

@jmchilton
Copy link
Member

Would there be any harm in just making this the default for Planemo and not having an option for it?

Ping @mvdbeek

@mvdbeek
Copy link
Member

mvdbeek commented Dec 14, 2016

I don't think so, since it falls back to the old behaviour anyway if no cached environment exists.
Problem with the current PR is that you still need to setup the cache (probably during planemo conda_install).

@mvdbeek
Copy link
Member

mvdbeek commented Dec 14, 2016

Though this should probably work with --conda_auto_install :)

@abretaud
Copy link
Contributor Author

Damn, should have tested it more, can't get the thing to work on my setup

@abretaud
Copy link
Contributor Author

I'm stuck on this strange error, does it evoke anything to you?

galaxy.tools.deps.conda_util DEBUG 2016-12-14 17:43:43,096 Executing command: /home/abretaud/miniconda2/bin/conda clean --tarballs -y
requests.packages.urllib3.connectionpool DEBUG 2016-12-14 17:43:43,166 "GET /api/histories/2891970512fa2d5a?key=412adcd2d1a1516f35d3620d02d251b7 HTTP/1.1" 200 None
requests.packages.urllib3.connectionpool INFO 2016-12-14 17:43:43,392 Starting new HTTP connection (1): localhost
galaxy.jobs.runners DEBUG 2016-12-14 17:43:43,403 (1) command is: mkdir -p working; cd working; /tmp/tmpZ1uwOZ/job_working_directory/000/1/tool_script.sh; return_code=$?; cd '/tmp/tmpZ1uwOZ/job_working_directory/000/1'; [ "$CONDA_DEFAULT_ENV" = "/tmp/tmpZ1uwOZ/job_working_directory/000/1/conda-metadata-env" ] || . /home/abretaud/miniconda2/bin/activate '/tmp/tmpZ1uwOZ/job_working_directory/000/1/conda-metadata-env' > conda_activate.log 2>&1 ; python "/tmp/tmpZ1uwOZ/job_working_directory/000/1/set_metadata_4d9gCq.py" "/tmp/tmpZ1uwOZ/tmp/tmpg2Hkos" "/tmp/tmpZ1uwOZ/job_working_directory/000/1/working/galaxy.json" "/tmp/tmpZ1uwOZ/job_working_directory/000/1/metadata_in_HistoryDatasetAssociation_1_pRCG7I,/tmp/tmpZ1uwOZ/job_working_directory/000/1/metadata_kwds_HistoryDatasetAssociation_1_P5UlU8,/tmp/tmpZ1uwOZ/job_working_directory/000/1/metadata_out_HistoryDatasetAssociation_1_JTiZRv,/tmp/tmpZ1uwOZ/job_working_directory/000/1/metadata_results_HistoryDatasetAssociation_1_OWufZz,/tmp/tmpZ1uwOZ/files/000/dataset_1.dat,/tmp/tmpZ1uwOZ/job_working_directory/000/1/metadata_override_HistoryDatasetAssociation_1_tHHlma" 5242880; sh -c "exit $return_code"
galaxy.jobs.runners.local DEBUG 2016-12-14 17:43:43,435 (1) executing job script: /tmp/tmpZ1uwOZ/job_working_directory/000/1/galaxy_1.sh
galaxy.jobs DEBUG 2016-12-14 17:43:43,452 (1) Persisting job destination (destination id: upload_dest)
requests.packages.urllib3.connectionpool DEBUG 2016-12-14 17:43:43,550 "GET /api/histories/2891970512fa2d5a?key=412adcd2d1a1516f35d3620d02d251b7 HTTP/1.1" 200 None
galaxy.jobs.runners.local DEBUG 2016-12-14 17:43:44,934 execution finished: /tmp/tmpZ1uwOZ/job_working_directory/000/1/galaxy_1.sh
requests.packages.urllib3.connectionpool INFO 2016-12-14 17:43:44,938 Starting new HTTP connection (1): localhost
galaxy.jobs.output_checker DEBUG 2016-12-14 17:43:44,946 Tool produced standard error failing job - [Traceback (most recent call last):
  File "/tmp/tmpZ1uwOZ/galaxy-dev/tools/data_source/upload.py", line 17, in <module>
    from six.moves.urllib.request import urlopen
ImportError: No module named six.moves.urllib.request
Traceback (most recent call]
galaxy.jobs DEBUG 2016-12-14 17:43:45,034 (1) setting dataset 1 state to ERROR

I launch planemo like this with a clean planemo virtualenv (code from this PR), and empty ~/.planemo:

planemo test --conda_dependency_resolution --conda_copy_dependencies --use_cached_dependency_manager --galaxy_branch release_16.10 --conda_auto_install  --no_cleanup .

@mvdbeek
Copy link
Member

mvdbeek commented Dec 14, 2016

Sounds familiar indeed, I'm afraid the upload tool depends on samtools -- conda_auto_install is then installing __samtool@uv, and then the conda env takes precedence over galaxy's virtualenv.
Any idea what to do here, @jmchilton ?

@mvdbeek
Copy link
Member

mvdbeek commented Dec 14, 2016

Possibly related:
galaxyproject/galaxy#3238

I think the issue at core is how to deal with tools that expect galaxy's PYTHONPATH to come first.

@jmchilton
Copy link
Member

We've never observed this breaking though before right? I don't think the samtools recipe currently depends on Python at all in Conda - so why is activating the environment affecting upload? I don't get it at all... I'll let you know if I think of something though.

@abretaud
Copy link
Contributor Author

Ok, the strange error is gone, I think I had messed up my conda install
Still the option doesn't do anything, but I think I know why, to be continued!

@abretaud
Copy link
Contributor Author

I just fixed the way the option is passed to galaxy. Not sure if it's in the best place in the code though

Now, as the cache is not built on tool install the planemo option has no effect. The --auto_install doesn't seem to change anything
@mvdbeek you told about creating the cache on conda_install, but do you think it would be feasible to add the dep to the cache on first use?

@mvdbeek
Copy link
Member

mvdbeek commented Dec 20, 2016

@mvdbeek you told about creating the cache on conda_install, but do you think it would be feasible to add the dep to the cache on first use?

I'm divided on that issue, maybe that is a good idea, but my thinking is that (at least by default), you wouldn't want to have a long-running and slightly fragile process blocking other conda activities.
My other concern is that you could end up in a cache building "loop" if building the cache doesn't work for some reason (similar to the conda_auto_install option constantly trying to install unavailable conda dependencies).
I think it would be best for now to change the planemo conda_install command to create these cached environments, but I haven't looked into what planemo conda_install is actually doing.

@abretaud
Copy link
Contributor Author

Ok, I have an ugly hack for conda_install to create the cache, but the question now is: where can I store the cache? The galaxy instance does not exist yet when we launch conda_install...

@jmchilton
Copy link
Member

I feel like is the cache is enabled and auto_install is on (however it is on) - the dependencies should be installed into a cache automatically. If this is not how it works - I guess that should be a Galaxy issue?

I understand the worry about a flakey process and loops and stuff - but aren't those all the same worries that one would have with auto_install regardless? Does adding the cache make it worse?

@abretaud
Copy link
Contributor Author

I understand the worry about a flakey process and loops and stuff - but aren't those all the same worries that one would have with auto_install regardless? Does adding the cache make it worse?

Oh yes I think you're right, I don't think it makes it worse than not caching it

Here's a PR for galaxy:
galaxyproject/galaxy#3348

@jmchilton jmchilton merged commit 06ea300 into galaxyproject:master Jan 25, 2017
@jmchilton
Copy link
Member

Sorry for the lag on this, I forgot the dealing with the Galaxy side alone wasn't enough to finish this off.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants