Skip to content

Commit

Permalink
Dependency resolver documentation fixes
Browse files Browse the repository at this point in the history
  • Loading branch information
nsoranzo committed Jan 25, 2017
1 parent 3eaad9a commit a43d70e
Show file tree
Hide file tree
Showing 2 changed files with 55 additions and 56 deletions.
75 changes: 37 additions & 38 deletions doc/source/admin/conda_faq.rst
Expand Up @@ -41,22 +41,22 @@ features that led to this decision:
Below we answer some common questions (collected by Lance Parsons):


1. How do I enable Conda dependency resolution for Galaxy jobs?
***************************************************************
1. How do I enable Conda dependency resolution for Galaxy tools?
****************************************************************

The short answer is that as of 17.01, Galaxy should install Conda the first
time it starts up and be configured to use it by default.

The long answer is that Galaxy's dependency job resolution is managed via
The long answer is that Galaxy's tool dependency resolution is managed via
``dependency_resolvers_conf.xml`` configuration file. This configuration
file is discussed in detail in the :ref:`Dependency Resolvers <dependency_resolvers>`
documentation. Most Galaxy administrators will be using Galaxy's default dependency
resolvers configuration file (``config/dependency_resolvers_conf.xml.sample``). With
release 16.04, Galaxy has enabled Conda dependency resolution by default when
Conda was already installed on the system. As of 17.01, Galaxy will also install
Conda as needed when starting up. Having Conda enabled in ``dependency_resolvers_conf.xml``
means that Galaxy can look for job dependencies using the Conda system when it
attempts to run tools.
means that Galaxy can look for tool dependencies using the Conda system when it
attempts to run a job.

Note that the order of resolvers in the file matters and the ``<tool_shed_packages />``
entry should remain first. This means that tools that have specified Tool Shed packages
Expand All @@ -65,30 +65,27 @@ as their dependencies will work without a change.
The most common configuration settings related to Conda are listed in Table 1.
See `galaxy.ini.sample`_ for the complete list.

+--------------------------+--------------------------+---------------------------+
| Setting | Default setting | Meaning |
+--------------------------+--------------------------+---------------------------+
| ``conda_auto_init`` | True | Set to True to instruct |
| | | Galaxy to install Conda |
| | | (the package manager) |
| | | automatically if it |
| | | cannot find a local copy |
| | | already on the system. |
+--------------------------+--------------------------+---------------------------+
| ``conda_auto_install`` | False | Set to True to instruct |
| | | Galaxy to look for and |
| | | install Conda packages |
| | | for missing tool |
| | | dependencies before |
| | | running a job. |
+--------------------------+--------------------------+---------------------------+
| ``conda_prefix`` | <tool\_dependency\_dir>/ | the location |
| | \_conda | on the |
| | | filesystem where Conda |
| | | packages and |
| | | environments are |
| | | installed |
+--------------------------+--------------------------+---------------------------+
+-------------------------+------------------------------------+---------------------------+
| Setting | Default setting | Meaning |
+-------------------------+------------------------------------+---------------------------+
| ``conda_auto_init`` | ``True`` | If ``True``, Galaxy will |
| | | try to install Conda |
| | | (the package manager) |
| | | automatically if it |
| | | cannot find a local copy |
| | | already on the system |
+-------------------------+------------------------------------+---------------------------+
| ``conda_auto_install`` | ``False`` | If ``True``, Galaxy will |
| | | look for and install |
| | | Conda packages for |
| | | missing tool dependencies |
| | | before running a job |
+-------------------------+------------------------------------+---------------------------+
| ``conda_prefix`` | ``<tool\_dependency\_dir>/_conda`` | The location on the |
| | | filesystem where Conda |
| | | packages and environments |
| | | are installed |
+-------------------------+------------------------------------+---------------------------+

*Table 1: Commonly used configuration options for Conda in Galaxy.*

Expand Down Expand Up @@ -164,8 +161,8 @@ This depends on your ``galaxy.ini`` setting. Starting with release 16.07, Galaxy
can automatically install the Conda package manager for you if you have enabled
``conda_auto_init``. Galaxy can then install Trinity along with its dependencies
using one of the methods listed in question 2 above. In particular, if
``conda_auto_install`` is True and Trinity is not installed yet, Galaxy will try
to install it via Conda when a Trinity job is launched.
``conda_auto_install`` is ``True`` and Trinity is not installed yet, Galaxy will
try to install it via Conda when a Trinity job is launched.

With release 16.07 you can see which dependencies are being used
in the “Manage installed tools” section of the Admin panel and you can select
Expand All @@ -179,7 +176,7 @@ dependency resolvers configuration with regards to what will actually be used du
the tool execution.

To check if Galaxy has created a Trinity environment, have a look at folders under
``<tool_dependency_dir>/_conda/envs/``(or ``<conda_prefix>/envs`` if you have changed `conda_prefix` in your galaxy.ini file).
``<tool_dependency_dir>/_conda/envs/`` (or ``<conda_prefix>/envs`` if you have changed ``conda_prefix`` in your galaxy.ini file).

We recommend to use Conda on a tool-per-tool basis, by unchecking the checkbox
for TS dependencies during the tool installation, and for tools where there
Expand Down Expand Up @@ -331,13 +328,15 @@ message appears in your logs:
You can also use: $ conda clean --lock
First, you may wish to enable cached dependencies. This can be done by setting
``use_cached_dependency_manager`` in ``galaxy.ini``. Many jobs will create a
per job Conda environment with just the dependencies needed for that job installed.
``use_cached_dependency_manager`` to ``True`` in ``galaxy.ini``. Without this
option, many jobs will create a per-job Conda environment with just the
dependencies needed for that job installed.
This will be placed on the filesystem containg the job working directory. This
is an expensive operation and Conda doesn't always link environments correctly
across filesystems. Enabling this job caching will create a cache for each required
combination of requirements in the directory specified by ``tool_dependency_cache_dir``
in ``galaxy.ini`` (defaulting to ``<tool_dependency_dir>/_cache``).
across filesystems. Enabling this dependency caching will create a cache
directory for each required combination of requirements inside the directory
specified by ``tool_dependency_cache_dir`` in ``galaxy.ini`` (defaulting to
``<tool_dependency_dir>/_cache``).

The cached dependency manager was added to the 16.10 release of Galaxy (see
`Pull Request #3106`_). In 17.01 Galaxy was updated to build the cached dependencies
Expand All @@ -364,7 +363,7 @@ YAML ``condarc`` file, this should be created by Galaxy and placed in
newer version of Conda than shipped with Galaxy as of 17.01. See the question below
on upgrading Conda if you must use this trick.

Alternatively, copying can be used when creating environments instead links (either
Alternatively, copying can be used when creating environments instead of links (either
symbolic or hard). To enable this set ``conda_copy_dependencies`` to ``True`` in
``galaxy.ini``. This requires at least version 16.07 of Galaxy.

Expand Down
36 changes: 18 additions & 18 deletions doc/source/admin/dependency_resolvers.rst
Expand Up @@ -11,7 +11,7 @@ job uses includes commands, such as changes to the ``PATH`` environment variable
resolvers*. There is a default dependency resolver configuration but administrators can provide their own configuration
using the ``dependency_resolvers_conf.xml`` configuration file in the Galaxy ``config/`` directory.

The binding between tool XML and the tools they need to run is specified in the tool XML using ``requirement``
The binding between tool XML and the tools they need to run is specified in the tool XML using ``<requirement>``
tags, for example

.. code-block:: xml
Expand Down Expand Up @@ -45,7 +45,7 @@ The default configuration of dependency resolvers is equivalent to the following
</dependency_resolvers>
This default dependency resolver configuration contains five items. First, the *tool shed dependency resolver* is used,
then the *Galaxy packages dependency resolver* is used (initially looking for packages by name and version string and then looking for the package just by name), and finally it checks Conda for a versioned or unversioned match.
then the *Galaxy packages dependency resolver* is used (initially looking for packages by name and version string and then looking for the package just by name), and finally it checks *Conda* for a versioned or unversioned match.
The default configuration thus prefers packages installed from the Galaxy Tool Shed using legacy ``tool_dependencies.xml``
files, before trying to find a "Galaxy package" satisfying the specific version the dependency requires before
falling back to looking for a Galaxy package with merely the correct name, and then looking for Conda recipes with
Expand Down Expand Up @@ -144,17 +144,17 @@ modulepath
value used for MODULEPATH environment variable, used to locate modules

versionless
whether to resolve tools using a version string or not (default: *false*)
whether to resolve tools using a version string or not (default: ``false``)

find_by
whether to use the ``DirectoryModuleChecker`` or ``AvailModuleChecker`` (permissable values are "directory" or "avail",
default is "avail")
whether to use the ``DirectoryModuleChecker`` or ``AvailModuleChecker`` (permissable values are ``directory`` or ``avail``,
default is ``avail``)

prefetch
in the AvailModuleChecker prefetch module info with ``module avail`` (default: true)
in the AvailModuleChecker prefetch module info with ``module avail`` (default: ``true``)

default_indicator
what indicate to the AvailModuleChecker that a module is the default version (default: "(default)"). Note
what indicate to the AvailModuleChecker that a module is the default version (default: ``(default)``). Note
that the first module found is considered the default when no version is used by the resolver, so
the sort order of modules matters.

Expand All @@ -163,7 +163,7 @@ of the ``module avail`` command for the name of the dependency. If it is configu
or is looking for a package with no version specified, it accepts any module whose name matches and is a bare word
or the first module whose name matched. For this reason, the default version of the module should be the first one
listed, something that can be achieved by tagging it with a word that appears first in sort order, for example the
string "(default)" (yielding a module name like ``bedtools/(default)``). So when looking for ``bedtools`` in
string ``(default)`` (yielding a module name like ``bedtools/(default)``). So when looking for ``bedtools`` in
versionless mode the search would match the first module called ``bedtools``, and in versioned mode the search would
only match if a module named ``bedtools/2.20.1`` was present (assuming you're looking for ``bedtools/2.20.1``).

Expand Down Expand Up @@ -202,36 +202,36 @@ For a very detailed discussion of Conda dependency resolution, check out the
:ref:`Conda FAQ <conda_faq>`.

prefix
The conda_prefix used to locate dependencies in (defaults to ``<tool_dependency_dir>/_conda``).
The conda_prefix used to locate dependencies in (default: ``<tool_dependency_dir>/_conda``).

exec
The conda executable to use, it will default to the one on the
PATH (if available) and then to ``<conda_prefix>/bin/conda``.

versionless
whether to resolve tools using a version string or not (defaults to *False*).
whether to resolve tools using a version string or not (default: ``False``).

debug
Pass debug flag to conda commands (default: *False*).
Pass debug flag to conda commands (default: ``False``).

ensure_channels
conda channels to enable by default. See
http://conda.pydata.org/docs/custom-channels.html for more
information about channels. This defaults to ``iuc,bioconda,r,defaults,conda-forge``.
This order should be consistent with `Bioconda perscribed order <https://github.com/bioconda/bioconda-recipes/blob/master/config.yml#L8>`__
This order should be consistent with `Bioconda prescribed order <https://github.com/bioconda/bioconda-recipes/blob/master/config.yml#L8>`__
if it includes ``bioconda``.

auto_install
Set to True to instruct Galaxy to look for and install missing tool
dependencies before each job runs (defaults to *False*).
If ``True``, Galaxy will look for and install missing tool
dependencies before running a job (default: ``False``).

auto_init
Set to True to instruct Galaxy to install conda from the web
automatically if it cannot find a local copy and conda_exec is not
configured. This defaults to *True* as of Galaxy 17.01.
If ``True``, Galaxy will try to install Conda from the web
automatically if it cannot find a local copy and ``conda_exec`` is not
configured. This defaults to ``True`` as of Galaxy 17.01.

copy_dependencies
Set to ``True`` to instruct Galaxy to copy dependencies over instead of symbolically
If ``True``, Galaxy will copy dependencies over instead of symbolically
linking them when creating per job environments. This should be considered somewhat
deprecated because Conda will do this as needed for newer versions of Conda - such
as the version targeted with Galaxy 17.01+.

0 comments on commit a43d70e

Please sign in to comment.