New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cached dependency manager #3106

Merged
merged 6 commits into from Nov 23, 2016

Conversation

Projects
None yet
8 participants
@mvdbeek
Copy link
Member

mvdbeek commented Oct 28, 2016

Similar to #2986, implements a mechanism that allows tool dependencies to be cached.

If the use_cached_dependency_manager option is set to True in galaxy.ini, we build a hash of the combination of a tools' requirements, and store the resulting environment in a directory specified by the
tool_dependency_cache_dir option in galaxy.ini.

The approach here is only caching (combinations of) conda dependencies when installing new tools through the ToolShed.
When a dependency combination is not yet cached, conda create will be invoked with tool_dependency_cache_dir/<hash> folder, and otherwise conda install with tool_dependency_cache_dir/<hash>.

If a __name@version dependency exists, but not a cached environment at job runtime, that environment will be used. So to benefit from cached environments for already installed tools, those cached environment need to be created. Currently this can only be done by (re-)installing tools.

ping @abretaud @jmchilton

@mvdbeek mvdbeek changed the title Add cached dependency manager [WIP] Add cached dependency manager Oct 28, 2016

@mvdbeek mvdbeek force-pushed the mvdbeek:cached_dep_manager branch 3 times, most recently from 63451ef to 3fab2b8 Oct 28, 2016

@@ -213,6 +213,19 @@ paste.app_factory = galaxy.web.buildapp:app_factory
# if it cannot find a local copy and conda_exec is not configured.
#conda_auto_init = False

# Certain dependency resolvers (namely conda) take a considerable amount of

This comment has been minimized.

@nsoranzo

nsoranzo Oct 28, 2016

Member

s/conda/Conda/

@@ -213,6 +213,19 @@ paste.app_factory = galaxy.web.buildapp:app_factory
# if it cannot find a local copy and conda_exec is not configured.
#conda_auto_init = False

# Certain dependency resolvers (namely conda) take a considerable amount of
# time to build an isolated job environment in the job_working_directory. Set
# the following option to true to cache the dependencies in a folder. This

This comment has been minimized.

@nsoranzo

nsoranzo Oct 28, 2016

Member

s/true/True/

# the following option to true to cache the dependencies in a folder. This
# option is beta and should only be used if you experience long waiting times
# before a job is actually submitted to your cluster. If you activate this
# option, and you install new dependencies you may need to clear out old cached

This comment has been minimized.

@nsoranzo

nsoranzo Oct 28, 2016

Member

Move comma after dependencies.


# By default the tool_dependency_cache_dir is the _cache directory
# of the tool dependency directory
#tool_dependency_cache_dir = tool_dependency_dir/_cache

This comment has been minimized.

@nsoranzo

nsoranzo Oct 28, 2016

Member

s/tool_dependency_dir/<tool_dependency_dir>/

This comment has been minimized.

@mvdbeek

mvdbeek Oct 28, 2016

Member

hmm, so

tool_dependency_cache_dir = _cache

?
I would like to indicate that by default we store the cache in tool_dependency_dir/_cache.
Maybe

tool_dependency_cache_dir = <tool_dependency_dir>/_conda

would be better?

This comment has been minimized.

@nsoranzo

nsoranzo Oct 28, 2016

Member

I guess you saw my comment before I edited it to add quotes, GitHub hides everything between the angle brackets.
Anyway, I suggested to use

<tool_dependency_dir>/_cache

This comment has been minimized.

@mvdbeek

mvdbeek Oct 28, 2016

Member

sorry, forgot about that GH quirk :)

@@ -213,6 +213,19 @@ paste.app_factory = galaxy.web.buildapp:app_factory
# if it cannot find a local copy and conda_exec is not configured.
#conda_auto_init = False

# Certain dependency resolvers (namely Conda) take a considerable amount of
# time to build an isolated job environment in the job_working_directory if the
# job working diretory is on a network share. Set the following option to True

This comment has been minimized.

@nsoranzo

nsoranzo Oct 28, 2016

Member

s/diretory/directory/

# while other resolvers may not create the hashed_requirements_dir
os.mkdir(hashed_requirements_dir)
with open(os.path.join(hashed_requirements_dir, 'dep_commands.sh'), 'w') as cmds_f:
[cmds_f.write("%s\n" % line) for line in commands]

This comment has been minimized.

@nsoranzo

nsoranzo Oct 28, 2016

Member

This looks "perlish":

for line in commands:
    cmds_f.write("%s\n" % line)
# used if you experience long waiting times before a job is actually submitted
# to your cluster. If you activate this option and install or remove dependencies,
# you may need to clear out old cached environments
#use_cached_dependency_manager = False

This comment has been minimized.

@bgruening

bgruening Oct 29, 2016

Member

Should we name this something like conda?
People could be confused if they are using Docker.

This comment has been minimized.

@mvdbeek

mvdbeek Oct 29, 2016

Member

In principle this works for all non-container resolver types, except that you get the biggest benefits with conda. But maybe I should find another angle and do the caching only for conda ... I kind of like inheriting from DependencyManager though.
Let's see if I can come up with an elegant way to do this for conda only. That would reduce the number of environments that would be cached.

@mvdbeek

This comment has been minimized.

Copy link
Member

mvdbeek commented Oct 29, 2016

In my testing this is working pretty well:
If I uninstall a dependency, it is not being used from the cache (because the resolver doesn't find it), if I reinstall it it is being picked up from the cache, if I downgrade a dependency inside the __<name>@<version> environment, the downgraded version will be used (conda install into the pre-existing hash dir) etc. So whatever happens, we do not diverge from the packages that would be chosen by the dependency resolver.

@mvdbeek mvdbeek changed the title [WIP] Add cached dependency manager Add cached dependency manager Oct 29, 2016

@mvdbeek mvdbeek removed the status/WIP label Oct 29, 2016

@mvdbeek mvdbeek added this to the 17.01 milestone Oct 29, 2016

@jmchilton

This comment has been minimized.

Copy link
Member

jmchilton commented Oct 31, 2016

Sorry I didn't respond sooner - I was at a conference last week.

I appreciate the effort and I know this is solving a real problem, it isn't exactly how I would have gone about it - kind of close though.

For instance this doesn't handle locking well right? There could be a race condition there because the bash stuff is pretty basic? I'd also hope the cache conda dependencies thing would sit beside conda somewhere by default to solve @natefoo's linking problems (this PR does that I think) and would cause the cached environment creations during TS installs (to solve @natefoo's RO problems).

What do you think about this modification to the approach? The TS piece doesn't need to be apart of this - but I'm a bit worried about the race condition.

@mvdbeek

This comment has been minimized.

Copy link
Member

mvdbeek commented Oct 31, 2016

I think conda does have a mechanism to prevent race conditions -- I have
seen lock file creation when creating / updating the environment, but I can
check how robust this is when I fire a lot of parallel jobs before the
environment is cached.

I can definitely add the cache creation during TS install, that should be
just calling the build_dependency_shell commands.

On Oct 31, 2016 2:19 PM, "John Chilton" notifications@github.com wrote:

Sorry I didn't respond sooner - I was at a conference last week.

I appreciate the effort and I know this is solving a real problem, it
isn't exactly how I would have gone about it - kind of close though.

For instance this doesn't handle locking well right? There could be a race
condition there because the bash stuff is pretty basic? I'd also hope the
cache conda dependencies thing would sit beside conda somewhere by default
to solve @natefoo https://github.com/natefoo's linking problems (this
PR does that I think) and would cause the cached environment creations
during TS installs (to solve @natefoo https://github.com/natefoo's RO
problems).

What do you think about this modification to the approach? The TS piece
doesn't need to be apart of this - but I'm a bit worried about the race
condition.


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#3106 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AGfVpQ71kxXe_7EE33DeBW77kavevDTrks5q5erSgaJpZM4KjhXz
.

@mvdbeek mvdbeek added status/WIP and removed status/review labels Oct 31, 2016

@mvdbeek mvdbeek changed the title Add cached dependency manager [WIP] Add cached dependency manager Oct 31, 2016

@mvdbeek

This comment has been minimized.

Copy link
Member

mvdbeek commented Oct 31, 2016

Hmm, conda is actually locking, both for the creation of the environment as well as when calling conda install on existing environments. that means only 1 job at a time is being prepared for submission :(.
I guess creating the cached environments at tool install time is the way to go, and then skipping the environment building when creating jobs. Which means that manual changes to the __name@version environments will not be automatically propagated to the cache. I guess that would be good enough.

@nsoranzo

This comment has been minimized.

Copy link
Member

nsoranzo commented Oct 31, 2016

Which means that manual changes to the __name@version environments will not be automatically propagated to the cache. I guess that would be good enough.

Do we actually still need the __name@version environments?

@mvdbeek

This comment has been minimized.

Copy link
Member

mvdbeek commented Nov 1, 2016

Do we actually still need the __name@version environments?

Good question, I'm not sure. Maybe not; they live within the conda_prefix/envs dir and

  • may be necessary just so that there won't be any conflicts inside the root environment.
  • we may need them because of the --offline environment creation.
  • if we keep them, people can just disable the cached environments without needing to re-install conda dependencies

Do you know more about this, @bgruening ?
Maybe we can directly create these cached environments. But even if we eliminate the __name@versionenvironments, it will be hard to do any manual maintenance work. Perhaps we can add functionality to planemo or ephemeris to do this?!

@mvdbeek

This comment has been minimized.

Copy link
Member

mvdbeek commented Nov 14, 2016

@jmchilton I've rebased the PR and updated the PR description.

@natefoo

This comment has been minimized.

Copy link
Member

natefoo commented Nov 15, 2016

I believe this is exactly what I need to run these directly from CVMFS. I can give it a test if I can remember one of the tools that had the problems.

@mvdbeek

This comment has been minimized.

Copy link
Member

mvdbeek commented Nov 15, 2016

Conflicting again? grrrr

mvdbeek added some commits Oct 28, 2016

Add cached dependency manager
Similar to #2986, implement
a mechanism that allows tool dependencies to be cached.

If the `use_cached_dependency_manager` option is set to True in
galaxy.ini, we build a hash of the combination of a tools' requirements,
and store the resulting environment in a directory specified by the
`tool_dependency_cache_dir` option in galaxy.ini.
Rework cached dependencies to be more robust
to updates or changes in depedencies, folder structure and resolver
configuration. Instead of hashing name, type and version of a
dependency, hash the json representation of the dependencies returned by
the dependency resolver, which include the path to the environment and
the depedency type. This is only applied to resolvers whose cacheable
attribute is set to True (conda-only, currently).
Build dependency cache at install time
and only activate cached environments if they exist.
Only attempt to build cache once per install
and override __eq__ for ToolRequirement, to simplify
checking if ToolRequirements are already installed/cached.

@mvdbeek mvdbeek force-pushed the mvdbeek:cached_dep_manager branch from c02666c to df93db4 Nov 23, 2016

@mvdbeek mvdbeek added status/review and removed status/WIP labels Nov 23, 2016

@mvdbeek

This comment has been minimized.

Copy link
Member

mvdbeek commented Nov 23, 2016

OK, rebased it again.

@jmchilton

This comment has been minimized.

Copy link
Member

jmchilton commented Nov 23, 2016

Thanks for the PR - this was really needed!

@jmchilton jmchilton merged commit db93fd3 into galaxyproject:dev Nov 23, 2016

3 of 4 checks passed

continuous-integration/travis-ci/pr The Travis CI build failed
Details
api test Build finished. 230 tests run, 0 skipped, 0 failed.
Details
framework test Build finished. 126 tests run, 0 skipped, 0 failed.
Details
toolshed test Build finished. 580 tests run, 0 skipped, 0 failed.
Details
@lparsons

This comment has been minimized.

Copy link
Contributor

lparsons commented Nov 29, 2016

So, most of my conda installed tools are failing with the error Conda dependency seemingly installed but failed to build job environment. which means I'm desperate for a fix. Is there a recommended way to install this beyond just moving to a bleeding edge build?

@lparsons

This comment has been minimized.

Copy link
Contributor

lparsons commented Nov 29, 2016

Perhaps we could get this into release_16.10? Conda dependencies simply do not work (reliably) without it.

@mvdbeek

This comment has been minimized.

Copy link
Member

mvdbeek commented Nov 29, 2016

It's working well for me, but for convenience we should also backport #3222. Otherwise you would need to re-install these tools.

@lparsons

This comment has been minimized.

Copy link
Contributor

lparsons commented Nov 29, 2016

Hmmm, what are you doing to avoid the ... failed to build job environment error? Seems to happen whenever I run jobs that use a conda dependency over a collection. At this point, I'm thinking I'll have to deactivate all tools that use conda deps and either hold off on them 😦 or perhaps write module files or something to avoid using conda. Any tips would be greatly appreciated.

BTW, thanks so much for this PR! Can't wait to be able to use conda!

@lparsons

This comment has been minimized.

Copy link
Contributor

lparsons commented Nov 29, 2016

Also, I didn't think release_16.10 was out yet, so not really backporting... 😉

@nsoranzo

This comment has been minimized.

Copy link
Member

nsoranzo commented Nov 29, 2016

@lparsons 16.10 is not out, but it's feature freezed. Whether this is a new feature or a bug fix, or if an exception should be granted, is up for discussion.

@lparsons

This comment has been minimized.

Copy link
Contributor

lparsons commented Nov 29, 2016

@nsoranzo Well, since it wouldn't be enabled by default, I think it's relatively safe. And it's certainly attempting to resolve a showstopper of a bug for me, so in some sense it is a "bug fix". ;-)

@mvdbeek

This comment has been minimized.

Copy link
Member

mvdbeek commented Nov 29, 2016

Hmmm, what are you doing to avoid the ... failed to build job environment error? Seems to happen whenever I run jobs that use a conda dependency over a collection

Sorry, I wasn't clear, I meant that I'm using this in production, so I would say this is (probably) not going to break stuff all over the place. I opened PR #3227 for the backport.

@lparsons

This comment has been minimized.

Copy link
Contributor

lparsons commented Nov 29, 2016

@mvdbeek You rock! Thanks.

martenson added a commit that referenced this pull request Nov 29, 2016

Merge pull request #3227 from mvdbeek/cached_deps_16.10
[16.10] Backport #3106 and #3222: Cached conda environments and API to manage them

@mvdbeek mvdbeek deleted the mvdbeek:cached_dep_manager branch Jun 12, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment