Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add API to install resolver tool dependencies #3222

Merged
merged 7 commits into from Nov 28, 2016

Conversation

Projects
None yet
5 participants
@mvdbeek
Copy link
Member

commented Nov 27, 2016

and to build a tool dependency cache (if activated in galaxy.ini).

This brings us closer to managing tool dependencies for non-toolshed tools.
In addition this is currently the only way to build a cached environment, except for re-installing a tool.

An example to install dependencies for the twobit converter:

import bioblend.galaxy
url = 'http://localhost:8080/'
api_key = 'admin_api_key'
tool_id = 'CONVERTER_fasta_to_2bit'
endpoint = "api/tools/%s/install_dependencies" % tool_id
gi = bioblend.galaxy.GalaxyInstance(url, api_key)
gi.make_post_request("/".join((url, endpoint)), payload={'id': tool_id})

If use_cached_dependency_manager is activated in the galaxy.ini,
a cached environment can be built like this:

endpoint = "api/tools/%s/build_dependency_cache" % tool_id
gi.make_post_request("/".join((url, endpoint)), payload={'id': tool_id})

mvdbeek added some commits Nov 27, 2016

Add API to install resolver tool dependencies
and to build a tool dependency cache (if activated in galaxy.ini).

An example to install dependencies for the twobit converter:
```
import bioblend.galaxy
url = 'http://localhost:8080/'
api_key = 'admin_api_key'
tool_id = 'CONVERTER_fasta_to_2bit'
endpoint = "api/tools/%s/install_dependencies" % tool_id
gi = bioblend.galaxy.GalaxyInstance(url, api_key)
gi.make_post_request("/".join((url, endpoint)), payload={'id': tool_id})
```
If `use_cached_dependency_manager` is activated in the galaxy.ini,
a cached environment can be built like this:
```
endpoint = "api/tools/%s/build_dependency_cache" % tool_id
gi.make_post_request("/".join((url, endpoint)), payload={'id': tool_id})
```
@bgruening

This comment has been minimized.

Copy link
Member

commented Nov 27, 2016

@mvdbeek this is a great idea! Can we include an automatics conda clean in this endpoint or should this be a separate API endpoint.

@mvdbeek

This comment has been minimized.

Copy link
Member Author

commented Nov 27, 2016

Can we include an automatics conda clean in this endpoint or should this be a separate API endpoint.

I don't know, I guess I would put this under the resolvers endpoint. In there we could have a general cleanup function, that would map to conda clean, or in the case of docker to docker rm/rmi. Does that make sense?

@mvdbeek mvdbeek added the status/WIP label Nov 27, 2016

@bgruening

This comment has been minimized.

Copy link
Member

commented Nov 27, 2016

Mh ... I think we need to clean at the end of bulk-installations and not after every single installation.
To make it clear, here I mean conda clean to remove the tarballs etc ... not the installation. This will free a lot of space.
I don't know of anything comparable in the Docker world.

@bgruening

This comment has been minimized.

Copy link
Member

commented Nov 27, 2016

Overall, this is a nice PR and needed, so +1 from me.

@mvdbeek

This comment has been minimized.

Copy link
Member Author

commented Nov 27, 2016

conda clean is not bound to a particular environment or tool, right?
Does a POST to /api/dependencies_resolvers/clean or cleanup seem reasonable?
I agree, for docker it is not very apparent what would need to be cleaned, but if we ever support apt, guix or sth. else this could be extended to map to the respective cleanup function.

Then again, I'm just hacking away here, input from people that know something about API design is more than welcome.

@bgruening

This comment has been minimized.

Copy link
Member

commented Nov 27, 2016

conda clean is not bound to an environment, like apt-get clean. /api/dependencies_resolvers/clean seems like a good idea.

@mvdbeek mvdbeek removed the status/WIP label Nov 27, 2016

@mvdbeek

This comment has been minimized.

Copy link
Member Author

commented Nov 27, 2016

@bgruening Alright, I've added the /api/dependencies_resolvers/clean endpoint and cherry-picked the conda_clean function from #2931 .

@bgruening

This comment has been minimized.

Copy link
Member

commented Nov 27, 2016

Cool, thanks!

@jmchilton

This comment has been minimized.

Copy link
Member

commented Nov 28, 2016

Awesome sauce - thanks a bunch @mvdbeek!

@jmchilton jmchilton merged commit fd0a52a into galaxyproject:dev Nov 28, 2016

4 checks passed

api test Build finished. 230 tests run, 0 skipped, 0 failed.
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
framework test Build finished. 126 tests run, 0 skipped, 0 failed.
Details
toolshed test Build finished. 580 tests run, 0 skipped, 0 failed.
Details
@martenson

This comment has been minimized.

Copy link
Member

commented Nov 28, 2016

does merging this make #2931 obsolete?

martenson added a commit that referenced this pull request Nov 29, 2016

Merge pull request #3227 from mvdbeek/cached_deps_16.10
[16.10] Backport #3106 and #3222: Cached conda environments and API to manage them
@martenson

This comment has been minimized.

Copy link
Member

commented Dec 1, 2016

Do these changes need to be reflected in Conda FAQ https://docs.galaxyproject.org/en/master/admin/conda_faq.html ?

We need to keep that resource up to date and top notch.

@mvdbeek @jmchilton

@mvdbeek

This comment has been minimized.

Copy link
Member Author

commented Dec 1, 2016

Yes, I'll try to get to this.

@@ -175,6 +176,12 @@ def build_cache(self, requirements, **kwds):
resolved_dependencies = self.requirements_to_dependencies(requirements, **kwds)
cacheable_dependencies = [dep for req, dep in resolved_dependencies.items() if dep.cacheable]
hashed_requirements_dir = self.get_hashed_requirements_path(cacheable_dependencies)
if kwds.get('force_rebuild', False) and os.path.exists(hashed_requirements_dir):

This comment has been minimized.

Copy link
@nsoranzo

nsoranzo Dec 6, 2016

Member

@mvdbeek What happens if an admin installs a tool which has the same set of requirements of a previously installed-and-cached environment (and force_rebuild is False)?

This comment has been minimized.

Copy link
@mvdbeek

mvdbeek Dec 7, 2016

Author Member

This is actually a bit inconsistent now (given that initially there was no way to build/rebuild a cache):
If you install a tool with the same set of requirements, you do build the cache again, since this was the only way to fix a broken cache or update a cache with new packages (short of deleting the cache).
But now I think we should expose this in the details section of the tool installation and extend the install repository API endpoint for this parameter. Does that sound like a good idea?

This comment has been minimized.

Copy link
@mvdbeek

mvdbeek Dec 7, 2016

Author Member

In fact we should have a UI to manage these things independently of the install process. The pieces are all there, but I don't feel up to the task of building a "Manage tool dependencies" page. That would be a huge win IMO. I have a clear picture in my head how this should look like, but looking through the codebase I don't really see something that I could base my work on like I did for the conda install process.

This comment has been minimized.

Copy link
@nsoranzo

nsoranzo Dec 7, 2016

Member

If you install a tool with the same set of requirements, you do build the cache again, since this was the only way to fix a broken cache

I think that should be done only when using the new force_rebuild option, and yes, we need an UI for that!

or update a cache with new packages (short of deleting the cache).

Do you mean updating a package that is already in the cached env or adding a new package to it? I don't think the second should happen, because that should have a different hash. And for the first, rebuilding is probably better.

So, I'd change this code to be:

        if os.path.exists(hashed_requirements_dir):
            if kwds.get('force_rebuild', False):
                try:
                    shutil.rmtree(hashed_requirements_dir)
                except Exception:
                    log.warning("Could not delete cached requirements directory '%s'" % hashed_requirements_dir)
                    pass
            else:
                log.debug("Cached environment %s already exists, skipping build", hashed_requirements_dir)
                return
        [dep.build_cache(hashed_requirements_dir) for dep in cacheable_dependencies]

In fact we should have a UI to manage these things independently of the install process. The pieces are all there, but I don't feel up to the task of building a "Manage tool dependencies" page.

I'd love that! Hopefully someone from the core Galaxy Team can help?

This comment has been minimized.

Copy link
@mvdbeek

mvdbeek Dec 7, 2016

Author Member

So, I'd change this code to be:

yep, that looks pefect. Do you want to open a PR? Otherwise I can include this with a Documentation update.

This comment has been minimized.

Copy link
@nsoranzo

nsoranzo Dec 7, 2016

Member

Thank @mvdbeek, I'll open a PR later.

This comment has been minimized.

Copy link
@nsoranzo
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.