Skip to content

AstraZeneca/persist-seq-galaxy-tools-setup

Repository files navigation

Persist-Seq scRNA-seq analysis tools

This repo concentrates all the tools and versions that are installed in a Persist-Seq standard Galaxy instance. There are CI processes that check the tools and versions are up to date, and update the repo accordingly.

The repo has mainly two kind of files, the main YAML files with the tools and the sections where they are to be installed, and the lock files that are generated by the CI process and contain the exact versions of the tools. Humans should mostly edit the YAML files and the CI process will update the lock files.

Process for adding new tools

Does the tool has a Galaxy wrapper in the Galaxy Toolshed? If yes, simply add the tool to the YAML file in the section where it should be installed, and make a pull request with the change to this repo via the Galaxy UI. If not, keep reading.

This README serves as a relatively barebones indication on how to do add new tools to give an overview. We recommend getting more details from the excellent tool writing tutorial from the Galaxy Training Network.

Tools dependencies

Galaxy wrappers will need to resolve dependencies to specific tools (ie. a wrapper using elements from Seurat needs to resolve dependencies that enable it to use Seurat).

Does the tool (or its main dependency) exist in the bioconda channel or in conda-forge? If yes, follow to the next section. If not, add it in one of those conda setups following their instructions. Adding it to Bioconda has the advantage that it will automatically get a biocontainer generated (which is needed for our cloud instances or HPC instances that use singularity). The tool writing tutorial from the Galaxy Training Network has sections on this as well.

Development environment for tool wrappers

Once there is a dependency that the Galaxy wrapper can resolve to, you can write the Galaxy wrapper. If using VSCode, you could do this on a cluster and then work remotely on your local machine (specially if you have an M1 mac), but you can also do it on your local machine.

Setup a development environment:

  1. Install planemo on a python virtual environment:
version=0.75.7
virtualenv -p python3 venv_planemo@$version
source venv_planemo@$version/bin/activate
pip install --upgrade pip wheel
pip install planemo

if using a Mac M1, I suggest doing this in the rosetta environment (unless that you are doing remotely on a cluster), as Galaxy might not have yet all its deps on arm64.

  1. Clone Galaxy on a folder, use release_22.05:
path_to_galaxy=~/galaxy_installs/galaxy_22.05
git clone -b release_22.05 --depth 1 https://github.com/galaxyproject/galaxy.git $path_to_galaxy/galaxy_22.05

This will be used by planemo (usin the --galaxy_root parameter, if using planemo manually).

  1. I suggest using VSCode to write the wrapper, as it has a nice integration with planemo. Install the planemo extension and the Galaxy language server. Indicate to the extension where the Galaxy folder is, and where the planemo virtual environment is. If you have docker installed, you can also indicate that to the extension by adding the extra planemo parameter --biocontainers.

  2. Fork the Galaxy Tools IUC repo in the GitHub interface, and clone it locally:

git clone https://github.com/<my-git-username>/tools-iuc.git tools-iuc
  1. Start a git branch on this repo, based on the latest main/master from Tools IUC. If you had the repo for a while, make sure that your fork is up to date with the main repo. You can do this on the GitHub interface, or on the command line (by adding a new remote that points to the main repo, and then main pulling from it):
git remote add iuc https://github.com/galaxyproject/tools-iuc.git # you only need adding the remote once
git fetch iuc # to bring any structure changes from the main repo
# the first time you do this, you will need to checkout the main branch
git checkout -b iuc/main iuc/main # to create a new branch based on the main branch of the main repo, only needed once
# otherwise, simply pull changes from the main repo onto that branch
git pull iuc main
# now you can create a new branch based on the iuc/main branch
git checkout -b my-new-tool # to create a new branch based on the iuc/main branch # only needed once per tool
  1. Start coding on your editor or on VSCode. You can get some initial boilerplate initially through ChatGPT or other AI assistants, by asking something like "Write a Galaxy tool wrapper for tool XYZ". The main reference for tool wrapping is the Galaxy Tool XML schema. The VSCode extensions will help you run planemo tests for the tool. You can also find useful resources at https://galaxyproject.org/tools/, at https://planemo.readthedocs.io/en/latest/writing.html, and the excellent tool writing tutorial from the Galaxy Training Network. Additional sources of help are the Galaxy Gitter and the Galaxy Development mailing list.