This repository contains example notebooks on the usage of TACCO as well as a snakemake workflow to prepare and provide all necessary files for their execution.
To view the examples with their results, open the executed notebooks in the notebooks directory of the main branch or in the examples section of the TACCO documentation.
To execute the examples locally, clone the repository, and run the workflow with snakemake. The main branch contains also the executed notebooks, i.e. it is quite big, while the devel branch contains only the code itself to regenerate the notebooks. To clone only the devel branch use git clone --single-branch --branch devel git@github.com:simonwm/tacco_examples.git.
In case you do not have snakemake set up already, you can follow the snakemake instructions or just install a new environment in your existing conda setup using conda env create -f workflow/envs/snakemake_env.yml
NB: While snakemake recommends to install mamba as drop in replacement for conda in the conda base environment, it does also work with regular conda by specifying the additional command line arguments --use-conda --conda-frontend conda to snakemake. It just used to take a while longer to get the individual environments set up with older conda versions, so we also recommend to use mamba or a recent version of conda.
If you dont want to run the full set of examples, check the main workflow/snakemake file for options. You can
- call
snakemakewithout any targets to prepare and run all examples, - call it with the name of a selected example as first argument (e.g.
snakemake slideseq_mouse_olfactory_bulb) to prepare the datasets and to run the notebook for this example only, or - call it with the name of a selected example as first argument prepended with
prepare(e.g.snakemake prepare_slideseq_mouse_olfactory_bulb) to prepare the datasets for this example only but not run the notebook.
For some examples (e.g. slideseq_mouse_olfactory_bulb) there is also a version using just a single sample (e.g. slideseq_mouse_olfactory_bulb_single) which can be used in place of the names for the full example. These run faster, as they have to download and process much less data. They also come in handy if your machine has not enough memory to run the full example.
To run the workflow on the machine on which the snakemake command is executed, use the --cores <N> option. This tells snakemake to plan with N cpu cores for working on all the steps in the workflow. Note that --cores 1 will not limit TACCO to use all the cores which are available on the machine as this value is not propagated to the notebook. So changing this number will change only the degree of parallelization for the preparation of the datasets, i.e. the number of parallel download and data conversion tasks.
snakemake supports a wide variety of distributed execution modes. To use them for this workflow consult the snakemake documentation.