diff --git a/book/_config.yml b/book/_config.yml index f81c0bd..c967efc 100644 --- a/book/_config.yml +++ b/book/_config.yml @@ -53,4 +53,17 @@ html: parse: myst_substitutions: - miniconda_url: "[Miniconda](https://conda.io/miniconda.html)" \ No newline at end of file + miniconda_url: "[Miniconda](https://conda.io/miniconda.html)" + release_epoch: "2024.5" + tutorial_environment_block: | + ````{admonition} Reminder + :class: tip + + These examples assume that you have a QIIME 2 deployment that includes the [q2-dwq2](https://github.com/caporaso-lab/q2-dwq2) educational plugin. + Follow the instructions in [](tutorial-setup) if you'd like to follow along with this tutorial. + If you've already followed those instructions, before following this tutorial be sure to activate your conda environment as follows: + + ```python + conda activate using-qiime2 + ``` + ```` \ No newline at end of file diff --git a/book/_toc.yml b/book/_toc.yml index 2868e86..4308d44 100644 --- a/book/_toc.yml +++ b/book/_toc.yml @@ -4,18 +4,22 @@ parts: - caption: Tutorials chapters: - file: tutorials/intro + - file: tutorials/parallel-pipeline - caption: How-tos chapters: - file: how-to-guides/merge-metadata - file: how-to-guides/validate-metadata - file: how-to-guides/artifacts-as-metadata - file: how-to-guides/view-visualizations + - file: how-to-guides/pipeline-resumption - caption: Explanations chapters: - file: explanations/metadata + - file: explanations/types-of-parallelization - caption: References chapters: - file: references/metadata + - file: references/parallel-configuration - caption: Back matter chapters: - file: back-matter/glossary diff --git a/book/back-matter/glossary.md b/book/back-matter/glossary.md index 8a8dfb9..62981fa 100644 --- a/book/back-matter/glossary.md +++ b/book/back-matter/glossary.md @@ -12,6 +12,12 @@ artifact When written to file, artifacts typically have the extension {term}`qza`. Artifacts can be provided as input to QIIME 2 {term}`actions ` or exported from QIIME 2 for use with other software. +breaking change + A *breaking change* is a change to how a program works (for example, a QIIME 2 plugin or interface) that introduces an incompatibility with earlier versions of the program. + This will generally require that users make some modification to how they were using some aspect of a system. + For example, if a plugin method added a new required input in version 2, that would be a breaking change with respect to version 1: calling the method without that new parameter would fail in version 2, but would have succeeded with version 1. + This may also be called a backward incompatible change or an API change. + DRY An acronym of *Don't Repeat Yourself*, and a critical principle of software engineering and equally applicable in research data management. For more information on DRY and software engineering in general, see {cite:t}`pragprog20`. @@ -33,6 +39,11 @@ plugin As of this writing, a collection of plugins that are installed together are referred to as a distribution. Additional plugins can be installed, and the primary resource enabling discovery of additional plugins is the [QIIME 2 Library](https://library.qiime2.org). +Python 3 API + QIIME 2's Application Programmer Interface. + This allows advanced users to access all QIIME 2 analytic functionality directly in Python. + This can be very convenient for developing tools that use QIIME 2 as a component, or for performing data analysis without writing intermediary data artifacts to disk unless you specifically want to. + q2cli [q2cli](https://github.com/qiime2/q2cli) is the original (and still primary, as of March 2024) command line interface for QIIME 2. diff --git a/book/explanations/metadata.md b/book/explanations/metadata.md index a20c7b9..acf2b3b 100644 --- a/book/explanations/metadata.md +++ b/book/explanations/metadata.md @@ -1,5 +1,5 @@ (metadata-explanation)= -# Metadata in QIIME 2 +# Sample and feature metadata Metadata provides the key to gaining biological insight from your data. In QIIME 2, **sample metadata** may include technical details, such as the DNA barcodes that were used for each sample in a multiplexed sequencing run, or descriptions of the samples, such as which subject, time point, and body site each sample came from in a human microbiome time series. diff --git a/book/explanations/types-of-parallelization.md b/book/explanations/types-of-parallelization.md new file mode 100644 index 0000000..e4d6cce --- /dev/null +++ b/book/explanations/types-of-parallelization.md @@ -0,0 +1,14 @@ +(types-of-parallel-support)= +# Types of parallel computing support + +## Parallel Pipeline execution + +QIIME 2's formal parallel computing support uses [Parsl](https://parsl.readthedocs.io/en/stable/1-parsl-introduction.html>), and enables parallel execution of QIIME 2 {term}`Pipeline` actions. +All QIIME 2 `Pipelines` will have parallel computing options, notably the `--parallel` parameter in {term}`q2cli`, though whether those actually induce parallel computing is up to the implementation of the `Pipeline`. +Actions using this formal parallel computing support can make use of high-performance computing hardware that doesn't necessarily have shared memory. + +## Informal parallel support + +Some {term}`Method` actions (e.g., `qiime dada2 denoise-*`) wrap multi-threaded applications and may define a parameter (like `--p-n`) that gives the user control over that. +The QIIME 2 parameter type associated with these parameters should always be `NTHREADS` or `NJOBS` (if you observe a parameter where this isn't the case, it was probably an error on the developers part - reach out on the forum to let us know). +Actions using this informal parallel computing support are generally restricted to running on systems with shared memory. diff --git a/book/how-to-guides/artifacts-as-metadata.md b/book/how-to-guides/artifacts-as-metadata.md index 6ac2488..c6474b0 100644 --- a/book/how-to-guides/artifacts-as-metadata.md +++ b/book/how-to-guides/artifacts-as-metadata.md @@ -1,5 +1,5 @@ (view-artifacts-as-metadata)= -# How to use QIIME 2 Artifacts as Metadata +# How to use Artifacts as Metadata In addition to TSV metadata files, QIIME 2 also supports viewing some kinds of artifacts as metadata. An example of this is artifacts of type `SampleData[AlphaDiversity]`. diff --git a/book/how-to-guides/pipeline-resumption.md b/book/how-to-guides/pipeline-resumption.md new file mode 100644 index 0000000..ade272b --- /dev/null +++ b/book/how-to-guides/pipeline-resumption.md @@ -0,0 +1,35 @@ +(pipeline-resumption)= +# How to resume failed Pipeline runs + +If a {term}`Pipeline` fails at some point during its execution, and you rerun it, QIIME 2 can attempt to reuse the results that were calculated by the `Pipeline` before it failed. + +## Pipeline resumption through the command line interface (CLI) + +By default, when you run a {term}`Pipeline` on the CLI, QIIME 2 will create a pool in its cache (either the default cache, or the cache specified using the `--use-cache` parameter). +This pool will named based on the scheme: `recycle___`. +This pool will store all intermediate {term}`Results ` created by the {term}`Pipeline`. + +Should the `Pipeline` run succeed, this pool will be removed. +However, should the `Pipeline` run fail, you can rerun the `Pipeline` using the same command you ran the first time, and the intermediate {term}`Results ` stored in the pool will be reused to avoid redoing steps in the Pipeline that had already completed. + +If you wish to specify the pool that you would like QIIME 2 should use, either on a `Pipeline`'s first run or on a resumption, you can specify the pool using the `--recycle-pool` option, followed by the name of the pool you wish to use. +This pool will be created in the cache if it does not already exist. +The `--no-recycle` flag may be passed if you do not want QIIME 2 to attempt to recycle any past {term}`Results ` or to save its {term}`Results ` from this run for future reuse. + +It is not necessarily possible to reuse prior {term}`Results ` if your inputs to the `Pipeline` differ on resumption with respect to what was provided on the initial run. +In this situation, QIIME 2 will still try to reuse any {term}`Results ` that are not dependent on the inputs that changed, but there is no guarantee any will be usable. + +## Pipeline resumption through the Python 3 API + +When using the Python API, pools are specified using context managers (i.e., using Python's `with` statement). +If you don't want to enable resumption, don't use the context manager. + +```python +from qiime2.core.cache import Cache + +cache = Cache('cache_path') +pool = cache.create_pool('pool', reuse=True) + +with pool: + # run your pipeline here +``` \ No newline at end of file diff --git a/book/how-to-guides/view-visualizations.md b/book/how-to-guides/view-visualizations.md index e0f5b7f..dfe64b7 100644 --- a/book/how-to-guides/view-visualizations.md +++ b/book/how-to-guides/view-visualizations.md @@ -1,5 +1,5 @@ (view-visualizations)= -# How to view QIIME 2 Visualizations +# How to view Visualizations ## QIIME 2 View diff --git a/book/references/parallel-configuration.md b/book/references/parallel-configuration.md new file mode 100644 index 0000000..4e09e4e --- /dev/null +++ b/book/references/parallel-configuration.md @@ -0,0 +1,183 @@ +(parallel-configuration)= +# Parallel Pipeline configuration + +QIIME 2 provides formal support for parallel computing of {term}`Pipelines ` through [Parsl](https://parsl.readthedocs.io/en/stable/1-parsl-introduction.html>). + +## Parsl configuration + +A [Parsl configuration](https://parsl.readthedocs.io/en/stable/userguide/configuring.html) tells Parsl what resources are available and how to use them, and is required to use Parsl. +The [Parsl documentation](https://parsl.readthedocs.io/en/stable/) provides full detail on [Parsl configuration](https://parsl.readthedocs.io/en/stable/userguide/configuring.html#). + +In the context of QIIME 2, Parsl configuration information is maintained in a QIIME 2 configuration file. +QIIME 2 configuration files are stored on disk in [TOML](https://toml.io/en/) files. + +### Default Parsl configuration + +For basic multi-processor usage, QIIME 2 writes a default configuration file the first time it's needed (e.g., if you instruct QIIME 2 to execute in parallel without a particular configuration). + +The default `qiime2_config.toml` file, as of QIIME 2 2024.10, looks like the following: + +(default-parsl-configuration-file)= +``` +[parsl] +strategy = "None" + +[[parsl.executors]] +class = "ThreadPoolExecutor" +label = "tpool" +max_threads = ... + +[[parsl.executors]] +class = "HighThroughputExecutor" +label = "default" +max_workers = ... + +[parsl.executors.provider] +class = "LocalProvider" +``` + +When this file is written to disk, the `max_threads` and `max_workers` values (represented above by `...`) are computed by QIIME 2 as one less than the CPU count on the computer where it is running (`max(psutil.cpu_count() - 1, 1)`). + +This configuration defines two `Executors`. + +1. The [`ThreadPoolExecutor`](https://parsl.readthedocs.io/en/stable/stubs/parsl.executors.ThreadPoolExecutor.html?highlight=Threadpoolexecutor) that parallelizes jobs across multiple threads in a process. +2. The [`HighThroughputExecutor`](https://parsl.readthedocs.io/en/stable/stubs/parsl.executors.HighThroughputExecutor.html?highlight=HighThroughputExecutor) that parallelizes jobs across multiple processes. + +In this case, the `HighThroughputExecutor` is designated as the default by nature of it's `label` value being `default`. +Your parsl configuration **must** define an executor with the label `default`, and this is the executor that QIIME 2 will use to dispatch your jobs to if you do not specify an alternative. + +````{admonition} The parsl.Config object +:class: tip + +This parsl configuration is ultimately read into a `parsl.Config` object internally in QIIME 2. +The `parsl.Config` object that corresponds to the above example would look like the following: + +```python +config = parsl.Config( + executors=[ + ThreadPoolExecutor( + label='tpool', + max_threads=... # will be an integer value + ), + HighThroughputExecutor( + label='default', + max_workers=..., # will be an integer value + provider=LocalProvider() + ) + ], + strategy=None +) +``` +```` + +### Parsl configuration, line-by-line + +This first line of [the default configuration file presented above](default-parsl-configuration-file) indicates that this is the parsl section (or [table](https://toml.io/en/v1.0.0#table), to use TOML's terminology) of our configuration file. + +``` +[parsl] +``` + +The next line: + +``` +strategy = "None" +``` + +is a top-level Parsl configuration parameter that you can [read more about in the Parsl documentation](https://parsl.readthedocs.io/en/stable/userguide/configuring.html#multi-threaded-applications). +This may need to be set differently depending on your system. + +If you were to load this into Python using tomlkit you would get the following dictionary: + +Next, the first executor is added. + +``` +[[parsl.executors]] +class = "ThreadPoolExecutor" +label = "tpool" +max_threads = 7 +``` + +The double square brackets (`[[ ... ]]`) indicates that [this is an array](https://toml.io/en/v1.0.0#array-of-tables), `executors`, that is nested under the `parsl` table. +`class` indicates the specific parsl class that is being configured ([`parsl.executors.ThreadPoolExecutor`](https://parsl.readthedocs.io/en/stable/stubs/parsl.executors.ThreadPoolExecutor.html#parsl.executors.ThreadPoolExecutor) in this case); `label` provides a label that you can use to refer to this executor elsewhere; and `max_threads` is a configuration value for the ThreadPoolExecutor class which corresponds to a parameter name for the class's constructor. +In this example a value of 7 is specified for `max_threads`, but as noted above this will be computed specifically for your machine when this file is created. + +Parsl's `ThreadPoolExecutor` runs on a single node, so we provide a second executor which can utilize up to 2000 nodes. + +``` +[[parsl.executors]] +class = "HighThroughputExecutor" +label = "default" +max_workers = 7 + +[parsl.executors.provider] +class = "LocalProvider" +``` + +The definition of this executor, [`parsl.executors.HighThroughputExecutor`](https://parsl.readthedocs.io/en/stable/stubs/parsl.executors.HighThroughputExecutor.html#parsl.executors.HighThroughputExecutor), looks similar to the definition of the `ThreadPoolExecutor`, but it additionally defines a `provider`. +The provider class provides access to computational resources. +In this case, we use [`parsl.providers.LocalProvider`](https://parsl.readthedocs.io/en/stable/stubs/parsl.providers.LocalProvider.html), which provides access to local resources (i.e., on the laptop or workstation). +[Other providers are available as well](https://parsl.readthedocs.io/en/stable/reference.html#providers), including for Slurm, Amazon Web Services, Kubernetes, and more. + +### Mapping {term}`Actions ` to executors + +An executor mapping can be added to your parsl configuration that defines which actions should run on which executors. +If an action is unmapped, it will run on the default executor. +This can be specified as follows: + +``` +[parsl.executor_mapping] +action_name = "tpool" +``` + +```{warning} +The mechanism for specifying action names at present does not handle the case of different plugins defining actions with the same name. +This mechanism will likely change soon, and may be a {term}`breaking change`. +You can track progress on this [here](https://github.com/qiime2/qiime2/issues/802). +``` + +(view-parsl-configuration)= +### Viewing the current configuration + +Using {term}`q2cli`, you can see your current `qiime2_config.toml` file by running: + +```shell +qiime info --config-level 2 +``` + +(qiime2-configuration-precedence)= +### QIIME 2 configuration file precedence + +When QIIME 2 needs configuration information, the following precedence order is followed to load a configuration file: + +1. The path specified in the environment variable `$QIIME2_CONFIG`. +2. The file at `/qiime2/qiime2_config.toml` +3. The file at `/qiime2/qiime2_config.toml` +4. The file at `$CONDA_PREFIX/etc/qiime2_config.toml` + +If no configuration is found after checking those four locations, QIIME 2 writes a default configuration file to `$CONDA_PREFIX/etc/qiime2_config.toml` and uses that. +This implies that after your first time running QIIME 2 in parallel without a config in at least one of the first 3 locations, the path referenced in step 4 will exist and contain a configuration file. + +Alternatively, when using {term}`q2cli`, you can provide a specific configuration for use in configuring parsl using the `--parallel-config` option. +If provided, this overrides the priority order above. + +````{admonition} user_config_dir and site_config_dir +:class: note +On Linux, `user_config_dir` will usually be `$HOME/.config/qiime2/`. +On macOS, it will usually be `$HOME/Library/Application Support/qiime2/`. + +You can get find the directory used on your system by running the following command: + +```bash +python -c "import appdirs; print(appdirs.user_config_dir('qiime2'))" +``` + +On Linux `site_config_dir` will usually be something like `/etc/xdg/qiime2/`, but it may vary based on Linux distribution. +On macOS it will usually be `/Library/Application Support/qiime2/`. + +You can get find the directory used on your system by running the following command: + +```bash +python -c "import appdirs; print(appdirs.site_config_dir('qiime2'))" +``` +```` diff --git a/book/tutorials/intro.md b/book/tutorials/intro.md index 65ac606..242aba3 100644 --- a/book/tutorials/intro.md +++ b/book/tutorials/intro.md @@ -1,19 +1,154 @@ -(tutorials)= -# Tutorials - -Lorem ipsum dolor sit amet, consectetur adipiscing elit. -Fusce interdum leo ut blandit hendrerit. -Duis fermentum tellus ut neque tincidunt, quis semper dui luctus. -Etiam rhoncus hendrerit diam, non molestie elit facilisis a. -Ut porttitor cursus erat vel ultricies. -Sed consectetur ultrices ante sit amet porttitor. -Phasellus eget efficitur ipsum, quis congue ipsum. -Integer egestas congue nunc, et dictum est consequat at. -Aenean dapibus hendrerit semper. -Morbi eu turpis ac nibh ornare sollicitudin. -Cras ullamcorper dictum scelerisque. -Sed ac elementum odio, vitae congue lacus. -Praesent id vestibulum mi. -Nam et sodales sapien, eget posuere nisl. -Integer et mi nec leo rutrum finibus. -Vestibulum mollis enim sagittis turpis tristique, a accumsan sem auctor. +(tutorial-setup)= +# Getting started + +The tutorials in *Using QIIME 2* provide basic information on how to use the QIIME 2 framework, {term}`q2cli` (i.e., the official QIIME 2 command line interface), and the QIIME 2 Python 3 API. +The tutorials make use of the *Tiny Distribution*, and the QIIME 2 example plugin `q2-dwq2`. +This deployment allows for in-depth learning of how QIIME 2 itself works, and the information you learn here will be relevant across all QIIME 2 distributions. +When you're ready to perform your own data analysis you'll transition to domain-specific plugins or distributions and their documentation. + +Before attempting to run the *Using QIIME 2* tutorials, configure your learning environment following the steps here. + +```{warning} +The installation instructions in this document are not finalized though they should work. +Updated installation instructions will soon be added. +``` + +## Install Prerequisites + +{{ miniconda_url }} provides the ``conda`` environment and package manager, and is currently the only supported way to install QIIME 2. +Follow the instructions for downloading and installing Miniconda. + +After installing Miniconda and opening a new terminal, make sure you're running the latest version of ``conda``: + +```bash +conda update conda +``` + +## Install the QIIME 2 "Tiny Distribution" + +The QIIME 2 "Tiny Distribution" is a minimal set of QIIME 2 functionality for building and using plugins through the QIIME 2 command line or Python 3 API. +Here's we'll install the most recent release version of QIIME 2, QIIME 2 {{ release_epoch }}. + +% Unfortunately we can't use MyST substitutions in literals, so including `{{ release_epoch }}` in the following install commands won't work. +% We need to figure out how we want to address this in the future to automatically update these commands on new releases. +% Options are having a `latest` URL for release versions, or running a pre-processing script to update these docs before building them. + +`````{tab-set} +````{tab-item} macOS +```bash +conda env create -n using-qiime2 --file https://data.qiime2.org/distro/tiny/qiime2-tiny-2024.5-py39-osx-conda.yml +``` +```` + +````{tab-item} Linux +```bash +conda env create -n using-qiime2 --file https://data.qiime2.org/distro/tiny/qiime2-tiny-2024.5-py39-linux-conda.yml +``` +```` + +````{tab-item} macOS (Apple Silicon) +```bash +CONDA_SUBDIR=osx-64 conda env create -n using-qiime2 --file https://data.qiime2.org/distro/tiny/qiime2-tiny-2024.5-py39-osx-conda.yml +conda activate using-qiime2 +conda config --env --set subdir osx-64 +``` +```` + +````{tab-item} Windows (via WSL) +```bash +conda env create -n using-qiime2 --file https://data.qiime2.org/distro/tiny/qiime2-tiny-2024.5-py39-linux-conda.yml +``` +```` + +````` + +## Activate the ``conda`` environment + +You can now activate the environment you just created as follows. + +```bash +conda activate using-qiime2 +``` + +To test your QIIME 2 environment, run: + +```bash +qiime info +``` + +You should see something like the following, though the version numbers you'll see may be different: + +``` +System versions +Python version: 3.9.19 +QIIME 2 release: 2024.5 +QIIME 2 version: 2024.5.1 +q2cli version: 2024.5.0 + +Installed plugins +types: 2024.5.0 + +Application config directory +/Users/jgc/miniconda3/envs/using-qiime2/var/q2cli + +Getting help +To get help with QIIME 2, visit https://qiime2.org +``` + +At this stage you have a working QIIME 2 environment, but it doesn't do a whole lot. +To add some bioinformatics functionality, we'll next add the QIIME 2 example plugin [`q2-dwq2`](https://github.com/caporaso-lab/q2-dwq2). + +## Install q2-dwq2 + +All domain-specific functionality in QIIME 2 comes in the form of plugins. +Sometimes you'll install these directly, and sometimes you'll install QIIME 2 distributions which are bundles of plugins intended to be used together. +In this case, we're going to install one specific plugin. +Run the following command from your `using-qiime2` conda environment (i.e., after having run `conda activate using-qiime2`). + +```shell +pip install https://github.com/caporaso-lab/q2-dwq2/archive/refs/heads/main.zip +``` + +If you run `qiime info` again, you should now see a new plugin, `dwq2`, in the list of *Installed Plugins*. + +``` +System versions +Python version: 3.9.19 +QIIME 2 release: 2024.5 +QIIME 2 version: 2024.5.1 +q2cli version: 2024.5.0 + +Installed plugins +dwq2: 0+unknown +types: 2024.5.0 + +Application config directory +/Users/jgc/miniconda3/envs/using-qiime2/var/q2cli + +Getting help +To get help with QIIME 2, visit https://qiime2.org +``` + +## Exploring the available functionality + +To see the list of available plugins, along with some additional information, run: + +```shell +qiime --help +``` + +To see what functionality, or {term}`Actions ` a plugin defines, call help on that plugin: + +```shell +qiime dwq2 --help +``` + +To learn how to use a specific action, call help on it: + +```shell +qiime dwq2 nw-align --help +``` + +Take a few minutes now to explore `q2-dwq2`. +What is this plugin intended to do? +What is some of the functionality that it provides? diff --git a/book/tutorials/parallel-pipeline.md b/book/tutorials/parallel-pipeline.md new file mode 100644 index 0000000..026c08e --- /dev/null +++ b/book/tutorials/parallel-pipeline.md @@ -0,0 +1,123 @@ +(parallel-tutorial)= +# Using parallel Pipeline execution + +QIIME 2 provides formal support for parallel computing of {term}`Pipelines ` through [Parsl](https://parsl.readthedocs.io/en/stable/1-parsl-introduction.html>).[^formal-informal-parallel] +This allows for faster execution of QIIME 2 `Pipelines`, assuming the compute resources are available, by ensuring that pipeline steps that can run simultaneously do run simultaneously. + +Parallel Pipeline execution is accessible in different ways depending on which interface you're using. +Here we illustrate how to run `Pipelines` in parallel using {term}`q2cli` and {term}`QIIME 2's Python 3 API `. + +{{ tutorial_environment_block }} + +## q2cli + +Review the help text for a QIIME 2 `Pipeline`. +Pay special attention to the usage examples at the bottom of the help text. + +```shell +qiime dwq2 search-and-summarize --help +``` + +Have QIIME 2 generate example data that can be used to run the usage example. + +```shell +qiime dwq2 search-and-summarize --example-data ss-usage +``` + +This will create a new directory for `search-and-summarize` usage example data. +Change into that new directory by running: + +```shell +cd ss-usage/Serial +``` + +Run the usage example serially first. +Note that in the following commands the output filenames are adapted from the usage example to **prepend `serial-` to each file name**. + +```{note} +The following command may take several minutes to run. +On my Apple MacBook Pro (M3) it ran for approximately 6 minutes. +``` + +```shell +qiime dwq2 search-and-summarize \ + --i-query-seqs query-seqs.qza \ + --i-reference-seqs reference-seqs.qza \ + --m-reference-metadata-file reference-metadata.tsv \ + --p-split-size 1 \ + --o-hits serial-hits.qza \ + --o-hits-table serial-hits-table.qzv +``` + +To re-run this `Pipeline` in parallel, append the `--parallel` flag. +This will run this command in parallel using a default parallel configuration (learn more about this in [](parallel-configuration)). +Note that the output filenames this time are adapted to **prepend `parallel-` to each file name**. + +```shell +qiime dwq2 search-and-summarize \ + --i-query-seqs query-seqs.qza \ + --i-reference-seqs reference-seqs.qza \ + --m-reference-metadata-file reference-metadata.tsv \ + --p-split-size 1 \ + --o-hits parallel-hits.qza \ + --o-hits-table parallel-hits-table.qzv \ + --parallel +``` + +If you're using a system with parallel computing capabilities (e.g., at least six cores) the parallel execution of this command should have run faster than the serial execution. + +## Python 3 API + +Parallel Pipeline execution through the Python API is done using a `parsl.Config` object as a context manager. +These objects take a `parsl.Config` object and an optional dictionary mapping action names to executor names as input. +If no config is provided your default configuration will be used (see [](qiime2-configuration-precedence)). + +```python +from qiime2.sdk.parallel_config import ParallelConfig +from qiime2.plugins.dwq2.pipelines import search_and_summarize +from qiime2 import Artifact, Metadata + +query_seqs = Artifact.load('query-seqs.qza') +reference_seqs = Artifact.load('reference-seqs.qza') +reference_metadata = Metadata.load('reference-metadata.tsv') + +with ParallelConfig(): + future = search_and_summarize.parallel(query_seqs=query_seqs, + reference_seqs=reference_seqs, + reference_metadata=reference_metadata, + split_size=1) + # call future._result() inside of the context manager + result = future._result() +``` + +To use a specific configuration, you can create it directly, or load one from file. +For example: + +```python +from qiime2.sdk.parallel_config import ParallelConfig, get_config_from_file +from qiime2.plugins.dwq2.pipelines import search_and_summarize +from qiime2 import Artifact, Metadata + +query_seqs = Artifact.load('query-seqs.qza') +reference_seqs = Artifact.load('reference-seqs.qza') +reference_metadata = Metadata.load('reference-metadata.tsv') + +path_to_config_file = # set this to the path to the file you'd like to load + +c, m = get_config_from_file(path_to_config_file) + +with ParallelConfig(parallel_config=c, action_executor_mapping=m): + future = search_and_summarize.parallel(query_seqs=query_seqs, + reference_seqs=reference_seqs, + reference_metadata=reference_metadata, + split_size=1) + # call future._result() inside of the context manager + result = future._result() +``` + +## Parsl configuration + +To learn how to configure parsl for your own usage, refer to [](parallel-configuration). + +[^formal-informal-parallel]: QIIME 2 {term}`Actions ` can provide formal (i.e., Parsl-based) or informal (e.g., multi-threaded execution of a third party program) parallel computing support. + To learn more about the distinction, see [](types-of-parallel-support). \ No newline at end of file