nipype · tclose · Apr 2, 2025 · Apr 2, 2025 · Apr 2, 2025
diff --git a/docs/source/index.rst b/docs/source/index.rst
@@ -27,11 +27,12 @@ Installation
 
 Pydra is implemented purely in Python and has a small number of dependencies
 It is easy to install via pip for Python >= 3.11 (preferably within a
-`virtual environment`_):
+`virtual environment`_). To get the latest version you will need to explicitly specify
+greater than or equal to 1.0a, otherwise PyPI will install the last 0.* version:
 
 .. code-block:: bash
 
-   $ pip install pydra
+   $ pip install pydra>=1.0a
 
 Pre-designed tasks are available under the `pydra.tasks.*` namespace. These tasks
 are typically implemented within separate packages that are specific to a given
@@ -41,7 +42,7 @@ ANTs_ (*pydra-ants*), or a collection of related tasks/workflows, such as Niwork
 
 .. code-block:: bash
 
-   $ pip install pydra-fsl pydra-ants
+   $ pip install pydra-tasks-fsl pydra-tasks-ants
 
 Of course, if you use Pydra to execute commands within non-Python toolkits, you will
 need to either have those commands installed on the execution machine, or use containers

diff --git a/docs/source/reference/glossary.rst b/docs/source/reference/glossary.rst
@@ -4,62 +4,69 @@ Glossary
 .. glossary::
 
     Cache-root
-        The directory where cache directories for tasks to be executed are created.
-        Task cache directories are named within the cache root directory using a hash
-        of the task's parameters, so that the same task with the same parameters can be
-        reused.
+        The root directory in which separate cache directories for each job are created.
+        Job cache directories are named within the cache-root directory using a unique
+        checksum for the job based on the task's parameters and software environment,
+        so that if the same job is run again the outputs from the previous run can be
+        reuused.
 
     Combiner
         A combiner is used to combine :ref:`State-array` values created by a split operation
         defined by a :ref:`Splitter` on the current node, upstream workflow nodes or
         stand-alone tasks.
 
     Container-ndim
-        The number of dimensions of the container object to be iterated over when using
-        a :ref:`Splitter` to split over an iterable value. For example, a list-of-lists
-        or a 2D array with `container_ndim=2` would be split over the elements of the
-        inner lists into a single 1-D state array. However, if `container_ndim=1`,
-        the outer list/2D would be split into a 1-D state array of lists/1D arrays.
+        The number of dimensions of the container object to be flattened into a single
+        state array when splitting over nested containers/multi-dimension arrays.
+        For example, a list-of-list-of-floats or a 2D numpy array with `container_ndim=1`,
+        the outer list/2D would be split into a 1-D state array consisting of
+        list-of-floats or 1D numpy arrays, respectively. Whereas with
+        `container_ndim=2` they would be split into a state-array of floats consisiting
+        of all the elements of the inner-lists/array.
 
     Environment
         An environment refers to a specific software encapsulation, such as a Docker
-        or Singularity image, that is used to run a task.
+        or Singularity image, in which a shell tasks are run. They are specified in the
+        Submitter object to be used when executing a task.
 
     Field
-        A field is a parameter of a task, or a task outputs object, that can be set to
-        a specific value. Fields are specified to be of any types, including objects
-        and file-system objects.
+        A field is a parameter of a task, or an output in a task outputs class.
+        Fields define the expected datatype of the parameter and other metadata
+        parameters that control how the field is validated and passed through to the
+        execution of the task.
 
     Hook
-        A hook is a user-defined function that is executed at a specific point in the task
-        execution process. Hooks can be used to prepare/finalise the task cache directory
+        A hook is a user-defined function that is executed at a specific point either before
+        or after a task is run. Hooks can be used to prepare/finalise the task cache directory
         or send notifications
 
     Job
-        A job is a discrete unit of work, a :ref:`Task`, with all inputs resolved
-        (i.e. not lazy-values or state-arrays) that has been assigned to a worker.
-        A task describes "what" is to be done and a submitter object describes
-        "how" it is to be done, a job combines both objects to describe a concrete unit
-        of processing.
+        A job consists of a :ref:`Task` with all inputs resolved
+        (i.e. not lazy-values or state-arrays) and a Submitter object. It therefore
+        represents a concrete unit of work to be executed, be combining "what" is to be
+        done (Task) with "how" it is to be done (Submitter).
 
     Lazy-fields
         A lazy-field is a field that is not immediately resolved to a value. Instead,
-        it is a placeholder that will be resolved at runtime, allowing for dynamic
-        parameterisation of tasks.
+        it is a placeholder that will be resolved at runtime when a workflow is executed,
+        allowing for dynamic parameterisation of tasks.
 
     Node
-        A single task within the context of a workflow, which is assigned a name and
-        references a state. Note this task can be nested workflow task.
+        A single task within the context of a workflow. It is assigned a unique name
+        within the workflow and references a state object that determines the
+        state-array of jobs to be run if present (if the state is None then a single
+        job will be run for each node).
 
     Read-only-caches
         A read-only cache is a cache root directory that was created by a previous
-        pydra runs, which is checked for matching task caches to be reused if present
-        but not written not modified during the execution of a task.
+        pydra run. The read-only caches are checked for matching job checksums, which
+        are reused if present. However, new job cache dirs are written to the cache root
+        so the read-only caches are not modified during the execution.
 
     State
         The combination of all upstream splits and combines with any splitters and
-        combiners for a given node, it is used to track how many jobs, and their
-        parameterisations, need to be run for a given workflow node.
+        combiners for a given node. It is used to track how many jobs, and their
+        parameterisations, that need to be run for a given workflow node.
 
     State-array
         A state array is a collection of parameterised tasks or values that were generated
@@ -84,8 +91,9 @@ Glossary
 
     Worker
         Encapsulation of a task execution environment. It is responsible for executing
-        tasks and managing their lifecycle. Workers can be local (e.g., a thread or
-        process) or remote (e.g., high-performance cluster).
+        tasks and managing their lifecycle. Workers can be local (e.g., debug and
+        concurrent-futures multiprocess) or orchestrated through a remote scheduler
+        (e.g., SLURM, SGE).
 
     Workflow
       A Directed-Acyclic-Graph (DAG) of parameterised tasks, to be executed in order.

diff --git a/empty-docs/conf.py b/empty-docs/conf.py
diff --git a/empty-docs/index.rst b/empty-docs/index.rst
diff --git a/empty-docs/requirements.txt b/empty-docs/requirements.txt
diff --git a/pydra/compose/base/task.py b/pydra/compose/base/task.py
@@ -196,11 +196,13 @@ def __call__(
         readonly_caches : list[os.PathLike], optional
             Alternate cache locations to check for pre-computed results, by default None
         audit_flags : AuditFlag, optional
-            Auditing configuration, by default AuditFlag.NONE
-        messengers : list, optional
-            Messengers, by default None
-        messenger_args : dict, optional
-            Messenger arguments, by default None
+            Configure provenance tracking. available flags: :class:`~pydra.utils.messenger.AuditFlag`
+            Default is no provenance tracking.
+        messenger : :class:`Messenger` or :obj:`list` of :class:`Messenger` or None
+            Messenger(s) used by Audit. Saved in the `audit` attribute.
+            See available flags at :class:`~pydra.utils.messenger.Messenger`.
+        messengers_args : messengers_args : dict[str, Any], optional
+            Argument(s) used by `messegner`. Saved in the `audit` attribu
         **kwargs : dict
             Keyword arguments to pass on to the worker initialisation
 

diff --git a/pydra/engine/submitter.py b/pydra/engine/submitter.py
@@ -64,11 +64,13 @@ class Submitter:
     max_concurrent: int | float, optional
         Maximum number of concurrent tasks to run, by default float("inf") (unlimited)
     audit_flags : AuditFlag, optional
-        Auditing configuration, by default AuditFlag.NONE
-    messengers : list, optional
-        Messengers, by default None
-    messenger_args : dict, optional
-        Messenger arguments, by default None
+        Configure provenance tracking. available flags: :class:`~pydra.utils.messenger.AuditFlag`
+        Default is no provenance tracking.
+    messenger : :class:`Messenger` or :obj:`list` of :class:`Messenger` or None
+        Messenger(s) used by Audit. Saved in the `audit` attribute.
+        See available flags at :class:`~pydra.utils.messenger.Messenger`.
+    messengers_args : dict[str, Any], optional
+        Argument(s) used by `messegner`. Saved in the `audit` attribu
     clean_stale_locks : bool, optional
         Whether to clean stale lock files, i.e. lock files that were created before the
         start of the current run. Don't set if using a global cache where there are