# Getting started

The basic runnable component of Pydra is a *task*. Tasks are conceptually similar to
functions, in that they take inputs, operate on them and then return results. However,
unlike functions, tasks are parameterised before they are executed in a separate step.
This enables parameterised tasks to be linked together into workflows that are checked for
errors before they are executed, and modular execution workers and environments to specified
independently of the task being performed.

Tasks can encapsulate Python functions, shell-commands or workflows constructed from
task components.

## Running your first task

Pre-defined task definitions are installed under the `pydra.tasks.*` namespace by separate
task packages (e.g. `pydra-fsl`, `pydra-ants`, ...). To use a pre-defined task definition

* import the class from the `pydra.tasks.*` package it is in
* instantiate it with appropriate parameters
* "call" resulting object (i.e. `my_task(...)`) to execute it as you would a function 

To demonstrate with an example of loading a JSON file with the
`pydra.tasks.common.LoadJson` task, we first create an example JSON file to test with

In [1]:
from pathlib import Path
from tempfile import mkdtemp
import json
import nest_asyncio

# Allow running async code in Jupyter notebooks
nest_asyncio.apply()

JSON_CONTENTS = {'a': True, 'b': 'two', 'c': 3, 'd': [7, 0.55, 6]}

test_dir = Path(mkdtemp())
json_file = test_dir / "test.json"
with open(json_file, "w") as f:
    json.dump(JSON_CONTENTS, f)

Now we can load the JSON contents back from the file using the `LoadJson` task definition
class

In [2]:
# Import the task definition
from pydra.tasks.common import LoadJson

# Instantiate the task definition, providing the JSON file we want to load
load_json = LoadJson(file=json_file)

# Run the task to load the JSON file
outputs = load_json()

# Access the loaded JSON output contents and check they match original
assert outputs.out == JSON_CONTENTS

A newer version (0.25) of nipype/pydra is available. You are using 0.25.dev128+g1e817743.d20250104


If you want to access a richer `Result` object you can use a Submitter object to execute the task

In [3]:
from pydra.engine.submitter import Submitter

with Submitter() as submitter:
    result = submitter(load_json)

print(result)

ValueError: not enough values to unpack (expected 3, got 2)

The `Result` object contains

* `output`: the outputs of the task (if there is only one output it is called `out` by default)
* `runtime`: information about the peak memory and CPU usage
* `errored`: the error status of the task
* `task`: the task object that generated the results
* `output_dir`: the output directory the results are stored in

## Iterating over inputs

It is straightforward to apply the same operation over a set of inputs using the `split()`
method. For example, if we wanted to re-grid all the NIfTI images stored in a directory,
such as the sample ones generated by the code below

In [5]:
from fileformats.medimage import Nifti

nifti_dir = test_dir / "nifti"
nifti_dir.mkdir()

for i in range(10):
    Nifti.sample(nifti_dir, seed=i)  # Create a dummy NIfTI file in the dest. directory

FileExistsError: [Errno 17] File exists: '/var/folders/mz/yn83q2fd3s758w1j75d2nnw80000gn/T/tmpnaqc3ee3/nifti'

Then we can by importing the `MrGrid` shell-command task from the `pydra-mrtrix3` package
and run it over every NIfTI file in the directory using the `TaskDef.split()` method

In [5]:
from pydra.tasks.mrtrix3.v3_0 import MrGrid

# Instantiate the task definition, "splitting" over all NIfTI files in the test directory
# by splitting the "input" input field over all files in the directory
mrgrid = MrGrid(voxel=(0.5,0.5,0.5)).split(input=nifti_dir.iterdir())

# Run the task to resample all NIfTI files
outputs = mrgrid()

# Print the locations of the output files
print("\n".join(str(p) for p in outputs.output))

AttributeError: 'MrGrid' object has no attribute 'split'

It is also possible to iterate over inputs in pairs/n-tuples. For example, if you wanted to use
different voxel sizes for different images, both the list of images and the voxel sizes
are passed to the `split()` method and their combination is specified by a tuple "splitter"


Note that it is important to use a tuple not a list for the splitter definition in this
case, because a list splitter is interpreted as the split over each combination of inputs
(see [Splitting and combining](../explanation/splitting-combining.html) for more details
on splitters).

In [None]:


mrgrid_varying_vox_sizes = MrGrid().split(
    ("input", "voxel"),
    input=nifti_dir.iterdir(),
    # Define a list of voxel sizes to resample the NIfTI files to,
    # the list must be the same length as the list of NIfTI files
    voxel=[
        (1.0, 1.0, 1.0),
        (1.0, 1.0, 1.0),
        (1.0, 1.0, 1.0),
        (0.5, 0.5, 0.5),
        (0.75, 0.75, 0.75),
        (0.5, 0.5, 0.5),
        (0.5, 0.5, 0.5),
        (1.0, 1.0, 1.0),
        (1.25, 1.25, 1.25),
        (1.25, 1.25, 1.25),
    ],
)

print(mrgrid_varying_vox_sizes().output)

## Debugging failed tasks

Work in progress...