# Getting started

## Running your first task

The basic runnable component of Pydra is a *task*. Tasks are conceptually similar to
functions, in that they take inputs, process them and then return results. However,
unlike functions, tasks are parameterised before they are executed in a separate step.
This enables parameterised tasks to be linked together into workflows that are checked for
errors before they are executed, and modular execution workers and environments to specified
independently of the task being performed.

Pre-defined task definitions are installed under the `pydra.tasks.*` namespace by separate
task packages (e.g. `pydra-fsl`, `pydra-ants`, ...). Pre-define task definitions are run by

* importing the class from the `pydra.tasks.*` package it is in
* instantiate the class with the parameters of the task
* "call" resulting object to execute it as you would a function (i.e. with the `my_task(...)`)

To demonstrate with a toy example, of loading a JSON file with the `pydra.tasks.common.LoadJson` task, this we first create an example JSON file

In [6]:
from pathlib import Path
from tempfile import mkdtemp
import json

JSON_CONTENTS = {'a': True, 'b': 'two', 'c': 3, 'd': [7, 0.5598136790149003, 6]}

test_dir = Path(mkdtemp())
json_file = test_dir / "test.json"
with open(json_file, "w") as f:
    json.dump(JSON_CONTENTS, f)

Now we can load the JSON contents back from the file using the `LoadJson` task definition
class

In [7]:
# Import the task definition
from pydra.tasks.common import LoadJson

# Instantiate the task definition, providing the JSON file we want to load
load_json = LoadJson(file=json_file)

# Run the task to load the JSON file
result = load_json()

# Access the loaded JSON output contents and check they match original
assert result.output.out == JSON_CONTENTS

## Iterating over inputs

It is straightforward to apply the same operation over a set of inputs using the `split()`
method. For example, if we wanted to re-grid all the NIfTI images stored in a directory,
such as the sample ones generated by the code below

In [None]:
from fileformats.medimage import Nifti

nifti_dir = test_dir / "nifti"
nifti_dir.mkdir()

for i in range(10):
    Nifti.sample(nifti_dir, seed=i)

Then we can by importing the `MrGrid` shell-command task from the `pydra-mrtrix3` package
and then splitting over the list of files in the directory

In [None]:
from pydra.tasks.mrtrix3 import MrGrid

# Instantiate the task definition, "splitting" over all NIfTI files in the test directory
mrgrid = MrGrid(voxel=0.5).split(input=nifti_dir.iterdir())

# Run the task to resample all NIfTI files
result = mrgrid()

# Print the locations of the output files
print("\n".join(str(p) for p in result.output.output))

It is also possible to iterate over inputs in pairs, if for example you wanted to use
different voxel sizes for different images, both the list of images and the voxel sizes
are passed to the `split()` method and their combination is specified by a tuple "splitter"
(see [Splitting and combining](../explanation/splitting-combining.html) for more details
on splitters)

In [None]:
# Define a list of voxel sizes to resample the NIfTI files to, must be the same length
# as the number of NIfTI files
VOXEL_SIZES = [0.5, 0.5, 0.5, 0.75, 0.75, 0.75, 1.0, 1.0, 1.0, 1.25]

mrgrid_varying_sizes = MrGrid().split(
    ("input", "voxel"),
    input=nifti_dir.iterdir(),
    voxel=VOXEL_SIZES
)

# Run the task to resample all NIfTI files with different voxel sizes
result = mrgrid()

## Cache directories

When a task runs, a hash is generated by the combination of all the inputs to the task and the task to be run.