# Designing complex, reusable and scalable scientific workflows with Pydra

> Ghislain VAILLANT, Inria

## Prerequisites

- Python 3.8+
- Type annotations
- Data classes

### Type annotations

- Proposed in [PEP 484](https://peps.python.org/pep-0484/)
- Implemented since Python 3.5 in syntax and [typing](https://docs.python.org/3/library/typing.html) module
- Enhanced by subsequent Python releases

Standard function definition.

In [None]:
def scale(factor, vector):
    return [factor * x for x in vector]

Definition with type annotations.

In [None]:
from typing import List

# Type alias for convenience.
Vector = List[float]

def scale(factor: float, vector: Vector) -> Vector:
    return [factor * x for x in vector]

### Data classes

- Proposed in [PEP 557](https://peps.python.org/pep-0557/)
- Implemented since Python 3.7 in [dataclasses](https://docs.python.org/3/library/dataclasses.html) module
- Enhanced by third-party libraries such as [attrs](https://www.attrs.org/)

Simple record definition.

In [24]:
import attrs

@attrs.define
class GeoPoint:
    lat: float
    lon: float

In [25]:
swansea = GeoPoint(51.62, -3.94)

print(swansea)

GeoPoint(lat=51.62, lon=-3.94)


Record with custom fields.

In [1]:
from attrs import define, field, validators

def validate_lat(instance, attribute, value):
    if abs(value) > 90:
        raise ValueError(
            f"Latitude must be in range (-90, 90), got {value}.")

def validate_lon(instance, attribute, value):
    if abs(value) > 180:
        raise ValueError(
            f"Longitude must be in range (-180, 180), got {value}.")

@define(kw_only=True)   # Forbid init with posargs.
class CustomGeoPoint:
    lat: float = field(
        validator=[validators.instance_of(float), validate_lat])

    lon: float = field(
        validator=[validators.instance_of(float), validate_lon])

    alt: float = field(
        default=0.0, metadata={"recorded_by": "$DEVICE"})

In [2]:
swansea = CustomGeoPoint(lat=51.62, lon=-3.94)  # Okay!

print(swansea)

CustomGeoPoint(lat=51.62, lon=-3.94, alt=0.0)


In [3]:
%xmode Minimal

Exception reporting mode: Minimal


In [5]:
swansea = CustomGeoPoint(151.62, -3.94)             # Oops!

TypeError: CustomGeoPoint.__init__() takes 1 positional argument but 3 were given

In [6]:
swansea = CustomGeoPoint(lat=151.62, lon=-3.94)     # Oops!

ValueError: Latitude must be in range (-90, 90), got 151.62.

## Core components

Tasks, workflows and specifications.

## Complex workflows

Heterogeneous tasks, map-reduce semantics and nested workflows.

## Advanced features

Execution strategies and workflow customization options.

## Support channels