## Getting Started

If you've ever written code before, the problems caused by the code below will not be new to you, but it is worth making explicit.
The code mimics a simple data pipeline, which makes a call to an API service, augments the data, and then writes the results to our database.

The major difference is that the API call that we are making will fail half of the time. This is hopefully much more frequently than your API calls will fail in production, but it is useful for demonstration purposes.

In [1]:
import random
from prefect import flow, task

In [2]:

def call_unreliable_api():  # sourcery skip: raise-specific-error
    choices = [{"data": 42}, "failure"]
    res = random.choice(choices)
    if res == "failure":
        raise Exception("Our unreliable service failed")
    else:
        return res


def augment_data(data: dict, msg: str):
    data["message"] = msg
    return data


def write_results_to_database(data: dict):
    print(f"Wrote {data} to database successfully!")
    return "Success!"


def pipeline(msg: str):
    api_result = call_unreliable_api()
    augmented_data = augment_data(data=api_result, msg=msg)
    write_results_to_database(augmented_data)

In [3]:
for _ in range(10):
    pipeline(msg="Super Special Message")

Exception: Our unreliable service failed

### Negative Engineering

This is obviously a trivial example, and as engineers, we know to expect these things and deal with them. But, dealing with ways code fails is NOT what we set out to do. We set out to write a data pipeline.

The process of writing code that deals with failures, instead of writing code that performs the actions that we want to be done, is something that we at Prefect refer to as *Negative Engineering*.

Negative Engineering happens when engineers write defensive code to make sure the positive code actually runs. It must anticipate the almost limitless number of ways that code can fail, and is a massive time sink.

Prefect aims to eliminate as much negative engineering as possible for you.


### Using a Prefect Flow

It's easier to show than it is to tell, so let's run this next block and then we'll explain what is happening.

#### Creating a flow

To create a flow, we simply import flow from prefect and then add it as a decorator to our pipeline function. You can see the modifications that we’ve made to our flow below. Any lines that have modifications will be tagged with the comment # NEW **** .

In [4]:
def call_unreliable_api():  # sourcery skip: raise-specific-error
    choices = [{"data": 42}, "failure"]
    res = random.choice(choices)
    if res == "failure":
        raise Exception("Our unreliable service failed")
    else:
        return res


def augment_data(data: dict, msg: str):
    data["message"] = msg
    return data


def write_results_to_database(data: dict):
    print(f"Wrote {data} to database successfully!")
    return "Success!"

@flow   # NEW ****
def pipeline(msg: str):
    api_result = call_unreliable_api()
    augmented_data = augment_data(data=api_result, msg=msg)
    write_results_to_database(augmented_data)

In [5]:
pipeline("Trying a flow!")

12:52:06.320 | INFO    | prefect.engine - Created flow run 'poised-agouti' for flow 'pipeline'
12:52:06.396 | INFO    | Flow run 'poised-agouti' - Finished in state Completed()


Wrote {'data': 42, 'message': 'Trying a flow!'} to database successfully!


In [6]:
!ls ~/.prefect

auth.toml    config.toml  orion.db     orion.db-wal [34mstorage[m[m
backend.toml [34mflows[m[m        orion.db-shm [34mresults[m[m


#### Making our flows better with tasks

Flows are only the first step in orchestrating our data pipelines. The next step is adding Prefect task.

A task can be thought of as a discrete unit of work. In practice, you'll often simply convert the functions that make up your flow into tasks.

Like flows, tasks are created by adding a decorator. We'll demonstrate below.

In [7]:
 @task   # NEW ****
def call_unreliable_api():
    choices = [{"data": 42}, "failure"]
    res = random.choice(choices)
    if res == "failure":
        raise Exception("Our unreliable service failed")
    else:
        return res

@task   # NEW ****
def augment_data(data: dict, msg: str):
    data["message"] = msg
    return data

@task   # NEW ****
def write_results_to_database(data: dict):
    print(f"Wrote {data} to database successfully!")
    return "Success!"

@flow 
def pipeline(msg: str):
    api_result = call_unreliable_api()
    augmented_data = augment_data(data=api_result, msg=msg)
    write_results_to_database(augmented_data)


 `@flow(name='my_unique_name', ...)`


In [8]:
for i in range(5):
    print(f"Run #{i}")
    pipeline("Trying a flow with tasks!")

Run #0


12:52:31.125 | INFO    | prefect.engine - Created flow run 'cerulean-finch' for flow 'pipeline'
12:52:31.197 | INFO    | Flow run 'cerulean-finch' - Created task run 'call_unreliable_api-466f2784-0' for task 'call_unreliable_api'
12:52:31.198 | INFO    | Flow run 'cerulean-finch' - Executing 'call_unreliable_api-466f2784-0' immediately...
12:52:31.224 | ERROR   | Task run 'call_unreliable_api-466f2784-0' - Encountered exception during execution:
Traceback (most recent call last):
  File "/Users/mjboothaus/code/github/mjboothaus/try-prefect2/.venv_dev_try-prefect2/lib/python3.9/site-packages/prefect/engine.py", line 1053, in orchestrate_task_run
    result = await run_sync(task.fn, *args, **kwargs)
  File "/Users/mjboothaus/code/github/mjboothaus/try-prefect2/.venv_dev_try-prefect2/lib/python3.9/site-packages/prefect/utilities/asyncutils.py", line 56, in run_sync_in_worker_thread
    return await anyio.to_thread.run_sync(call, cancellable=True)
  File "/Users/mjboothaus/code/github/mjbo

Exception: Our unreliable service failed

#### Adding retries

The next feature that we will demo is the ability to retry a task. We know that tasks will inevitably fail. Sometimes this requires complex behavior, but other times we simply need to try again after a brief delay. We can do this with the `retries` and `retry_delay_seconds` parameters.

This will be helpful for our unreliable API call.

In [9]:
@task(name="Get data from API", retries=4, retry_delay_seconds=3)
def call_unreliable_api():
    choices = [{"data": 42}, "failure"]
    res = random.choice(choices)
    if res == "failure":
        raise Exception("Our unreliable service failed")
    else:
        return res

@task   # NEW ****
def augment_data(data: dict, msg: str):
    data["message"] = msg
    return data

@task   # NEW ****
def write_results_to_database(data: dict):
    print(f"Wrote {data} to database successfully!")
    return "Success!"

@flow 
def pipeline(msg: str):
    api_result = call_unreliable_api()
    augmented_data = augment_data(data=api_result, msg=msg)
    write_results_to_database(augmented_data)


 `@task(name='my_unique_name', ...)`

 `@task(name='my_unique_name', ...)`

 `@flow(name='my_unique_name', ...)`


In [10]:
pipeline("Trying a flow with tasks and retries!")

12:52:49.260 | INFO    | prefect.engine - Created flow run 'ruby-coucal' for flow 'pipeline'
12:52:49.331 | INFO    | Flow run 'ruby-coucal' - Created task run 'Get data from API-466f2784-0' for task 'Get data from API'
12:52:49.331 | INFO    | Flow run 'ruby-coucal' - Executing 'Get data from API-466f2784-0' immediately...
12:52:49.345 | ERROR   | Task run 'Get data from API-466f2784-0' - Encountered exception during execution:
Traceback (most recent call last):
  File "/Users/mjboothaus/code/github/mjboothaus/try-prefect2/.venv_dev_try-prefect2/lib/python3.9/site-packages/prefect/engine.py", line 1053, in orchestrate_task_run
    result = await run_sync(task.fn, *args, **kwargs)
  File "/Users/mjboothaus/code/github/mjboothaus/try-prefect2/.venv_dev_try-prefect2/lib/python3.9/site-packages/prefect/utilities/asyncutils.py", line 56, in run_sync_in_worker_thread
    return await anyio.to_thread.run_sync(call, cancellable=True)
  File "/Users/mjboothaus/code/github/mjboothaus/try-prefec

Wrote {'data': 42, 'message': 'Trying a flow with tasks and retries!'} to database successfully!


[Completed(message=None, type=COMPLETED, result={'data': 42}),
 Completed(message=None, type=COMPLETED, result={'data': 42, 'message': 'Trying a flow with tasks and retries!'}),
 Completed(message=None, type=COMPLETED, result='Success!')]