# Getting Started with Prefect

### Introduction

Prefect is popular software orchestrator, which will allow us to schedule different services, as well as schedule their execution.

Let's dive in so we can begin to see how it works.

### Installing Prefect

The first task for us to do is to install prefect.  Run the `requirements.txt` file.

`pip install -r requirements.txt`

Then take a look at our index file.  There you will see the following:

```python
import requests
from prefect import flow, task

@task
def find_receipts(name):
    url = "https://data.texas.gov/resource/naix-2893.json"
    response = requests.get(url, params = {'taxpayer_name': name})
    return response.json()[:1]

@flow
def get_data(url):
    receipts = find_receipts(url)
    return receipts

name = 'HONDURAS MAYA CAFE & BAR LLC'
print(get_data(name))
```

As you can see, the file almost looks like normal python code.  At the bottom of the file, you can see that we call the `get_data` function.  This calls a **flow**, which then calls the `find_receipts` **task**. 

So a **flow** contains workflow logic.  And a flow has many **tasks**, where a task is a discrete unit of work in a workflow.  

Now let's run this flow.  We can do so like we run any other python file.

```bash
python3 index.py
```

<img src="./flow-run.png" width="100%">

We can see that unlike with a python script, our flows and tasks log a lot of information.  Running the flow creates a **flow run** above named `hypnotica-jackal` -- just a random generated name.  And that flow run has a **task run** of the `find-receipts` task.

> So to review, a flow has many tasks.  And an individual run is called a `flow-run` which has many `task-runs`.

So the logging is an essential part of what prefect offers.  The log reports the time of both when the flow run begins and when the task run is executed, as well as if it succeeded.

Even if we didn't see the log of the flow run in the console, we can view it by booting up the prefect server.

### Viewing the prefect server

Ok, so now let's run the prefect server by typing the following in the command line.

```bash
prefect server start
```

And if you click on `flows` over to the left, you can see our `get_data` flow.

<img src="./flows.png">

And then if you click on the flow runs panel on the left, we can see each of our past flow runs.

<img src="./flow-runs.png">

And then if we click on that flow run (here hypnotic jackal), and then click on logs, we can see those same logs about the flow run.

<img src="./flow-log.png" width="60%">

So how does prefect maintain all of this information?  It turns out that prefect ships with a database, where it logs all of this information -- so we can see a history of how our flow runs performed.

### Working with Prefect Cloud 

Now pretty soon, we'll see how to schedule these workflows.  And if we want to schedule these workflows to occur, it's best not to do so on our laptop (as then we'd have to keep it on and running prefect), but instead to move our workflow to the cloud.  

Let's do that now.

Begin by going to prefect.io, and then click on the login button on the top right -- or you can just click [here](https://app.prefect.cloud/auth/login).

From there, you can create a new account.

<img src="./create_account.png" width="70%">

And then create a workspace, which are used for organizing a collection of workflows (for example, maybe we have a workspace just related to marketing tasks). 

Click on `Create Workspace`.

<img src="./create_workspace.png" width="70%">

And then enter the corresponding information.

<img src="./tutorials.png" width="100%">

### Syncing with our computer

From there, move to the bash terminal, where our codebase is, and type the following.

```bash
prefect cloud login
```

<img src="./logged-in.png" width="80%">

And from there, we can see a request to authenticate with our API key, so let's see where they are.

From prefect cloud, click on our Profile, by clicking on our avatar in the bottom left (mine is the weird green icon), and from there click on the API keys panel.

<img src="./api-keys.png">

And then just create a new api key.

<img src="./api-key.png">

From there, we'll see our API key.  So now we can press the down arrow to `Paste an API key`, press return, then paste in our API key, and press return again.

<img src="./logged-in.png" width="80%">

Now from our terminal (on our local computer), run the flow again.

```bash
python3 index.py
```

<img src="./run-flow.png" width="100%">

We'll see the flow run locally, but then if we go to prefect cloud, and click on flows, we'll see our flow listed there -- along with information about the flow run we ran locally.

<img src="./cloud-flows.png" width="50%">

### Summary

Ok, so in this lesson we got started by installing prefect on our computer:

`pip3 install prefect`

And then we created a prefect flow, which is a workflow, and associated our first task to that workflow.  Remember that a task is just a discrete unit of work.

```python
@task
def find_receipts(name):
    url = "https://data.texas.gov/resource/naix-2893.json"
    response = requests.get(url, params = {'taxpayer_name': name})
    return response.json()[:1]

@flow
def get_data(url):
    receipts = find_receipts(url)
    return receipts

name = 'HONDURAS MAYA CAFE & BAR LLC'
print(get_data(name))
```

So we moving from the bottom to the top, we call our prefect flow, with `get_data(name)`.  And then in that flow, we call the our task `find_receipts`, just like we would a normal function.  The difference, is that prefect will log and provide information about these flow and task runs, which we saw by booting up the local prefect server.

`prefect server start`

And also saw with prefect cloud.

### Resources

[Prefect with Lambda and Snowflake](https://www.dataknowsall.com/prefectintro.html)