# Setting up Airflow

### Introduction

In this lesson, we'll work with setting up airflow.  Let's get started.

### Setting up With Docker

The easiest way for us to get started with airflow is via Docker.  We'll be using the `puckel/docker-airflow` image available [here](https://github.com/puckel/docker-airflow).  Let's download the image by running the following:

> `docker pull puckel/docker-airflow`

From there, we can confirm that our image has been downloaded.

<img src="./docker-airflow.png" width="80%">

Now the airflow's image contains a flask application, among other services.  And we can kick off this flask application with the following.

`docker run -p 8080:8080 puckel/docker-airflow webserver`

From there, we can view airflow by going to `localhost:8080`.

> <img src="./docker-web.png" width="60%">

Looking at the website, it looks like one of the main concepts is dags.  We'll get more into dags later -- but essentially, a dag is a workflow.  It allows us to describe a series of sequential steps like extract transform and load.



### Adding a DAG

We can add a dag by placing it in our airflow container.  Let's connect to our airflow and see how we can do so.

<img src="./airflow-env.png" width="100%">

So above, we first list the container processes, and then we sh into our running docker container.  So we can see that we are taken into the `/usr/local/airflow` folder.  And we can see that there are only a couple of files in that folder.

```bash
ls
airflow.cfg airflow.db airflow-webserver.pid logs unittests.cfg
```

So the `airflow.db` file is a database file for airflow.  And the `.cfg` files are configuration files.

### Adding a Dag

Now let's add a dag to airflow.  We have already added the code to create our first dag in the `/dags/hello_dag.py` file in the `dags` folder of this reading.

This is what it looks like.

```python
# /dags/hello_dag.py

from datetime import datetime
from airflow import DAG
from airflow.operators.dummy_operator import DummyOperator
from airflow.operators.python_operator import PythonOperator

def hello():
    return 'Hello world!'

hello_dag = DAG('hello_world', start_date=datetime(2021, 1, 1))

hello_task = PythonOperator(task_id='hello_task', python_callable=hello, dag=hello_dag)
```

We'll get into the details of the code later, but for now, we can get this dag up and running by using our bindmount to place this file into the `/usr/local/airflow/dags` folder of a running airflow container.

So let's first stop our running airflow container.

And then we can run another container, this time bind-mounting the local `/dags` folder into the container's `/usr/local/airflow/dags` folder.  We do so with the following:

```bash 
docker run -p 8080:8080 -v "$(pwd)"/dags:/usr/local/airflow/dags puckel/docker-airflow webserver
```

This time when we `sh` into our container we can see our `dags/hello_dag.py` file in our container.

<img src="./hello_dags.png" width="100%">

And now we hopefully can see this dag popup if we revisit our airflow webserver by going to `localhost:8080`.

There it is.

> <img src="./hello_world.png" width="80%">

So we can see that our `hello_world` dag was uploaded.  And if we click on that `hello_world` link, then we can see that this dag consists of our `hello_task`.

<img src="./dag-task.png" width="60%">

Now let's try to kick off this dag.  We can do so by going back to the main airflow dashboard, flipping the switch to the left to the `on` state, and then clicking the play button over to the right that says `trigger dag` when hovered over.

<img src="./trigger-dag.png" width="100%">

If we then click on the Last Run timestamp, we'll be taken to the following screen.  

> <img src="./last-run.png" width="40%">

The green border around the `hello_task` tells us that the `hello_task` was successfully run.  And then we can see further evidence of this by clicking on the task, and then clicking on the View Log button.

> <img src="./view-log.png" width="60%">

When clicking on the button, we can indeed see the log of task being run.

> <img src="./log-task.png" width="100%">

Looking at the log above, we can see that we first see `Starting attempt` of the task.  From ther, it says that it is running and beginning to run the task.  We then see the log of:

`Done. Returned value was: Hello World!`

Remember that this was the return value of the function associated with our task.

```python
def hello():
    return 'Hello world!'

hello_dag = DAG('hello_world', start_date=datetime.now())

hello_task = PythonOperator(task_id='hello_task', python_callable=hello, dag=hello_dag)
```

So it looks like we were able to create a dag associated with the `hello_task`, and that the `hello_task` then ran our `hello` function.  

We'll go into more details about the various components of getting this to work in the following lessons, but this is a good place to stop for now.

### Summary

In this lesson, we saw how we can get up and running with airflow by using docker.  We did booted up our airflow container with the command:


`docker run -p 8080:8080 puckel/docker-airflow webserver`

And then we created our first dag by bind mounting Python code into a container's `/dags` folder with the following:

```bash
docker run -p 8080:8080 -v "$(pwd)"/dags:/usr/local/airflow/dags puckel/docker-airflow webserver
```

From there, we saw that our dag was uploaded to airflow.

> <img src="./hello_world.png" width="80%">

And from here, we can 

And then we can manually trigger the dag -- we'll describe why we need to do this in the next lesson -- by clicking on the play button.

### Resources

[Debugging Airflow](https://www.astronomer.io/blog/7-common-errors-to-check-when-debugging-airflow-dag)