# 6.2 Apache Airflow: A Powerful Tool for Workflow Orchestration
## 2.1 Introduction to Airflow
## 2.2 Key Concepts: DAGs, Tasks, Operators
## 2.3 Installation and Basic Configuration (Linux)

Install Airflow and set up basic configuration:

```bash
export AIRFLOW_HOME=~/airflow
pip install apache-airflow
airflow db init
airflow users create --username admin --firstname Admin --lastname User --role Admin --email admin@example.com
airflow webserver --port 8080
airflow scheduler
```

## 2.4 Creating a Simple Pipeline with Airflow

Create a basic ETL pipeline using Airflow:

```python
from datetime import datetime, timedelta
from airflow import DAG
from airflow.operators.python_operator import PythonOperator

from utils import extract, transform, load 

default_args = {
    'owner': 'airflow',
    'depends_on_past': False,
    'start_date': datetime(2024, 1, 1),
    'email_on_failure': False,
    'email_on_retry': False,
    'retries': 1,
    'retry_delay': timedelta(minutes=5),
}

dag = DAG(
    'olist_etl_dag',
    default_args=default_args,
    description='ETL pipeline for Olist dataset',
    schedule_interval=timedelta(days=1),
)

with dag:
    setup_logger_task = PythonOperator(
        task_id='extract_data',
        python_callable=extract,
        dag=dag,
    )

    extract_task = PythonOperator(
        task_id='transform_data',
        python_callable=transform,
        dag=dag,
    )

    transform_task = PythonOperator(
        task_id='load_data',
        python_callable=load,
        dag=dag,
    )

    extract_task >> transform_task >> load_task 
```

## 2.5 Advantages and Use Cases