Skip to content

Latest commit

 

History

History
51 lines (33 loc) · 2.49 KB

README.md

File metadata and controls

51 lines (33 loc) · 2.49 KB

Airflow

Airflow is an open-source platform designed to programmatically author, schedule, and monitor workflows. It allows you to create and execute complex data pipelines and workflows that can involve multiple steps, dependencies, and sources of data.

At its core, Airflow provides a way to define a DAG (Directed Acyclic Graph) of tasks, which can be orchestrated and executed on a schedule or triggered manually. Each task in the DAG represents a specific operation or step in the workflow, and tasks can depend on one another, allowing you to create complex dependencies between tasks.

Airflow Task Life Cycle

image

A happy workflow execution process

image

Install Airflow on Linux

  1. Install Airflow using pip
pip install apache-airflow
  1. Initialize Airflow database: After installing Airflow, you need to initialize its metadata database.
airflow db init
  1. Start Airflow webserver and scheduler: Once the database is initialized, you can start the Airflow webserver and scheduler using the following commands:
airflow webserver --port 8080
airflow scheduler
This will start the webserver on port 8080 and the scheduler in the background. You can now access the Airflow web UI by opening a web browser and navigating to http://localhost:8080.

What is a DAG?

A DAG (Directed Acyclic Graph) is a collection of tasks that you want to execute, organized in a way that reflects their dependencies and relationships. A DAG is a fundamental concept in Airflow, as it represents the workflow that you want to automate or orchestrate.

A DAG consists of nodes and edges, where "nodes represent the tasks that need to be executed", and "edges represent the dependencies between tasks". The direction of the edges is always from upstream tasks to downstream tasks, indicating that a downstream task depends on the successful completion of its upstream tasks.

An example of what is a DAG.

image

An example of what is not a DAG.

image