Skip to content

momaee/airflow

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Airflow

This is a simple repo to work with Apache Airflow.

Install Airflow

export AIRFLOW_HOME=~/airflow

AIRFLOW_VERSION=2.9.0

PYTHON_VERSION="$(python -c 'import sys; print(f"{sys.version_info.major}.{sys.version_info.minor}")')"


CONSTRAINT_URL="https://raw.githubusercontent.com/apache/airflow/constraints-${AIRFLOW_VERSION}/constraints-${PYTHON_VERSION}.txt"

pip install "apache-airflow==${AIRFLOW_VERSION}" --constraint "${CONSTRAINT_URL}"

Please note you may don't have python but python3 in your system. In that case, you need to replace python with python3 in the above command.

Check Installation

airflow version

airflow info

Check db connection

airflow db check

Add dags folder to ~/airflow/airflow.cfg

First, check the current airflow.cfg file. If it's empty, you can add the following lines to it.

[core]
dags_folder = /path/to/this/repo/dags

Please note that you need to replace /path/to/this/repo with the actual absolute path to this repo. You can get the absolute path by running the following command in the terminal.

pwd

Start Airflow

airflow standalone

You should see the credentials for the default user. You can change the password for the default user by editing the standalone_admin_password.txt file.

Access Airflow

Open a browser and go to http://localhost:8080/

You should see the Airflow UI with all the example dags and your dags, too!

Hide Example Dags

To hide the example dags, you can add the following line to the airflow.cfg file.

[core]
load_examples = False

Then you need to reset the database.

airflow db reset

Kubernetes DAG

For running the Kubernetes DAG locally, you need to install kubectl and minikube.

Kubectl

You can use this link to install kubectl.

Minikube

You can use this link to install minikube.

Start minikube

After installing kubectl and minikube, you can start the minikube cluster by running the following command.

minikube start

Then you should see the minikube cluster running by running the following command.

kubectl get nodes

or

kubectl cluster-info

Install Airflow Provider for Kubernetes

Now you need to install the Airflow provider for Kubernetes.

pip install apache-airflow-providers-cncf-kubernetes

For more information, you can check this link.

Run the Kubernetes DAG

Now you can run the Kubernetes DAG by going to the Airflow UI and turning on the kubernetes DAG. Every time you run the DAG, it will create a new pod in the minikube cluster and run the task in that pod. The pod will be deleted after the task is done. To catch the logs of the pod, you can use the following command.

watch kubectl get pods -A 

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages