Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to Airflow 2.0.0 #618

Open
wants to merge 8 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -4,15 +4,15 @@
# BUILD: docker build --rm -t puckel/docker-airflow .
# SOURCE: https://github.com/puckel/docker-airflow

FROM python:3.7-slim-buster
FROM python:3.8-slim-buster
LABEL maintainer="Puckel_"

# Never prompt the user for choices on installation/configuration of packages
ENV DEBIAN_FRONTEND noninteractive
ENV TERM linux

# Airflow
ARG AIRFLOW_VERSION=1.10.9
ARG AIRFLOW_VERSION=2.0.0
ARG AIRFLOW_USER_HOME=/usr/local/airflow
ARG AIRFLOW_DEPS=""
ARG PYTHON_DEPS=""
Expand Down
28 changes: 16 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,35 +10,36 @@ This repository contains **Dockerfile** of [apache-airflow](https://github.com/a

## Informations

* Based on Python (3.7-slim-buster) official Image [python:3.7-slim-buster](https://hub.docker.com/_/python/) and uses the official [Postgres](https://hub.docker.com/_/postgres/) as backend and [Redis](https://hub.docker.com/_/redis/) as queue
* Based on Python (3.8-slim-buster) official Image [python:3.8-slim-buster](https://hub.docker.com/_/python/) and uses the official [Postgres](https://hub.docker.com/_/postgres/) as backend and [Redis](https://hub.docker.com/_/redis/) as queue
* Install [Docker](https://www.docker.com/)
* Install [Docker Compose](https://docs.docker.com/compose/install/)
* Following the Airflow release from [Python Package Index](https://pypi.python.org/pypi/apache-airflow)

## Installation

Pull the image from the Docker repository.
Up to this moment, there is no public image `puckel/docker-airflow:2.0.0`, so we have to build it. After cloning this repo, you may do

docker build -t puckel/docker-airflow:2.0.0 .

docker pull puckel/docker-airflow

## Build

Optionally install [Extra Airflow Packages](https://airflow.incubator.apache.org/installation.html#extra-package) and/or python dependencies at build time :

docker build --rm --build-arg AIRFLOW_DEPS="datadog,dask" -t puckel/docker-airflow .
docker build --rm --build-arg PYTHON_DEPS="flask_oauthlib>=0.9" -t puckel/docker-airflow .
docker build --rm --build-arg AIRFLOW_DEPS="datadog,dask" -t puckel/docker-airflow:2.0.0 .
docker build --rm --build-arg PYTHON_DEPS="flask_oauthlib>=0.9" -t puckel/docker-airflow:2.0.0 .

or combined

docker build --rm --build-arg AIRFLOW_DEPS="datadog,dask" --build-arg PYTHON_DEPS="flask_oauthlib>=0.9" -t puckel/docker-airflow .
docker build --rm --build-arg AIRFLOW_DEPS="datadog,dask" --build-arg PYTHON_DEPS="flask_oauthlib>=0.9" -t puckel/docker-airflow:2.0.0 .

Don't forget to update the airflow images in the docker-compose files to puckel/docker-airflow:latest.

## Usage

By default, docker-airflow runs Airflow with **SequentialExecutor** :

docker run -d -p 8080:8080 puckel/docker-airflow webserver
docker run -d -p 8080:8080 puckel/docker-airflow:2.0.0 webserver

If you want to run another executor, use the other docker-compose.yml files provided in this repository.

Expand All @@ -54,7 +55,7 @@ NB : If you want to have DAGs example loaded (default=False), you've to set the

`LOAD_EX=n`

docker run -d -p 8080:8080 -e LOAD_EX=y puckel/docker-airflow
docker run -d -p 8080:8080 -e LOAD_EX=y puckel/docker-airflow:2.0.0

If you want to use Ad hoc query, make sure you've configured connections:
Go to Admin -> Connections and Edit "postgres_default" set this values (equivalent to values in airflow.cfg/docker-compose*.yml) :
Expand All @@ -65,7 +66,7 @@ Go to Admin -> Connections and Edit "postgres_default" set this values (equivale

For encrypted connection passwords (in Local or Celery Executor), you must have the same fernet_key. By default docker-airflow generates the fernet_key at startup, you have to set an environment variable in the docker-compose (ie: docker-compose-LocalExecutor.yml) file to set the same key accross containers. To generate a fernet_key :

docker run puckel/docker-airflow python -c "from cryptography.fernet import Fernet; FERNET_KEY = Fernet.generate_key().decode(); print(FERNET_KEY)"
docker run puckel/docker-airflow:2.0.0 python -c "from cryptography.fernet import Fernet; FERNET_KEY = Fernet.generate_key().decode(); print(FERNET_KEY)"

## Configuring Airflow

Expand Down Expand Up @@ -98,6 +99,9 @@ In order to incorporate plugins into your docker container
- Airflow: [localhost:8080](http://localhost:8080/)
- Flower: [localhost:5555](http://localhost:5555/)

To log into airflow webserver, the default credentials are
- username: airflow
- password: airflow

## Scale the number of workers

Expand All @@ -111,16 +115,16 @@ This can be used to scale to a multi node setup using docker swarm.

If you want to run other airflow sub-commands, such as `list_dags` or `clear` you can do so like this:

docker run --rm -ti puckel/docker-airflow airflow list_dags
docker run --rm -ti puckel/docker-airflow:2.0.0 airflow list_dags

or with your docker-compose set up like this:

docker-compose -f docker-compose-CeleryExecutor.yml run --rm webserver airflow list_dags

You can also use this to run a bash shell or any other command in the same environment that airflow would be run in:

docker run --rm -ti puckel/docker-airflow bash
docker run --rm -ti puckel/docker-airflow ipython
docker run --rm -ti puckel/docker-airflow:2.0.0 bash
docker run --rm -ti puckel/docker-airflow:2.0.0 ipython

# Simplified SQL database configuration using PostgreSQL

Expand Down
Loading