Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add docker-compose-airflow #5757

Merged
merged 2 commits into from May 2, 2023

Conversation

arielshaqed
Copy link
Contributor

@arielshaqed arielshaqed commented Apr 27, 2023

This is almost exactly the Airflow docker-compose, except we add a lakeFS
connector as part of setup.

To use:

docker compose -f docker-compose.yml -f docker-compose-airflow.yml up -d

Why a separate docker-compose file?

  1. Airflow requires multiple services. This makes patching the one
    required docker-compose file easier.
  2. Anyway many (most?) users won't want all these services. Using profiles
    to load Airflow inside the same docker-compose file would only cause
    more differences, and add no value to users.

How was this tested?

I brought up all services in both docker-compose files using docker compose. Then I copied
treeverse/airflow-provider-lakeFS/lakefs_provider/example_dags/lakefs-dag.py
into ./dags/. I created a repo example-repo on
http://localhost:8080/ (the lakeFS instance). I had to restart
airflow-scheduler, but after that it found my DAG and I was able to run the
lakeFS example dag http://localhost:8000/ (the Airflow instance).

Fixes #5677.

This is almost exactly the Airflow docker-compose, except we add a lakeFS
connector as part of setup.

To use:
```sh
docker compose -f docker-compose.yml -f docker-compose-airflow.yml up -d
```

== Why a separate docker-compose file? ==

1. Airflow requires _multiple_ services.  This makes patching the one
   required docker-compose file easier.
1. Anyway many (most?) users won't want all these services.  Using profiles
   to load Airflow inside the same docker-compose file would only cause
   _more_ differences, and add no value to users.

== How was this tested? ==

I brought up all services in _both_ docker-compose files using `docker
compose`.  Then I copied
`treeverse/airflow-provider-lakeFS/lakefs_provider/example_dags/lakefs-dag.py`
into `./dags/`.  I created a repo example-repo on
http://localhost:8080/ (the lakeFS instance).  I had to restart
airflow-scheduler, but after that it found my DAG and I was able to run the
lakeFS example dag http://localhost:8000/ (the Airflow instance).
@arielshaqed arielshaqed added new-feature Issues that introduce new feature or capability roadmap/airflow-operators https://docs.lakefs.io/understand/roadmap.html#airflow-operators include-changelog PR description should be included in next release changelog team/ecosystem Team Ecosystem pr/merge-if-approved Reviewer: please feel free to merge if no major comments labels Apr 27, 2023
Copy link
Contributor Author

@arielshaqed arielshaqed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To review: there is exactly one line changed in docker-compose-airflow.yml, it is marked on this object.

AIRFLOW__SCHEDULER__ENABLE_HEALTH_CHECK: 'true'
# WARNING: Use _PIP_ADDITIONAL_REQUIREMENTS option ONLY for a quick checks
# for other purpose (development, test and especially production usage) build/extend Airflow image.
_PIP_ADDITIONAL_REQUIREMENTS: ${_PIP_ADDITIONAL_REQUIREMENTS:-} airflow-provider-lakefs>=0.45.0
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewer please note: this is the only change to the Airflow docker-compose file!

@arielshaqed arielshaqed requested a review from rmoff April 27, 2023 13:17
@arielshaqed
Copy link
Contributor Author

  • @rmoff who may find this useful!

@rmoff
Copy link
Contributor

rmoff commented Apr 27, 2023

@arielshaqed thanks for the note. Is there a readme or docs to go with this?

@arielshaqed
Copy link
Contributor Author

arielshaqed commented Apr 28, 2023

@arielshaqed thanks for the note. Is there a readme or docs to go with this?

You're right, there will be. Documentation for the Everything Bagel docker-compose is somewhat lacking already. No excuse for keeping it that way, though.

PTAL.

Copy link
Contributor

@rmoff rmoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @arielshaqed.

EB docs and general state is closely on my radar for attention soon :)

Copy link
Contributor

@Jonathan-Rosenberg Jonathan-Rosenberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool
Thank you!

@arielshaqed
Copy link
Contributor Author

Thanks! Pulling...

@arielshaqed arielshaqed merged commit 3765038 into master May 2, 2023
36 checks passed
@arielshaqed arielshaqed deleted the feature/5677-add-airflow-to-everything-bagel branch May 2, 2023 08:06
@rmoff rmoff mentioned this pull request Jun 14, 2023
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
include-changelog PR description should be included in next release changelog new-feature Issues that introduce new feature or capability pr/merge-if-approved Reviewer: please feel free to merge if no major comments roadmap/airflow-operators https://docs.lakefs.io/understand/roadmap.html#airflow-operators team/ecosystem Team Ecosystem
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add Airflow to Everything Bagel docker-compose
3 participants