-
Notifications
You must be signed in to change notification settings - Fork 14.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failure during "airflow db init" on fresh airflow 2.0 installation #13149
Comments
Thanks for opening your first issue here! Be sure to follow the issue template! |
Hello. We so not support installing airlfow via Poetry. The official way of installing airflow in a reproducible way is described here (using PIP 20.2.4 and constraint mechanism) : http://airflow.apache.org/docs/apache-airflow/stable/installation.html#getting-airflow Also the problems with PIP 20.3 has been apparently solved in PIP 20.3.3 (we are still verifying it) so installing with PIP 20.3.3 should also work. Can you please verify that the problem persists if you follow the official installation mechanism? If want to still stay with poetry, the list of "consistent" and working constraints for Airflow 2.0.0 is available here https://github.com/apache/airflow/tree/constraints-2.0.0 (separate for each python version supported) and in case poetry resolves the requirements differently, I encourage you to make it follow the constraints we publish. If you see any further problems, simple comparision of your installed versions of dependencies with the ones provided by our constraints should help you to resolve the problem and make poetry install the right versions. I am not sure how this can be done, I do not know poetry that well, but if you find how to make this works with poetry It would be great if you could contribute back the description on how to make our constraint mechanisms work with poetry-driven installations. We've the whole CI set of tests employed in place to make sure that the list of "valid" constraints is up-to-date and automatically verified, so folowing the constraints we produce is the best to make sure your installation is smooth and works. You can read more on why this works this way and how it actually works in here if you are interested why we chose this path. I am closing it as invalid now, but if you try to match the constraints and you see that you still have the same problems even with the same versions of dependencies installed, feel free to add extra comment here. |
Thank you for the quick reply.
DB initialization after installing packages with
Thanks. It looks like poetry does not support constraint files out of the box. If I come across anything helpful regarding installation via poetry I'll share it with the community. |
For those seeking a workaround, I was eventually able to initialize the database with a poetry installation by adding [tool.poetry]
name = "airflow-2-docker-example"
version = "0.1.0"
description = ""
authors = ["name <name@example.com>"]
[tool.poetry.dependencies]
python = "^3.7"
apache-airflow = "^2.0.0"
psycopg2-binary = "^2.8.6"
+python3-openid = "^3.2.0"
[tool.poetry.dev-dependencies]
[build-system]
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api" |
Cool. Do you think you could make somehow repeatable instructions on how to install Airflow with Poetry? it would be great to add such instruction in Airflow Docs for those who try it. UPDATE: Ah - I see your response about it. I would love to hear back if you find something there :). This is very much possible that installation will work even without the constraints initially. The problems we experienced in the past was that at some point of time transitive dependency changes broke Airflow installation - that's why we got those "fixed" constraint files that makes a reproducible installation. |
Why not use a lock file for this purpose? Poetry should support reproducible builds via the lock file. |
We indeed considered to check poetry in the past so maybe this is a good time to try. Would it be possible that you make a POC and try of poetry.lock approach can be used in similar way we use constraint files ? Happy to help with review and guide you @maxcountryman |
Just one comment here @maxcountryman if you are following that route. There are certain use cases that need to be solved if we would support poetry:
And get a fully reproducible installation of Airflow 1.10.14, no matter what transitional dependencies have been released since we released airflow. The
Those are the three major challenges I see if we would like to go the poetry route at some point in time - I wonder @maxcountryman (or anyone else with poetry experience) if you have some experience that could help us to see if those cases can be handled with it? |
Not to ignore the above discussion, but for the benefit of anyone else using poetry on top of airflow in a workflow similar to the following who is experiencing issues with flask-openid:
I had to add RUN export PYTHON_MAJOR_MINOR_VERSION=$(python -c 'import sys; print("%s.%s"% (sys.version_info.major, sys.version_info.minor))') \
&& AIRFLOW_MINOR_VERSION=$(echo "$AIRFLOW_VERSION" | cut -d "." -f 1)-$(echo "$AIRFLOW_VERSION" | cut -d "." -f 2) \
&& curl -sSL "https://raw.githubusercontent.com/apache/airflow/constraints-$AIRFLOW_MINOR_VERSION/constraints-$PYTHON_MAJOR_MINOR_VERSION.txt" > ./airflow-constraints.txt \
&& poetry export --without-hashes -f requirements.txt -o ./requirements.txt \
# flask-openid does not correctly specify version constraints https://github.com/python-poetry/poetry/issues/1287
&& echo "remove python-openid from poetry packages as it's pulled in incorrectly by flask-openid" \
&& sed -i '/^python-openid==/d' ./requirements.txt \
&& pip install --user --no-cache-dir --upgrade pip==${PIP_VERSION} \
&& pip install --user --no-cache-dir --no-warn-script-location -r ./requirements.txt --constraint ./airflow-constraints.txt \
&& rm -rf ~/.cache ./requirements.txt ./airflow-constraints.txt |
Just wondering (I would love to understand it) - what benefit does poetry brings in this particular case? I was thinking about switching to poetry once for Airflow (that was a long time ago: my mail from October 2018: https://lists.apache.org/thread.html/23a598f54eda27311544fbdb9503305cd214b27b211699bd37689f46%40%3Cdev.airflow.apache.org%3E) but after trying it out, I noticed that it misses a lot of the things vs. what we wanted (also checked pip-tools then but they weren't good enough either). So I really wonder what benefits people have with using poetry vs. other tools and how it fits in in the workflow you already have. |
We started using poetry to build an airflow docker image, but have since moved to the official image to try to minimize incompatibilities and haven't gone away from the tool. I think the main reasons are:
|
I see. I perfectly understand what 'horrible issues' mean. It's unfortunately yet another 'dependency-hell' kind of situation. Good luck with that. For now I think I prefer to stick with PIP install, but if you would like to make a PR to our contributing documentation on how to use our constraint files when you are installing airflow with poetry in the chapter following this: https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pinned-constraint-files that would be awesome @Limess . I know there are people using poetry for various reasons and I would love to be able to tell them "if you use poetry you can follow this" rather than "do not use poetry". |
Seems that python3-openid dependency is not properly solved by tools like poetry (it is properly resolved by pip). The result is that old version of python3-openid is installed when poetry is used and errors when initdb is run. While we do not use poetry as an official installation mechanism this happens frequently enought and it is easy enough to fix that we can add this dependency to make it easier for poetry users. Related to apache#13711 apache#13558 apache#13149
* Adds python3-openid requirement Seems that python3-openid dependency is not properly solved by tools like poetry (it is properly resolved by pip). The result is that old version of python3-openid is installed when poetry is used and errors when initdb is run. While we do not use poetry as an official installation mechanism this happens frequently enought and it is easy enough to fix that we can add this dependency to make it easier for poetry users. Related to #13711 #13558 #13149 * Update setup.cfg
* Adds python3-openid requirement Seems that python3-openid dependency is not properly solved by tools like poetry (it is properly resolved by pip). The result is that old version of python3-openid is installed when poetry is used and errors when initdb is run. While we do not use poetry as an official installation mechanism this happens frequently enought and it is easy enough to fix that we can add this dependency to make it easier for poetry users. Related to #13711 #13558 #13149 * Update setup.cfg (cherry picked from commit df73edf)
I had to fork [tool.poetry]
name = "airflow-new"
version = "0.1.0"
description = ""
authors = ["petobens <foo@bar.com>"]
[tool.poetry.dependencies]
python = "^3.8"
flask-openid = {git = "https://github.com/petobens/flask-openid.git"}
apache-airflow = "^2.0.1"
[tool.poetry.dev-dependencies]
[build-system]
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api" basically removing the sys.version_info check that made poetry install the Python2 dependency instead of the correct Python3 one. Since Airflow 2.0 only supports Python >=3.6 can Airflow either use my fork or a new flask-openid fork with this workaround? (the official flask-openid library hasn't been updated in almost 5 years) |
@petobens -> did you try to make a PR to the flask-openid library? the change is not huge and in the light of Python 2 EOL > year ago, maybe the authors will merge and release it? I am afraid we cannot release anything in PyPI that refers to a github repository even if we want - PyPI does not work with dependencies from GitHub. |
Apache Airflow version: 2.0
Environment:
uname -a
): DarwinWhat happened:
A fresh installation of airflow 2.0 seems to be failing on
airflow db init
with what looks like a 3rd-party library exception (see traceback below). I searched for related issues on github/google; but didn't find anything useful.What you expected to happen:
Given I couldn't find any useful information online, this makes me think the problem is my environment. I will continue to look at it, but in the meantime I'd like to put this out there in case anyone else has had a similar issue.
How to reproduce it:
1. Install poetry package manager
2. Create a
docker-compose.yml
file inside a new directory3. Create a
pyproject.toml
file inside the same directory as (2)4. Install python packages and bring up the database
5. Initialize airflow database
Anything else we need to know:
Traceback
The text was updated successfully, but these errors were encountered: