-
Notifications
You must be signed in to change notification settings - Fork 14.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
provide_context=True not working with PythonVirtualenvOperator #8177
Comments
Thanks for opening your first issue here! Be sure to follow the issue template! |
This problem seems indeed significant. I wonder if it appears in the master version. Have you tried to check which objects in the context are causing the problem? Maybe we can exclude one or two objects to restore the correct behavior of this option in Airflow 1.10. |
This is an open source project, so there is no specific person who solves the issue. Would you like to take responsibility for it? I will gladly help and answer the questions if you want to solve this problem. Contributor guide: https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst The community is waiting for your contribution 🐈 |
@mik-laj Thank you for the information, I will look into the root cause and create a PR. Could you please assign me to the issue. I have no tested it on the master branch, I'll do so before proceeding with the PR. |
@mik-laj Hi Kamil - I created the PR #8256 to fix this issue on the branch v1-10-stable and the CI tests are passing, there seems to be an issue with requirements, but I that's related to this change. Could you let me know the next steps? There were too many changes on the master branch, I will revisit this bug there once 2.0 is out. |
Is it right that this issue is not going to be fixed in 1.10.x? As a workaround one could use dill=True, it is able to serialize modules. |
It was not marked as Milestone 1.10.12 I am afraid. I marked is as such now. There is a good "chance" rc1 will be cancelled because of #10362 and if so - we might be able to add it. |
Yeah thanks, will include it, planning to cut 1.10.12rc2 later tonight |
PR merged, will be part of 1.10.12rc2 |
Hi, Was this bug fixed in Airflow version 2? |
no, it wasn't fixed ... I have the same bug in 2.0.2
|
Hi @jatejeda Please check my github, I used the PythonVirtualenvOperator in a personal project using the version 2. |
Hi, facing the same issue with airflow 2.1.2 |
Probably you are using some 3rd party functions /modules that are not picklable. There is not much we can do with it. The Python Virtualenv operator works in the way that it will serialize everything that is needed by your function and deserialize it in the virtualenv. If there is any module that refuses to get serialized, you will get this error. |
Can you post some details of your methods/functions? Imports /logs ? I'd be curious if we can improve the error message to tell exactly what's wrong. |
Hi @potiuk ! thanks for your reply. So my method is the following:
I imported I've also seen that it was not possible to use the context variable
Above, I tried to handle the pull but not sure how to handle the push of dataset_2 from the virtualenv. With the above code I am getting the following error: |
To give you a bit of context I am trying to use this PythonVirtualenvOperator as I am facing the following issue when running on of my task : EDIT: I managed to make it work setting execute_tasks_new_python_interpreter=True in the airflow config but I wanted to isolate and use a new python interpreter only for a specific task and not for all of my tasks. |
Apache Airflow version: 1.10.9
Kubernetes version (if you are using kubernetes) (use
kubectl version
): 1.14.8Environment: Docker (Ubuntu 18.4 - Python 3.7)
uname -a
): Docker (Ubuntu 18.4 - Python 3.7)What happened:
When we enable provide_context=True for CustomPythonVirtualenvOperator we get the error below.
One way to get around this issue is to create your own CustomPythonVirtualenvOperator and overwrite _write_args, but this should not be the case. Feel free to use this if you're encountering the same issue:
What you expected to happen:
Ideally we should be able to use the context so we can run these tasks with run-time arguments via the CLI or the REST API.
How to reproduce it:
Anything else we need to know:
If you run the DAG provided you should see task1 passing and task2 failing.
The text was updated successfully, but these errors were encountered: