Skip to content

PythonVirtualEnvOperator not serialising return type of function if False  #16022

@enisnazif

Description

@enisnazif

Apache Airflow version: 2.0.2

Kubernetes version (if you are using kubernetes) (use kubectl version): N/A

Environment: Ubuntu 18.04 / Python 3.7

  • Cloud provider or hardware configuration: N/A
  • OS: Debian GNU/Linux 10 (buster)"
  • Kernel (e.g. uname -a): Ubuntu 18.04 on WSL2

What happened:

When using the PythonVirtualEnvOperator with a python callable that returns False (or any other value, x such that bool(x) == False), due to line 53 of the Jinja template linked above, we don't end up serialising the return type into the script.out file, meaning that when read_result is called with script.out, we see an empty file.

What you expected to happen:

It's expected that regardless of the return value of the function, this will be correctly serialised in the script.out. This could be fixed by changing the jinja template to use if res is not None instead of if res

How to reproduce it:

Minimal DAG:

from airflow import DAG
from airflow.operators.python_operator import PythonVirtualenvOperator
import airflow

dag = DAG(
    dag_id='test_dag',
    start_date=airflow.utils.dates.days_ago(3),
    schedule_interval='0 20 * * *',
    catchup=False,
)

with dag:

    def fn_that_returns_false():
        return False

    def fn_that_returns_true():
        return True

    task_1 = PythonVirtualenvOperator(
        task_id='return_false',
        python_callable=fn_that_returns_false
    )

    task_2 = PythonVirtualenvOperator(
        task_id='return_true',
        python_callable=fn_that_returns_true
    )

Checking the logs for return_false, we see:

...
[2021-05-24 12:09:02,729] {python.py:118} INFO - Done. Returned value was: None
[2021-05-24 12:09:02,741] {taskinstance.py:1192} INFO - Marking task as SUCCESS. dag_id=test_dag, task_id=return_false, execution_date=20210524T120900, start_date=20210524T120900, end_date=20210524T120902
[2021-05-24 12:09:02,765] {taskinstance.py:1246} INFO - 0 downstream tasks scheduled from follow-on schedule check
[2021-05-24 12:09:02,779] {local_task_job.py:146} INFO - Task exited with return code 0

When it should probably read 'Returned value was: False`.

This issue was discovered whilst trying to build a Virtualenv aware version of ShortCircuitOperator, where a return value of False is important

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions