Skip to content

PythonVirtualenvOperator: native template rendering breaks {{ ti }} / {{ task_instance }} pickling #61741

@akhundovte

Description

@akhundovte

Description

The PythonVirtualenvOperator docs state that passing/serializing ti / task_instance is not supported. However, I’m seeing a confusing behavior difference depending on render_template_as_native_obj:

  • With render_template_as_native_obj=False, templating {{ ti }} / {{ task_instance }} in op_kwargs appears to work (the task runs and prints values).

  • With render_template_as_native_obj=True, the task fails while serializing arguments for the subprocess with a PicklingError (non-obvious error mentioning loggers/structlog).

Minimal reproduction

from pendulum import datetime

from airflow.sdk import dag
from airflow.providers.standard.operators.python import PythonVirtualenvOperator


def venv_callable(ti, task_instance):
    print("ti =", ti)
    print("task_instance =", task_instance)


@dag(
    start_date=datetime(2026, 1, 1),
    schedule=None,
    catchup=False,
    render_template_as_native_obj=False,  # <-- changing only this flips behavior
)
def test_simple():
    PythonVirtualenvOperator(
        task_id="repro",
        python_callable=venv_callable,
        python_version="3.10",
        serializer="cloudpickle",
        op_kwargs={
            "ti": "{{ ti }}",
            "task_instance": "{{ task_instance }}",
        },
        requirements=["apache-airflow==3.1.6"],
        system_site_packages=False,
    )


test_simple()

Steps to reproduce

  • Run the DAG above with render_template_as_native_obj=False → task succeeds.
  • Change only render_template_as_native_obj=True → task fails during pickle/argument serialization.

Error / traceback (render_template_as_native_obj=True)

[2026-01-29 23:06:37] INFO - Use 'cloudpickle' as serializer.
[2026-01-29 23:06:37] ERROR - Task failed with exception
PicklingError: Only BytesLoggers to sys.stdout and sys.stderr can be pickled.
File "/home/airflow/.local/lib/python3.10/site-packages/airflow/sdk/execution_time/task_runner.py", line 1004 in run
...
File "/home/airflow/.local/lib/python3.10/site-packages/airflow/providers/standard/operators/python.py", line 529 in _write_args
File "/home/airflow/.local/lib/python3.10/site-packages/cloudpickle/cloudpickle.py", line 1537 in dumps
...
File "/home/airflow/.local/lib/python3.10/site-packages/structlog/_output.py", line 278 in __getstate__

Use case/motivation

We use PythonVirtualenvOperator to run code in an isolated environment with separate dependencies, but we often still need basic task-instance context inside the callable (identifiers, try number, sometimes access to basic TI metadata / XCom).

Native template rendering (render_template_as_native_obj=True) is valuable for other templates because it renders lists/dicts/booleans as native Python types, so we don’t want to avoid it globally. The current behavior makes it easy to accidentally pass a “live” non-serializable object and then hit a confusing pickle error.

I understand that calling the Airflow REST API directly from the virtualenv is a valid workaround, but it would be very helpful to have an official/supported mechanism for isolated environments:

  • a small supported serializable handle/proxy (e.g., dag_id, run_id, task_id, try_number, map_index, etc.),
  • and a minimal supported helper/client (or a limited set of operations) to safely retrieve basic TaskInstance info from within the virtualenv without passing the full object.

Even if a full client is out of scope, UX would improve a lot with:

  • explicit validation + a clearer error message when TaskInstance/ti ends up in op_args/op_kwargs, and/or
  • documentation that this case behaves differently (and fails earlier) with native rendering enabled.

Environment

  • Apache Airflow: 3.1.6
  • Provider: apache-airflow-providers-standard (PythonVirtualenvOperator)
  • Runner Python: 3.10
  • Virtualenv Python: 3.10
  • serializer: cloudpickle

Related issues

#61231

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions