Skip to content

Fix ArgNotSet repr to use stable string instead of memory address#65222

Open
necmo wants to merge 2 commits intoapache:mainfrom
necmo:fix-argnotset-repr
Open

Fix ArgNotSet repr to use stable string instead of memory address#65222
necmo wants to merge 2 commits intoapache:mainfrom
necmo:fix-argnotset-repr

Conversation

@necmo
Copy link
Copy Markdown

@necmo necmo commented Apr 14, 2026

Issue: closes #65220

Summary

Add __repr__ and __str__ methods to ArgNotSet to return a stable "NOTSET" string instead of Python's default <... object at 0x...> which includes the memory address.

Problem

When ArgNotSet instances leak into serialization paths that fall through to str()/repr() (e.g., TriggerDagRunOperator's logical_date parameter appearing in serialized default_args), the output includes the memory address:

"logical_date": "<airflow.serialization.definitions.notset.ArgNotSet object at 0x7fe833fa8f20>"

Since the scheduler runs DagFileProcessor as separate subprocesses, each process allocates ArgNotSet at a different address → different serialized JSON → new dag_version on every parse cycle (~30 seconds).

Fix

class ArgNotSet:
    """Sentinel type for annotations, useful when None is not viable."""

    def __repr__(self) -> str:
        return "NOTSET"

    def __str__(self) -> str:
        return "NOTSET"

This ensures that even if ArgNotSet bypasses the proper is_arg_set() check in serialization, the resulting string is deterministic and does not cause spurious DAG version increments.

@necmo necmo requested review from ashb and bolkedebruin as code owners April 14, 2026 14:55
@boring-cyborg
Copy link
Copy Markdown

boring-cyborg bot commented Apr 14, 2026

Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contributors' Guide
Here are some useful points:

  • Pay attention to the quality of your code (ruff, mypy and type annotations). Our prek-hooks will help you with that.
  • In case of a new feature add useful documentation (in docstrings or in docs/ directory). Adding a new operator? Check this short guide Consider adding an example DAG that shows how users should use it.
  • Consider using Breeze environment for testing locally, it's a heavy docker but it ships with a working Airflow and a lot of integrations.
  • Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
  • Please follow ASF Code of Conduct for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
  • Be sure to read the Airflow Coding style.
  • Always keep your Pull Requests rebased, otherwise your build might fail due to changes not related to your commits.
    Apache Airflow is a community-driven project and together we are making it better 🚀.
    In case of doubts contact the developers at:
    Mailing List: dev@airflow.apache.org
    Slack: https://s.apache.org/airflow-slack

@kaxil kaxil added this to the Airflow 3.2.2 milestone Apr 15, 2026
Replace __repr__ and __str__ overrides with a serialize() method.
This avoids inheritance issues with SetDuringExecution and targets
the actual serialization path in serialize_template_field().

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ArgNotSet.__repr__ includes memory address, causing non-deterministic DAG serialization and infinite dag_version increases

2 participants