Skip to content

DAG Bundle/User Impersonation Causes Permission Issues in Resource Preparation #60387

@Dev-iL

Description

@Dev-iL

Apache Airflow version

3.1.5

If "Other Airflow 3 version" selected, which one?

No response

What happened?

When running Git-based DAG bundles in Airflow with user impersonation (run_as_user), tasks may fail due to permission errors on resources like lock files and tracking directories. This occurs because bundles are initialized by the main airflow user before impersonation, making it necessary for the impersonated user to access these resources. Attempts to mitigate this in PRs #60270, #60278, and #60280 involve configuration changes and runtime warnings, but the underlying architectural problem remains unaddressed.

What you think should happen instead?

Resource preparation for bundles should be performed post-impersonation—or in a context where only the appropriate user has write access. Impersonated users should not need elevated permissions or write access to resources initialized by the airflow service user. The fix should be holistic, ideally centralizing resource management with privilege separation and avoiding relaxed file permissions or risky group sharing.

How to reproduce

  1. Set up an Airflow deployment using Git-based DAG bundles and enable user impersonation with the run_as_user config.
  2. Configure a DAG that fetches and interacts with resources stored in a shared repository (e.g., involving lock files, tracking directories).
  3. Trigger the DAG with a task that switches user context.
  4. Observe that the task fails with permission errors accessing bundle-related files or directories.
  5. Attempt to mitigate by setting group-writable permissions or updating git's safe.directory; note that this works but relaxes security.

Operating System

Linux

Versions of Apache Airflow Providers

No response

Deployment

Virtualenv installation

Deployment details

No response

Anything else?

  • Short-term solutions as seen in recent PRs involve group-writable directories, explicit warnings, and setting safe.directory. However, these may not be viable for all production environments, such as containerized or multi-tenant stacks.
  • Recommending a long-term refactor, possibly using a supervisor or Execution API to manage resource preparation, with only read-access needed for impersonated task users.
  • Would appreciate feedback, especially for edge-case deployments.

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions