Skip to content

standalone dag processor gets stuck when over 1k dag files #41806

@awesomescot

Description

@awesomescot

Apache Airflow version

2.10.0

If "Other Airflow 2 version" selected, which one?

No response

What happened?

When I have over about 1000 dag files the standalone processor seems to stop functioning properly. I see CPU drop to almost zero. Parsing processes is also around 0. The dagbag never fills up. Logs are unhelpful. I can't seem to figure out what the dag processor is doing, seems as though it's silently crashing.

What you think should happen instead?

I think the standalone dag processor should process in the same or less time than the scheduler dag processor.

How to reproduce

I'm not sure I can share our dag files, but I will post my values file and would love to see if others can reproduce.

Operating System

kubernetes helm chart

Versions of Apache Airflow Providers

The ones in the helm chart.

Deployment

Official Apache Airflow Helm Chart

Deployment details

We are connecting to an RDS postgres instance(also very low cpu usage).

Anything else?

I've been trying to play around with settings to see if I can figure out what is happening, but no luck so far. I'm happy to post any logs that would be helpful.

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions