In Openshift we see a lot of restarts related to the DagProcessor pod.
When checking the previous logs of the crashed pod, we can clearly see the reason why the pod was restarted.
2026-07-03T08:59:25.767332Z [info ] Process exited [supervisor] exit_code=<Negsignal.SIGTERM: -15> loc=supervisor.py:859 pid=11928 signal_sent=SIGTERM
2026-07-03T08:59:25.843704Z [info ] Waiting up to 5 seconds for processes to exit... [airflow.utils.process_utils] loc=process_utils.py:308
Traceback (most recent call last):
File "/usr/local/sbin/airflow", line 10, in <module>
sys.exit(main())
~~~~^^
File "/usr/local/lib/python3.13/site-packages/airflow/__main__.py", line 55, in main
args.func(args)
~~~~~~~~~^^^^^^
File "/usr/local/lib/python3.13/site-packages/airflow/cli/cli_config.py", line 49, in command
return func(*args, **kwargs)
File "/usr/local/lib/python3.13/site-packages/airflow/utils/memray_utils.py", line 60, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python3.13/site-packages/airflow/utils/cli.py", line 113, in wrapper
return f(*args, **kwargs)
File "/usr/local/lib/python3.13/site-packages/airflow/utils/providers_configuration_loader.py", line 54, in wrapped_function
return func(*args, **kwargs)
File "/usr/local/lib/python3.13/site-packages/airflow/cli/commands/dag_processor_command.py", line 64, in dag_processor
run_command_with_daemon_option(
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
args=args,
^^^^^^^^^^
...<2 lines>...
should_setup_logging=True,
^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "/usr/local/lib/python3.13/site-packages/airflow/cli/commands/daemon_utils.py", line 86, in run_command_with_daemon_option
callback()
~~~~~~~~^^
File "/usr/local/lib/python3.13/site-packages/airflow/cli/commands/dag_processor_command.py", line 67, in <lambda>
callback=lambda: run_job(job=job_runner.job, execute_callable=job_runner._execute),
~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.13/site-packages/airflow/utils/session.py", line 100, in wrapper
return func(*args, session=session, **kwargs) # type: ignore[arg-type]
File "/usr/local/lib/python3.13/site-packages/airflow/jobs/job.py", line 355, in run_job
return execute_job(job, execute_callable=execute_callable)
File "/usr/local/lib/python3.13/site-packages/airflow/jobs/job.py", line 384, in execute_job
ret = execute_callable()
File "/usr/local/lib/python3.13/site-packages/airflow/jobs/dag_processor_job_runner.py", line 61, in _execute
self.processor.run()
~~~~~~~~~~~~~~~~~~^^
File "/usr/local/lib/python3.13/site-packages/airflow/dag_processing/manager.py", line 334, in run
return self._run_parsing_loop()
~~~~~~~~~~~~~~~~~~~~~~^^
File "/usr/local/lib/python3.13/site-packages/airflow/dag_processing/manager.py", line 453, in _run_parsing_loop
self._collect_results()
~~~~~~~~~~~~~~~~~~~~~^^
File "/usr/local/lib/python3.13/site-packages/airflow/utils/session.py", line 100, in wrapper
return func(*args, session=session, **kwargs) # type: ignore[arg-type]
File "/usr/local/lib/python3.13/site-packages/airflow/dag_processing/manager.py", line 979, in _collect_results
processor.logger_filehandle.close()
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^
OSError: [Errno 116] Stale file handle
The OSError should be catched when file handler is stale and logged as a warning but not propageted as this would avoid the pod from crashing and thus being restarted each time.
Under which category would you file this issue?
Airflow Core
Apache Airflow version
3.2.2
What happened and how to reproduce it?
In Openshift we see a lot of restarts related to the DagProcessor pod.
When checking the previous logs of the crashed pod, we can clearly see the reason why the pod was restarted.
What you think should happen instead?
The OSError should be catched when file handler is stale and logged as a warning but not propageted as this would avoid the pod from crashing and thus being restarted each time.
Operating System
Red Hat Fedora 5.3
Deployment
Official Apache Airflow Helm Chart
Apache Airflow Provider(s)
No response
Versions of Apache Airflow Providers
No response
Official Helm Chart version
1.22.0 (latest released)
Kubernetes Version
v1.29.14+41c4e9b
Helm Chart configuration
No response
Docker Image customizations
No response
Anything else?
Multiple times a day
Are you willing to submit PR?
Code of Conduct