Replies: 10 comments 16 replies
-
Not sure if this is related, but I am also seeing an "Import error" every now and then, apparently appearing out of thin air Broken DAG: [/opt/airflow/dags/repo/dags/documents/DEV/file_pipeline.py] Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.9/site-packages/airflow/serialization/serialized_objects.py", line 574, in serialize_operator
serialize_op['params'] = cls._serialize_params_dict(op.params)
File "/home/airflow/.local/lib/python3.9/site-packages/airflow/serialization/serialized_objects.py", line 447, in _serialize_params_dict
if f'{v.__module__}.{v.__class__.__name__}' == 'airflow.models.param.Param':
AttributeError: 'str' object has no attribute '__module__'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.9/site-packages/airflow/serialization/serialized_objects.py", line 935, in to_dict
json_dict = {"__version": cls.SERIALIZER_VERSION, "dag": cls.serialize_dag(var)}
File "/home/airflow/.local/lib/python3.9/site-packages/airflow/serialization/serialized_objects.py", line 847, in serialize_dag
raise SerializationError(f'Failed to serialize DAG {dag.dag_id!r}: {e}')
airflow.exceptions.SerializationError: Failed to serialize DAG 'DEV_file_pipeline': 'str' object has no attribute '__module__' This disappears when I refresh the page |
Beta Was this translation helpful? Give feedback.
-
related: #20648 |
Beta Was this translation helpful? Give feedback.
-
Some logs should be present - can you also check if you have enough resources / kubernetes logs ? It is lilkely your tasks are killed due to lack of resources (memory most likely). |
Beta Was this translation helpful? Give feedback.
-
Converting to discussion until more information provided. |
Beta Was this translation helpful? Give feedback.
-
I am also seeing an increase in the scheduler memory usage over time |
Beta Was this translation helpful? Give feedback.
-
Which version of Airlfow do you use @zorzigio ? From your observation about memory I assume it is before 2.1.4? Is it "active" or "cache" memory ? If it's cache, this is harmless and expected (In 2.1.4 #18054 we added kerne advisory to not cache log file memory). The cache growth was from log writing, but cache memory growth is entirely harmless and expected in earlier versions - the memory will be freed as needed and it's normal behaviour of Unix. Before we dive deeper I strongly advise (if my guess is right) to upgrade to latest release of Airlfow. There were already a number of fixes since then so this is very likely. See this thread: #21499 (reply in thread) if you want to see testimony of others who did. |
Beta Was this translation helpful? Give feedback.
-
@potiuk, thanks for looking into this. I am using the latest version of Airflow (which of today is 2.2.3) Here is the output from
|
Beta Was this translation helpful? Give feedback.
-
I think the issue you are having is about DAG serialization. You can try clearing the serialized DAGs. There's an from airflow.models.serialized_dag import SerializedDagModel
from airflow.settings import Session
session = Session()
session.query(SerializedDagModel).delete()
session.commit() To have airflow reserialize the dags again When there is an issue running a task airflow/airflow/cli/commands/task_command.py Line 331 in 5a38d15 |
Beta Was this translation helpful? Give feedback.
-
This issue seems similar to #21082 |
Beta Was this translation helpful? Give feedback.
-
Any update on this? |
Beta Was this translation helpful? Give feedback.
-
Apache Airflow version
2.2.3 (latest released)
What happened
I often have tasks failing in Airflow and no logs are produced.
If I clear the tasks, it will then run successfully.
Some tasks are stuck in queue state and even if cleared, will get stuck again in queue state.
Happy to provide more details if needed.
What you expected to happen
Tasks running successfully
How to reproduce
No response
Operating System
Debian GNU/Linux 10 (buster)
Versions of Apache Airflow Providers
apache-airflow-providers-amazon==2.4.0
apache-airflow-providers-celery==2.1.0
apache-airflow-providers-cncf-kubernetes==2.2.0
apache-airflow-providers-docker==2.3.0
apache-airflow-providers-elasticsearch==2.1.0
apache-airflow-providers-ftp==2.0.1
apache-airflow-providers-google==6.2.0
apache-airflow-providers-grpc==2.0.1
apache-airflow-providers-hashicorp==2.1.1
apache-airflow-providers-http==2.0.1
apache-airflow-providers-imap==2.0.1
apache-airflow-providers-microsoft-azure==3.4.0
apache-airflow-providers-mysql==2.1.1
apache-airflow-providers-odbc==2.0.1
apache-airflow-providers-postgres==2.4.0
apache-airflow-providers-redis==2.0.1
apache-airflow-providers-sendgrid==2.0.1
apache-airflow-providers-sftp==2.3.0
apache-airflow-providers-slack==4.1.0
apache-airflow-providers-sqlite==2.0.1
apache-airflow-providers-ssh==2.3.0
Deployment
Official Apache Airflow Helm Chart
Deployment details
I am using the Celery Executor with KEDA enables on Kubernetes. The nodepool is set on autoscaling.
Anything else
No response
Are you willing to submit PR?
Code of Conduct
Beta Was this translation helpful? Give feedback.
All reactions