-
Notifications
You must be signed in to change notification settings - Fork 13.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TaskInstances do not succeed when using enable_logging=True option in DockerSwarmOperator #28452
Comments
Thanks for opening your first issue here! Be sure to follow the issue template! |
enable_logging=False, |
@maxim317, yes disabling logging is a workaround. Ideally we would like the logs though in airflow. |
Marked it as good first issue, hopefully someone wil take a look at this one. |
can I work on this? |
Feel free, but there is a linked PR for that so better ask there. |
yeah I see the pull request recently updated and active. my bad! |
…ogging=True option in DockerSwarmOperator" It introduces logging of Docker Swarm services which was previously not working.
…ogging=True option in DockerSwarmOperator" It introduces logging of Docker Swarm services which was previously not working.
* Fixes #28452: "TaskInstances do not succeed when using enable_logging=True option in DockerSwarmOperator" It introduces logging of Docker Swarm services which was previously not working. * tty=True/False to be chosen by the user as it was the case before this fix (#28452) * Follow=true for logs will always result in tasks not ending. This is standard and provided upstream by the Docker API. Therefore in DockerSwarmOperator follow is always false. * service_logs called multiple times as we continuously poll the Docker API for logs. As we indicated in the previous commmit, the docker client malfunctions when we try to get the logs with follow=True. Therefore we make multiple calls to the API (every 2 seconds), to fetch the new logs. * service_logs called multiple times. In this test the tasks increase (6 instead of 5) as we check if the service has terminated (+1). As this assertion makes less sense in a situation where we do multiple calls to the Docker API (polling), we might think of removing it or replacing it with something more suitable. * Final commit of this PR marking the test case that validates logging in the Docker Swarm Operator. We log two times a different message and we assert that the two lines are given back in the logs in the expected sequence. * Formatting ruff * Reverting as Github actions don't run this test as a swarm node: docker.errors.APIError: 503 Server Error for http+docker://localhost/v1.43/services/create: Service Unavailable ("This node is not a swarm manager. Use "docker swarm init" or "docker swarm join" to connect this node to swarm and try again.") Revert "Final commit of this PR marking the test case that validates logging in the Docker Swarm Operator. We log two times a different message and we assert that the two lines are given back in the logs in the expected sequence." This reverts commit 048ba1e. * Logging "since" timestamp to avoid memory issues Fix for #28452 * Formatting - Fix for #28452
* Fixes #28452: "TaskInstances do not succeed when using enable_logging=True option in DockerSwarmOperator" It introduces logging of Docker Swarm services which was previously not working. * tty=True/False to be chosen by the user as it was the case before this fix (#28452) * Follow=true for logs will always result in tasks not ending. This is standard and provided upstream by the Docker API. Therefore in DockerSwarmOperator follow is always false. * service_logs called multiple times as we continuously poll the Docker API for logs. As we indicated in the previous commmit, the docker client malfunctions when we try to get the logs with follow=True. Therefore we make multiple calls to the API (every 2 seconds), to fetch the new logs. * service_logs called multiple times. In this test the tasks increase (6 instead of 5) as we check if the service has terminated (+1). As this assertion makes less sense in a situation where we do multiple calls to the Docker API (polling), we might think of removing it or replacing it with something more suitable. * Final commit of this PR marking the test case that validates logging in the Docker Swarm Operator. We log two times a different message and we assert that the two lines are given back in the logs in the expected sequence. * Formatting ruff * Reverting as Github actions don't run this test as a swarm node: docker.errors.APIError: 503 Server Error for http+docker://localhost/v1.43/services/create: Service Unavailable ("This node is not a swarm manager. Use "docker swarm init" or "docker swarm join" to connect this node to swarm and try again.") Revert "Final commit of this PR marking the test case that validates logging in the Docker Swarm Operator. We log two times a different message and we assert that the two lines are given back in the logs in the expected sequence." This reverts commit 048ba1e. * Logging "since" timestamp to avoid memory issues Fix for #28452 * Formatting - Fix for #28452 * Fixes #28452: "TaskInstances do not succeed when using enable_logging=True option in DockerSwarmOperator" It introduces logging of Docker Swarm services which was previously not working. * tty=True/False to be chosen by the user as it was the case before this fix (#28452) * Follow=true for logs will always result in tasks not ending. This is standard and provided upstream by the Docker API. Therefore in DockerSwarmOperator follow is always false. * service_logs called multiple times as we continuously poll the Docker API for logs. As we indicated in the previous commmit, the docker client malfunctions when we try to get the logs with follow=True. Therefore we make multiple calls to the API (every 2 seconds), to fetch the new logs. * service_logs called multiple times. In this test the tasks increase (6 instead of 5) as we check if the service has terminated (+1). As this assertion makes less sense in a situation where we do multiple calls to the Docker API (polling), we might think of removing it or replacing it with something more suitable. * Final commit of this PR marking the test case that validates logging in the Docker Swarm Operator. We log two times a different message and we assert that the two lines are given back in the logs in the expected sequence. * Formatting ruff * Reverting as Github actions don't run this test as a swarm node: docker.errors.APIError: 503 Server Error for http+docker://localhost/v1.43/services/create: Service Unavailable ("This node is not a swarm manager. Use "docker swarm init" or "docker swarm join" to connect this node to swarm and try again.") Revert "Final commit of this PR marking the test case that validates logging in the Docker Swarm Operator. We log two times a different message and we assert that the two lines are given back in the logs in the expected sequence." This reverts commit 048ba1e. * Logging "since" timestamp to avoid memory issues Fix for #28452 * Formatting - Fix for #28452 * Fix bugs (#28452): Correctly assign last_line_logged, last_timestamp
Apache Airflow Provider(s)
docker
Versions of Apache Airflow Providers
apache-airflow-providers-celery==3.1.0
apache-airflow-providers-docker==3.3.0
Apache Airflow version
2.5.0
Operating System
centos 7
Deployment
Other Docker-based deployment
Deployment details
Running an a docker-swarm cluster deployed locally.
What happened
Same issue as #13675
With logging_enabled=True the DAG never completes and stays in running.
When using DockerSwarmOperator together with the default enable_logging=True option, tasks do not succeed and stay in state running. When checking the docker service logs I can clearly see that the container ran and ended successfully. Airflow however does not recognize that the container finished and keeps the tasks in state running.
What you think should happen instead
DAG should complete.
How to reproduce
Docker-compose deployment:
DAG code:
Anything else
No response
Are you willing to submit PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: