You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When running jobs with a large number of hosts we found that we would trigger ansible/awx#10366 which causes the Job in AWX to exit with the message "error" and no summary even through the ee pod would still finish the playbook in the background.
As described in the issue linked above the error occurs when container logs are rotated by the kubelet causing the Kubernetes log stream which AWX uses to fail.
I can confirm that by increasing the maximum container log size as described by this comment ansible/awx#10366 (comment) the issue can be worked around until the root cause is fixed.
We used the following command to update/install K3s with the increased maximum:
curl -sfL https://get.k3s.io | K3S_KUBECONFIG_MODE="644" INSTALL_K3S_EXEC="--kubelet-arg "container-log-max-size=150Mi"" sh -
Maybe it's worth increasing the maximum even further than 150Mi (from the 10Mi default) for bigger environments.
The text was updated successfully, but these errors were encountered:
Thanks for letting me know the issue and its workaround for k3s! This is helpful tips for someone who has large environment.
Currently I have a plan to add some tips for troubleshooting to my guide. This seems like a good piece of information to add to it 😃
Hi, thanks for this great guide!
When running jobs with a large number of hosts we found that we would trigger ansible/awx#10366 which causes the Job in AWX to exit with the message "error" and no summary even through the ee pod would still finish the playbook in the background.
As described in the issue linked above the error occurs when container logs are rotated by the kubelet causing the Kubernetes log stream which AWX uses to fail.
I can confirm that by increasing the maximum container log size as described by this comment ansible/awx#10366 (comment) the issue can be worked around until the root cause is fixed.
We used the following command to update/install K3s with the increased maximum:
Maybe it's worth increasing the maximum even further than 150Mi (from the 10Mi default) for bigger environments.
The text was updated successfully, but these errors were encountered: