Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Job error on container log rotation #16

Closed
meis4h opened this issue Oct 26, 2021 · 3 comments · Fixed by #30
Closed

Job error on container log rotation #16

meis4h opened this issue Oct 26, 2021 · 3 comments · Fixed by #30

Comments

@meis4h
Copy link

meis4h commented Oct 26, 2021

Hi, thanks for this great guide!

When running jobs with a large number of hosts we found that we would trigger ansible/awx#10366 which causes the Job in AWX to exit with the message "error" and no summary even through the ee pod would still finish the playbook in the background.

As described in the issue linked above the error occurs when container logs are rotated by the kubelet causing the Kubernetes log stream which AWX uses to fail.

I can confirm that by increasing the maximum container log size as described by this comment ansible/awx#10366 (comment) the issue can be worked around until the root cause is fixed.
We used the following command to update/install K3s with the increased maximum:

curl -sfL https://get.k3s.io | K3S_KUBECONFIG_MODE="644" INSTALL_K3S_EXEC="--kubelet-arg "container-log-max-size=150Mi"" sh -

Maybe it's worth increasing the maximum even further than 150Mi (from the 10Mi default) for bigger environments.

@kurokobo
Copy link
Owner

Thanks for letting me know the issue and its workaround for k3s! This is helpful tips for someone who has large environment.
Currently I have a plan to add some tips for troubleshooting to my guide. This seems like a good piece of information to add to it 😃

@meis4h
Copy link
Author

meis4h commented Oct 27, 2021

Sounds good to me 👍

@kurokobo
Copy link
Owner

kurokobo commented Feb 5, 2022

Thanks for informing me of that 😃
Added this information to the new troubleshooting guide.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants