Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Long running jobs: job output stops non-deterministically #11087

Closed
3 tasks done
andreaskainz opened this issue Sep 16, 2021 · 3 comments
Closed
3 tasks done

Long running jobs: job output stops non-deterministically #11087

andreaskainz opened this issue Sep 16, 2021 · 3 comments

Comments

@andreaskainz
Copy link

Please confirm the following

  • I agree to follow this project's code of conduct.
  • I have checked the current issues for duplicates.
  • I understand that AWX is open source software provided for free and that I am not entitled to status updates or other assurances.

Summary

The installation is on a k3s single-node machine, awx 19.2.2 provisioned via awx-operator 0.12.0

We have a long running playbook (~35 mins) provisioning more than 30 hosts.

The playbook output (in the WebUI) stops after several minutes. The playbook run seems to continue in the background, but not output is generated any more. I examined the relevant table in the postgres db (main_jobevent_XXXXXXXX_XX), the entries also stop at the same point. As the automation-job-xx is destroyed when the playbook finishes, I can't say for sure that the playbook always runs till the end, but judging from my tests I think so.

When watching the log via 'kubectl logs -f automation-job-xx', the output also almost always stops, but mostly at some later time.

I've the feeling that those stops happen when there is a lot of output in a short timespan. I don't have the insight how the logs are collected, but it seems that something isn't able to keep up with the output speed.

Another playbook (~ 12mins runtime) also truncates the output sometimes, and sometimes it finishes correctly.

AWX version

19.2.2

Installation method

kubernetes

Modifications

no

Ansible version

ansible [core 2.11.3.post0]

Operating system

CentOS 7

Web browser

Firefox

Steps to reproduce

A long running job (>10 minutes) with fast log output

Expected results

Job log output should be complete

Actual results

Job log output is truncated

Additional information

No response

@kzinas-adv
Copy link

Same issue as #10211 #9961

@AlexSCorey
Copy link
Member

There are changes to the job output stdout coming really soon so hopefully those changes will resolve your issues

@keithjgrant FYI

@nicovs
Copy link

nicovs commented Oct 15, 2021

You might try my solution in: #10366 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants