Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Long tasks result in logging frozen and provisioner stdout blocked #8426

Closed
howels opened this issue Nov 28, 2019 · 12 comments
Closed

Long tasks result in logging frozen and provisioner stdout blocked #8426

howels opened this issue Nov 28, 2019 · 12 comments

Comments

@howels
Copy link

@howels howels commented Nov 28, 2019

When filing a bug, please include the following headings if possible. Any
example text in this template can be deleted.

Overview of the Issue

We have found that with both Virtualbox and QEMU builders, Packer will run an Ansible playbook but often freeze mid-way through if we log a lot of data. When frozen like this Packer will not respond with CTRL+C and must be sent a "kill -9" to abort the build.

Reproduction Steps

Run the Ansible-Local provisioner and produce several thousand lines of log output.

Packer version

1.4.3

Simplified Packer Buildfile

Any Ansible-local provisioner which produces a lot of output.

Operating system and Environment details

CentOS 7 VM on CentOS 7 host.

Log Fragments and crash.log files

No log output produced, as log freezes when this occurs even with debug logging enabled.

@azr

This comment has been minimized.

Copy link
Member

@azr azr commented Dec 2, 2019

Hello @howels thanks for openning; I think this could have been fixed by #8356 I recently tagged a nightly release of packer; could you please tell us if the binaries you can find here work in your test case ?

I think we won't but otherwise we need more infos to debug this ! Like a simplified buildfile.

@azr azr added the waiting-reply label Dec 2, 2019
@howels

This comment has been minimized.

Copy link
Author

@howels howels commented Dec 2, 2019

You are a 🌟

Running build using this pre-1.5.0 nightly build right now.

@howels

This comment has been minimized.

Copy link
Author

@howels howels commented Dec 2, 2019

The new binary now gives me an error whilst doing the QEMU disk resize:

==> fam: Copying hard drive...
==> fam: Resizing hard drive...
==> test: Error creating hard drive: QemuImg error: qemu-img: warning: Shrinking an image will delete all data beyond the shrunken image's end. Before performing such an operation, make sure there is no important data there.
==> test: qemu-img: Use the --shrink option to perform a shrink operation.

I have not changed QEMU versions between runs, this error is occuring only with this new binary. A bug exists here (#5969), not sure why this regression appears from 1.4.3->1.5.0?

Log with debug enabled:

==> test: Copying hard drive...
2019/12/02 10:31:55 packer: 2019/12/02 10:31:55 Executing qemu-img: []string{"convert", "-O", "qcow2", "/home/test/packer_cache/41dd48739ee913661acfb4def2bd730f68ae18f2.iso", "test/test.qcow2"}
2019/12/02 10:31:57 packer: 2019/12/02 10:31:57 stdout:
2019/12/02 10:31:57 packer: 2019/12/02 10:31:57 stderr:
==> test: Resizing hard drive...
2019/12/02 10:31:57 packer: 2019/12/02 10:31:57 Executing qemu-img: []string{"resize", "-f", "qcow2", "test/test.qcow2", "81920"}
2019/12/02 10:31:57 packer: 2019/12/02 10:31:57 stdout:
2019/12/02 10:31:57 packer: 2019/12/02 10:31:57 stderr: qemu-img: warning: Shrinking an image will delete all data beyond the shrunken image's end. Before performing such an operation, make sure there is no important data there.
2019/12/02 10:31:57 packer: qemu-img: Use the --shrink option to perform a shrink operation.
==> test: qemu-img: Use the --shrink option to perform a shrink operation.
==> test: Error creating hard drive: QemuImg error: qemu-img: warning: Shrinking an image will delete all data beyond the shrunken image's end. Before performing such an operation, make sure there is no important data there.
==> test: qemu-img: Use the --shrink option to perform a shrink operation.
==> test: Deleting output directory...

More details included under the other bug, cannot test this at present due to the regression in #5969

@howels

This comment has been minimized.

Copy link
Author

@howels howels commented Dec 2, 2019

The above error was caused by the QEMU builder now assuming that disk sizes are Bytes instead of Megabytes. Once I made this change I am greeted by another error, QEMU cannot get a write lock on the disk image once it is resized. I am continuing to test.

@howels

This comment has been minimized.

Copy link
Author

@howels howels commented Dec 2, 2019

I am also hitting an error with the QEMU builder, where the QEMU argument over-rides which worked in 1.4.3 are now ignored, resulting in double arguments which conflict.

@howels

This comment has been minimized.

Copy link
Author

@howels howels commented Dec 2, 2019

I have logged the QEMU issues regarding over-rides (or failing to over-ride) in #8436

@azr

This comment has been minimized.

Copy link
Member

@azr azr commented Dec 2, 2019

Hey @howels thanks for confirming ! Because your QEMU issue is a different one I think we can close this one !

@azr azr closed this Dec 2, 2019
@howels

This comment has been minimized.

Copy link
Author

@howels howels commented Dec 2, 2019

Sorry. I have tried to test this but the instant I tried 1.5.0 I hit breaking bugs in the QEMU implementation which prevent me from testing. There have been serious QEMU builder regressions between 1.4.3 and the 1.5.0 nightly that you posted - if these can be resolved then I can re-test.

@howels

This comment has been minimized.

Copy link
Author

@howels howels commented Dec 2, 2019

Hey @howels thanks for confirming ! Because your QEMU issue is a different one I think we can close this one !

Please note that I am unable to confirm if this fixes the issue.

@tjmcs

This comment has been minimized.

Copy link

@tjmcs tjmcs commented Dec 2, 2019

@azr, it seems to me that this issue should remain open until it can be confirmed that the nightly build you referenced does indeed fix this problem. As I understand it, @howels has not been able to confirm that the changes in that build do resolve his issue because of other bugs he has encountered in that build.

Can you please reopen this issue until @howels can confirm that the build you referenced does indeed fix this problem, @azr?

@SwampDragons SwampDragons reopened this Dec 2, 2019
@howels

This comment has been minimized.

Copy link
Author

@howels howels commented Dec 3, 2019

Issue resolved! Thanks everyone.

@howels howels closed this Dec 3, 2019
@hashibot

This comment has been minimized.

Copy link

@hashibot hashibot bot commented Jan 23, 2020

I'm going to lock this issue because it has been closed for 30 days . This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@hashibot hashibot bot locked and limited conversation to collaborators Jan 23, 2020
@hashibot hashibot bot removed the waiting-reply label Jan 23, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
5 participants
You can’t perform that action at this time.