Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

azure-arm: unexpected EOF #8353

Closed
FeodorFitsner opened this issue Nov 8, 2019 · 12 comments · Fixed by #8369
Closed

azure-arm: unexpected EOF #8353

FeodorFitsner opened this issue Nov 8, 2019 · 12 comments · Fixed by #8369

Comments

@FeodorFitsner
Copy link

Overview of the Issue

We are running Packer as a Docker container from hashicorp/packer image. Packer is provisioning Azure VM with Windows OS. Container with Packer is run as Azure Container Instance (ACI). Both container and VM are in the same region. We are getting a lot of azure-arm: unexpected EOF errors with the stack trace as in gist. I'd say success rate of making the end of Packer build is 1 of 5.

Packer version

1.4.3 and 1.4.4.

Simplified Packer Buildfile

Dockerfile for building container with Packer: https://github.com/appveyor/build-images/blob/master/Dockerfile

Packer build file: https://github.com/appveyor/build-images/blob/master/vs2019.json

Operating system and Environment details

Packer run as container from hashicorp/packer:1.4.3 and hashicorp/packer:1.4.4. Container is run on Azure Container Instances.

Log Fragments and crash.log files

https://gist.github.com/FeodorFitsner/1d9d834c6ee4dd9db428ab3e7f8eb1ff

@SwampDragons
Copy link
Contributor

SwampDragons commented Nov 8, 2019

Thanks for opening. I have a couple of follow up questions for you:

  • Is this happening exclusively with the Azure builder? I see in your template you have google and amazon builders as well.

  • Is this regularly happening in a specific place in your build (i.e. a specific provisioner, provisioner type, or script?) or does it seem to be random where it happens during the build run?

  • Does it seem to happen when uploading things of a specific size?

  • Does it seem to happen after a certain amount of time has elapsed during the build?

@FeodorFitsner
Copy link
Author

We are running that build mostly on Azure and Hyper-V now - I haven't tried it yet on AWS and GCE. However, on Hyper-V it's not happening so often.

That's interesting that it usually fails in these two places:

More often:

�[1;32m==> azure-arm: Provisioning with powershell script: /build-images/scripts/Windows/install_powershell_get.ps1�[0m
�[1;31mBuild 'azure-arm' errored: unexpected EOF�[0m

install_powershell_get.ps1

Less often:

�[1;32m==> azure-arm: Provisioning with powershell script: /build-images/scripts/Windows/install_wsl.ps1�[0m
�[1;31mBuild 'azure-arm' errored: unexpected EOF�[0m

install_wsl.ps1

Link to the build script: https://github.com/appveyor/build-images/blob/master/vs2019.json#L88

And yes, both scripts come in the beginning of the build. Usually, if it passes that point there is 95% chance the build would be successful (the entire build runs between 5-6 hours).

Not sure about uploading part of the question. Could you please elaborate?

Thanks in advance for giving me any clue what might be wrong and how to make it more stable.

@FeodorFitsner
Copy link
Author

Just ran and got the same place again:

�[1;32m==> azure-arm: Provisioning with powershell script: /build-images/scripts/Windows/install_powershell_get.ps1�[0m
�[1;31mBuild 'azure-arm' errored: unexpected EOF�[0m

==> Some builds didn't complete successfully and had errors:
--> azure-arm: unexpected EOF

==> Builds finished but no artifacts were created.
panic: runtime error: invalid memory address or nil pointer dereference
2019/11/08 20:28:34 packer: [signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x6ab3e6]
2019/11/08 20:28:34 packer: 

@SwampDragons
Copy link
Contributor

In both cases you shared, the crash is happening while we're using WinRM to upload your powershell scripts to your vm.

I was just wondering if this is some kind of size buffer situation where this is happening with larger files, but at a whopping 289 bytes I guess that's not the issue 😂

This looks like it's happening within the WinRM library we use, and is probably a duplicate of #7350 and possibly #8229. I've never been able to reproduce these intermittent errors from my own setup. I'll try from the packer docker container and see if I can do that... maybe there's something going on with resource constraints inside a container.

@FeodorFitsner
Copy link
Author

Resource constraints could be the reason. Right now it's container with 2 CPU cores and 4 GB of ram, however I remember there were more crashes with 2 GB of RAM. I'm going to try a different container size.

@SwampDragons
Copy link
Contributor

That sure feels like it ought to be enough, but worth verifying.

@SwampDragons
Copy link
Contributor

I don't have much new to add yet, but I think these four issues are all duplicates of each other:

#7685
#8353
#8229
#7350

@SwampDragons
Copy link
Contributor

@FeodorFitsner I think I've tracked down this bug. The patched build here should solve it https://circleci.com/gh/hashicorp/packer/21445#artifacts/containers/0; this comes from the PR linked above.

@FeodorFitsner
Copy link
Author

Fantastic, I'm going to give it a try! Thank you for not giving up on this issue!

While trying the artifact from the build I noticed that packer_windows_amd64.zip archive contain pkg\packer_windows_amd64 file inside which I assumed should be renamed to packer.exe. Also, the resulting executable is like 40 MB less than packer.exe from official distro and it's slower to start. It's expected behavior for these build artifacts, right?

@SwampDragons
Copy link
Contributor

I'm not sure why it would be slower to start, but the size difference doesn't surprise me.

@FeodorFitsner
Copy link
Author

I've just tested the patched packer and it worked like a charm! It was run from the container with 1 core and 2 GB of memory while provisioning Windows instance on Azure. I ran 3 jobs in a row 5 hours each and they all finished successfully! Thanks for fixing that!

@ghost
Copy link

ghost commented Jan 23, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@hashicorp hashicorp locked and limited conversation to collaborators Jan 23, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants