-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
On vagrant up
with an additional private network, the Ansible provisioner randomly fails on an SSH timeout
#4860
Comments
You're using a custom inventory file ( Note that for simple use case, I'd recommend to rely on auto-generated inventory by simply removing If you still encounter problems, could you please gist your |
I'll try to set up a testing VM. Meanwhile, could you elaborate on "but Vagrant dynamically allocates the forwarded port for SSH connections" part? Actually I don't really understand what's happening. As far as I can tell, UPD Well, it must be because ansible connects using host-only network. But I still don't understand how Vagrant "allocates the forwarded port for SSH connections." |
Yes indeed, my comment about port forwarding only referred to this host-only connection, which is Vagrant's default ssh path. In your case, you've defined a extra private (or public) network (192.168.88.10), where using default port 22 is absolutely correct. With a quick Internet search, I found that there is several SSH issues when running Mac OS X Mavericks (and maybe affecting Yosemite as well). I could not check them, as I only have an OS X 10.9 box, but please have a look at:
If you still encounter problems, could you please provide following information:
Hint: for faster debugging, you also can execute the
|
@x-yuri I'm closing this until we receive more information from your part. |
Sorry for the delay. So, with this setup it still fails occasionally:
I thought the problem had to do with
Still, what exactly do you mean by dynamically allocating ports? The issues at those links has nothing to do with mine. The first has UPD I just confirmed that it must have to do with
UPD I'm running |
@x-yuri thanks for all the info. This is odd... What I see here, is that with your Arch Linux you're using one of the most recent versions of OpenSSH (version 6.7 was released on October 6, 2014). Ubuntu 14.04 is using What VM provider are you using actually? KVM? Virtualbox? I reopen this issue and flag it to |
@x-yuri I renamed the issue. Does |
I'm using |
Please, can you execute |
and maybe also for |
I hope there's no sensitive information there. |
@x-yuri the last example that you gisted does not fully correspond to the original problem report:
That said, I agree with your analysis that at least the "private network (192.168.88.11)" connectivity is not ready when the ansible provisioner is started. Next, could you please:
Note: I updated the issue title again (for helping internet searches). |
vagrant up
on Arch Linux Host: the private network is not available when Ansible provisioner begins
@x-yuri what is your exact versions for:
|
Well, if you expected me to gist
Sorry, I was experimenting with commenting corresponding line out and forgot to uncomment it back.
To make things clear (I'm not sure you noticed the point), it's not just about reexecuting As for the versions, Also, I wanted to make it clear, that I didn't do any tweaks to the kernel, and I did minimal adjustments to the configuration of the software I'm using, just the things I needed. Also, I hope it's not only about |
@x-yuri sorry, if was unclear, but no worries we'll keep digging into it bravely 🏥. Also thank you for the software versions clarification, you're running only on bleeding edges 🚀, and that could be a source of "yet unknown" integration troubles, that's all.
Correct, that was my mistake, I should have asked you to run So, my questions are:
|
Just chiming in that I am experiencing this as well. I also have a custom inventory file which contains the following: The problem is intermittent and running vagrant provision manually works fine after the VM finishes booting. |
@tkellen Which OS is it? @gildegoma This is output for failed and successful runs. It didn't occur to me that |
Multiple hosts. I've seen it on OSX and Ubtunu. |
@tkellen Thanks for your feedbacks, which obviously enlarge the scope of the problem (I'll update again the issue title and labels). @x-yuri Awesome work to fetch the logs, thanks a lot! I've started to check and compare But for the next investigation steps I propose you two kinds of possible "workarounds":
I also noticed that when
Usually, I don't think this should be a problem (SSH simply recreates the control socket file), but maybe it introduces some additional delay that possibly lead to the observed timeout error. So I think you also could try to:
It would be of great help If you could try these two custom configurations 🆘 |
vagrant up
on Arch Linux Host: the private network is not available when Ansible provisioner beginsvagrant up
with an additional private network, the Ansible provisioner randomly fails on an SSH timeout
@tkellen To make it clear, was the error exactly "Connection timed out"?
My coworkers had similar complaints, and that's exactly the OSes they use. It's just hard to make them confirm it. As for The way it looks I'll soon be an expert at testing Just to have it stored somewhere, this command has proved to be easier to use: |
The thing is this issue is hard to reproduce. I don't know how many times I should let it recreate VM to be sure that the issue is gone. Supposedly computer must be under heavy load for it to manifest itself. However by now I haven't yet manage to come up with a reliable way to make it happen, let alone fast one. And I'm not so sure now, that the issue can't happen with auto-generated inventory. So, do you have any suggestion for how to force it happen? One of the options would be to do it from inside the VM. Also, could you point me out where in the code |
@x-yuri You did more than pretty well, and I thank you again so much for your efforts. I don't think we need any more tests on your side ❤️ (and sorry I asked too much things, it is a bit hard to be complete in asynchronous communication).
Here are some parts that may interest you:
I close this issue with following further possible enhancement ideas to be evaluated in the future:
@mitchellh what do you think? Are you open to accept pull requests implementing the above ideas? |
Not sure if it's vagrant's fault:
This error happens occasionally (not every time). So, first
vagrant
says the machine is up and ready, but thenansible
says that it can't connect. Can this be fixed in any way? Could you suggest anything to further investigate the issue?The text was updated successfully, but these errors were encountered: