-
Notifications
You must be signed in to change notification settings - Fork 2k
virtualbox: intermittent machine create fails #479
Comments
thanks for reporting. I just confirmed this is working. Can you paste the exact |
@ramschmaerchen there was a bug with the way the cli args were processed. could you try this again from master and see if it still errors? thx |
Ok, I tried to investigate this issue further. First of all, master did not fix it. Specs
Investigation
Step by step
ConclusionThe only thing I could see was, that non-working machines are generating a new certificate in the docker init process on every boot. Side effectsBoth, working and non-working machines, fail on ipv6 |
The IPv6 support in machine for virtual box doesn't exist, I'm not sure anyone has tried yet. But we should look into why this is failing. I'd also like to see unified arguments for drivers. There's no reason drivers can't share flags. Things like disk size, ram and auth credentials could be reused. |
+1 sthulb @ramschmaerchen could you also run the command with debug ( |
Sure. I censored keys and stuff but made sure the keys and names are the same (same name for same key) Working machine
INFO[0001] Creating SSH key... Generating public/private rsa key pair. END SSH DEBU[0066] generating server cert: /Users/thisuser/.docker/machines/dev/server.pem DEBU[0068] executing: ssh -o IdentitiesOnly=yes -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o LogLevel=quiet -p 49178 -i /Users/thisuser/.docker/machines/dev/id_rsa docker@localhost echo "-----BEGIN RSA PRIVATE KEY----- DEBU[0068] executing: ssh -o IdentitiesOnly=yes -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o LogLevel=quiet -p 49178 -i /Users/thisuser/.docker/machines/dev/id_rsa docker@localhost echo "-----BEGIN CERTIFICATE----- DEBU[0068] executing: VBoxManage showvminfo dev --machinereadable END SSH DEBU[0068] executing: ssh -o IdentitiesOnly=yes -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o LogLevel=quiet -p 49178 -i /Users/thisuser/.docker/machines/dev/id_rsa docker@localhost echo "EXTRA_ARGS='--tlsverify |
The non-working machine is not generating certs and keys upon creation. Non-Working Machine
INFO[0001] Creating SSH key... Generating public/private rsa key pair. |
Thanks. Could you also post the actual machine command line you are running? |
Sure, nothing fancy: Edited the comments above. |
Ok. Is this from the RC2 or master? Also, can you try to remove (or backup) the existing I guess the obvious question is what is the difference between machines? OS version, arch, VirtualBox version, etc? |
Besides of the original entry, I always used master branch 4th Feb 2015. Sorry for not mentioning: Removing .docker does not fix anything. I did not edit the original message as I was trying with RC2 then. Here is what I tried between creation of new machines:
I created 100 machines the last two days and was not able to figure out any pattern. |
Ok so just to confirm, it is working some times? If that's the case, what is your Host OS (version) and VirtualBox version of the machine that isn't working? |
Yes, sometimes it is working, sometimes it is not, as mentioned in the comments. Specs are in the third comment. I added VirtualBox information. Quick access: VirtualBox 4.3.20 r96996 |
After several tests I can confirm that RC2 is NOT working any better if .docker has been deleted before creating new machines. |
I have, with a decent enough level of frequency to be considered worrisome, gotten that "Bad port '0'" error from VBox machines as well. |
Same here, sometimes it works, most of the time not.
For example, I made a script that create 5 docker-machine with the driver virtualbox (one is going to be a swarm-master, the other agents) for a demo. The creation of the first machine is ok, then it fails most of the time on the second one (hanging on Waiting for VM to start...). Cleaning $HOME/.docker, $HOME/.config/VirtualBox, remove dhcpservers ( |
@vdemeester are you adding delays in the script? with VirtualBox, if you create the VMs too quickly, virtualbox will do that exact error. For example, in @nathanleclaire multiple instance PR, he doesn't launch them concurrently but rather queues them because of this behavior. |
@ehazlett Yes I added a delay 5 to 10s and it's working a little better, few of nodes have created but in the end I got same error for one node. What's strange is that docker-machine is waiting for the VM to start, but if I look at the VirtualBox machine (using the UI), I see that boot2docker is started.. It seems that for a reason I don't know, docker-machine do not detect that it's started in some cases. I'll try to take a look at the driver code (my golang knowledge is kinda small but :P), on how it's checking the availability of the VM. |
@vdemeester Sorry to hear about the trouble. Could you please post the following information when you get a chance (it will help greatly in debugging):
Thanks! |
@vdemeester that's the same behavior I've seen too. You can see the thumbnail of it running but never gets an IP. I think it would be beneficial to add a flag to not make the VM headless -- then we could at least login via the console to debug. |
@nathanleclaire Yes :) The Host is Archlinux, current update (as it's a rolling release, there no real way to point out a version). It makes me think I did not try my script on Ubuntu 14.04 for example, I'll keep you update when I do that (this week-end probably) $ VBoxManage --version
4.3.22_OSEr98236
$ pacman -Q | grep virtual # It's Archlinux
virtualbox 4.3.22-2
virtualbox-host-modules 4.3.22-1 The output is big (and I did not redirect it in a file so for now, it's just the end part — creation of node3 and node4) so I posted a gist I'll update : https://gist.github.com/vdemeester/452b7455ac85d904d0c0 . I also added the script I'm using to run it. Last time I tried I wasn't on the same network (@work vs @home now..) and here it's failing on the node4, @work it was on the node3 each time. No idea why :). I'll re-run the script from a clean environment with the output redirect to a file and will update the gist. |
Hi, Just tried docker-machine and I'm encountering the same issue. I'm running this on Linux Mint, here are the details.. --8<-- snip ----------------------------------------------- --8<-- snip ----------------------------------------------- My Virtualbox version is 4.3.22 r98236 Other info: Here's the output . . . --8<-- snip ----------------------------------------------- johnzan@johnzan-lxc-wks ~ $ docker-machine create --driver virtualbox dev ^Cjohnzan@johnzan-lxc-wks ~ $ docker-machine ls --8<-- snip ----------------------------------------------- The binary is the 64bit Linux binary downloaded from the links provided in the documentation. Note that I created a symlink from ~/bin to the actual binary. --8<-- snip ----------------------------------------------- johnzan@johnzan-lxc-wks ~ $ docker-machine -v johnzan@johnzan-lxc-wks ~/bin $ ls -l --8<-- snip ----------------------------------------------- Hope that this info is of use. Regards, |
It might no be related (and hasn't been referencing this issue) but it may help debugging problems : #775 |
@ljrittle could you try this build to see if it helps? https://public.evanhazlett.com/docker-machine/vbox-intel-nic/ |
Thank you @ehazlett . I can confirm that this version works on my machine. I ran "docker-machine_darwin-amd64 create -d virtualbox dev" multiple Regards, @ljrittle https://github.com/ljrittle could you try this build to see if
|
@ehazlett : I can now inform you that a recent docker/master (the one On Tue, May 19, 2015 at 10:56 AM, Loren James Rittle ljrittle@gmail.com wrote:
|
thank you @ehazlett, I had the same issue in a fresh yosemite install and your https://public.evanhazlett.com/docker-machine/vbox-intel-nic/ fix it |
Still not working for me. This is the state before I created a machine with docker-machine. $ VBoxManage list hostonlyifs Name: vboxnet1 $ VBoxManage list dhcpservers Last debug message: |
@johnkeenleyside can you try the latest master? it has several fixes including the nic driver in. You can get the latest builds from https://docker-machine-builds.evanhazlett.com/latest/. Thanks! |
I just built master, 0.3.0-dev (0a0bbf7), here is the tail end of
At this point,
|
Hey @jfieber, is the issue you are seeing the daemon not starting up again (typified by the endless "connection refused" on 2376 errors?) I am seeing that issue as well; a |
Indeed, the daemon is not running, but it looks like it was started. The log show registering all the URL path handlers, setting up iptables, and then the end of
|
This last case of the daemon appearing to start, but then exit is also reported as #1028. |
I'm running into the same issue: DEBU[0036] STDERR: END SSH DEBU[0111] Got an error it was dial tcp 192.168.99.108:2376: operation timed out I install docker-machine using the below command: UPDATE: It worked after I deleted the host interfaces |
full log for full log for |
I was facing the same issue and found that virtualization was not enabled in the BIOS of the machine on which I was trying to create the docker swarm container. I enabled VT-x on the host machine and it works fine now. |
For me i got it working only after downgrading msysgit to i tried this because on one of my notebooks everything worked fine, on the other it didn't. the one where it wasn't working was running msysgit at version with the msysgit version i also encountered #1430 but after downgrading i got the |
i am getting the below error while running two nodes on my docker-machine .Kindly help .. $ docker-machine ls Even not able to get the url where to point... Has anybody have solution for this?? |
@ehazlett I am facing the similar issue. The swarm machine can't get URL. I have copied this patch https://docker-machine-builds.evanhazlett.com/latest/. Could someone help on it? [root@jackswarm ~]# docker-machine ls [root@jackswarm ~]# rm -fr .docker/ STDERR: STDERR: STDERR: STDERR: executing: /usr/bin/VBoxManage modifyvm swarm-master --natpf1 ssh,tcp,127.0.0.1,41827,,22 STDERR: |
I can confirm that both the vbox-intel-nic version solves this issue for me, but that the latest (b101c29) does not.
|
FWIW, I arrived here getting the same problem on docker quickstart terminal bootup (after it was working fine a few days back). I deleted the VM in Virtualbox and restarted MacbookPro then re-ran the quickstart terminal. It launched this time but I could not connect in Mac terminal. Turns out it was simply that the IP .101 was assigned instead of .100 as I had in DOCKER_HOST env variable in ~/.bash_profile. After editing that and sourcing it, I'm back in business. |
@phpguru Thanks, that was the right hint... I followed a Mac-Docker-Tutorial months ago and also edited the ~/.bash_profile - now with the right DOCKER_HOST, DOCKER_CERT_PATH and DOCKER_MACHINE_NAME exported everything runs fine again. |
When running Docker QuickStart Terminal i ran into a similar issue. Luckily I found this thread #1591 which solved my problem . Someone suggested as a workaround prepending to every docker-machine command the following argument: --native-ssh , e.g
Good luck! |
I had corrupt ~/.ssh/config . Once I corrected it, I was able to create machine like before. One way to check this is trying to ssh to some other server and see if works to make sure the docker-machine isn't failing due to ssh. |
This issue is very long and contains a lot of digressions. If someone continues to encounter similar ones please open a new one at https://github.com/docker/machine/issues/new with detailed information including:
Thanks!
|
If running --virtualbox-disk-size docker-machine create on OSX, no new docker machines can be generated anymore. To resolve, .docker needs to be deleted.After deleting .docker, creating a machine works flawlessly again.Please see my first comment for updated information
Behaviour
Most of the times, docker-machine
--virtualbox-disk-sizewaits endlessly for VM to start, although it was started already (regarding to virtualbox).Stopping the start process, docker ls shows the instance as running, without URL however:
The text was updated successfully, but these errors were encountered: