New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Forwarded ports not being released with VMWare Fusion 8.5.3 #8130

Closed
amacnamara opened this Issue Dec 16, 2016 · 23 comments

Comments

Projects
None yet
@amacnamara

amacnamara commented Dec 16, 2016

Vagrant version

Vagrant 1.9.1

Host operating system

macOS 10.12.1 (16B2659)

Guest operating system

Red Hat Enterprise Linux Server release 6.7 (Santiago)

Vagrantfile

Vagrant.configure("2") do |config|

# Hostmanager Settings
  config.hostmanager.enabled            = true    # Enable HostMan Plugin
  config.hostmanager.manage_host        = true    # Update Host's host file
  config.hostmanager.ignore_private_ip  = false
  config.hostmanager.include_offline    = true

  # Core Configurations
  config.vm.box            = "rcs_class_a_oel66_1.5.2"           # Specify the Vagrant Base Box by name
  config.vm.host_name      = "eaton.rcsdev.net"                  # Specify a hostname for the virtual machine
  config.ssh.forward_agent = true                                # Forward ssh keys
  config.ssh.forward_x11   = true                                # Forward X11
  config.vm.network "forwarded_port", guest: 7001,  host: 7001   # Forward WebLogic Admin Console Port
  config.vm.network "forwarded_port", guest: 4444,  host: 4444   # Forward RIDC Port
  config.vm.network "forwarded_port", guest: 16300, host: 16300  # Forward URM Port
  config.vm.network "forwarded_port", guest: 1521,  host: 1521   # Forward JDBC Port

  # Provisioning
  config.vm.provision "shell", path: "setup.sh"

  # VMWare specific
  config.vm.provider "vmware_fusion" do |v|
    v.vmx["memsize"] = "8192"
    v.vmx["numvcpus"] = "4"
    v.vmx["ethernet0.virtualDev"] = "vmxnet3"
  end

end

Expected behavior

When the machine is shut down, the forwarded ports should be freed up to be used by other vms. The port forwarding entries in /Library/Preferences/VMware\ Fusion/vmnet8/nat.conf should be removed.

Actual behavior

The port forwarding entries in /Library/Preferences/VMware\ Fusion/vmnet8/nat.conf are not removed. Vagrant catches and fixes the port collisions for the ssh port, which grabs another port to be used. Eventually, there are a lot of extra entries in nat.conf:

2201 = 192.168.225.136:22
2222 = 192.168.225.128:22
2204 = 192.168.225.145:22
2200 = 192.168.225.136:22
2202 = 192.168.225.140:22
2206 = 192.168.225.148:22
# VAGRANT-BEGIN: /Users/amacnamara/src/eat_urm_puppet/.vagrant/machines/default/vmware_fusion/8155631a-1c29-42a1-9a4f-0567011f98bd/oel66.vmx
2207 = 192.168.225.148:22
# VAGRANT-END: /Users/amacnamara/src/eat_urm_puppet/.vagrant/machines/default/vmware_fusion/8155631a-1c29-42a1-9a4f-0567011f98bd/oel66.vmx

This also causes other forwarded ports to collide, preventing vagrant up from running. In the above Vagrantfile example, port 7001 is set up in nat.conf, then never released, so subseguent vagrant ups fail.

Steps to reproduce

  1. Configure forwarded port in Vagrantfile
  2. vagrant up
  3. vagrant halt
  4. Reboot host
  5. vagrant up
  6. port collision
@xenithorb

This comment has been minimized.

xenithorb commented Dec 19, 2016

I'm also seeing this with Libvirt on Fedora 25, vagrant version 1.8.5 I believe it's related.

My workaround is to killall -u user ssh inbetween sessions, since it seems to be forwarding via SSH

@Moncky

This comment has been minimized.

Moncky commented Jan 13, 2017

Adding my voice to @amacnamara I am also seeing this behaviour on Vagrant 1.8.7 and VMware Fusion 8.5.3.

To work around I am manually clearing out the nat.conf file and restarting vmnet-cli which is a complete pain.

@chronon

This comment has been minimized.

chronon commented Jan 16, 2017

Same issue for me with Vagrant 1.9.1 and VMware Fusion 8.5.3, though rebooting my Mac seems to allow a VM to boot. When a VM doesn't boot, it hangs at either "Waiting for machine to boot..." or "SSH auth method: private key".

The workaround mentioned by @Moncky usually works for me too:

  1. Comment all entries under [incomingtcp] in /Library/Preferences/VMware\ Fusion/vmnet8/nat.conf
  2. sudo /Applications/VMware\ Fusion.app/Contents/Library/vmnet-cli --stop
  3. sudo /Applications/VMware\ Fusion.app/Contents/Library/vmnet-cli --start
@mightyoj

This comment has been minimized.

mightyoj commented Feb 1, 2017

this issue is also happening in OS X 10.11.6

@robosung

This comment has been minimized.

robosung commented Feb 8, 2017

Had the same issue and chronon/moncky approach worked for a while but stopped working...I think I tried to stop vagrant up with control-c. I was able to get it work again by:

  1. renamed ~/.vagrant.d to ~/.vagrant.bak (so a new one can be created)
  2. reinstall the vagrant vmware provider plugin and license. (the license file is in the original ~/.vagrant.bak)
  3. cleaned the nat.conf file as mentioned above
  4. vagrant up, confirm that it works then delete ~/.vagrant.bak
@chronon

This comment has been minimized.

chronon commented Feb 21, 2017

For reference, this issue appears to be a duplicate of #7948.

@chrisroberts

This comment has been minimized.

Member

chrisroberts commented Mar 2, 2017

This should be resolved within the latest vagrant vmware plugin release. Cheers!

@StefanScherer

This comment has been minimized.

Contributor

StefanScherer commented Mar 7, 2017

Seem's I'm late to the party. Just encountered that problem, all ports up to 2250 are occupied, using Vagrant 1.9.2 + vagrant-vmare-fusion 4.0.17
I just updated to 4.0.18 and tried to spin up a new Vagrant box, but still no free port forwardings available. Think I have to clean up manually and hope that new port forwardings will be cleanup up properly.
Other workaround is to increase the port range with override.vm.usable_port_range = 2200..2999

@chrisroberts

This comment has been minimized.

Member

chrisroberts commented Mar 8, 2017

@StefanScherer Hi! If there are old untagged port forward entries in the nat.conf file they will need to be cleaned out manually. After that they should not be lost again with the latest version of the plugin. If you have any problems please do let me know. Cheers!

@jtopper

This comment has been minimized.

Contributor

jtopper commented Apr 18, 2017

@chrisroberts I found this issue whilst attempting to debug a problem I'm having here with version 4.0.18 of the vagrant-vmware-fusion plugin.

I have three boxes configured, and just one running. I vagrant destroy centos6 and the box is destroyed as expected, however the nat.conf goes from containing

[incomingtcp]

# Use these with care - anyone can enter into your VM through these...
# The format and example are as follows:
#<external port number> = <VM's IP address>:<VM's port number>
#8080 = 172.16.3.128:80
# VAGRANT-BEGIN: /Users/jtopper/Vagrant VMs/31a423a1-bd7c-4481-8820-152f6ddeb7dc/scalefactory-centos6-sfpuppet-vmware-1.0.145.vmx
62523 = 192.168.20.139:22
# VAGRANT-END: /Users/jtopper/Vagrant VMs/31a423a1-bd7c-4481-8820-152f6ddeb7dc/scalefactory-centos6-sfpuppet-vmware-1.0.145.vmx

to

[incomingtcp]

# Use these with care - anyone can enter into your VM through these...
# The format and example are as follows:
#<external port number> = <VM's IP address>:<VM's port number>
#8080 = 172.16.3.128:80
62523 = 192.168.20.139:22

It looks to me like the vagrant tags are being removed from the config file, but the NAT config itself remains.

@chrisroberts

This comment has been minimized.

Member

chrisroberts commented Apr 18, 2017

@jtopper Is that port forward being explicitly defined in the Vagrantfile? When you run vagrant up again, does it cause a collision error, or properly bring the box up?

@jtopper

This comment has been minimized.

Contributor

jtopper commented Apr 18, 2017

@chrisroberts: We're actually generating the port number programatically based on the PID and some randomness, so it'll never collide in that way. The same Vagrantfile is used in our CI environment where we run multiple vagrants simultaneously, and this helps us avoid a concurrency issue. Assuming the port number is always subsequently obtained from vagrant's state files (which does seem to be the case generally) we don't get into trouble with subsequent runs of the command finding the wrong PID - though it's plausible we're hitting an edge case here.

What seems to happen is that after destroying and bringing up the box, the process hangs on

==> server: Starting the VMware VM...
==> server: Waiting for machine to boot. This may take a few minutes...

In the VMWare Fusion console, I can see that the box starts, but none of the networking comes up. Removing the NAT entries and restarting the vmware-cli seems to solve the problem to the point where boxes start properly again. I seem to recall that in this case, a vmware-cli --stop doesn't always output the whole list of vmnet interfaces that I'd expect to see.

@chrisroberts

This comment has been minimized.

Member

chrisroberts commented Apr 18, 2017

@jtopper Can you clear out the [incomingtcp] configuration, run a vagrant up && vagrant destroy and then gist the output of vagrant up --debug. Even if Vagrant loses the markers within the nat.conf file (which can happen if the vmware service rewrites the file) it should detect ports it was responsible for and properly re-mark or clean them. Thanks!

@jtopper

This comment has been minimized.

Contributor

jtopper commented Apr 18, 2017

@chrisroberts I hit ^C on this one eventually.

https://gist.github.com/jtopper/2d3243db8fdc847d61aec0e208c81d12

(Sorry, that's a bit cluttered by the fact there are more boxes defined in the Vagrantfile and I forgot to constrain the run to just one of them - seems not to have got as far as trying to start the other two though)

@chrisroberts

This comment has been minimized.

Member

chrisroberts commented Apr 18, 2017

@jtopper when the VM is up, can you connect via the UI? From the log it looks like it's failing to acquire an IP address. Does the guest have any error information within the logs and can you manually enable the networking in the guest? Thanks!

@jtopper

This comment has been minimized.

Contributor

jtopper commented Apr 18, 2017

@chrisroberts When the VM is up, I can open a console in the UI. eth0 and eth1 are both up, but neither has picked up an IP using DHCP. Restarting networking in the guest doesn't help.

At this point, vmnet-cli --status gives me:

DHCP service on vmnet1 is not running
Hostonly virtual adapter on vmnet1 is disabled
DHCP service on vmnet2 is not running
Hostonly virtual adapter on vmnet2 is disabled
DHCP service on vmnet3 is not running
Hostonly virtual adapter on vmnet3 is disabled
DHCP service on vmnet4 is not running
Hostonly virtual adapter on vmnet4 is disabled
DHCP service on vmnet5 is not running
Hostonly virtual adapter on vmnet5 is disabled
DHCP service on vmnet8 is not running
NAT service on vmnet8 is not running
Hostonly virtual adapter on vmnet8 is disabled
Some/All of the configured services are not running

It looks to me a lot like destroying the other VM is stopping vmnet. On manually starting vmnet and then restarting networking in the guest, the guest gets an IP address.

My suspicion here is that the nat.conf entries are a red herring - the issue seems to be with vmnet.

@chrisroberts

This comment has been minimized.

Member

chrisroberts commented Apr 18, 2017

Interesting. I'll get a comparable setup running locally to see if I can replication the behavior. Are there any other interactions happening with VMware outside of Vagrant?

@jtopper

This comment has been minimized.

Contributor

jtopper commented Apr 18, 2017

Nothing else is touching VMware on this host.

Actually having said the nat.conf entries are a red herring, I'm going to backtrack on that. I can only reliably start a VM if there are no entries in the incomingtcp section. Thereafter that file looks like:

# VAGRANT-BEGIN: /Users/jtopper/Vagrant VMs/e7efe74a-fe88-4416-bf75-f290407b8c66/scalefactory-centos6-sfpuppet-vmware-1.0.145.vmx
62639 = 192.168.20.148:22
# VAGRANT-END: /Users/jtopper/Vagrant VMs/e7efe74a-fe88-4416-bf75-f290407b8c66/scalefactory-centos6-sfpuppet-vmware-1.0.145.vmx

and vmnet-cli --status shows the networks as running.

After running a vagrant destroy, nat.conf contains just:

62639 = 192.168.20.148:22

Again, vmnet-cli --status shows the networks to be running.

On a subsequent vagrant up, I see:

==> centos6: Cloning VMware VM: 'scalefactory/centos6'. This can take some time...
==> centos6: Checking if box 'scalefactory/centos6' is up to date...
==> centos6: Verifying vmnet devices are healthy...

vmnet-cli --status goes disabled then enabled again.

then we get

==> centos6: Preparing network adapters...
==> centos6: Starting the VMware VM...
==> centos6: Waiting for machine to boot. This may take a few minutes...

and the vmnet-cli --status shows disabled from this point on, and the box doesn't come up with any networking enabled.

@ejames17

This comment has been minimized.

ejames17 commented May 11, 2017

@chronon thanks that helped

@jtopper

This comment has been minimized.

Contributor

jtopper commented May 23, 2017

This problem is still apparent with v4.0.19 of the plugin.

@StefanScherer

This comment has been minimized.

Contributor

StefanScherer commented Jul 27, 2017

For unknown reasons I still have the problem with Vagrant 1.9.7, vagrant-vmware-fusion (4.0.22), VMware Fusion Pro 8.5.8 so I had to google this solution above again. 😬

@Montana

This comment has been minimized.

Montana commented Sep 21, 2017

Getting same issue on Vagrant 2.0.0

@fuel-lhartwell

This comment has been minimized.

fuel-lhartwell commented Oct 19, 2017

Adding this line worked for me... till I reach 2999 I guess :\

config.vm.usable_port_range = 2200..2999

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment