Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Forwarded ports not being released with VMWare Fusion 8.5.3 #8130

Closed
amacnamara opened this issue Dec 16, 2016 · 24 comments
Closed

Forwarded ports not being released with VMWare Fusion 8.5.3 #8130

amacnamara opened this issue Dec 16, 2016 · 24 comments

Comments

@amacnamara
Copy link

Vagrant version

Vagrant 1.9.1

Host operating system

macOS 10.12.1 (16B2659)

Guest operating system

Red Hat Enterprise Linux Server release 6.7 (Santiago)

Vagrantfile

Vagrant.configure("2") do |config|

# Hostmanager Settings
  config.hostmanager.enabled            = true    # Enable HostMan Plugin
  config.hostmanager.manage_host        = true    # Update Host's host file
  config.hostmanager.ignore_private_ip  = false
  config.hostmanager.include_offline    = true

  # Core Configurations
  config.vm.box            = "rcs_class_a_oel66_1.5.2"           # Specify the Vagrant Base Box by name
  config.vm.host_name      = "eaton.rcsdev.net"                  # Specify a hostname for the virtual machine
  config.ssh.forward_agent = true                                # Forward ssh keys
  config.ssh.forward_x11   = true                                # Forward X11
  config.vm.network "forwarded_port", guest: 7001,  host: 7001   # Forward WebLogic Admin Console Port
  config.vm.network "forwarded_port", guest: 4444,  host: 4444   # Forward RIDC Port
  config.vm.network "forwarded_port", guest: 16300, host: 16300  # Forward URM Port
  config.vm.network "forwarded_port", guest: 1521,  host: 1521   # Forward JDBC Port

  # Provisioning
  config.vm.provision "shell", path: "setup.sh"

  # VMWare specific
  config.vm.provider "vmware_fusion" do |v|
    v.vmx["memsize"] = "8192"
    v.vmx["numvcpus"] = "4"
    v.vmx["ethernet0.virtualDev"] = "vmxnet3"
  end

end

Expected behavior

When the machine is shut down, the forwarded ports should be freed up to be used by other vms. The port forwarding entries in /Library/Preferences/VMware\ Fusion/vmnet8/nat.conf should be removed.

Actual behavior

The port forwarding entries in /Library/Preferences/VMware\ Fusion/vmnet8/nat.conf are not removed. Vagrant catches and fixes the port collisions for the ssh port, which grabs another port to be used. Eventually, there are a lot of extra entries in nat.conf:

2201 = 192.168.225.136:22
2222 = 192.168.225.128:22
2204 = 192.168.225.145:22
2200 = 192.168.225.136:22
2202 = 192.168.225.140:22
2206 = 192.168.225.148:22
# VAGRANT-BEGIN: /Users/amacnamara/src/eat_urm_puppet/.vagrant/machines/default/vmware_fusion/8155631a-1c29-42a1-9a4f-0567011f98bd/oel66.vmx
2207 = 192.168.225.148:22
# VAGRANT-END: /Users/amacnamara/src/eat_urm_puppet/.vagrant/machines/default/vmware_fusion/8155631a-1c29-42a1-9a4f-0567011f98bd/oel66.vmx

This also causes other forwarded ports to collide, preventing vagrant up from running. In the above Vagrantfile example, port 7001 is set up in nat.conf, then never released, so subseguent vagrant ups fail.

Steps to reproduce

  1. Configure forwarded port in Vagrantfile
  2. vagrant up
  3. vagrant halt
  4. Reboot host
  5. vagrant up
  6. port collision
@xenithorb
Copy link

I'm also seeing this with Libvirt on Fedora 25, vagrant version 1.8.5 I believe it's related.

My workaround is to killall -u user ssh inbetween sessions, since it seems to be forwarding via SSH

@Moncky
Copy link

Moncky commented Jan 13, 2017

Adding my voice to @amacnamara I am also seeing this behaviour on Vagrant 1.8.7 and VMware Fusion 8.5.3.

To work around I am manually clearing out the nat.conf file and restarting vmnet-cli which is a complete pain.

@chronon
Copy link

chronon commented Jan 16, 2017

Same issue for me with Vagrant 1.9.1 and VMware Fusion 8.5.3, though rebooting my Mac seems to allow a VM to boot. When a VM doesn't boot, it hangs at either "Waiting for machine to boot..." or "SSH auth method: private key".

The workaround mentioned by @Moncky usually works for me too:

  1. Comment all entries under [incomingtcp] in /Library/Preferences/VMware\ Fusion/vmnet8/nat.conf
  2. sudo /Applications/VMware\ Fusion.app/Contents/Library/vmnet-cli --stop
  3. sudo /Applications/VMware\ Fusion.app/Contents/Library/vmnet-cli --start

@mightyoj
Copy link

mightyoj commented Feb 1, 2017

this issue is also happening in OS X 10.11.6

@robosung
Copy link

robosung commented Feb 8, 2017

Had the same issue and chronon/moncky approach worked for a while but stopped working...I think I tried to stop vagrant up with control-c. I was able to get it work again by:

  1. renamed ~/.vagrant.d to ~/.vagrant.bak (so a new one can be created)
  2. reinstall the vagrant vmware provider plugin and license. (the license file is in the original ~/.vagrant.bak)
  3. cleaned the nat.conf file as mentioned above
  4. vagrant up, confirm that it works then delete ~/.vagrant.bak

@chronon
Copy link

chronon commented Feb 21, 2017

For reference, this issue appears to be a duplicate of #7948.

@chrisroberts
Copy link
Member

This should be resolved within the latest vagrant vmware plugin release. Cheers!

@StefanScherer
Copy link
Contributor

Seem's I'm late to the party. Just encountered that problem, all ports up to 2250 are occupied, using Vagrant 1.9.2 + vagrant-vmare-fusion 4.0.17
I just updated to 4.0.18 and tried to spin up a new Vagrant box, but still no free port forwardings available. Think I have to clean up manually and hope that new port forwardings will be cleanup up properly.
Other workaround is to increase the port range with override.vm.usable_port_range = 2200..2999

@chrisroberts
Copy link
Member

@StefanScherer Hi! If there are old untagged port forward entries in the nat.conf file they will need to be cleaned out manually. After that they should not be lost again with the latest version of the plugin. If you have any problems please do let me know. Cheers!

@jtopper
Copy link
Contributor

jtopper commented Apr 18, 2017

@chrisroberts I found this issue whilst attempting to debug a problem I'm having here with version 4.0.18 of the vagrant-vmware-fusion plugin.

I have three boxes configured, and just one running. I vagrant destroy centos6 and the box is destroyed as expected, however the nat.conf goes from containing

[incomingtcp]

# Use these with care - anyone can enter into your VM through these...
# The format and example are as follows:
#<external port number> = <VM's IP address>:<VM's port number>
#8080 = 172.16.3.128:80
# VAGRANT-BEGIN: /Users/jtopper/Vagrant VMs/31a423a1-bd7c-4481-8820-152f6ddeb7dc/scalefactory-centos6-sfpuppet-vmware-1.0.145.vmx
62523 = 192.168.20.139:22
# VAGRANT-END: /Users/jtopper/Vagrant VMs/31a423a1-bd7c-4481-8820-152f6ddeb7dc/scalefactory-centos6-sfpuppet-vmware-1.0.145.vmx

to

[incomingtcp]

# Use these with care - anyone can enter into your VM through these...
# The format and example are as follows:
#<external port number> = <VM's IP address>:<VM's port number>
#8080 = 172.16.3.128:80
62523 = 192.168.20.139:22

It looks to me like the vagrant tags are being removed from the config file, but the NAT config itself remains.

@chrisroberts
Copy link
Member

@jtopper Is that port forward being explicitly defined in the Vagrantfile? When you run vagrant up again, does it cause a collision error, or properly bring the box up?

@jtopper
Copy link
Contributor

jtopper commented Apr 18, 2017

@chrisroberts: We're actually generating the port number programatically based on the PID and some randomness, so it'll never collide in that way. The same Vagrantfile is used in our CI environment where we run multiple vagrants simultaneously, and this helps us avoid a concurrency issue. Assuming the port number is always subsequently obtained from vagrant's state files (which does seem to be the case generally) we don't get into trouble with subsequent runs of the command finding the wrong PID - though it's plausible we're hitting an edge case here.

What seems to happen is that after destroying and bringing up the box, the process hangs on

==> server: Starting the VMware VM...
==> server: Waiting for machine to boot. This may take a few minutes...

In the VMWare Fusion console, I can see that the box starts, but none of the networking comes up. Removing the NAT entries and restarting the vmware-cli seems to solve the problem to the point where boxes start properly again. I seem to recall that in this case, a vmware-cli --stop doesn't always output the whole list of vmnet interfaces that I'd expect to see.

@chrisroberts
Copy link
Member

@jtopper Can you clear out the [incomingtcp] configuration, run a vagrant up && vagrant destroy and then gist the output of vagrant up --debug. Even if Vagrant loses the markers within the nat.conf file (which can happen if the vmware service rewrites the file) it should detect ports it was responsible for and properly re-mark or clean them. Thanks!

@jtopper
Copy link
Contributor

jtopper commented Apr 18, 2017

@chrisroberts I hit ^C on this one eventually.

https://gist.github.com/jtopper/2d3243db8fdc847d61aec0e208c81d12

(Sorry, that's a bit cluttered by the fact there are more boxes defined in the Vagrantfile and I forgot to constrain the run to just one of them - seems not to have got as far as trying to start the other two though)

@chrisroberts
Copy link
Member

@jtopper when the VM is up, can you connect via the UI? From the log it looks like it's failing to acquire an IP address. Does the guest have any error information within the logs and can you manually enable the networking in the guest? Thanks!

@jtopper
Copy link
Contributor

jtopper commented Apr 18, 2017

@chrisroberts When the VM is up, I can open a console in the UI. eth0 and eth1 are both up, but neither has picked up an IP using DHCP. Restarting networking in the guest doesn't help.

At this point, vmnet-cli --status gives me:

DHCP service on vmnet1 is not running
Hostonly virtual adapter on vmnet1 is disabled
DHCP service on vmnet2 is not running
Hostonly virtual adapter on vmnet2 is disabled
DHCP service on vmnet3 is not running
Hostonly virtual adapter on vmnet3 is disabled
DHCP service on vmnet4 is not running
Hostonly virtual adapter on vmnet4 is disabled
DHCP service on vmnet5 is not running
Hostonly virtual adapter on vmnet5 is disabled
DHCP service on vmnet8 is not running
NAT service on vmnet8 is not running
Hostonly virtual adapter on vmnet8 is disabled
Some/All of the configured services are not running

It looks to me a lot like destroying the other VM is stopping vmnet. On manually starting vmnet and then restarting networking in the guest, the guest gets an IP address.

My suspicion here is that the nat.conf entries are a red herring - the issue seems to be with vmnet.

@chrisroberts
Copy link
Member

Interesting. I'll get a comparable setup running locally to see if I can replication the behavior. Are there any other interactions happening with VMware outside of Vagrant?

@jtopper
Copy link
Contributor

jtopper commented Apr 18, 2017

Nothing else is touching VMware on this host.

Actually having said the nat.conf entries are a red herring, I'm going to backtrack on that. I can only reliably start a VM if there are no entries in the incomingtcp section. Thereafter that file looks like:

# VAGRANT-BEGIN: /Users/jtopper/Vagrant VMs/e7efe74a-fe88-4416-bf75-f290407b8c66/scalefactory-centos6-sfpuppet-vmware-1.0.145.vmx
62639 = 192.168.20.148:22
# VAGRANT-END: /Users/jtopper/Vagrant VMs/e7efe74a-fe88-4416-bf75-f290407b8c66/scalefactory-centos6-sfpuppet-vmware-1.0.145.vmx

and vmnet-cli --status shows the networks as running.

After running a vagrant destroy, nat.conf contains just:

62639 = 192.168.20.148:22

Again, vmnet-cli --status shows the networks to be running.

On a subsequent vagrant up, I see:

==> centos6: Cloning VMware VM: 'scalefactory/centos6'. This can take some time...
==> centos6: Checking if box 'scalefactory/centos6' is up to date...
==> centos6: Verifying vmnet devices are healthy...

vmnet-cli --status goes disabled then enabled again.

then we get

==> centos6: Preparing network adapters...
==> centos6: Starting the VMware VM...
==> centos6: Waiting for machine to boot. This may take a few minutes...

and the vmnet-cli --status shows disabled from this point on, and the box doesn't come up with any networking enabled.

@ejames17
Copy link

@chronon thanks that helped

@jtopper
Copy link
Contributor

jtopper commented May 23, 2017

This problem is still apparent with v4.0.19 of the plugin.

@StefanScherer
Copy link
Contributor

For unknown reasons I still have the problem with Vagrant 1.9.7, vagrant-vmware-fusion (4.0.22), VMware Fusion Pro 8.5.8 so I had to google this solution above again. 😬

@Montana
Copy link

Montana commented Sep 21, 2017

Getting same issue on Vagrant 2.0.0

@fuel-lhartwell
Copy link

Adding this line worked for me... till I reach 2999 I guess :\

config.vm.usable_port_range = 2200..2999

@ghost
Copy link

ghost commented Mar 31, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@ghost ghost locked and limited conversation to collaborators Mar 31, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests