gluon-0.7.2-x86-vmware does not get dhcp on wan interface #496

Open
nmaas87 opened this Issue Sep 13, 2015 · 22 comments

Projects

None yet

9 participants

@nmaas87
nmaas87 commented Sep 13, 2015

As written here (https://forum.freifunk.net/t/bug-in-gluon-0-7-2-x86-vmware/7801), I got an 0.7.2. version of the Freifunk Trier VMWare Image (https://github.com/freifunktrier/firmware_store/blob/master/firmware/stable/factory/gluon-fftr-0.7.2-x86-vmware.vmdk) - which does, after inital setup - not get an dhcp lease on the wan (eth1) interface. I left everything in the configuration on the website on default, activated mesh_via_vpn and checked in the vpn key - which was done successfully. however, my appliance can't get an ip on the wan interface and does not connect to the internet. same thing with the image from ggrz. Any ideas?
Can reproduce the error on Workstation 11 as well as on VMWare ESXi 4.1 - with multiple gluon 0.7.2 Versions. Wan Interface, as descibred in the core config of gluon is correctly configured to be eth1, udhcpc eht1 does not give back an ip. Using the UNOFFICAL gluon 0.3.3 image from IT-KL.eu (https://www.it-kl.eu/2015/08/gluon-x86-unter-vmware/ ) does work correctly in the same configuration and enviroment.

@jplitza
Member
jplitza commented Sep 13, 2015

Could you verify the layer 2 connectivity of that interface, i.e. use something like ping6 -I eth1 ff02::1 and see if your host responds? Also, more details about the network setup of your VMWare setup would probably be needed.
Also, what Gluon versions do those community version numbers correspond to?

@nmaas87
nmaas87 commented Sep 13, 2015

Hi there,

ping6 -I eth1 ff02::1does only give me an ping6: sendto: Operation not permitted.
Same for eth0. However, an ping6 on br-client, br-wan and bat0 give me back an answer.
The system also gives out IPv6 addresses on eth0, the client network - but none ipv4, due to the fact that it is not online.

Network:
Freifunk VM, eth0 (E1000) for client port to vlan 5
Freifunk VM, eth1 (E1000) for wan port to vlan 1
Internal pfSense, which gives LAN and dhcp on vlan 1
Internal pfSense, which takes WAN on vlan 2
VMWare ESXi 4.1, attached with trunk port to switch
Switch with trunk port to VMWare
Switch with vlan 2, WAN access is attached here
Switch with vlan 1, LAN port, Freifunk is attached here with Uplink
Switch with vlan 5, Client port of Freifunk

I can plug my notebook into vlan 1, and it gets an IP Adress succesfully - so trunking, vlan and all is working. I can plug it into vlan 5, and it gets an autogenerated IPv6 address from Freifunk Client Port.
However, the freifunk vm does not get an WAN IP...

In easy, Freifunk is attached in that way:
WAN with dhcpdv4 ----- eth1 (Freifunk) eth0 ------ Client Laptop

The Image is based on Gluon 2015.1.2.

@NeoRaider
Member

I've confirmed this issue. It is caused by Gluon explicitly setting the MAC address of br-wan (to avoid address conflicts in the case mesh-on-WAN is used). VMware blocks unknown MAC addresses though...

This is a regression introduced in v2015.1.2.

@NeoRaider NeoRaider added this to the 2015.2 milestone Sep 19, 2015
@adlerweb
Contributor

For ESXi: Did you try to allow promisc mode in VMWares network security settings [1]? Gluon uses a network bridge (br-wan) for WAN causing the interface to enter promic mode - this however is forbidden in VMWares default configuration. This also affects static configuration and is not related to DHCP.

[1] http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1004099

@Little-Ben

nmaas87, this is Freifunk Westpfalz.

Our official repositories lives only under:
https://github.com/freifunk-westpfalz/

Official Downloads solely via:
http://westpfalz.freifunk.net/firmware/
or
http://download.westpfalz.freifunk.net/

Please do not announce other locations as official references!

@nmaas87
nmaas87 commented Sep 20, 2015

@adlerweb That was the right hint. I got it working by enabling promiscuous mode on the ESXi Software Switch. However, it would be better if there could be a software-fix in the Freifunk Firmware, as promiscuous mode is more of an work-around and not secure, if I recall correctly.

@Little-Ben Sorry, I rewrote the issue and stated unofficial, as well as removing the reference to Freifunk Westpfalz.

@nmaas87
nmaas87 commented Sep 21, 2015

Ok, promiscuous mode does not solve the problem completly. After enabling it, br-wan got an ip address and vpn and everything worked (eth1). On eth0 (client net), I also got an client ip from the net and got access to the internet. However, it turned out to be VERY slow ( In terms of loading www.google.de in about one minute... - or even dropping the connection while accessing other sites like www.web.de ).
On my console, I found the message
br-wan: recveived packet on eth1 with own address as source address
repeated in milisecond steps, spamming the console.
And sometimes something like net_ratelimit: 77 callbacks supressed.
I don't know wheter that is because of the promiscuous mode, however, it seems like a) the VPN network already got an host with the same mac address, or b) I'm somehow shorting the Freifunk net at my end?
I can however guarantee, that I only got eth1 on my WAN end online and eth0 on a own switchport and they never connect at any point... And that the wan interface br-wan is seeing its own source address on the wan port (eth1) is somehow funny.

Anyhow, something is still broken and I'll let the appliance switched off until further advice.

@NeoRaider
Member

@nmaas87, could you post the complete outputs of ip a, brctl show and batctl if?

@nmaas87
nmaas87 commented Sep 21, 2015

@NeoRaider Will do. Is there any easy way to access that box from wan/lan interface? Can I enable ssh login via console? (Would make things easier) ( I think I will have time to get the outputs this evening :))

@dracoTrier
Contributor

sure, during config-mode, you can enable ssh and set password or ssh key for login.

@nmaas87
nmaas87 commented Sep 21, 2015

@NeoRaider Okay, now comes the entertaining stuff: I installed the VM again from scratch, and this time, I did got errors on duplicate MAC addr on the client site (eth0). Turns out the VM fryed on of the FFTR Supernodes due to MAC addr collision. So I shut it down for good. Somehow the VM got the same MAC as the supernode - however, the ones randomly generated by VMWare on eth0 and eth1 are not identical to the Supernode Addresses and should not generated this problem...

@dracoTrier
Contributor

my conclusion: current gluon x86vm ist buggy. Needs more testing.

@adlerweb
Contributor

@nmaas87 as already said on the forum: I don't think this would help much. It could be possible to get rid of the bridge on the WAN interface (but this would make switching mesh-on-wan a nightmare), but since we are using L2-Routing it is not really possible to avoid promisc on the LAN interface

@nmaas87
nmaas87 commented Sep 21, 2015

@adlerweb That is ok, I found out that you can enable promisc mode on an "per interface" base instead of the whole switch. Which is ok by me :).

@NeoRaider
Member

Is it possible that your image was not clean when you experienced the MAC collisions? Gluon derives all MAC addresses from the address of the LAN interface (eth0) on first boot. If the image has already been booted once, it should not be migrated to another host, as the MAC addresses won't be regenerated.

@nmaas87
nmaas87 commented Sep 23, 2015

It is possible for the MAC collisions on WAN,
however, after I setup the VM again from scratch - it worked - with the
limitation of collisions on the LAN side - killing the supernode of FFTR on
the way. And that is surely not supposed to happen in production :/.

2015-09-23 2:12 GMT+02:00 Matthias Schiffer notifications@github.com:

Is it possible that your image was not clean when you experienced the MAC
collisions? Gluon derives all MAC addresses from the address of the LAN
interface (eth0) on first boot. If the image has already been booted once,
it should not be migrated to another host, as the MAC addresses won't be
regenerated.


Reply to this email directly or view it on GitHub
#496 (comment)
.

@NeoRaider
Member

I currently see three options to setup the WAN MAC address:

  1. Always explicitly set the WAN MAC address (current solution). Needs promicious mode permission in VMware even without mesh-on-wan. Simple to setup, won't lead to address conflicts (if the virtual NICs' MAC addresses are generated randomly by VMware)
  2. Only set the WAN MAC address explicitly when mesh-on-wan is enabled. MAC address conflicts on the WAN interface are only relevant when mesh-on-wan is enabled. Will make enabling/disabling mesh-on-wan more complex, as it can't be configured using a single UCI option anymore. Won't be a problem for the expert mode interface, but the current command line commands won't work anymore.
  3. Don't set the WAN MAC address at all (instead take the primary address from the eth1 MAC address). Probably not a good idea. Using eth1 won't work when there's only eth0; using eth1 seems arbitrary; would be a VMware-specific hack.

My current plan is keeping the current solution 1 for Gluon 2015.2, and switching to 2 as soon as #284 is solved.

@tcatm
Member
tcatm commented Oct 26, 2015

I think the plan is reasonable.

@NeoRaider NeoRaider modified the milestone: next, 2015.2 Oct 26, 2015
@NeoRaider NeoRaider modified the milestone: network-rewrite, 2016.2 Feb 24, 2016
@FFS-Roland

On VMware ESXi you need an additional setting when using promiscuous mode, if vSwitch has more than one phys. NIC connected: "Net.ReversePathFwdCheckPromisc" must be set to 1. You will find it on Configuration - Software - Advanced Settings.

@nmaas87
nmaas87 commented Oct 3, 2016

@FFS-Roland Perfect Answer. I tried this now with 0.8.4 Gluon (from Freifunk Trier) - worked like a charm 👍

@nmaas87
nmaas87 commented Oct 3, 2016

I documented the whole thing on https://www.nico-maas.de/?p=1320 :). Maybe the helps someone in the future.

@Ranlvor
Contributor
Ranlvor commented Oct 3, 2016

Trier 0.8.4 is gluon 2016.1.6-3-g9300421, it's just 2016.1.6 + ee597c6 + Webinterface-color-patches

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment