Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAC address duplicate for GL-AR300M #397

Closed
tpreston51 opened this issue Jun 14, 2022 · 15 comments
Closed

MAC address duplicate for GL-AR300M #397

tpreston51 opened this issue Jun 14, 2022 · 15 comments

Comments

@tpreston51
Copy link

I was trying to move me Tunnel Server and Tunnel Client to separate GL-AR300M devices to separate them from my active RF routers.
I found that the two delvices had the same MAC address on the br-lan and eth1 lan configs therefore causing problems with my switch.
Software 3.22.6.0 loaded but it also did not work on 3.22.1.0

@aanon4
Copy link
Contributor

aanon4 commented Jun 14, 2022

Could you post the support tool data for the node?

@tpreston51
Copy link
Author

Supportdata.zip

@pmilazzo
Copy link
Contributor

pmilazzo commented Jul 21, 2022

I can confirm this behavior on a pair of AR300M16-Ext nodes. When booting the manufacturer's latest firmware on one of them, I get addresses of the form:

  • eth0: 94:83:c4:xx:xx:xc
  • eth1: 94:83:c4:xx:xx:xd

If I then load 1510-ba55aed and check immediately after first boot, before any configuration has been done, I get:

  • eth0: 94:83:c4:xx:xx:xc
  • eth1: 00:03:7f:11:23:c6

Both nodes have 00:03:7f:11:23:c6 as the eth1 address. Because the 00:03:7f OUI belongs to Qualcomm Atheros, I suspect it's a default value programmed into the chipset, and we're failing to override it at boot time.

The 94:83:c4:xx:xx:xc address appears in the "art" partition (/dev/mtd6), but the "...d" address does not, so I assume that the manufacturer's firmware is just adding 1 to the eth0 address to get the eth1 address.

@pmilazzo
Copy link
Contributor

Workaround:

(fiddly, but it works)

  1. Use the ip link command to determine the MAC address of eth0.
  2. Add 1 to the hexadecimal value of the last byte to derive the correct MAC address for eth1. For example, if the address ends in ...:a2, use ...:a3
  3. Edit /etc/aredn_include/ethmacfixup to insert a line of the form:
    option macaddr "xxx"
    where one or more spaces precede option, and xxx is the MAC address derived above, surrounded by literal quotation marks.
  4. Edit /etc/config.mesh/_setup and change the value of dtdlink_ip to 10.x.y.z, where x, y, and z are the decimal equivalents of the last three hexadecimal pairs in the MAC address derived above. For example, if the derived address ends in ...:a3 , then z=163.
  5. Go to the Basic Setup tab in the UI and press Save Changes and then Reboot.

@aanon4
Copy link
Contributor

aanon4 commented Jul 22, 2022

So there's a file /etc/init.d/local which runs on each boot, but specifically on first boot it creates the /etc/aredn_include/ethmacfixup file based on a few rules. Is this file empty on this hardware? Unfortunately it's not included in the supportdata. Maybe we need to add some new rules here for this device.

@aanon4
Copy link
Contributor

aanon4 commented Jul 22, 2022

I just bought one of these on Amazon so hopefully can get to a solution this weekend.

@pmilazzo
Copy link
Contributor

That file exists, but it doesn't handle this case because the eth1 MAC address is valid and differs by more than 10 from the WiFi MAC address.

As an experiment, I changed the test on line 24:

if [ $macdelta -le 10 -o $lanmac = "00:03:7F:11:23:C6" ]

...quoting the fixed eth0 MAC address both of my units share. Now eth1 gets a valid MAC address and all is well.

Note that if OpenWRT had initialized the eth1 interface in the same way that the manufacturer's firmware does (setting the eth1 MAC to the eth0 MAC + 1), the existing code in /etc/init.d/local would have been triggered because the MAC addresses differ numerically by less than 10. So, I think that, strictly speaking, this issue is really an upstream problem.

As a practical matter, we'll have to deal with it for now.

Can you think of a less fragile test we could use? I thought about testing just the OUI part, but I don't want to mess up other devices that might use Qualcomm Atheros MAC addresses. We could also compare the OUIs of the WAN and LAN—or WiFi and LAN—interfaces, or check the hardware type, or something like that. Ideas?

@aanon4
Copy link
Contributor

aanon4 commented Jul 22, 2022

One of the reason I bought a new device from Amazon was to see if the mac was the same as you see on your two devices. If so, and I assume you devices aren't brand new, we might be okay assuming this value is the "magic" key here. I'm reluctant to over-generalizing as you also are.

@ae6xe
Copy link
Contributor

ae6xe commented Jul 22, 2022

If the upstream root cause is the device from the OEM has reused the MAC on multiple devices, and the vendor's firmware resolves this, this suggests we need to put in a random MAC generator to address. Something like this (probably mangled by github):

NEWMAC=$(echo 00:11:22:$(dd if=/dev/urandom bs=1024 count=1 2>/dev/null | md5sum | sed 's/^(..)(..)(..)(..)(..)(..).*$/\1:\2:\3'/))

@pmilazzo
Copy link
Contributor

@aanon4 Sadly, my devices are also new; I ordered them from Amazon on July 11th. Sorry! Did you get them with or without the external antennae? Mine are with.

@ae6xe My guess is that the cause is slightly different from what you are saying. I don't think GL.iNet manufactured the devices with duplicate MAC addresses and is patching the problem in software; I think the devices are working as intended and the eth1 MAC address was always intended to be derived from the eth0 MAC address. If you scan /dev/mtd6 (the "art" partition), you can find one MAC address in the 96:83:c4 OUI, which belongs to GL.iNet. There are no other nearby byte sequences that look like MAC addresses, and the specific duplicate MAC address we see appears nowhere in any MTD partition.

The MAC address in /dev/mtd6 is assigned to both eth0 (WAN) and wlan0, which I guess is OK because those two interfaces could never appear on the same network. When you run GL.iNet's firmware, the eth1 interface also gets the same MAC address, except that the sixth byte is incremented by 1.

When you run our firmware, eth0 gets the same address as before, but eth1 always gets 00:03:7F:11:23:C6. Because the 00:03:7F OUI belongs to Qualcomm Atheros and not GL.iNet, I think that the 00:03:7F:11:23:C6 address is just a default value compiled into the bitstream used to initialize the chipset. It is expected to be overridden by the firmware, which GL.iNet's firmware does but OpenWRT does not. So I think it's an OpenWRT bug.

Nevertheless, even if OpenWRT implemented the same algorithm as the GL.iNet firmware, /etc/init.d/local would still rewrite the eth1 address at first boot to avoid assigning the interfaces sequential IPv4 addresses, which as I understand it causes a problem in OLSR. The approach /etc/init.d/local uses is to increment the fourth (rather than the sixth) byte. Whether this approach is more or less likely to cause a collision than choosing a random MAC address is one of those statistics problems I always hated... :-)

@aanon4
Copy link
Contributor

aanon4 commented Jul 22, 2022

@aanon4 Sadly, my devices are also new; I ordered them from Amazon on July 11th. Sorry! Did you get them with or without the external antennae? Mine are with.

No worries. I ordered one of each .. which might be informative

@ae6xe
Copy link
Contributor

ae6xe commented Jul 23, 2022

I have a GL-AR300M, when I original ported AREDN to the model. It has the following:

eth0 MAC: E4:95:6E:43:A8:79
eth1 MAC: 00:03:7F:11:23:C6

The ART /dev/mtd6 partition has both MAC #s:
0000000 e495 6e43 a879 0003 7f11 23c6 ffff ffff
0001000 0202 e495 6e43 a879 0000 0000 0000 0000

Is your ART partition the same?

There is no handling for this model in the ./openwrt/target/linux/ar71xx/base-files/etc/board.d/02_network: ar71xx_setup_macs() -- this would pass through the setting as openwrt runs this script on firstboot to create this detail in /etc/board.json.

2nd option is to make the patch in /etc/init.d/local .

It appears that this 0003 7f11 23c6 MAC is the same on all the devices, and the firmware needs to do something to ensure uniqueness across devices.

I put in the -le 10 check and the code to handle tplink, mikrotik, and resolve an olsr failure I saw at the time. the '10' was just enough to not trigger the failure, but I don't recall the specific failure, it's been too long.

@pmilazzo
Copy link
Contributor

pmilazzo commented Jul 23, 2022 via email

@ae6xe
Copy link
Contributor

ae6xe commented Jul 23, 2022

'turn the "add-1" into an "add-65536"'

"randomly generated address or the deterministic
approach above gives a lower probability of collision?"

This 3rd octet + 1 technique is what Ubiquiti uses on their MAC address assignments, and I had repeated this pattern here. I've not done the math on this, but one reason to continue to use this technique, is that it makes it easier when troubleshooting network issues. It is easier to remember the IP addresses for a node when dtdlink and RF addresses are similar with this increment: 10.1.2.1 becomes 10.1.3.1, and easy to spot that both belong to the same node. We've not seen any collisions with this approach to day -- knock on wood -- so I believe it is good or low risk enough to continue using. It narrows down the collision to this assigned OUI MAC range of the vendor.

"a Locally Administered address"

Also, after looking into a random technique, and would also apply to manually setting, I realized a negative of this approach. The use case for this is evasive behavior to circumvent MAC blocks to join networks -- hooked in to randomize at each boot. ...not really the spirit to enable or support in our community. We're motivated to not lower the hurdle to modify the MAC addresses. But maybe there are other use cases to consider?

@aanon4
Copy link
Contributor

aanon4 commented Jul 24, 2022

Brand new AR300M (no external antennas) and the relevant MAC is also 00:03:7f:11:23:c6.

I'm okay with using the mac as the explicit key to solve this problem. I'm uncomfortable generalizing it as we don't know what that might do. Worse case is it doesn't fix older devices, but they're not fixed now anyway.

I'm also okay with reusing the current mac generation code as we know that's in use and hasn't cause any problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants