Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pfSense 23.05 - EarlyShellCmd Changes - negth0 prevents booting #83

Closed
neydah700 opened this issue May 24, 2023 · 41 comments
Closed

pfSense 23.05 - EarlyShellCmd Changes - negth0 prevents booting #83

neydah700 opened this issue May 24, 2023 · 41 comments

Comments

@neydah700
Copy link

Looks like on the latest version of pfSense (23.05) the earlyshellcmd scripts are running after inferences are assigned. Since the script hasn't ran, ngeth0 doesn't exist and can't be assigned. Another update, another break.

@aholmes55
Copy link

aholmes55 commented May 24, 2023

Looks like they have incorporated netgraph-free procedures for AT&T tethered-bypass into 23.05. Hopefully someone will add supplicant functionality. https://docs.netgate.com/pfsense/en/latest/recipes/authbridge.html

@5ch17
Copy link

5ch17 commented May 25, 2023

Issue confirmed... Is there a CLI command to assign ngeth0 to WAN late in boot process (after earlyshellcmd scripts ran)? Seems to work manually via GUI, so if it could be automated, that may at least provide a workaround

@5ch17
Copy link

5ch17 commented May 25, 2023

...and reassign WAN to physical NIC at reboot/shutdown

@owenthewizard
Copy link
Contributor

Issue confirmed... Is there a CLI command to assign ngeth0 to WAN late in boot process (after earlyshellcmd scripts ran)? Seems to work manually via GUI, so if it could be automated, that may at least provide a workaround

I'm on OPNsense so not sure how helpful this will be but have you checked /usr/local/etc/rc.syshook.d?

@5ch17
Copy link

5ch17 commented May 25, 2023

I'm on OPNsense so not sure how helpful this will be but have you checked /usr/local/etc/rc.syshook.d?

May not exist on pfsense unless I could not find it. Got it working without ngeth0 with a switch (see issue #82 )

@neydah700
Copy link
Author

I'm on OPNsense so not sure how helpful this will be but have you checked /usr/local/etc/rc.syshook.d?

May not exist on pfsense unless I could not find it. Got it working without ngeth0 with a switch (see issue #82 )

I assume you are using the passthrough method and not supplicant?

@5ch17
Copy link

5ch17 commented May 25, 2023

I assume you are using the passthrough method and not supplicant?

Supplicant with switch and WAN PCP set to 1 (issue #82 ).

@neydah700
Copy link
Author

I assume you are using the passthrough method and not supplicant?

Supplicant with switch and WAN PCP set to 1 (issue #82 ).

ah .. so you are only using a script for authentication. Not creating a ngeth0 interface. Mind sharing your script your using for auth only?

@5ch17
Copy link

5ch17 commented May 25, 2023

ah .. so you are only using a script for authentication. Not creating a ngeth0 interface. Mind sharing your script your using for auth only?

Correct. Script is same as original versions that included both Bridge and Supplicant modes - posted here: #79 (comment) and https://github.com/MonkWho/pfatt/files/10418690/2301_bypass.txt .

You could use just the part after this line - "elif [ "$EAP_MODE" = "supplicant" ] ; then" plus variables definition - for wpa_supplicant auth functionality only or use as is with EAP_MODE="supplicant".

@5ch17
Copy link

5ch17 commented May 25, 2023

...misspoke... if you don't want to create a separate supplicant only script, need to edit out all ngctl lines so as not to resent and/or create netgraph at all.

@tjasko
Copy link

tjasko commented May 28, 2023

For now, I just leveraged the system patch functionality to add ngeth to the list of ignored interfaces as mentioned here.

I also realized their new auth bridging functionality is exactly that, only for bridging... wpa_supplicant still needs to be patched to work with VLAN0.

The patch I made is here for those who want to copy & paste, just set the Path Strip Count setting to 0 (confirmed CE & Plus code are the same, so I have no qualms with sharing this here):

--- /etc/inc/util.inc	2023-05-22 10:02:01.000000000 -0500
+++ /etc/inc/util.inc	2023-05-28 17:20:34.248085000 -0500
@@ -2557,7 +2557,7 @@
 		foreach ($config['interfaces'] as $ifcfg) {
 			if (interface_is_vlan($ifcfg['if']) != NULL ||
 			    interface_is_qinq($ifcfg['if']) != NULL ||
-			    preg_match("/^enc|^cua|^tun|^tap|^l2tp|^pptp|^ppp|^ovpn|^ipsec|^gif|^gre|^lagg|^bridge|^ue|vlan|_wlan|_\d{0,4}_\d{0,4}$/i", $ifcfg['if'])) {
+			    preg_match("/^ngeth|^enc|^cua|^tun|^tap|^l2tp|^pptp|^ppp|^ovpn|^ipsec|^gif|^gre|^lagg|^bridge|^ue|vlan|_wlan|_\d{0,4}_\d{0,4}$/i", $ifcfg['if'])) {
 				// Do not check these interfaces.
 				$i++;
 				continue;

@rcmcdonald91
Copy link

rcmcdonald91 commented Jun 7, 2023

I also realized their new auth bridging functionality is exactly that, only for bridging... wpa_supplicant still needs to be patched to work with VLAN0.

https://reviews.freebsd.org/D40442

😄

@tjasko
Copy link

tjasko commented Jun 7, 2023

@rcmcdonald91 reviews.freebsd.org/D40442

Ohh neat, I went looking for that but couldn't find anything (thanks for knocking that fix out!). 💯

@5ch17
Copy link

5ch17 commented Jun 8, 2023

@rcmcdonald91

https://reviews.freebsd.org/D40442

is this already or will it be soon incorporated in the pfsense wpa_supplicant package (current version 2.10_6)?

@gpz1100
Copy link

gpz1100 commented Jun 8, 2023

Working version can be currently found on discord. (thanks to @rcmcdonald91 )

https://discord.com/channels/886329492438671420/1005613537382637661/1116442791313150064
or
https://www.dslreports.com/r0/download/2527653~5f910f995edee075c7674c639ed949f5/wpa_supplicant-b49ca023025e305dfd99c8bb4c4644e1.zip

If you need invite - https://discord.gg/c8HGajUEGk

Until we figure out boot sequences, you'll need the wpa_supplicant commandline as both shellcmd and earlyshellcmd for it work.

@5ch17
Copy link

5ch17 commented Jun 8, 2023

Working version can be currently found on discord.
Until we figure out boot sequences, you'll need the wpa_supplicant commandline as both shellcmd and earlyshellcmd for it work.

Thanks a lot. Assume wpa_cli needs no changes and you are using the same wpa auth script as before? Or any changes needed to the script?

@rcmcdonald91
Copy link

No changes to wpa_cli, the IPC ABI isn't impacted by this work.

@5ch17
Copy link

5ch17 commented Jun 8, 2023

thank you this is great

@5ch17
Copy link

5ch17 commented Jun 9, 2023

@gpz1100
since #82 was closed and patched wpa_supplicant is preferred, could someone outline the steps to run that on pfs (including script)? @neydah700 had asked about the script before, so it would be good to know if old one still applies

Working version can be currently found on discord. (thanks to @rcmcdonald91 )

https://discord.com/channels/886329492438671420/1005613537382637661/1116442791313150064 or https://www.dslreports.com/r0/download/2527653~5f910f995edee075c7674c639ed949f5/wpa_supplicant-b49ca023025e305dfd99c8bb4c4644e1.zip

If you need invite - https://discord.gg/c8HGajUEGk

Until we figure out boot sequences, you'll need the wpa_supplicant commandline as both shellcmd and earlyshellcmd for it work.

@gpz1100
Copy link

gpz1100 commented Jun 9, 2023

@5ch17

image

In my testing, setting the command line (or even pfatt script without netgraph references) as early shellcmd works sufficiently to satisfy initialization at the WAN line in the boot process. But, something happens between there and completion of boot resulting in no wan connectivity. Running it twice as in the image above appears to fix this while introducing the least delays in the boot process.

Another option is to just run is as shellcmd, but then you'll have a 60s delay at the wan line in the boot process.

I'm not overly familiar with the pfsense boot process so i'm hoping @MonkWho or @rcmcdonald91 may have a more elegant fix.

The pfatt script does a dhcp renew near its end. I don't think this is needed any more given the native wan interface is used directly, no virtual interface. That's why just the single line command(s) above work. It daemonizes wpa_supplicant to sit in the background and respond to eapol requests as needed.

@5ch17
Copy link

5ch17 commented Jun 9, 2023

Interesting, sounds like the question is how/when is IP obtained via DHCP. pfatt (at least some versions) only renews WAN DHCP if IP is 0.0.0.0 or void (i.e. does not run DHCP regardless).

In the scenario you tested, WAN DHCP must be invoked by pfsense while wpa_supplicant is daemonized. Not sure, however, what the explanation is for running wpa_supplicant twice at earlyshellcmd and shellcmd to get an IP assigned. Experienced DHCP behavior where DHCPACK is not received for a long time at/after boot (after EAP succeeds) and then it suddenly responds (unclear when/why this occurs). Could this be because of multiple boot/DHCP attempts with different MACs/devices such as RG, switches, pfs?

@gpz1100
Copy link

gpz1100 commented Jun 9, 2023

If the port is not authorized, you will get a lengthy hangup at the WAN line during the pf boot process. It will eventually time out (60s) and proceed with the bootup process.

I'm not sure what happens after the wan line in bootup, but after boot is complete, there is no wan ip, In testing, in wpa_cli, issuing a logoff, then logon yields no response either.

Why would there be multiple macs? The pf's wan interface has the RG mac spoofed, that's the only mac that matters and does not/should not change for dhcp purposes.

Try testing different variations yourself inc just a single earlyshell command or both as shown. I'd expect your results to mirror my own.

@neydah700
Copy link
Author

@5ch17

image

In my testing, setting the command line (or even pfatt script without netgraph references) as early shellcmd works sufficiently to satisfy initialization at the WAN line in the boot process. But, something happens between there and completion of boot resulting in no wan connectivity. Running it twice as in the image above appears to fix this while introducing the least delays in the boot process.

Another option is to just run is as shellcmd, but then you'll have a 60s delay at the wan line in the boot process.

I'm not overly familiar with the pfsense boot process so i'm hoping @MonkWho or @rcmcdonald91 may have a more elegant fix.

The pfatt script does a dhcp renew near its end. I don't think this is needed any more given the native wan interface is used directly, no virtual interface. That's why just the single line command(s) above work. It daemonizes wpa_supplicant to sit in the background and respond to eapol requests as needed.

I wonder if it’s related to this change in how earlyshellcmd is processed in 23.05. It was patched for the netgraph script by excluding the ngeth0 interfaces.

https://redmine.pfsense.org/issues/14410

@5ch17
Copy link

5ch17 commented Jun 9, 2023

EAP succeeds (port authorized) and is followed by no DHCPACK (no IP). Multiple MACs to test different switches - given topology of ONT --> Switch <-- pfs, not sure whether ONT/ISP is able to detect switch's MAC and have some device spoofing protection?

If the port is not authorized, you will get a lengthy hangup at the WAN line during the pf boot process. It will eventually time out (60s) and proceed with the bootup process.

I'm not sure what happens after the wan line in bootup, but after boot is complete, there is no wan ip, In testing, in wpa_cli, issuing a logoff, then logon yields no response either.

Why would there be multiple macs? The pf's wan interface has the RG mac spoofed, that's the only mac that matters and does not/should not change for dhcp purposes.

Try testing different variations yourself inc just a single earlyshell command or both as shown. I'd expect your results to mirror my own.

@gpz1100
Copy link

gpz1100 commented Jun 9, 2023

^^Why isn't ont connected directly to pf? Why the switch in the middle?

@5ch17
Copy link

5ch17 commented Jun 9, 2023

^^Why isn't ont connected directly to pf? Why the switch in the middle?

stripper switch setup (not patched wpa_supplicant). hesitant to keep testing until no IP / DHCPACK after EAP succeeds is figured out.

@5ch17
Copy link

5ch17 commented Jun 9, 2023

@neydah700 - possible, but would need @rcmcdonald91 or someone with deeper knowledge of pfs to help troubleshoot

I wonder if it’s related to this change in how earlyshellcmd is processed in 23.05. It was patched for the netgraph script by excluding the ngeth0 interfaces.

https://redmine.pfsense.org/issues/14410

@gpz1100
Copy link

gpz1100 commented Jun 10, 2023

@5ch17 I think what is happening, after the "configuring wan interface..." there's some wan interface manipulation resulting in the port ultimately becoming unauthorized. Wpa_cli status still indicates authorized but if this were the case dhcp would work. Because of said port mangling, wpa_supplicant is left in an unresponsive state. Thus, the later shellcmd command kills and restarts wpa_sup works.

Unless some other solution comes about, I think it's possible to have a single script test wpa_cli status and internet connectivity, perform necessary steps (reinit wpa), etc. But it would still need to be called twice, at earlyshell and shellcmd.

Those command lines are a quick and dirty way for now. @BigJohn97 and I have tested this successfully multiple times now.

@aholmes55
Copy link

aholmes55 commented Jun 10, 2023

Are these correct steps for installing?

  1. Add patched wpa_supplicant (/root/wpa_supplicant)
  2. Install wpa_supplicant.conf and certificates with correct paths
  3. Add Priority Code 1 to WAN Interface
  4. Add EarlyShellCmd and ShellCmd as from @5ch17 post

Is Promiscuous Mode needed?
Did I miss anything?

@5ch17
Copy link

5ch17 commented Jun 10, 2023

promiscuous should not be needed on wan, pcp 1 seems to be needed (at least for the stripper switch method). can't comment on earlyshellcmd and shellcmd as that method (patched wpa_supp) failed in testing

Are these correct steps for installing?

  1. Add patched wpa_supplicant (/root/wpa_supplicant)
  2. Install wpa_supplicant.conf and certificates with correct paths
  3. Add Priority Code 1 to WAN Interface
  4. Add EarlyShellCmd and ShellCmd as from @5ch17 post

Is Promiscuous Mode needed? Did I miss anything?

@5ch17
Copy link

5ch17 commented Jun 10, 2023

@gpz1100 something unclear is going in testing. WIth your suggested setup (PCP 1 or none on WAN), after boot there is an EAP error in logs (CTRL-EVENT-EAP-FAILURE EAP authentication failed). Running patched wpa_supplicant command from shell after boot sometimes leads to successful EAP authentication (RCM tag and all). However, no DHCP (despite refreshing via GUI and CLI).

Reverting back to stripper switch after patched wpa fails also fails (randomly either EAP and/or DHCP) - took many tries/reboots until it finally got an IP again.

Wondering if this is ONT/ISP port and region dependent. Do certs need to be from the same region as port's location (or, if not, could inconsistent results occur)? DHCP seems to fail more often than EAP. Could it be dependent on the assigned DHCP server (presumably there are multiple based on location/region)?

@5ch17 I think what is happening, after the "configuring wan interface..." there's some wan interface manipulation resulting in the port ultimately becoming unauthorized. Wpa_cli status still indicates authorized but if this were the case dhcp would work. Because of said port mangling, wpa_supplicant is left in an unresponsive state. Thus, the later shellcmd command kills and restarts wpa_sup works.

Unless some other solution comes about, I think it's possible to have a single script test wpa_cli status and internet connectivity, perform necessary steps (reinit wpa), etc. But it would still need to be called twice, at earlyshell and shellcmd.

Those command lines are a quick and dirty way for now. @BigJohn97 and I have tested this successfully multiple times now.

@gpz1100
Copy link

gpz1100 commented Jun 10, 2023

^^What's the md5 of the wpa_supplicant you're using. There were 2 versions released, the latest one is b49ca023025e305dfd99c8bb4c4644e1.

You should be able to force an eapol handshake in wpa_cli using the logon command.

Pop on to the discord chat, #pfsense channel. I'll be on later tonight after ~10pm Central time.

@5ch17
Copy link

5ch17 commented Jun 10, 2023

Same md5 for patched file. Seeing inconsitent behavior with stock 23.05 wpa_supplicant (on stripper switch), so it's unlikely wpa_supplicant is the culprit. Will try the chat if we can sync up

^^What's the md5 of the wpa_supplicant you're using. There were 2 versions released, the latest one is b49ca023025e305dfd99c8bb4c4644e1.

You should be able to force an eapol handshake in wpa_cli using the logon command.

@neydah700
Copy link
Author

neydah700 commented Jun 13, 2023

For those interested. Using the two earlyshellcmd and shellcmd commands never worked for me. It was very inconsistent. One in ever ten times it would auth. Manually commands after boot were inconsistent as well.

I don't have proof yet but I am very suspicions the culprit for me was the fact I have static IPs. When I would run status wpa_cli it would show this output with the "ip_address" matching one my statics (randomly varied amongst all IP's in the static pool). It only every auth'ed when that IP matched my dynamic. I have no idea during the boot process why pfsense would addingone of my statics to that interface and why an IP would cause issues with 802.1x, but it did.

status
bssid=01:80:c2:00:00:03
freq=0
ssid=
id=0
mode=station
pairwise_cipher=NONE
group_cipher=NONE
key_mgmt=IEEE 802.1X (no WPA)
wpa_state=ASSOCIATED
ip_address=X.X.X.X
address=XX:XX:XX:XX:XX:XX
Supplicant PAE state=AUTHENTICATING
suppPortStatus=Unauthorized
EAP state=IDLE

What DID work for me was modifying my pfatt script to strip all the netgraph references, and execute it like we do with pfatt at earlyshellcmd. This is using the patched wpa_supplicant file which I have placed in /root

Thank you @rcmcdonald91 tremendously! This is all I have wanted from my WAN connection years. Just pfSense and the ATT ONT. No switch, custom ONT, netgraph, custom wpa_supplicant, patched igb driver, etc.

So far works like a charm and hope it helps you all.

https://github.com/neydah700/pfatt/blob/b3d9ded2754f56ab773d5b66a01e796165977f19/bin/8021x_v2.sh

@5ch17
Copy link

5ch17 commented Jun 13, 2023

^^ Thanks for sharing. Weird variance in how it works for some people/locations and not others (does not here). Also unclear why the old script would work but not wpa_supplicant commands -- script does virtually the same as the commands as it parses WPA params without wpa_supplicant.conf needed, then daemonizes wpa_supplicant, then checks if IP is assigned and, if not, runs dhclient

@neydah700
Copy link
Author

^^ Thanks for sharing. Weird variance in how it works for some people/locations and not others (does not here). Also unclear why the old script would work but not wpa_supplicant commands -- script does virtually the same as the commands as it parses WPA params without wpa_supplicant.conf needed, then daemonizes wpa_supplicant, then checks if IP is assigned and, if not, runs dhclient

No worries! Current theory is earlyshellcmd runs before pfsense changes interface MAC address and the script addresses that. Haven't tested yet but currently discussing in the discord.

@neydah700
Copy link
Author

neydah700 commented Jun 13, 2023

Okay should have held off on my post. This earlyshellcmd works as well and is much simpler.

/sbin/ifconfig igb5 ether "XX:XX:XX:XX:XX:XX" && /root/wpa_supplicant -B -Dwired -i igb5 -c /root/pfatt/wpa/wpa_supplicant.conf -P/var/run/wpa_supplicant.pid && sleep 10 && /usr/sbin/wpa_cli logon

change the mac to match yours, change the two interface references to match yours, put patched supplicant in /root with execute permissions, change the path the wpa_supplicant.conf to match yours.

Booting hangs at the "configuring wan" stage for about 30 seconds while it re-auth's. If this is bothersome to you add a shellcmd (not earlyshellcmd) of

wpa_cli logoff && sleep 10 && wpa_cli logon

@5ch17
Copy link

5ch17 commented Jun 13, 2023

that can work if MAC is not spoofed at that stage, thx for sharing. you can add -s to wpa_supplicant options if you want log to syslog

@gpz1100
Copy link

gpz1100 commented Jun 13, 2023

@neydah700

Here's my final earlyshellcmd

/sbin/ifconfig igb0 ether "RG MAC" && /root/wpa_supplicant -B -Dwired -i igb0 -c /root/wpa_supplicant.conf -P/var/run/wpa_supplicant.pid && sleep 10 && /usr/sbin/wpa_cli logon

Note the sleep is 10, not 5 as from an earlier version. This should resolve the delay at "configuring wan interface...." Specifically as you pointed on discord because of the mac spoofing not taking place when earlyshellcmd scripts are run.

The 2nd shellcmd line is not 100% needed but should speed up dhcp at the end of the boot process. In testing, it appears the wan interface is toggled at some point after completing "configuring wan interface..." and completion of boot process. This causes a wpa reauth to take place. By issuing those commands, the 30s interval of upstream asking to reauth is bypassed.

@neydah700
Copy link
Author

@neydah700

Here's my final earlyshellcmd

/sbin/ifconfig igb0 ether "RG MAC" && /root/wpa_supplicant -B -Dwired -i igb0 -c /root/wpa_supplicant.conf -P/var/run/wpa_supplicant.pid && sleep 10 && /usr/sbin/wpa_cli logon

Note the sleep is 10, not 5 as from an earlier version. This should resolve the delay at "configuring wan interface...." Specifically as you pointed on discord because of the mac spoofing not taking place when earlyshellcmd scripts are run.

The 2nd shellcmd line is not 100% needed but should speed up dhcp at the end of the boot process. In testing, it appears the wan interface is toggled at some point after completing "configuring wan interface..." and completion of boot process. This causes a wpa reauth to take place. By issuing those commands, the 30s interval of upstream asking to reauth is bypassed.

perfect! Thank you!

@neydah700
Copy link
Author

Two resolutions.

Initial issue of earlyshellcmd changes was solved with a netgate patch https://redmine.pfsense.org/issues/14410

Second issue resolving the need for netgraph at all. Patched wpa_supplicant supporting vlan 0 now available. https://redmine.pfsense.org/issues/14457 & https://reviews.freebsd.org/rGbb5d6d14d81b0789d2e73da03571603426afef56

earlyshellcmd and shellcmd to run with patched supplicant located in this comment:
#83 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants