OMR Script Triggers for VPS and OMR-WAN Fail Events #2793

Network-Traditions · 2023-03-06T20:17:39Z

Is your feature request related to a problem? Please describe.
When a stable production VPS or OMR WAN connection develops an issue, rebooting the VPS or WAN interface often resolves the problem.

Describe the solution you'd like
Upon a VPS or OMR WAN failure event, have an option to trigger a bash script on OMR to complete a task such as REBOOT/RESTART to execute corrective action programatically.

Describe alternatives you've considered
We considered digging into OMR to capture failover events in manner similar to the method we use with the following script to obtain the OMR WAN public IP addresses for the dynamic DNS service:

pip=$(uci get openmptcprouter.wan1.publicip 2> /dev/null)
if [ -z "$pip" ]; then echo "0.0.0.0"; else echo $pip; fi

Additional context
We have a USA T-Mobile Business connection as one of our OMR WAN services. This service encounters connection problems (every 1 to 2 days) in such a way that the OMR USB Modem interface requires a restart to begin receiving packets again. We use the following manually executed bash script to remotely achieve the restart:

echo "usb2"> /sys/bus/usb/drivers/usb/unbind
sleep 5
echo "usb2"> /sys/bus/usb/drivers/usb/bind

Clicking OMR's WAN "Restart" button does not resolve this connection problem, only unplugging the USB modem connection or this script allows the modem to begin receiving packets again.

A similar issue exists with the VPS as well. When OMR is experiencing abnormal issues like VPN down, unable to contact the Admin Script, VPS disconnects, etc., often a VPS reboot resolves the issue.

We've deployed scripts on OMR to dynamically register the WAN public IP addresses and the currently connected VPS public IP addresses with our DNS provider. Doing so allows OMR to programatically update our public DNS connections when it switches VPS providers. Subsequently within about 5 minutes our email, voip calls, web services all get back online automatically without admin intervention. During such a fail event, we would like to execute a REBOOT/RESTART on the failing VPS in an attempt to clear the problem and restore it to a nominal operating status.

Ysurac · 2023-03-07T19:23:48Z

I would need to know why VPS is failing to solve that. So I would need /var/log/daemon.log, ip r, ip a, iptables-save and a status page screenshot on router side when it's failing and when it's working.

Network-Traditions · 2023-03-07T21:07:45Z

I'll try to capture that information when it occurs next time, but what I was wondering about, is there an opportunity to hook into OMR's monitoring events to trigger a bash script. (i.e. OMR determines the master VPS is offline and switches to an alternate and has an option to trigger a bash script upon completion that would execute a custom action such as reboot the master VPS). Something similar for the WAN interfaces would be useful as well. When our T-Mobile service starts giving us problems, the "RX" packet count of Network-Interface stops incrementing or only increments a few packets per display update. Eventually the OMR status page will indicate the respective WAN service is down with a red X. Hooking OMR's connection monitoring of the WAN interfaces to allow for custom bash script execution would present the same opportunity to programatically implement corrective action that otherwise required admin intervention.

Network-Traditions · 2023-03-08T06:20:42Z

Data for a T-Mobile loss of service event at about 19:45 log time. When this type of service loss occurs, eventually OMR will loop attempting to recover the interface (WAN1). Observing the Network-Interfaces tab as illustrated below next to the red arrow, the RX: packets will not change from the number displayed:

The System-OpenMPTCProuter-Status tab will indicate a problem as show for the "cell" WAN1 here:

The T-Mobile loss of service event eventually causes a disruption of all service temporarily at a minimum, but sometimes permanently requiring VPS and/or OMR reboots to regain internet connectivity with "star" WAN2, the multipath master:

Additionally, the OMR console will begin logging the following events:
OMR-dmesg.txt
To correct the problem, I SSH'd into OMR and executed the following script about 19:58 log time:

echo "usb2"> /sys/bus/usb/drivers/usb/unbind
sleep 5
echo "usb2"> /sys/bus/usb/drivers/usb/bind

Subsequently, ModemManager most often can fully recover the interface and bring WAN1 service back online for OMR:

Here are the requested logs:
daemon.txt
OMR-Systemlog.txt
ipa.txt
ipr.txt
iptables.txt

Ysurac · 2023-03-08T12:12:02Z

This seems to be a bug in USB driver, I will add a kernel patch that may solve the issue.

Network-Traditions · 2023-03-08T18:21:05Z

OK, let me know when you're ready to test the patch and how to properly implement it and I will let you know the result. Our system experiences the T-Mobile failure on average between 1-2 days.

github-actions · 2023-06-06T19:10:06Z

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days

Network-Traditions added the feature request label Mar 6, 2023

github-actions bot added the Stale label Jun 6, 2023

github-actions bot closed this as completed Jun 12, 2023

Network-Traditions mentioned this issue Sep 14, 2023

Possible vunerability in OMR: bad packet attack from Internet IP crashes glorytun, brings down wan interface, omr-tracker fails to restart interface. #2956

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OMR Script Triggers for VPS and OMR-WAN Fail Events #2793

OMR Script Triggers for VPS and OMR-WAN Fail Events #2793

Network-Traditions commented Mar 6, 2023

Ysurac commented Mar 7, 2023

Network-Traditions commented Mar 7, 2023

Network-Traditions commented Mar 8, 2023

Ysurac commented Mar 8, 2023

Network-Traditions commented Mar 8, 2023

github-actions bot commented Jun 6, 2023

OMR Script Triggers for VPS and OMR-WAN Fail Events #2793

OMR Script Triggers for VPS and OMR-WAN Fail Events #2793

Comments

Network-Traditions commented Mar 6, 2023

Ysurac commented Mar 7, 2023

Network-Traditions commented Mar 7, 2023

Network-Traditions commented Mar 8, 2023

Ysurac commented Mar 8, 2023

Network-Traditions commented Mar 8, 2023

github-actions bot commented Jun 6, 2023