Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wi-Fi traffic suddenly cutoff until LAN Control ticked. #152

Closed
greendalek opened this issue May 29, 2013 · 19 comments
Closed

Wi-Fi traffic suddenly cutoff until LAN Control ticked. #152

greendalek opened this issue May 29, 2013 · 19 comments
Labels

Comments

@greendalek
Copy link

Wi-Fi traffic was suddenly cutoff for all whitelisted apps. Tried Dolphin Browser, stock Internet browser, Fing network tool. Could not access internet nor even local home network. But Wi-Fi radio could successfully associate and connect to two different routers. Other devices without AFWall+ were unaffected. Also, 3G was unaffected. Safe Mode of Android was not affected either.

Tried turning Wi-Fi off/on. Also tried disabling/enabling firewall (with Wi-Fi temporarily off). Tried turning device off/on.

In Preferences, Enable Roaming Control and Enable VPN Control were ticked. LAN Control was not ticked. In the whitelist, Kernel was ticked. But Root and Tether were both not ticked. Android System was also not ticked.

When LAN Control was then ticked in Preferences, and Android System app ID was also whitelisted, suddenly all Wi-Fi traffic resumed per whitelist criteria. Now, with Android System unticked, Wi-Fi traffic continues to flow.

Using AFWall+ (donate) v1.2.5.1 from Play Store. Samsung Galaxy Note 2 with rooted stock ROM N7100XXDMC3 on Android 4.1.2 . Omly other iptables-using app which added on is Network Log v2.18.0 but logging was already turned off prior to the sudden cutoff.

@ukanth
Copy link
Owner

ukanth commented May 29, 2013

I pushed a newer version(1.2.5.2) which address this issue. Can you please test it and let me know ? It should be live by now on playstore. Preference is independent to rules. So It should not affect.

@greendalek
Copy link
Author

It works now with the new version. Thank you.

Updated to v1.2.5.2 from Play Store. Turned off Wi-Fi then disabled firewall in preparation for reboot and/or Safe Mode. Then turned device off/on. Enabled firewall and turned on Wi-Fi. Dolphin Browser connects to Google normally again.

Unclear why traffic was flowing OK while in Safe Mode since only the user installed APKs are disabled in Safe Mode. Would have thought the configured iptables would still be active. I guess the network filtering is not statically set to survive through reboot. That's probably it, but I'm still looking for some good iptables reference online. So far, I am taking care to disable the firewall before applying downloaded OTA updates. In case root is lost and a factory reset is required. Don't want old UIDs in the network filters to inadvertently affect different apps reallocated the same numeric UID values. With no root, that might mean random apps being blocked/allowed internet access.

BTW, there is not much detail about LAN Control within the app. Not clear if this applies to devices with wired LAN connections, or if it refers to network traffic addressed to the local subnet (e.g. 192.168.0.x) even over Wi-Fi link.

@cernekee
Copy link
Contributor

cernekee commented Jun 2, 2013

-                               // if the wifi interface is down, reject all outbound packets without logging them
-                               if (!cfg.allowWifi) {
-                                       listCommands.add(ipPath + " -A afwall-wifi -j REJECT || exit");
-                               } else if (any_wifi) {
+                               //--if the wifi interface is down, reject all outbound packets without logging them--
+                               //revert back to old approach
+                               if (any_wifi) {

Checking cfg.allowWifi was intended to block all wifi traffic when the OS tells us that wifi is down and we cannot determine the LAN IP address. This was a "fail safe" policy; the new code implements a "fail open" policy which may allow applications to access the LAN even if AFWall was set up to prohibit it.

Looking through my InterfaceTracker code I do see one issue: allowWifi is set to true by the "new" (Gingerbread+) interface scanner when it finds a valid wifi interface, but the "old" interface scanner fails to do the same. This should get fixed, however, it shouldn't affect JB 4.1.2.

Do we have a logcat showing what happens during a connectivity change / interface scan on this device?

Unclear why traffic was flowing OK while in Safe Mode since only the user installed APKs are disabled in Safe Mode. Would have thought the configured iptables would still be active. I guess the network filtering is not statically set to survive through reboot.

Correct, iptables only changes the rules in the running kernel. The kernel always starts up with an empty set of iptables rules. They need to be applied through a startup script (or in our case, by AFWall receiving an ACTION_BOOT_COMPLETED intent). This is the source of the "data leak at startup" problem, which can be remedied by altering the startup scripts in your ROM.

That's probably it, but I'm still looking for some good iptables reference online.

tldp.org has some HOWTO guides on masquerading, firewalling, etc.

So far, I am taking care to disable the firewall before applying downloaded OTA updates. In case root is lost and a factory reset is required. Don't want old UIDs in the network filters to inadvertently affect different apps reallocated the same numeric UID values. With no root, that might mean random apps being blocked/allowed internet access.

If you lose root, AFWall will not be able to apply its firewall rules anymore; the rules you get will be whatever the stock ROM / netd sets up.

@greendalek
Copy link
Author

I now understand the 'LAN Control' tick box to refer to the local network connected through the Wi-Fi interface. And not to refer to a wired interface such as twisted-pair Ethernet (possibly bridged through USB dongle). Was confused earlier because the other ones for Wi-Fi, 3G, VPN, all conceptually refer to a network interface, rather than a subset of addresses reachable via an interface.

Would it help to 'fail-open'to only private network addresses such as 192.168.* and 10.* so as to block other network addresses unlikely to be considered local?

A novice at using logcat. Have Logcat Extreme installed as well as a few alternatives. What tag or search string to use? I tried 'wifi' (case insensitive search string). This does get something when I toggle Wi-Fi off and back on. Seems mostly about Wi-Fi Direct file transfer to be stopped. No interface names appeared.

This device has some apps on it which sniff and collect Wi-Fi SSIDs. Also an experimental Tasker applet to display location-based reminders. Perhaps these triggered the sudden connectivity cutoff event I reported.

Thanks for the explanations and tips.

@cernekee
Copy link
Contributor

cernekee commented Jun 3, 2013

I now understand the 'LAN Control' tick box to refer to the local network connected through the Wi-Fi interface. And not to refer to a wired interface such as twisted-pair Ethernet (possibly bridged through USB dongle). Was confused earlier because the other ones for Wi-Fi, 3G, VPN, all conceptually refer to a network interface, rather than a subset of addresses reachable via an interface.

Roaming is kind of an outlier, too...

The current description is: "Enable this to control LAN permissions separately from WiFi permissions"

Do you think it would be clearer if it were changed to something like: "Use separate permissions for accessing hosts on the LAN subnet"?

Would it help to 'fail-open'to only private network addresses such as 192.168.* and 10.* so as to block other network addresses unlikely to be considered local?

IMO it would be best to figure out why the LAN interface info isn't being collected, first. Hopefully it's just due to bugs and corner cases in my new InterfaceTracker code, and those can be addressed without having to "fail open" at all.

Two potential issues with hardcoding IP ranges:

  • I don't know of a way to infer LAN vs. WAN on IPv6 without looking at the IP and prefix length.
  • This could produce surprising results if your "public" IP is in a reserved range.

A novice at using logcat. Have Logcat Extreme installed as well as a few alternatives. What tag or search string to use?

To see just AFWall messages (low volume), you can run the equivalent of: adb logcat -s "AFWall:*"

If you turn off the filter and capture everything, it will show a little more context. This can be helpful if we were expecting to see an AFWall message (e.g. due to an interface going up or down) but it's missing.

This device has some apps on it which sniff and collect Wi-Fi SSIDs. Also an experimental Tasker applet to display location-based reminders. Perhaps these triggered the sudden connectivity cutoff event I reported.

If you can temporarily disable them, it would help rule them out as possible culprits.

@greendalek
Copy link
Author

Yes, as you suggested. Or, "Use separate permissions for my local network (on Wi-Fi)". OTOH, perhaps each interface should be enabled by its own tick box (Wi-Fi, 3G, VPN). And Roaming would be indented under 3G, while LAN would be indented under Wi-Fi, as special cases.

When the app update appeared with LAN Control, I briefly noted the Play Store description and the Preferences wording. Somehow, I had thought this was for other low-cost device types which might have twisted pair cabling. TV Sticks, Media Players, etc.

It is a great improvement to access local network resources without leaking access to the wider world. I would further hope for some control of the local loopback interface. So that service apps like AdBlock Plus proxy do not open a pathway for every app not white-listed in AFWall+. Stopped using AdBlock Plus because of this.

I'll try to get the logcat next. Need to review if it is feasible to identify which apps are scanning the Wi-Fi for SSIDs. I have over 1000 apps downloaded from the Play Store. Does cause Bitmap Factory to crash for out of memory in many apps, when scrolling through list of all apps. I have icons turned off in Preferences for AFWall+ (carried over from DroidWall, I think).

@greendalek
Copy link
Author

Collected unfiltered logcat traces for both Wi-Fi toggling off and toggling back on. Used 'Verbose' level on the Main trace buffer. Notably, there are no log entries with 'AFWall' literal string embedded. If AFWall is not the foreground activity, would it make any list entries?

I have cleaned up whatever personally identifying information I could from these two text files and put them into a ZIP archive (for transit error detection). Now, I am puzzled how to send this to you since there does not appear to be any PM facility on this site.

I am not an app developer; just a user who tapped Help in the app and was pointed here by the displayed URLs. Found the Issues section and signed up to post the problem report. Now, I am lost as to the next steps. Thanks for any suggestions.

@cernekee
Copy link
Contributor

cernekee commented Jun 5, 2013

If AFWall is not the foreground activity, would it make any list entries?

The broadcast receivers should still be able to log events, even if the activities are not running at the time.

It works OK on my system, at any rate...

I am puzzled how to send this to you since there does not appear to be any PM facility on this site.

Could you please send the logcat output from v1.2.5 (not 1.2.5.1 or 1.2.5.2) to cernekee@gmail.com ? That version should log "received connectivity_change intent" and "received boot_complete intent" any time it is notified by the system. Newer versions perform some extra checks which could be suppressing the actions.

Thanks.

@greendalek
Copy link
Author

OK, sure. I try to keep every version of my paid apps. On the Play Store, you can never go back to earlier versions. I'll sideload version 1.2.5 and capture the traces. More later.

@greendalek
Copy link
Author

I have sent you a ZIP containing four logcat files. Two are with filtering turned on for the tag 'AFWall'. Two more are with no filtering. Each pair has a trace for the Wi-Fi being toggled on and another for the Wi-Fi being toggled off.

Duration of each trace is approximately 13 seconds. Covers time from Wi-Fi toggle control being tapped until after SuperSU toast notification appears for root access by AFWall+.

These traces were for version 1.2.5 as requested. So far, the sudden connectivity cutoff problem on Wi-Fi has not happened. The device is connected to a different Wi-Fi access point. I will try the original access point next.

Thanks for looking into this issue.

@cernekee
Copy link
Contributor

cernekee commented Jun 6, 2013

These traces were for version 1.2.5 as requested. So far, the sudden connectivity cutoff problem on Wi-Fi has not happened. The device is connected to a different Wi-Fi access point. I will try the original access point next.

OK - the traces look normal to me. We'll use these as a baseline.

Now let's figure out which factor(s) are associated with the connectivity cutoff problem: different AP, using AFWall 1.2.5.1 instead of 1.2.5, different Android device, different OS version, enabling or disabling the LAN option, etc. Then we could narrow down the differences and try to find the root cause.

@greendalek
Copy link
Author

With version 1.2.5 still installed, testing on the original Wi-Fi access points did not trigger any cutoff in connectivity. It's plural because two APs have radio coverage over the original incident. Not certain which one the device was actively connected to then. I recall manually reconnecting to each but still not gaining back traffic flow, until turning on LAN Control. After that, traffic was OK, and LAN Control could even be turned off.

Since the problem had occurred with version 1.2.5.1, I am going to test with that next by sideloading the APK.

As for other OS versions, I am unable to proceed. This device does not have custom recovery. The flash counter is deliberately kept zero to retain Normal/Official status so as to receive OTA updates. My only other ICS device is unrooted. Viewsonic VB100a tablet which does not have any rooting methods when googled. Thus, also unable to proceed on this angle.

@greendalek
Copy link
Author

There are two more possibilities which I considered as potential causes for the original incident. The first has to do with the two access points (actually home routers). Router-1 is connected to broadband service. Router-2 is downstream of Router-1. It has its four LAN ports repurposed as a network switch. Its broadband port is unused. Its IP address is statically set different from Router-1. And its DHCP service is turned off. From prior experience with another pair of routers, it is sometimes possible hosts connected to a downstream router (switch) do not successfully obtain a DHCP address lease. Perhaps the switch is not fully implemented since the manufacturer assumed DHCP would always be internally provided, or whatever.

However, it's unlikely this first scenario could be the cause of the problem incident. Other users and devices connected to Router-2 have not reported any problems obtaining DHCP addresses from the DHCP service of Router-1. Furthermore, during the original troubleshooting, I repeatedly reconnected manually to both routers successfully, with only the traffic silently blocked.

The second possibility is much more complicated. This device has over 1000 apps installed. Android seems to have some system components hard coded to limit resources to this quantity of apps. Regardless of actual hardware capability. This is the second time in three months, whereby as the number of apps grow incrementally over the 1000 threshold, the system error about android.process.acore having 'Unfortunately' stopped keeps popping up with increasing frequency, as a denial-of-service attack from within Android. Stable apps start crashing intermittently due to spurious 'Package Manager has died' and 'Binder Proxy transaction too large' errors. Deleting a few apps fixes the problem, without restarting the device. Installing a few apps starts the system errors again. 'Unfortunately' as a mere user, there is no Report button to send any stack trace to Google.

It may be that your AFWall broadcast receiver got clobbered by such a system error, if this device was at the edge of the 'acore tipping point'.

@greendalek
Copy link
Author

I have sent you a new ZIP containing two logcat files. These have no filtering. One is for the Wi-Fi being toggled on. The other is for Wi-Fi being toggled off. Could not capture with filter tag AFWall as nothing turned up.

After these traces were captured, the Wi-Fi traffic was observed to be silently blocked, despite a DHCP address lease being successfully obtained.

Tried connecting to both Router-1 (broadband) and Router-2 (downstream switch), but both are silently blocked. The radio connection to Router-2 is via a Wi-Fi extender.

Using Fing app to ping this device itself gets total packet loss. Using 'Ping & DNS' to ping this device on its own numeric address 192.168.0.108 actually succeeds for all ten test packets.

As I finish typing this, I'll turn on the LAN Control setting and try to unblock the Wi-Fi traffic.

I had to turn on LAN Control and even had to click Apply. All the tick boxes under LAN Control were left unticked. I noticed that SuperSU prompted for root access by AFWall+ 1.2.5.1. This means AFWall+ 1.2.5.1 had not executed as root since being sideload back in. Which suggests the broadcast receivers have not run prior to this.

@cernekee
Copy link
Contributor

So, to summarize: with 1.2.5 you do see AFWall applying the rules, but with 1.2.5.1 you do not?

One of the big changes in 1.2.5.1 was to change the broadcast receiver so that it only applied the rules if LAN control or roaming control are used. I believe the intention was to cut down on the number of su requests and any lag they might cause. However, this does mean that the rules need to be applied through some other mechanism. If the wifi interface was down at the time the rules were applied, then that could be a problem.

A change between 1.2.5.1 and 1.2.5.2 was to remove the wifi blockout. The blockout logic drops all outbound wifi packets if the system reports that the wifi interface is not up. The intention here is to make sure the LAN rules aren't inadvertently bypassed when e.g. switching between different LANs with different IP ranges.

My speculation as to why some users were seeing excessive su requests on v1.2.5? If the AFWall process was killed by the system and is revived to receive the CONNECTIVITY_CHANGE broadcast, InterfaceTracker.currentCfg will start out as null again. The new configuration is not null, of course, so this will trigger reapplication of the rules. A way to improve on this strategy might look like:

  • InterfaceTracker.checkForNewCfg() should cache the network configuration in our SharedPreferences whenever it changes. If currentCfg is null, read it out of SharedPreferences.
  • On receipt of BOOT_COMPLETED, make sure the current network configuration is stored into our SharedPreferences (so that we're not comparing against stale data from the last boot).

So even if the system kills the AFWall process, it will remember the network configuration and it won't make excessive calls to su for events that don't actually involve a connectivity change.

@greendalek
Copy link
Author

Yes. I was planning to further test and see how often it occurs. Just sent the traces out to you first. More later.

@ukanth
Copy link
Owner

ukanth commented Jun 11, 2013

@cernekee , Sorry for jumping in late. Hope you know the reason. First of all sorry for not including you in the minor releases after 1.2.5 for the changes that I made.

Couple of points regarding the 1.2.5

  • Most of the devices doesn't change the iptables rules when connectivity change and many lower end devices affected with multiple su calls.
  • For most the devices WIFI got blocked. I didn't get time to debug that. That's the reason i reverted back to old approach.

I'm thinking of old Droidwall approach to write the rules to script file and execute it on boot, so that it will not affect the startup rules. We can discuss about it.

Join me @ gtalk -> cumakt

@greendalek
Copy link
Author

Testing at a cafe hotspot. More bewildering anomalies observed on version 1.2.5.1 .

This hotspot usually allows a handful of web pages to be fetched without signing in. This behaved as expected. Beyond this, the hotspot will redirect to a free login/signup page at a 10.* private address. This did not work as expected, with the page fetches timing out. Had to (a) turn on LAN Control in Preferences, (b) tap Apply in the app menu, and (c) go back and redo it with the LAN box ticked for Dolphin Browser. Then only was the login page fetched.

From a usability point of view, it was unexpected to need to tap Apply for the Preferences change to kick in.

It was also unexpected that pages for Google News and randomly selected articles could be fetched, which would have to use 10.* as the local gateway, but that the login page at 10.* could not be fetched, without steps (a), (b) and (c) above. I suppose this is conceptually consistent if you consider the LAN and Wi-Fi columnar presentation in rules, since LAN was off but Wi-Fi was on for Dolphin. Sure would confuse anyone troubleshooting connectivity, if unfamiliar with specifics of AFWall settings. Personally, I prefer a PC software firewall approach where you get to specify interfaces, protocols, incoming/outgoing addresses and ports for each app, with conveniences like network address masks or CIDR, and user defined address/port groups. This present columnar approach may someday require panning left-right as more columns get added for quantum entanglement, wormholes, etc.

Once those 3 steps were taken, a whole bunch of other apps (e.g. K9 Mail, AppSales) were able to update and posted notifications.

Will do more testing with home Router-1 and Router-2 next.

@greendalek
Copy link
Author

Tested version 1.2.5.1 with home Router-1 (192.168.0.1) and Router-2 (192.168.0.250). Router-1 is broadband gateway. Router-2 is older device repurposed as a network switch. Both have Wi-Fi access points enabled on separate SSIDs.

Used 'Fing' and 'DNS & Ping' apps to ping both routers. Results are consistently the same. LAN Control must be enabled and each ping app has to be allowed under the LAN column.

Strangely, with LAN Control off, 'DNS & Ping' app can ping this same device at its own LAN address of 192.168.0.108 . The ping times measured were in many milliseconds, suggesting that the local loopback interface was not used as a shorter pathway. Does this mean there is a corner case where iptables does not fully block 192.168.0.* when it seems to be blocking everything else on the same subnet? If so, what other unknown corner cases might there be which may not be as benign?

One more anomaly noted. When toggling LAN Control from on-off-on, the individual apps lose their allow/deny status. Have to identify relevant apps and tick their boxes all over again. Would be more convenient if these were remembered, and just have the column header manually toggle enable/disable all for the whole column, if necessary to set/clear for all apps.

I'll be reverting to version 1.2.5.2 and following any updates. Let me know if there is anything else to check.

@ukanth ukanth closed this as completed Mar 18, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants