Skip to content
This repository has been archived by the owner on Nov 6, 2023. It is now read-only.

When to disable rulesets #7338

Closed
Hainish opened this issue Oct 13, 2016 · 17 comments
Closed

When to disable rulesets #7338

Hainish opened this issue Oct 13, 2016 · 17 comments
Labels

Comments

@Hainish
Copy link
Member

Hainish commented Oct 13, 2016

This is a follow-up to the discussion in #7329

@jeremyn @terrorist96

@jeremyn
Copy link
Contributor

jeremyn commented Oct 13, 2016

@terrorist96 wrote:

#7329 (comment)

Can we also please set the following rules to default=off before the new update?
#4510
#4545
#5086
#5621

Since they have been broken for several releases now (7 months for the oldest one) and no one has attempted to apply a fix yet. So let's set them to default=off until they get fixed, as per:
#5631

If I get approval, I'll make the PRs.

I wrote #7329 (comment)

@terrorist96 I disagree that we should bulk disable rules just because no one has fixed an issue with them yet, especially these complex CDN rules that span multiple sites in unclear ways (see #4510 (comment)). Unfortunately the people who can decide that a rule is safe to disable are the same ones who are able to fix it, and the problem is we don't have enough of those people looking at issues.

It's also a security concern. Perhaps I'm a malicious actor and want to disable a ruleset for one of these domains, so I report a fake issue with some other domain in the ruleset, with a problem that sounds tricky and hard-to-reproduce. Eventually I pressure the maintainers (us) to disable the entire ruleset until my fake issue is resolved, which will never happen.

@terrorist96 wrote #7329 (comment)

@jeremyn You can take a look at those issues and confirm that they break website functionality. Are you saying that you can't reproduce the issues?
I'm simply agreeing with @Hainish that rules that are found to be causing problems should be disabled until fixed, then may be re-enabled. Do we want maximum security with the side effect of broken websites or decent security with no broken websites? We browse the web to use websites. If that site's functionality is broken, it pushes people away from tools like HTTPS Everywhere because the broken functionality is not worth the security tradeoff. Most people won't go through the trouble of debugging each ruleset to see which one is causing the problem. And even if they do, and they disable that rule, that rule becomes disabled globally, not just for that site, so the end result is the same: that rule becomes disabled. And since rulesets that break websites are allowed to persist (I'm not pointing any fingers here, I understand everyone is busy and this is more so a volunteer kind of thing), it makes me hard pressed to be able to recommend tools such as HTTPS Everywhere to my non-tech-savvy friends, because I'll get blamed for breaking their web experience. They won't see the enhanced security, but they will see videos on websites not loading, for example.
So, help me be able to recommend more people use HTTPS Everywhere. 😅

@jeremyn
Copy link
Contributor

jeremyn commented Oct 13, 2016

@terrorist96 Security vs usability is a complex issue. See here for an earlier discussion related to mixed content blocking. We need to strike a balance.

It takes technical time to review an issue. Take the AWS issue #4510 . As @J0WI pointed out in #4510 (comment) , that ruleset affects a lot of sites. I can disable the ruleset to fix videos but now I've broken a bunch of other domains like these, so I won't do that. So now I have to figure out what part of the ruleset can safely be disabled, and at this point I'm basically fixing the issue anyway, when your suggestion was just to do something quick and temporary.

There is no quick fix for the problem of too many problems, not enough technical people working on them.

@terrorist96
Copy link
Contributor

terrorist96 commented Oct 13, 2016

Here's a quote from Hainish in the linked discussion:

...it is our job to make sure users are being provided the best endpoint security they can get without disrupting functionality.

And here the functionality is being disrupted. You also argue for erring on the side of usability. However, you wouldn't be breaking a bunch of other domains in the process of disabling the ruleset. You'd only be disabling the forced https; not breaking anything.
An additional argument for disabling the rules would be that it could motivate those who can fix it to give it higher priority, since many sites would have less encryption than they did before.
MCB is a smaller issue, in my opinion. In that case, you can load unsafe scripts by clicking a button in the browser and get the site to work while also keeping the rulesets active. My reported issues require the ruleset to be deactivated in order to regain functionality. And the person has to remember to reactivate the rule after they are done using that site, otherwise it'll be deactivated globally, which effectively equates to default=off. The benefit of making it default=off in the code would be that you can turn it back on in a future update and your users won't have had to disable it themselves (and have forgotten about it). So you'd be protecting more people in the long run. Otherwise, the way it is now, many people may have disabled these rules because of loss of functionality, and if/when the rules are fixed and a new version is pushed out, those people will still have the rule disabled, assuming it's still broken (or again, having forgotten about it). By preventing the user from encountering broken functionality, you lessen the chance of them disabling rules. And less disabled rules = more security.

I think your concern for MCB blocking ads on a site should be overshadowed by rulesets breaking main site functionality (like displaying a video), not ancillary things like advertisements. A site owner not being able to display ads due to MCB is less of a concern than a site owner not being able to serve content, the main function of the site.

It seems that most arguments should be in my favor, but the only thing holding it back is the global temporary loss of security on other unrelated sites (notwithstanding the potential permanent loss of security on other unrelated sites via user-disabled rulesets).

@terrorist96
Copy link
Contributor

One last thing: if my argument loses, then why are all these default=off rules allowed to stand?
https://github.com/EFForg/https-everywhere/search?utf8=%E2%9C%93&q=default%3Doff

They, too, cause broken functionality. But you're also lessening user security.

@jeremyn
Copy link
Contributor

jeremyn commented Oct 13, 2016

By "broken a bunch of other domains" in my last comment, I indeed meant disabling HTTPS coverage, not breaking the intended behavior of the site.

Nobody is arguing with you that these issues should be fixed. If HTTPS Everywhere is breaking some video playback, then that's a problem. But, I'm not going to blindly disable rulesets based on the fact that an issue has been open for a while.

I'm a volunteer here and can work on what I want. If you want those issues fixed more quickly then you should try fixing them yourself, or find or pay a friend who will fix them for you.

@terrorist96
Copy link
Contributor

I just fail to see the consistency. The GoogleVideo ruleset used to be on, but one day it started to break stuff. Instead of the rule being fixed, it was simply turned off since it was causing broken functionality, and that rule affects many sites.
I'm not pushing you or anyone to make the fixes. But since no one has, why can't it be turned off, like GoogleVideos was, if it's breaking major functionality? When is it appropriate to disable a rule? If not when it breaks functionality, then when?

@Hainish
Copy link
Member Author

Hainish commented Oct 13, 2016

The general rule we've followed in the past is that we should be liberal in disabling rulesets that break functionality, even if that lessens the security of some valid sites included in a ruleset. When a ruleset is disabled, a user navigating to a host included in that ruleset has the option of manually re-enabling it when they navigate to that host. But that's a functionality they have to know about. I'm willing to bet a good portion of our users, if not most of them, just install HTTPS Everywhere without realizing you can disable and re-enable rulesets manually. For them, they just see a site is broken, and if you disable HTTPS Everywhere, it ceases to be broken. So I think for most sites, it's best to just disable if it's causing an issue.

Last year, @jsha ran an automated test which disabled any ruleset which did not pass the fetch test. This disabled a good portion of the total rulesets we have in the addon, but the thinking is that it's better to be functional and give users some security than non-functional and have users uninstall the addon altogether.

I agree with @jeremyn that we should verify that a ruleset is actually causing an issue before disabling it. And there are some edge cases as well. If one of the sites in the Bit.ly ruleset is not resolving in DNS, we shouldn't disable the ruleset, which includes 26k hosts. The distinction is that this probably won't actually disrupt site functionality.

@terrorist96, if a ruleset maintainer has stated that they don't want to address a specific class of issues, that's okay and you should respect that. For example, someone who doesn't know Chinese may feel that they are ill-equipped to tell if a ruleset disrupts functionality of a .cn site. I think that's fine, and so that user should not be flagged when sites like this come up. @jeremyn doesn't want to merge PRs that disable rulesets, that's fine.

@terrorist96
Copy link
Contributor

And that's fine with me. Like I said, I'm not trying to force anyone to do anything here. I'm just trying to understand the reasoning for disabling some rules that break functionality while leaving others enabled.
Thanks for posting.

@Hainish
Copy link
Member Author

Hainish commented Oct 13, 2016

In the case of CDNs, disabling these rules can have a dangerous cascading effect due to MCB, so I agree with @jeremyn there that we should not just disable them. We do not know where disabling them will cause problems for other rulesets, and that's hard to predict.

So for CDNs, I'd say let's air on the side of caution and keep them enabled, but also try to fix them quickly.

@Hainish
Copy link
Member Author

Hainish commented Oct 13, 2016

It's kind of tricky to predict when a site may be used as a source of 3rd party content. For now we are forced to use our best judgement, which can be pretty shoddy in certain circumstances. I know that Google Analytics is used on many sites, but maybe I wouldn't know that Baidu Analytics was something commonly used. So someones perspective is of course locale-dependent, and this is problematic when making judgement calls such as whether to disable a ruleset when you're unsure if that action will cause lateral impact.

I'm going to try to track down a tool that may help here. I know OpenWPM is a tool used to gather statistics on the inclusion of 3rd party trackers on first-party sites, but I'm unsure if it's just gathering this data for trackers or all 3rd party resources. There are probably tools out there that can help us, but I'd have to search them out.

@Hainish
Copy link
Member Author

Hainish commented Oct 13, 2016

NerdyData seems to be a good tool for this. It's able to search source code on lots of sites: https://nerdydata.com/search?query=fonts.googleapis.com

@Hainish
Copy link
Member Author

Hainish commented Oct 14, 2016

Common Crawl might be a great resource for this. It has 250TB+ of web source data: https://commoncrawl.org/

@jeremyn
Copy link
Contributor

jeremyn commented Oct 14, 2016

Part of the problem is that these CDN rulesets jumble things together when they shouldn't, with little documentation. In other words, they're not good rulesets. Maybe one of these domains is security-critical content, like Android .apks that anonymize mobile traffic in hostile areas. Or maybe they all serve cat pictures. Who knows?

EDIT: And of course, those are just test URLs for a wildcard. There may be URLs covered by the wildcard that are serving security-critical content that we have no idea exist.

@jeremyn
Copy link
Contributor

jeremyn commented Oct 14, 2016

Also, the problem class of broken third party videos can be super annoying to troubleshoot. (None of these examples are meant to criticize any of the people who reported these issues on a personal level, I'm just giving these as examples illustrating how these issues can go.)

In https://github.com/EFForg/https-everywhere/issues/ 5602, the site is in a language I don't know, and I get different results between browsers.

In https://github.com/EFForg/https-everywhere/issues/ 4062, there might be some Flash complication in Firefox.

The issue creator sometimes states the problem as just "video won't load" or "video broken" -- what does that mean? A partial list of possibilities is, though I'm not claiming to have personally seen all of these in issues: player doesn't show; player shows but there's an error; player shows, no error, just a spinner that won't go away; player shows an ad but not the video after the ad. I have to inspect a site I'm not familiar with, maybe in a language I don't know, with tons of ads and other content blinking at me, for what "video won't load" might mean.

In https://github.com/EFForg/https-everywhere/issues/ 7287 you need an account to see the problem in the original example URL.

In https://github.com/EFForg/https-everywhere/issues/ 4262 the problem was already fixed.

People reporting video problems are sometimes not experienced contributors so there can be more interaction than normal, which takes time (https://github.com/EFForg/https-everywhere/issues/ 4519).

Assuming I can reproduce the problem, these media sites usually have dozens of rules because of ads and trackers, so it might not be obvious which ruleset is causing the problem. The Firefox Web Console can get flooded with messages from the dozens of ad and tracker scripts running, making it difficult to look for the usual mixed content or CORS messages. And some CORS problems can't be fixed due to a very old bug (#49) that I still don't fully understand.

I hope this helps explain why it's easy to pass on video problems when there are hundreds of other open things to do.

@terrorist96
Copy link
Contributor

Appreciate the explanation. I always try to mention the specific ruleset that causes the problem when making a report. Anything else I can do to help, just ask. For example, I found a deep link to help reproduce/isolate an issue here. :)

@ghost
Copy link

ghost commented Nov 17, 2016

Be also aware that different enabled/disabled rule sets between users are a possible fingerprinting target, although an unlikely one.

@jeremyn
Copy link
Contributor

jeremyn commented Dec 5, 2017

I'm closing this since it's a year-old discussion.

@jeremyn jeremyn closed this as completed Dec 5, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

3 participants