Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CNAME first-party resources detected as third-party #909

Closed
6 of 8 tasks
fuzzykiller opened this issue Feb 25, 2020 · 26 comments
Closed
6 of 8 tasks

CNAME first-party resources detected as third-party #909

fuzzykiller opened this issue Feb 25, 2020 · 26 comments
Labels
Firefox specific to Firefox invalid not a uBlock issue

Comments

@fuzzykiller
Copy link

Prerequisites

  • I verified that this is not a filter issue
  • This is not a support issue or a question
  • I performed a cursory search of the issue tracker to avoid opening a duplicate issue
    • Your issue may already be reported.
  • I tried to reproduce the issue when...
    • uBlock Origin is the only extension
    • uBlock Origin with default lists/settings
    • using a new, unmodified browser profile
  • I am running the latest version of uBlock Origin
  • I checked the documentation to understand that the issue I report is not a normal behavior

Description

When a site is fully CNAME-hosted (for example *.azurewebsites.net is backed by *.windows.net is backed by *.cloudapp.net), its origin is considered third-party.

A specific URL where the issue occurs

Steps to Reproduce

  1. Under "My Rules", have the following:
    * * 3p-script block
    
  2. Go to CNAME-hosted website (see above)

Expected behavior:

Website consisting of (or requiring) only first-party scripts should work.

Actual behavior:

Requests to first-party scripts are blocked as third-party because CNAME.

Your environment

  • uBlock Origin version: v1.25.0
  • Browser Name and version: Mozilla Firefox 73.0.1 (64-bit)
  • Operating System and version: Windows 10 1909
@uBlock-user
Copy link
Contributor

uBlock-user commented Feb 25, 2020

Yes, by-design. Pre v1.25 these CNAMEs were cloaked, now they're getting uncloaked.

@uBlock-user uBlock-user added Firefox specific to Firefox invalid not a uBlock issue labels Feb 25, 2020
@fuzzykiller
Copy link
Author

I know. I however believe this is very much not by-design. It is simply an edge case that was missed. The first-party page itself cannot be third-party.

@uBlock-user
Copy link
Contributor

uBlock-user commented Feb 25, 2020

It is simply an edge case that was missed.

Not an edge case. Infact there are many such cases, one for example is https://www.ghacks.net/ hosted entirely on marfeelcdn.com along with marfeel.map.fastly.net .

The first-party page itself cannot be third-party.

You're visiting *.azurewebsites.net, so connection made to any other domain, regardless of where it's hosted, including its CNAME is third-party. You will need to start adjusting your rules accordingly, if you're using Dynamic Filtering or disable CNAME uncloaking in the advanced settings.

Read up #780, the case behind CNAME uncloaking implementation.

@gwarser
Copy link

gwarser commented Feb 25, 2020

If you visit page A and its main_frame resolves to B then all requests to B should be seen as first-party?

This can be feature request.


Currently main document CNAME is not even resolved unless you toggle cnameIgnoreRootDocument

If you visit A and it now resolves to CNAME_B and you don't even know about it now, then what is the harm to treat all other subresources from CNAME_B as trusted?

@uBlock-user uBlock-user added something to address something to address and removed invalid not a uBlock issue labels Feb 25, 2020
@uBlock-user uBlock-user reopened this Feb 25, 2020
@gorhill
Copy link
Member

gorhill commented Feb 25, 2020

When a site is fully CNAME-hosted (for example *.azurewebsites.net is backed by *.windows.net is backed by *.cloudapp.net), its origin is considered third-party.

Yes, they are 3rd-party. For instance they are even more 3rd-party than arstechnica.net is 3rd-party to arstechnica.com even though we do know that surely it's the same entity behind these two domains.

The bottom line is that advanced user mode is for advanced users ready to deal with breakage as a result of using broad blocking rules, and the uncloaking of canonical names is no different -- I rather not start to make exception to this.

Yes, cnameIgnoreRootDocument tells uBO to not uncloak root frame document because this would mess up uBO's internals when it comes to keep track of what is the root context of what is loaded in a tab. The setting can be toggled for whoever is adventurous and want to investigate what happens when doing so.

@uBlock-user uBlock-user added invalid not a uBlock issue and removed something to address something to address labels Feb 25, 2020
@fuzzykiller
Copy link
Author

Just because dynamic filtering is for advanced users doesn’t mean it has to be a pain to use.

I see where you’re coming from. However, I disagree. If the alleged first party simply does not exist (except for being a domain name), I don’t see the point.

In fact, allowing the "real" domains behind the user-facing domains could be dangerous, as they may also host other pages. Resources from these pages would then be allowed.

@gorhill
Copy link
Member

gorhill commented Feb 26, 2020

Sorry, I am not understanding what you are saying.

@fuzzykiller
Copy link
Author

I assume you’re referring to the last paragraph.

When I want to allow scripts on the example site linked above, I have to allow third-party scripts from waws-prod-bay-007.cloudapp.net (that’s where the site is currently hosted). waws-prod-bay-007.cloudapp.net however may also host other sites, like shared hosting back in the day.

Let’s say bad-scripts.com is also hosted (via CNAME) on waws-prod-bay-007.cloudapp.net. Scripts from bad-scripts.com would then also be allowed whether I want them or not.

By using only the uncloaked domain for filtering purposes, we lose some accuracy.

@gorhill
Copy link
Member

gorhill commented Feb 26, 2020

By using only the uncloaked domain for filtering purposes, we lose some accuracy

But your are not using only the uncloaked domains, these are added to the usual set of domains seen by uBO. And in any case, your worry is unspecific to uncloaked cnames, the same can be said of any 3rd party. I am not understanding your argument, there is no added worries here, just extra rules to create for 3rd parties which were otherwise hidden before.

@jebrosen
Copy link

I was also surprised to find that I had to unblock the "true" name of domains that are CNAMEs, because its own scripts are now seen as 3rd-party.

If I am understanding right, the worry @fuzzykiller has is this:

  • block 3rd-party scripts
  • visit website1.com, which is a CNAME to realhost.com. All scripts from website1.com are blocked, because they now appear to be 3rd-party scripts. All scripts from website2.com are also blocked, as is desired.
  • unblock realhost.com, which is where the scripts now appear to be coming from
    • Oops - this unblocks all those resources that are served via website2.com, because they are also "really" from realhost.com
      • A quick test suggests to me that this is not actually what happens, because website2.com is checked before checking the "real" domain

However there is still an issue - it seems I can't allow resources from website1.com hosted at realhost.com without also allowing resources from realhost.com hosted at realhost.com.

@uBlock-user
Copy link
Contributor

However there is still an issue - it seems I can't allow resources from website1.com hosted at realhost.com without also allowing resources from realhost.com hosted at realhost.com.

this unblocks all those resources that are served via website2.com, because they are also "really" from realhost.com

Both untrue. That's not how it is. Dynamic Filtering rules are made per site, unless you explicitly allow realhost everywhere. Read up on Dynamic Filtering, it seems you're having mis-conceptions about it and that's the cause of your confusion, which leads to you finding an issue.

@jebrosen
Copy link

I explained that the second one was untrue, but I don't see how the first one is untrue.

For example, suppose I have this rule:

* * 3p-script block

This now blocks www.dfam.org from loading scripts from www.dfam.org because www.dfam.org is a CNAME, requiring this additional rule:

www.dfam.org systemsbiology.net * noop

But what if I had wanted to keep blocking scripts from systemsbiology.net?

@uBlock-user
Copy link
Contributor

uBlock-user commented Feb 26, 2020

This now blocks www.dfam.org from loading scripts from www.dfam.org because www.dfam.org is a CNAME,

A CNAME will never appear as a CNAME when directly browsed to, simply not possible. Post a real life example if you have a case.

@gorhill
Copy link
Member

gorhill commented Feb 26, 2020

this unblocks all those resources that are served via website2.com, because they are also "really" from realhost.com

No, this won't unblock website2.com, for this to happen one needs to first unblock website2.com. Blocked network requests are not resolved -- so even creating an allow rule for realhost.com won't cause blocked requests from website2.com to be suddenly allowed.

For instance, in hard mode:

website1.com => not blocked => realhost.com =>  blocked
website2.com => blocked

Now creating an allow rule for realhost.com:

website1.com => not blocked => realhost.com =>  not blocked
website2.com => blocked

@jebrosen
Copy link

jebrosen commented Feb 26, 2020

No, this won't unblock website2.com, for this to happen one needs to first unblock website2.com. Blocked network requests are not resolved -- so even creating an allow rule for realhost.com won't cause blocked requests from website2.com to be suddenly allowed.

I already agreed with that analysis, in the bullet point directly under the part you have both quoted. I don't know why we are arguing against a point I discounted at the time I made it.

This now blocks www.dfam.org from loading scripts from www.dfam.org because www.dfam.org is a CNAME,

A CNAME will never appear as a CNAME when directly browsed to, simply not possible. Post a real life example if you have a case.

That is my real case, which led me to this issue in the first place.

@gorhill
Copy link
Member

gorhill commented Feb 26, 2020

I don't know why we are arguing against a point I discounted at the time I made it.

Sorry, I went to answer immediately without reading the rest first.

@gorhill
Copy link
Member

gorhill commented Feb 26, 2020

However there is still an issue - it seems I can't allow resources from website1.com hosted at realhost.com without also allowing resources from realhost.com hosted at realhost.com.

I don't see why this is an issue -- when you create an allow rule for realhost.com, you are signaling that you are trusting realhost.com -- surely realhost.com does not become magically more trustable when accessed through an alias than when accessed directly -- it's the same entity in the end.

@uBlock-user
Copy link
Contributor

That is my real case, which led me to this issue in the first place.

uBO doesn't detect any CNAMEs at dfam.org , otherwise those CNAME domains would appear in blue in the popup panel.

@jebrosen
Copy link

Sorry, I think I've muddled things up a bit.

  • My only original issue is with www.dfam.org seeing scripts at www.dfam.org as 3rd-party scripts. -- not dfam.org, the reason one is and one isn't a CNAME is uninteresting as far as this discussion --. That is what led me here.
  • I was trying to repeat what I thought @fuzzykiller was getting at with a different example in an attempt to understand the problem better, and through that I have learned that my case already shouldn't be a problem.

I don't see why this is an issue -- when you create an allow rule for realhost.com, you are signaling that you are trusting realhost.com -- surely realhost.com does not become magically more trustable when accessed through an alias than when accessed directly -- it's the same entity in the end.

I think I agree with this, now that you put it that way.

@tartpvule
Copy link

I don't see why this is an issue -- when you create an allow rule for realhost.com, you are signaling that you are trusting realhost.com -- surely realhost.com does not become magically more trustable when accessed through an alias than when accessed directly -- it's the same entity in the end.

I partially disagree with this premise. It is not about "trustability", it is about granularity of control (different levels of "trust").

Take, for example, when I visit www.instagram which is aliased to foo-instagram.bar.facebook.com. By the same-origin policy, a request to www.instagram.com is a first-party request, whereas a request to foo-instagram.bar.facebook.com would be a third-party request, even if they would both end up in Facebook-controlled territory. If the browser treats them differently, should uBlock not distinguish between them and offer advanced control accordingly?

As for the "trustability", what about the case of subdomains? For example, since alice.azurewebsites.net and bob.azurewebsites.net are both ultimately controlled by Microsoft. But surely a trust of alice.azurewebsites.net should not mean a trust of bob.azurewebsites.net. There is no way to prove that both subdomains are or are not under control of the same entity.

Should CNAMEs not be treated with the same (or greater) granularity as subdomains? A server might look at HTTP Headers for Host: foo.com or Host: bar.com and serve different responses, after all.
In this example, There is also no way to prove that both domains are or are not under control of the same entity. That server might be a VPS provider's gateway that routes to different customers.

For me, I want greater granularity of control. When I visit www.instagram.com, I want the ability to distinguish between "First party foo-instagram.bar.facebook.com", "First party whatever" and "Third-party foo-instagram.bar.facebook.com". But since Facebook is known to shuffle things around, it might be baz-instagram.bar.facebook.com at some point in the future, writing a rule to "allow foo-instagram.bar.facebook.com scripts on www.instagram.com" is not sustainable, and "allow bar.facebook.com scripts on www.instagram.com" might be too broad.

To quote:

My only original issue is with www.dfam.org seeing scripts at www.dfam.org as 3rd-party scripts. -- not dfam.org, the reason one is and one isn't a CNAME is uninteresting as far as this discussion --. That is what led me here.

Thank you

@gorhill
Copy link
Member

gorhill commented Feb 27, 2020

If the browser treats them differently, should uBlock not distinguish between them and offer advanced control accordingly?

It does distinguish them, foo-instagram.bar.facebook.com is reported separately in the popup panel and logger.

@fuzzykiller
Copy link
Author

Let’s not forget that the ongoing discussion is not what this issue was originally about. It only started later because I made an assumption that was not entirely true.

The original issue, as a reminder:

I believe that if whatever.com is CNAME-redirected to something-else.com and you browse to whatever.com, something-else.com should be considered first-party, because whatever.com doesn’t really exist.

@gorhill said that’s not going to happen and that’s that.

The intricacies of dynamic filtering with uncloaked hosts should probably be discussed elsewhere. Like on Reddit or whatever.

@tartpvule
Copy link

If the browser treats them differently, should uBlock not distinguish between them and offer advanced control accordingly?

It does distinguish them, foo-instagram.bar.facebook.com is reported separately in the popup panel and logger.

I see no way to write a rule to "noop things from first-party www.instagram.com" and "block things from third-party foo-instagram.bar.facebook.com and facebook.com".

Writing a filter (||facebook.com^$third-party) or a rule (www.instagram.com facebook.com * block) to "block third-party facebook.com" also results in blocking "first-party www.instagram.com that is actually foo-instagram.bar.facebook.com".
And ||facebook.com^$third-party,domain=~instagram.com or www.instagram.com foo-instagram.bar.facebook.com * allow also allow, for example, <img src="https://foo-instagram.bar.facebook.com/blah.jpg">.
While www.instagram.com foo-instagram.bar.facebook.com * allow is just too broad.

Is there really a way to write such rule(s)?

@tartpvule
Copy link

Let’s not forget that the ongoing discussion is not what this issue was originally about. It only started later because I made an assumption that was not entirely true.

The original issue, as a reminder:

I believe that if whatever.com is CNAME-redirected to something-else.com and you browse to whatever.com, something-else.com should be considered first-party, because whatever.com doesn’t really exist.

@gorhill said that’s not going to happen and that’s that.

The intricacies of dynamic filtering with uncloaked hosts should probably be discussed elsewhere. Like on Reddit or whatever.

To summarize:

I am trying to make the case that "first-party whatever.com that is actually something-else.com" should be managable separately from "third-party something-else.com".

As far as I can see, gorhill made the statement in #909 (comment). I partially disagree with that, and one of arguments is: since the browser treats them differently, should an advanced uBlock user not be able to make the same distinction in the rule as well?

Should a new issue be made or is there already a discussion elsewhere? I am not aware of one.

@uBlockOrigin uBlockOrigin locked and limited conversation to collaborators Feb 27, 2020
@uBlock-user
Copy link
Contributor

uBlock-user commented Feb 27, 2020

This isn't the place for discussion, do it on /u/uBlockOrigin alongwith any questions you have.

@gorhill
Copy link
Member

gorhill commented Feb 27, 2020

While www.instagram.com foo-instagram.bar.facebook.com * allow is just too broad.

"Too broad"? Really? You have more control now and your assessment is that you have less?

Sorry my brain can't comprehend why you think there is an issue.

Best is that you scratch your own itch regarding what you want, on my side I am satisfied uBO is still on good foundation after introducing de-aliasing and I won't make it entirely something else (which I fail to comprehend), you are best placed to do this.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Firefox specific to Firefox invalid not a uBlock issue
Projects
None yet
Development

No branches or pull requests

6 participants