Skip to content
This repository has been archived by the owner on Jul 21, 2021. It is now read-only.

Unwanted domain(s) listed! #221

Closed
Alpengreis opened this issue May 21, 2015 · 23 comments
Closed

Unwanted domain(s) listed! #221

Alpengreis opened this issue May 21, 2015 · 23 comments

Comments

@Alpengreis
Copy link
Contributor

Hello gorhill/all

Sometimes - if a URL (site) NOT loaded from blank tab - unwanted domain(s) is/are listed in uMatrix (also in uBlock by the way).

Steps to reproduce:

  1. Load a site, which makes huge traffic (over much domains), such as a big Google Plus site.
  2. Load another site WITHOUT create a new tab first and WITHOUT relation to google.com. Instead type/take the new URL direct in the omnibox.
  3. Now - if you look in the uMatrix it's possible, that there is (for this example) nevertheless google.com - but the new site has nothing to do with this domain.
    At least, while point 1 is in use and traffic is not finished, this is a possible effect.

Also a refresh of the new site has no effect (note: in uBlock a refresh HAS an effect and google.com is away after).

Of course, if load a new URL in a New Tab (blank), this is never a problem.

Here a two examples:

This is from a loaded Google Plus site and NOT ok ...
internetbox_notok

This is from a blank tab and OK ...
internetbox_ok

The new URL is a local site, but this is an example only (I had the behaviour also with two external sites).

Possible this is not direct a problem from uMatrix. Nevertheless, it would be good uMatrix (and uBlock) could handle this ...

Is this "fixable" or not, resp. by Design?

By the way: I have this problem in Google Chrome with combo uMatrix/uBlock (both gorhill) in uBlock AND uMatrix (host-list removed in uBlock and no dynamic filtering is active in uBlock) AND also in Firefox with combo NoScript/uBlock (chrisaljoudi) in uBlock.

However: many thanks for answer(s) in advance!

Kind regards
Alpengreis

PS: Thank you VERY much for SUCH a great tool and work!
PPS: Sorry for my english, it's not "my" language ...

@gorhill
Copy link
Owner

gorhill commented May 21, 2015

Can you see in the logger the sequence of events? The logger will report in the exact order the network events were received. I would need an exact scenario to reproduce -- actual URLs with which you can reproduce the issue all the time.

I suspect this could be caused by the browser setting "Use a prediction service to help complete searches and URLs...".

@Alpengreis
Copy link
Contributor Author

Here is an example, that always works, at least if I not wait a (very) long time after load the g+ site:

A) Load URL
https://plus.google.com/107545467275966756564/posts?hl=de

B) Then URL
http://www.swissvpn.net/

The screenshot from uMatrix ...
umatrix_ex1
NOTE: Even a hard reload has no effect ...

The screenshot from uBlock ...
ublock_ex1a

Here a normal reload (not hard) has effect ...
ublock_ex1b

And here the log (truncated) ...

03:08:31 cookie http://www.swissvpn.net/{persistent-cookie:lang_cookie}
03:08:31 cookie http://google.com/{persistent-cookie:HSID}
... more such or related links snipped ...
03:08:31 cookie https://plus.google.com/{persistent-cookie:OTZ}
03:08:30 other https://talkgadget.google.com/u/0/_/diagnostics/?diagno
... more such or related links snipped ...
03:08:30 other https://csi.gstatic.com/csi?v=3&s=hangouts&action=&it=w
03:08:30 script http://www.swissvpn.net/{inline_script}
03:08:30 image http://www.swissvpn.net/images/gr_bg.gif
... more such or related links snipped ...
03:08:30 xhr https://play.google.com/log?format=json&u=0
03:08:30 other https://plus.google.com/_/diagnostics/?diagnostics=%5B%
03:08:30 other https://plus.google.com/_/diagnostics/?diagnostics=%5B%
03:08:30 other https://plus.google.com/_/diagnostics/?diagnostics=%5B%
03:08:30 css http://www.swissvpn.net/svpntxt.css
03:08:30 other https://plus.google.com/_/diagnostics/?diagnostics=%5B%
... more such or related links snipped ...
03:08:30 other https://csi.gstatic.com/csi?v=3&s=oz&action=profload_st
03:08:30 xhr https://plus.google.com/_/stream/markitemread/?hl=de&oz
03:08:30 xhr https://play.google.com/log?format=json
03:08:30 doc http://www.swissvpn.net/
http://www.swissvpn.net/
03:08:29 xhr https://8.client-channel.google.com/client-channel/chan
03:08:29 xhr https://play.google.com/log?format=json&u=0
03:08:29 xhr https://play.google.com/log?format=json
03:08:28 other https://ssl.gstatic.com/chat/sounds/incoming_video_shor
03:08:28 other https://ssl.gstatic.com/chat/sounds/incoming_video_long
03:08:28 other https://ssl.gstatic.com/chat/sounds/incoming_message_eb
03:08:27 xhr https://talkgadget.google.com/_/scs/talk-static/_/js/k=
03:08:26-- image https://ad.doubleclick.net/activity;src=2542116;type=so
03:08:26 cookie https://plus.google.com/{localStorage}
03:08:26 frame https://plus.google.com/_/blank
03:08:25 script https://ssl.gstatic.com/accounts/o/3655170095-postmessa
03:08:25 script https://oauth.googleusercontent.com/gadgets/js/core:rpc
03:08:25 script https://plus.google.com/u/0/_/notifications/frame?sourc
03:08:25 cookie https://plus.google.com/{localStorage}
03:08:25 css https://fonts.gstatic.com/s/roboto/v15/oMMgfZMQthOryQo9
03:08:25 image https://ssl.gstatic.com/s2/oz/images/notifications/spin
03:08:25 frame https://accounts.google.com/o/oauth2/postmessageRelay?p
03:08:25 script https://apis.google.com/_/scs/apps-static/_/js/k=oz.gap
03:08:25 css https://plus.google.com/_/scs/apps-static/_/ss/k=oz.sbw
03:08:25 xhr https://8.client-channel.google.com/client-channel/chan
03:08:25 script https://apis.google.com/_/scs/apps-static/_/js/k=oz.gap
03:08:25 frame https://plus.google.com/u/0/_/notifications/frame?sourci
03:08:24 script https://plus.google.com/hangouts/_/hscv?pvt=AMP3uWbBIKT
03:08:24 cookie https://plus.google.com/{localStorage}
03:08:24 xhr https://plus.google.com/_/scs/talk-static/_/js/k=wcs.ha
03:08:24 script https://apis.google.com/js/client.js
03:08:24 xhr https://8.client-channel.google.com/client-channel/chan
03:08:24 xhr https://8.client-channel.google.com/client-channel/chan
03:08:24 xhr https://8.client-channel.google.com/client-channel/chan
03:08:24 frame https://plus.google.com/hangouts/_/hscv?pvt=AMP3uWbBIKT
03:08:24 image https://lh3.googleusercontent.com/-Sbi02TE9dLg/U97nH3FD
03:08:24 image https://lh3.googleusercontent.com/proxy/RcgZvDoRzsSPg98
03:08:24 xhr https://plus.google.com/_/scs/apps-static/_/js/k=oz.hom
03:08:24 xhr https://8.client-channel.google.com/client-channel/chan
03:08:24 script https://talkgadget.google.com/u/0/talkgadget/_/frame?v=
03:08:24 image https://ssl.gstatic.com/ui/v1/activityindicator/offline
03:08:24 image https://ssl.gstatic.com/chat/babble/sprites/common-301c
03:08:23 xhr https://8.client-channel.google.com/client-channel/chan
03:08:23 image https://ssl.gstatic.com/ui/v1/activityindicator/loading
03:08:23 xhr https://plus.google.com/_/stream/getactivities/?hl=de&o
... more such or related links snipped ...
03:08:23 xhr https://8.client-channel.google.com/client-channel/gsid
03:08:23 script https://talkgadget.google.com/u/0/talkgadget/_/frame?v=
03:08:23 xhr https://8.client-channel.google.com/client-channel/gsid
03:08:23 frame https://talkgadget.google.com/u/0/talkgadget/_/frame?v=
03:08:23 script https://8.client-channel.google.com/client-channel/clie%
03:08:23 frame https://talkgadget.google.com/u/0/talkgadget/_/frame?v=
03:08:23 script https://apis.google.com/js/api.js
03:08:23 script https://8.client-channel.google.com/client-channel/js/1
03:08:23 script https://clients4.google.com/invalidation/lcs/client?xpc
03:08:23 frame https://8.client-channel.google.com/client-channel/clie%
03:08:22 xhr https://plus.google.com/_/profiles/getprofilepagephotos
03:08:22 script https://talkgadget.google.com/u/0/talkgadget/_/chat?cli0
03:08:22 xhr https://talkgadget.google.com/_/scs/talk-static/_/ss/k=
03:08:22 cookie http://talkgadget.google.com/{session-cookie:llbcs}
03:08:22 script https://talkgadget.google.com/_/scs/talk-static/_/js/k=
03:08:22 frame https://talkgadget.google.com/u/0/talkgadget/_/chat?cli0
03:08:22 script https://talkgadget.google.com/_/scs/talk-static/_/js/k=
03:08:22 xhr https://plus.google.com/_/socialgraph/lookup/people/?if
03:08:22 image https://ssl.gstatic.com/s2/oz/images/circles/cpw-7de38e
03:08:21 cookie https://clients5.google.com/{localStorage}
03:08:21 xhr https://plus.google.com/_/profiles/getfollowercount/101
03:08:21 xhr https://plus.google.com/_/scs/apps-static/_/js/k=oz.hom
03:08:21 xhr https://plus.google.com/_/people/notify?soc-app=1&cid=0
03:08:21 script https://talkgadget.google.com/u/0/talkgadget/_/host-js?
03:08:21 frame https://talkgadget.google.com/u/0/talkgadget/_/blank
03:08:21 xhr https://plus.google.com/_/scs/apps-static/_/js/k=oz.hom
03:08:21 image https://lh3.googleusercontent.com/-HDk4PX0tPv8/AAAAAAAA
03:08:21 xhr https://plus.google.com/_/scs/apps-static/_/js/k=oz.hom
... more such or related links snipped ...
03:08:21 xhr https://play.google.com/log?format=json
03:08:21 xhr https://plus.google.com/_/scs/apps-static/_/js/k=oz.hom
... more such or related links snipped ...
03:08:20 image https://ssl.gstatic.com/s2/oz/images/sprites/profiles_s
03:08:20 xhr https://plus.google.com/_/scs/apps-static/_/js/k=oz.hom
... more such or related links snipped ...
03:08:19 script https://apis.google.com/_/scs/apps-static/_/js/k=oz.gap
03:08:19 script https://plus.google.com/107545467275966756564/posts?hl=
03:08:19 cookie https://plus.google.com/{localStorage}
03:08:19 image https://ssl.gstatic.com/s2/oz/images/sprites/collection
... more such or related links snipped ...
03:08:19 css https://fonts.gstatic.com/s/roboto/v15/El-bgsteBznJNL5p
03:08:19 image https://ssl.gstatic.com/s2/oz/images/sprites/stream_sho
... more such or related links snipped ...
03:08:19 xhr https://plus.google.com/_/scs/apps-static/_/js/k=oz.hom
03:08:19 script https://www.gstatic.com/og/_/js/k=og.og.en_US.fC5iiuc_6
03:08:19 css https://fonts.gstatic.com/s/roboto/v15/tZdhd9Zzj0I2MwoD
03:08:19 css https://fonts.gstatic.com/s/roboto/v15/N5Lbe1fynPA1KT8B
03:08:19 image https://lh5.googleusercontent.com/-HDk4PX0tPv8/AAAAAAAA
03:08:19 image https://ssl.gstatic.com/gb/images/v1_376447c3.png
03:08:18 image https://images-pos-opensocial.googleusercontent.com/gad
... more such or related links snipped ...
03:08:18 image https://maps-api-ssl.google.com/maps/api/staticmap?size
... more such or related links snipped ...
03:08:18 image https://s2.googleusercontent.com/s2/favicons?alt=p&doma
03:08:18 image https://lh3.googleusercontent.com/proxy/6UPOhVeV7V5x3zA
... more such or related links snipped ...
03:08:18 image https://ssl.gstatic.com/s2/oz/images/logo/2x/googleplus
03:08:18 script https://plus.google.com/_/scs/apps-static/_/js/k=oz.hom
03:08:18 css https://plus.google.com/_/scs/apps-static/_/ss/k=oz.hom
03:08:18 cookie http://google.com/{persistent-cookie:HSID}
... more such or related links snipped ...
03:08:18 cookie https://plus.google.com/{persistent-cookie:OTZ}
03:08:17 doc https://plus.google.com/107545467275966756564/posts?hl=
https://plus.google.com/107545467275966756564/posts?hl=de
03:08:09 cookie http://google.ch/verify{persistent-cookie:SNID}
... more such or related links snipped ...
03:08:08 other https://www.google.ch/_/chrome/newtab/manifest?espv=2&i
... more such or related links snipped ...
03:08:08 image https://www.google.ch/images/srpr/logo9w.png
03:08:08 script https://www.google.ch/xjs/_/js/k=xjs.ntp.en_US.BFZSFxB-
... more such or related links snipped ...
03:08:08 cookie http://google.ch/{persistent-cookie:PREF}
03:08:08 cookie http://google.ch/{persistent-cookie:PREF}
03:08:08 other https://plus.google.com/_/diagnostics/?diagnostics=%5B%
... more such or related links snipped ...
03:08:08 doc https://www.google.ch/_/chrome/newtab?espv=2&ie=UTF-8
https://www.google.ch/_/chrome/newtab?espv=2&ie=UTF-8

NOTE: loaded BOTH through bookmark, NOT over omnibox ...

About your suspect "Use a prediction service to help complete searches and URLs...". It SEEMS, it's not the reason after a test, but I make further tests (with empty cache first, restart browser, and some other things). I'll report it here, IF this is responsible, of course!

Thank you for your help!

Alpengreis

@Alpengreis
Copy link
Contributor Author

I have now the same combo uBlock & uMatrix in Fx (plus NoScript for some things), and I was able to reproduce this behaviour too ...

@Alpengreis Alpengreis changed the title Sometimes unwanted domain(s) listed ... Unwanted domain(s) listed! Aug 4, 2015
@Alpengreis
Copy link
Contributor Author

Gorhill, could you please check this behaviour? It's such an annoying thing and can be even dangerous! It's the case in Chrome and Firefox. And ist even the the case in a complete new (clean) installation on Win 10 ...

Thank you!

@adrienbeau
Copy link

I have seen this too, especially using the Google search engine (but maybe this is because I use it a lot).

  • Browser: Firefox 39.0 (Mozilla Firefox for Ubuntu - Canonical - 1.0)
  • uMatrix: v0.9.2.1

Here's how I can reproduce it fairly well:

  • Go to http://www.google.fr/
  • Search for "adrien beau free fr"
  • One of the search results is for http://adrien.beau.free.fr/ (a very simple pure HTML page, no Javascript, no cookies)
  • Wait a bit in case Google has requests running in the background
  • Middle-click on the search result to open it in a new tab
  • In the uMatrix log, you can see two requests for my web site (the page, and one image), 10 google.fr cookies, and one google.fr script (all of them blocked by uMatrix)
  • In the page matrix, google.fr and www.google.fr are listed even though nothing in the page refers to them

Here a screenshot of the page matrix:
umatrix-adrien-beau-free-fr

And here's a big screenshot of the uMatrix log resulting from this test (most relevant lines at the top):
screen shot 2015-08-04 at 13 30 36

@gorhill
Copy link
Owner

gorhill commented Aug 4, 2015

@Alpengreis

Your original bug seems to be a side effect of how the browser API works: network requests are associated with tab, not with web page, and because of this, it is possible that network requests from a previous page are seen by a new page -- and there is no way for an extension to decide to which web page a specific network request originates, it can only tell from which tab. The fact that refreshing the page does not change the matrix content is by design in uMatrix: uMatrix will cache and reuse the data, until a few minutes after the page has not been visited. The reason for this is that if a web page make unfrequent requests to some specific 3rd-party hostnames, you still want to keep that information around a bit so that the user is properly informed about this.

@adrienbeau

Keep in mind this: URL redirections.

The way redirections are detected is different in uMatrix, I may look into this to see if I can improve to get the same results as how HTTP Switchboard behaved. In your case, clicking on a link in Google search result always results in a redirection (because Google wants to know which link you clicked).

@adrienbeau
Copy link

Thanks for the link, I didn't know about it.

I know about the Google search redirection, and it can actually be seen in the screenshot. The second gray bar from the top is when I middle-clicked to open in a new tab (at 13:25:26, more than one minute after displaying the search results). We can see Google set some cookies at that point, and then redirected to my site. uMatrix is apparently able to decide it is a new site, since it displays a grey bar for it. Maybe some Google requests were still lingering at that point, I'm not up-to-date with what concurrent events can happen with Javascript these days.

I understand it is not easy to decide when the requests are "coming from a new site", or "still issued by the current site"; maybe a good FAQ is the best solution to this issue.

@Alpengreis
Copy link
Contributor Author

I understand the tech explanation, thank you, Gorhill!

The problem is, if I load a new page in the same tab, and I save a new rule for this page, I have included a possible unwanted domain. AFAIK, I never had this behaviour in NoScript (it's not the very same, I know, nevertheless ...).

For example I load a Google Plus page and after my local Router page, I see there the google domain. But this site has no link to google. It's how you said: It's only in the same tab as g+ before was. If I do not check this explicit (load in new tab) and I save this (allow google), I have a "false" record.

In daily work, this means for me: I have EVERYTIME to close the actual tab first, resp. I have to load a new tab, before I load a new page - to ensure, that on the new page are only "valid" domains listed.

As workaround I had used the Tab Mix Plus, which opens normally automatically a new tab. But I would like to reduce plugins now. BUT: the situation is not very user friendly at this point.

I hope, you understand me (enough), my english is not very good :-)

However: I hope you can change somehow this behaviour, else I must live with it ...

Info: I mean NOT REAL included links, such as google or whatever (they are in many many pages, I know that).

Also interesting: why can NoScript handle this proper? Is this also by design?

Kind regards!

@gorhill
Copy link
Owner

gorhill commented Aug 4, 2015

why can NoScript handle this proper?

I can't answer as I know nothing about NoScript code. One thing is for sure though, is that NoScript does not report xmlhttprequest, which are the network requests most likely to be affected by the current issue.

@Alpengreis
Copy link
Contributor Author

Okay, then I leave my workaround active, not soooo a big thing.

Thank you, Gorhill!

PS: For other users with Firefox: I use the AddOn "Tab Mix Plus" and configured it to open relevant things in NEW TABS - so, it's a practicable workaround for this behaviour ...

@Alpengreis
Copy link
Contributor Author

This seems to be fixed now in uBlock but not yet in uMatrix! Could you fix it in uMatrix too, please? This would be so important, to avoid extra AddOn!

@Alpengreis Alpengreis reopened this Nov 18, 2015
@gorhill
Copy link
Owner

gorhill commented Nov 18, 2015

What version of uMatrix? Give me steps to reproduce please, that will save me time.

OK reproduced with latest build dev.

@gorhill
Copy link
Owner

gorhill commented Nov 18, 2015

From what I can see, there is a beforeunload event listener on the Google+ page, which executes after the new document started loading. As said, uMatrix is being told from which tab a network request occurs, not from which document URL.

To give some perspective, even the Network pane in the dev console will report network requests to google.com when loading the second site -- so the issue is not specific to uMatrix.

@Alpengreis
Copy link
Contributor Author

Thanks for answer, gorhill! I use also latest Dev (with latest Fx Release (42.0)).

I had reasked, because it's NO MORE the case with uBlock (before it was).

Would it be possible to have the same behaviour in uMatrix as in uBlock? The problem is, it's really difficult to handle these (unnecessary? or at least undesired) domains in the matrix, even if they are not in uBlock ...

Or in other words: how can I decide that such domains are from the webpage itself or not - without other tools? Or: if such domains are there, I will not make a relation the page itself, if it's not from the page source.

The only possibility to avoid this behaviour with uMatrix is: I have to look ALWAYS in uBlock OR I have to open EACH link in a new Tab. In Fx, this is relatively easy with Tab Mix Plus (TMP), but with Chrome, I don't know an extension for this (or they does not work (correctly) - so it's necessary to make it ALWAYS "manually".

So uBlock makes it "okay", NoScript makes it okay, uMatrix not.

Or exist any reason to leave this so in uMatrix?

Many greetings!!

@gorhill
Copy link
Owner

gorhill commented Nov 18, 2015

I don't know why it does not happen with uBO, it should, the logger reports google.com after the second document has started loading. I need to investigate why. NoScript does not report google.com because what is pulled are images, not scripts.

@Alpengreis
Copy link
Contributor Author

Okay, thanks. It WAS also the case with uBO as I had made my first posting here ...

However, have a nice week yet, gorhill!

@Alpengreis
Copy link
Contributor Author

@gorhill
Yes, you have right. Indeed uBlock should display the "unwanted Domain" too! This is the result after loading first google maps (https://www.google.ch/maps/) and then http://www.swissvpn.net/ in the SAME tab.

ubo_1

ubo_2

This is really annoying to not have at least the same result in uMatrix and uBO.

@Alpengreis
Copy link
Contributor Author

I have news about this ...

It seems it's NOT the beforeunload listener!

I have disabled this in Firefox (latest Release) in the config (dom.disable_beforeunload = true).

XHR seems to be involved. I had allowed on the google maps page ALL except XHR. After loading swissvpn.net NO google entry. After switch the XHR also to allow: BOOM, google is present after loading swissvpn.net in same Tab.

PS: Could this have to do something with the AV Scanner and/or BehindTheScene?
PPS: Another idea is the onreadystatechange ...
https://developer.mozilla.org/en-US/docs/Web/Events/readystatechange
which I have found in the Google Page Source Code.

@Alpengreis
Copy link
Contributor Author

Interesting. I had to install Chrome again. I installed the v47.0.2526.80 m (64-bit). There, this problem does NOT exist (tried with the same links above). Even not in the log, as you can see here ...

uMatrix:
umatrix-log

uBO:
ublock-log

@Alpengreis
Copy link
Contributor Author

@gorhill

The problem exist also in Chrome ("again")! So it's NOT a browser bug (at least not in Fx only) ...

I could reproduce with the following process ...

  1. Load URL http://www.20min.ch/sport/fussball/story/Sion-belohnt-seinen-Sturmlauf-15059427
  2. Make a refresh without cache (Ctrl+F5)
  3. Load URL https://www.mywot.com/en/scorecard/mywot.net in the same tab
  4. GO BACK function to go back to site in 1)
  5. LOAD URL https://www.mywot.com/en/scorecard/mywot.net in the same tab
  6. Repeat Step 4 and 5 (one time SHOULD be enough)

After this, the 20min appears in the uMatrix of WOT ...

Also here: NOT in uBlock, only in uMatrix!

@gorhill
Copy link
Owner

gorhill commented Dec 13, 2015

I explained why this happened:

  • The browser tells uBO/uMatrix from which tab a network request originates, not from which web page.
  • uMatrix works differently than uBO -- it remembers the seen hostnames across page reloads (up to a few minutes in the future), for the benefit of the user (if a 3rd-party occurs only once on a specific page load, we want the user to be informed about this, and not have the information flush down on a mere page reload).

@Alpengreis
Copy link
Contributor Author

Okay, NOW I have understood! Sorry for my long time to check this and the trouble! Thank you!

@Alpengreis
Copy link
Contributor Author

Can you please answer the following yet:

Today, I found out with the Fx example (google maps -> swissvpn.net) that it's NOT a problem, if I WRITE the address in the omnibar.

This means: if I open the new link in same tabe with a Mouse click or Enter from Bookmarks, it's a problem - if I write the new link and press Enter, (opens also in the same tab) it's not a problem.

Can you explain this?
I was wrong: also with omnibar exists this behaviour!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants