Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTML filters occasionally lost effectiveness #42

Closed
Crystal-RainSlide opened this Issue May 15, 2018 · 27 comments

Comments

Projects
None yet
4 participants
@Crystal-RainSlide
Copy link

commented May 15, 2018

Prerequisites

  • I verified that this is not a filter issue
  • This is not a support issue or a question
  • I performed a cursory search of the issue tracker to avoid opening a duplicate issue
  • I tried to reproduce the issue when...
    • uBlock Origin is the only extension
    • uBlock Origin with default lists/settings
    • using a new, unmodified browser profile
  • I am running the latest version of uBlock Origin
  • I checked the documentation to understand that the issue I report is not a normal behavior

Description

An ad script in https://www.baidu.com/s?wd=%E7%9A%AE%E9%9E%8B

<script id="ecomScript">
...
<script>

I tried to remove it with www.baidu.com##^script#ecomScript, but I found that only script:contains() is supported.

Then I took some time with the code and got 'setAdsHeight' in the js outline from Firefox's dev tools, then everything works fine, but the time has gone.

Certainly CSS Selector don't support this, but adblocks may support it for simplifing the rules&filtering.

A specific URL where the issue occurs

All website equipped with <script foo="bar">

Steps to Reproduce

Descripted↑

Expected behavior:

example.com##script#id

Actual behavior:

Nothing happens & desired filter don't work

Your environment

  • uBlock Origin version: Newest
  • Browser Name and version: Firefox Developer Edition 61.0
  • Operating System and version: Windows 7 Customed.
@Crystal-RainSlide

This comment has been minimized.

Copy link
Author

commented May 15, 2018

One other thing...

I mistaked the outdated fork source of fang5566/uBlock , which is the offical Chinese (中文) introduction&wiki , and posted uBlock-LLC/uBlock#1767 .

Then I got a ban at https://github.com/gorhill/uBlock/issues/new for that misposted issue.

WHY???

@gorhill

This comment has been minimized.

Copy link
Member

commented May 15, 2018

There is no ban on https://github.com/gorhill/uBlock/issues/new, it's just reserved to contributors. Here is the proper issue tracker.

@gorhill

This comment has been minimized.

Copy link
Member

commented May 15, 2018

I answered to you on uBlockAdmin (which is a scammy fork by the way):

When I view-source your URL https://www.baidu.com/s?wd=%E7%9A%AE%E9%9E%8B, I get:

<!DOCTYPE html>
<!--STATUS OK-->

No script in there.

@gorhill

This comment has been minimized.

Copy link
Member

commented May 15, 2018

Duh, sorry, I didn't realize I could scroll. I will investigate.

@gorhill

This comment has been minimized.

Copy link
Member

commented May 15, 2018

Ok, again, when I view-source your URL https://www.baidu.com/s?wd=%E7%9A%AE%E9%9E%8B, I can't find any instance of ecomScript in the source.

@gwarser

This comment has been minimized.

Copy link
Member

commented May 15, 2018

ecomScript is already gone. I tested on script id="head_script" and cannot reproduce.

@gorhill

This comment has been minimized.

Copy link
Member

commented May 15, 2018

@Crystal-RainSlide HTML filtering will only match what is in seen in view-source:, not what is dynamically added afterward.

@gorhill

This comment has been minimized.

Copy link
Member

commented May 15, 2018

@gwarser Yes, cofirmed it works fine with www.baidu.com##^script#head_script, the script element is present in view-source:, and I confirmed it was properly removed by the above filter, so the syntax works, it's just a matter of filtering what is present in the source data as per view-source:.

@gwarser

This comment has been minimized.

Copy link
Member

commented May 15, 2018

When I tested this earlier ecomScript was present, but disappeared after refresh or two. This and my assumption that content of view-source should be filtered, misleaded me to think that something is not working. But now I'm pretty sure all is working as it should.

@Crystal-RainSlide why you think that only script:contains() is supported? Plese, provide detailed steps to reproduce.

@uBlock-user

This comment has been minimized.

Copy link
Member

commented May 15, 2018

When I tested this earlier ecomScript was present, but disappeared after refresh or two.

Still there on my end - https://i.imgur.com/xZr5QIe.jpg Can't reproduce though, it blocks just as expected.

@uBlock-user

This comment has been minimized.

Copy link
Member

commented May 15, 2018

script:contains() is only supported on Firefox-legacy branch, you must have installed that build instead of WebExtension build.

@gorhill

This comment has been minimized.

Copy link
Member

commented May 15, 2018

##script:contains() is deprecated syntax, it is internally converted to ##^script:has-text() -- true for either Firefox legacy or webext.

@Crystal-RainSlide

This comment has been minimized.

Copy link
Author

commented May 15, 2018

Sorry for my half day long offline.

About script#ecomScript 's disappearing

Whether things like business ads, ecomScript will appear depends on Baidu's advertising strategy. 皮鞋 means leather shoes, a kind of commodity, so this search word would cause ecomScript appearing in my test. But Baidu's ad market is mainly in China, so do the advertisers. So ecomScript may disappear in your tests for oversea customers are less likely to click ads.

About reproduction

I also failed to reproduce it after Firefox update automatically. But I got some other results:

content_right

↑ I've stayed with www.baidu.com##^#content_right for a long time, but it lost effectiveness when I'm reproducing.

So it seems like a stability problem but not an unsupported syntax... I'll go on with more tests.

@Crystal-RainSlide Crystal-RainSlide changed the title [Syntax Support] example.com##script#id HTML filters (example.com##^.badstuff) occasionally lost effectiveness (reproducing...) May 15, 2018

@Crystal-RainSlide

This comment has been minimized.

Copy link
Author

commented May 16, 2018

DONE, reproduced on my

Firefox Developer Edition 61.0

Steps

(0. Open the The logger to verify whether HTML filters work?)

  1. Pick an element which will definitely appear (but not body) and block it with ^. Pick an element is for obviousness. I used www.baidu.com##^#head this time.
  2. Open https://www.baidu.com/s?wd=%E7%9A%AE%E9%9E%8B in a tab. That element should be removed aright.
  3. Right click on that tab and duplicate it.
  4. Check the duplicated tab. That element should appear again.

Then, Chrome... Note that I'm using Cent Browser, not offical Chrome.

@gwarser

This comment has been minimized.

Copy link
Member

commented May 16, 2018

I can reproduce once out of ~10 attempts. This happens when page is loaded from cache (I suppose, because there is no response header):

screenshot_20180516_103847

@gwarser gwarser reopened this May 16, 2018

@uBlock-user

This comment has been minimized.

Copy link
Member

commented May 16, 2018

www.baidu.com##^#head doesn't work on my end, I can reproduce every time and #head appears in the view-source of both tabs -- original and duplicate.

@uBlock-user uBlock-user removed the invalid label May 16, 2018

@gwarser

This comment has been minimized.

Copy link
Member

commented May 16, 2018

Weird thing - I can easily reproduce on https://github.com/ with github.com##^#user\[login\], but it's very hard with github.com##^form

//edit:

May be because of this:

Race Cache with Network: When we detect that disk IO may be slow, we send a network request in parallel, and we use the first response that comes back. For users with slow spinning disks and a low latency network, the result would be faster loads. (Firefox 59)

@gwarser

This comment has been minimized.

Copy link
Member

commented May 16, 2018

@uBlock-user uBlock-user changed the title HTML filters (example.com##^.badstuff) occasionally lost effectiveness (reproducing...) HTML filters (example.com##^.badstuff) occasionally lost effectiveness May 16, 2018

@uBlock-user uBlock-user changed the title HTML filters (example.com##^.badstuff) occasionally lost effectiveness HTML filters occasionally lost effectiveness May 16, 2018

@gwarser

This comment has been minimized.

Copy link
Member

commented May 16, 2018

I tried to set request header Cache-Control: no-cache, must-revalidate and this does not help.
Probably something from this https://bugzilla.mozilla.org/buglist.cgi?quicksearch=cache%20webrequest&list_id=14151937

@gorhill

This comment has been minimized.

Copy link
Member

commented May 16, 2018

I can reproduce every time and #head appears in the view-source of both tabs -- original and duplicate.

view-source: is to be used to find out what can be filtered, not what has been filtered. Firefox bypasses uBO when requesting a page via view-source:.

@uBlock-user

This comment has been minimized.

Copy link
Member

commented May 16, 2018

You're right, I just tested ##.cr-content with baidu.com##^.cr-content and it works as expected.

@gorhill

This comment has been minimized.

Copy link
Member

commented May 16, 2018

Then, Chrome... Note that I'm using Cent Browser, not offical Chrome.

HTML filtering is not supported on Chromium-based browsers, it's missing the proper API.

@gorhill

This comment has been minimized.

Copy link
Member

commented May 16, 2018

I can reproduce sporadically when using the "duplicate tab" trick. When the issue occurs, the HTML filter is not reported in the logger. I will keep investigating.

@gwarser

This comment has been minimized.

Copy link
Member

commented May 16, 2018

Actually Cache-Control: no-cache, must-revalidate (set by https://addons.mozilla.org/firefox/addon/header-editor/) may help, but only after browser restart.

@gorhill

This comment has been minimized.

Copy link
Member

commented May 16, 2018

Yes, I was looking at this: https://bugzilla.mozilla.org/show_bug.cgi?id=1376932.

I confirm that when the issue occurs, uBO's onHeadersReceived listener is not being called at all, Firefox bypasses uBO, hence uBO can't do its job.

@uBlock-user

This comment has been minimized.

Copy link
Member

commented May 16, 2018

So this is a browser bug ?

@gorhill

This comment has been minimized.

Copy link
Member

commented May 16, 2018

Yes, but I need to provide a workaround given the seriousness of it: one main use of HTML filtering is to remove unwanted specific inline script tag, and the issue here means this could lead to unwanted inline script code being executed. Beside, this should also solve the issue described in https://bugzilla.mozilla.org/show_bug.cgi?id=1376932 -- NoScript is being mentioned but uBO does also suffers from it (also uMatrix: gorhill/uMatrix#893).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.