Skip to content

Loading…

Blocking zero-width & space characters such as U+200B, U+200C, U+200D #826

Closed
gavin20 opened this Issue · 10 comments

3 participants

@gavin20

Is it possible to block zero-width & space characters? example

Reference url: http://kaspars.net/blog/web-development/invisible-click-tracking

@gorhill

Frankly, I fail to see the real life, concrete issue here. Somebody need to enlighten me about what I am missing.

@lewisje

It's a combinatorial nightmare, but it can be done; for example, to hide links with U+200B:

a[href*="​"]

a[href*="​"]

To hide images, scripts, and stylesheets with U+200C in their URLs (multiply the filters if you want to specifically target img and link and picture and source and video etc. elements):

[src*="὘"]

[src*="‌"]

To block content from URLs containing U+200D:

Ὑ

‍

@gorhill

@lewisje Yes, they can be hidden, but I want to understand why should someone worry about this.

@lewisje

My guess is paranoia surrounding this needlessly complicated form of tracking, which AFAIK cannot even be so easily mass-implemented as the Evercookie.

@gorhill

The author of the blog post just points out you can use non-ASCII space characters in a URL, but doesn't provide any hint at how this could be used in a way detrimental to users. I fail to see how this is even an issue.

@gavin20

Agreed - was simply wanting to know if it's possible to block as a precaution. So far I've been unsuccessful.

@gorhill

@gavin20 Try -tracking%E2%80%8B|$inline-script. To block inline script tags, a filter must explicitly declare inline-script (like popup), or there would be tons of false positive breaking inline scripting everywhere.

@gavin20

Confirmed that worked, but I was trying to block "%E2%80%8B" , or "%" for example, which doesn't seem to work.

@gorhill

uBlock extracts tokens from a URL, and those tokens are used to look-up filters. There is no other way to make a blocker minimally efficient, and by the look of it, the filters in EasyList, EasyPrivacy etc. are crafted with this in mind. A token is a sequence of any of [0-9a-z%]. So you can't use tracking to block trackingABC.

@gorhill

Ok so I can't figure why this is an issue: I completely fail to see what was the point of the blog post.

@gorhill gorhill closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.