Skip to content
This repository has been archived by the owner on Sep 9, 2022. It is now read-only.

Ensure all URLs reaching uBlock's core are normalized to ASCII #498

Closed
gorhill opened this issue Jan 14, 2015 · 11 comments
Closed

Ensure all URLs reaching uBlock's core are normalized to ASCII #498

gorhill opened this issue Jan 14, 2015 · 11 comments

Comments

@gorhill
Copy link
Contributor

gorhill commented Jan 14, 2015

Today I went to my regular perusing of Adblock Plus issues -- to pre-emptively find out whether uBlock suffers same reported issues -- and I found one such issue which uBlock also suffers on Firefox, which is the handling of IDNs.

All URLs which reach the code code of uBlock needs to be normalized to ASCII. On Chromium the normalization is not necessary as the extension API always normalize URL to punycode. However that is not the case for Firefox.

gorhill added a commit that referenced this issue Jan 14, 2015
@gorhill
Copy link
Contributor Author

gorhill commented Jan 15, 2015

gorhill added a commit that referenced this issue Jan 15, 2015
@gorhill
Copy link
Contributor Author

gorhill commented Jan 15, 2015

@Deathamns In shouldLoadListener(), where does e.data comes from?

I ask because because it contains URL properties which do not have the punycoded version unlike the ones I used elsewhere (asciiSpec).

@Deathamns
Copy link
Contributor

e.data is the message data (defined by the browser, you can read about the basics here). It is sent from frameModule.js:154, but that URL is used for comparing the channel's URL in vapi-background.js.

If you need the asciiSpec for filtering only, then probably you don't want to use it everywhere, for example when comparing the URLs of opened tabs, or when determining the source tab for a popup.

Also, probably you misused self.URI (or I just don't understand what you meant there).
I'll send a change later that I think will work.

@Deathamns
Copy link
Contributor

Additional note: element picker should punycode the URLs in Firefox too. Probably an additional API in vapi-client.js (Chrome, Safari would use an ordinary anchor element, Firefox would use a modified nsIURI to mimic the anchor element).

@gorhill
Copy link
Contributor Author

gorhill commented Jan 15, 2015

If you need the asciiSpec for filtering only

I want everything passed to uBlock's platform independant code to be normalized to punycode. The code to extract the hostname from a URL, or to extract the effective domain from a hostname is not Unicode safe.

Re. self.URI, that's the only way I could make it work without having to re-allocate a new URI object every call.

I figured it was better to punycode on uBlock's core side, not in content scripts (so no need in vapi-client.js). That's what I tried at first and this was getting way too complicated. Let's consider the messaging.js counterparts of the injected scripts as the entry point where URL may need to be punycoded, hence why in the end I put vAPI.punycode* in background only.

I will go fix element picker, I thought about it but it escaped my mind at the end.

@Deathamns
Copy link
Contributor

URL is a constructor, and setting the href attribute won't do anything.
If you want to use the URL, then this is how: new URL('about:blank'). But, even like this, if you set the hostname (cachedURL.hostname = punycodeHostname(cachedURL.hostname)) to a punycoded version, later returning the href will still give you the unicode version.

@gorhill
Copy link
Contributor Author

gorhill commented Jan 15, 2015

will still give you the unicode version

I see, I should have stepped in the code. For whatever reasons, this fixed having a blank row in the popup menu's dynamic filtering. Too much in a hurry to have a release.

Se here is the thing:

  • Can we revert changes in vapi-client.js to remove getUrlNormalizer? (not needed)

Now the element picker won't work on Firefox for IDN, because the filter parsers do not support yet Unicode-encoded domain option. Since these parsers are key to load fast, I want to take my time to do the right thing here, and if it means for now not supporting that one filter in the Russian list and the element picker for Firefox for sites which use IDN, so be it. It will get fixed eventually, but not in a rushing way in a key part of uBlock.

@Deathamns
Copy link
Contributor

I just wanted to, but... conflict. In frameModule.js:195 the _urlNormalizer_ function too should be removed, I leave it to you.

@gorhill
Copy link
Contributor Author

gorhill commented Jan 15, 2015

Sorry about this, it's the coffee kicking in.

@chrisaljoudi
Copy link
Contributor

@gorhill what's a good way to test whether the platform-specific code is feeding Unicode-encoded hostnames to core? I suspect this might be what's triggering older Safari's crashes.

@gorhill
Copy link
Contributor Author

gorhill commented Feb 9, 2015

The "core" is javascript, if you feed it Unicode it might just not function properly, but certainly not crash.

Actually the core does use Unicode to encode hash values into compact strings in hash map. It's why I asked last time whether the very specific case of using '\u0000\u0000' as a hash key was a problem in Safari.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants