Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The Norwegian list has a new URL now #41

Closed
DandelionSprout opened this issue May 15, 2018 · 17 comments
Closed

The Norwegian list has a new URL now #41

DandelionSprout opened this issue May 15, 2018 · 17 comments
Labels
enhancement New feature or request fixed issue has been addressed

Comments

@DandelionSprout
Copy link

DandelionSprout commented May 15, 2018

Hello again. I decided to change the URL of Dandelion Sprouts norske filtre from https://raw.githubusercontent.com/DandelionSprout/adfilt/master/ExperimentalNorwegianList.txt to https://raw.githubusercontent.com/DandelionSprout/adfilt/master/NorwegianList.txt, since the list has definitely progressed beyond merely being experimental.

Would it possible to update its URL in uBO's default lists accordingly, or?

@uBlock-user uBlock-user added the enhancement New feature or request label May 15, 2018
@uBlock-user
Copy link
Contributor

Wait till gorhill decides.

@DandelionSprout
Copy link
Author

Ah, okay. The old link will still work in the meantime, but will be less frequently updated.

@gorhill
Copy link
Member

gorhill commented May 15, 2018

assets.json is updated after 13 days, so keep your older list for at least 14 days to be sure no subscribers to your list are going to lose their subscription.

@gorhill gorhill added the fixed issue has been addressed label May 15, 2018
@DandelionSprout
Copy link
Author

Ah, okay. I'll make sure to keep the old list around for at least another two weeks, then.

@gorhill
Copy link
Member

gorhill commented May 15, 2018

Correction, I was confused: they would not loose their subscription, it's just that the attempt at downloading an up to date version would fail, with no other consequence than an out of date list.

@jspenguin2017
Copy link

jspenguin2017 commented May 15, 2018

@gorhill uBO doesn't recognize hour as unit of time, and will default to 5 days instead, do you want a PR to fix it (make hour round up to nearest day)?

DandelionSprout added a commit to DandelionSprout/adfilt that referenced this issue May 16, 2018
Apparently I only became aware just now that uBO doesn't recognise hours as a unit of time, as per uBlockOrigin/uBlock-issues#41 (comment).
@DandelionSprout
Copy link
Author

DandelionSprout commented May 16, 2018

While I wouldn't want to upset anyone, I hope that this is a good time for me to admit that there have been some occasions where I've had problems finding sufficient documentation about various syntax features, sometimes even having had to look up GitHub comment threads or have had to randomly guess (The !#if and ! Redirect tags comes to mind up until just recently).

So when Adblock Plus' explanation of the expiration feature explicitly mentions that hours are supported, it took me by surprise to learn from Jspenguin's comment that it wasn't supported in uBO.

While I would be supportive of uBO supporting hours natively, I can make due with making hours round up instead (so that listing it as "12 hours" would mean 12h on ABP and 1 day on uBO/Nano), so I support Jspenguin's proposition.

@jspenguin2017
Copy link

jspenguin2017 commented May 16, 2018

@DandelionSprout Yes, uBO's doc/wiki is known to be pretty horrible, a lot of details are buried in release notes which are not indexed. The code itself does have some useful comments embedded, and I use grep a lot (more than I should) to find what I'm looking for.

I am trying to write some docs for both uBO and my fork. I have some deep implementation details available here about the expire header parsing logic.

Note that uBO will take the first valid expire header, so if you have:

! Expires: 12 hours
! Expires: 1 day

uBO will take the 1 day. I'm not sure how ABP will act when there are duplicate headers though. Note that Nano does recognize hour as unit of time but will round it up to nearest day, so it will take the first line and treat it like 1 day.

Also, redirect header does not work at all in uBO and all headers must be in the first 1024 characters of the file. I think I have documented all the quirks about publishing filter lists, I'm considering copying it to uBO's wiki but I haven't got around to do that yet. Also I don't want to put Nano's doc in uBO's wiki, so there's quite a bit of cleaning up to do. As an effort to document uBO, you can ask me any implementation details and I'll document them.

As a side note, uBO is very tolerant when it comes to filters syntax and have no problem throwing out part of your filter if it doesn't like it. I patched the filter parser to expose all these quirks, this is one of the major feature of my fork and I find it a lot easier to write correct filters on the first shot. It does slow down filter parsing for all filter lists though, but just a tiny bit.
image

@DandelionSprout
Copy link
Author

The linked-to documentation does look pretty good to me, but it'd be even better if it also covered uses in Adblock Plus and Adguard, since although I estimate through a user survey that more than 90% of the users of my Norwegian list are using uBlock Origin, I wouldn't want to break compatibility with other blocker tools.

As for the expire header, I decided to try with a
! Expires: 12 hours (Adblock Plus)
! Expires: 1 day (uBlock Origin)
setup now to alleviate potential confusion, in the hopes that paranthesises are allowed in the expire tag.

And as for keeping the old list around, I suddenly realised an hour ago that "Wait a minute. Why don't I just make use of the !#include tag to link the old list to the new list?", and it seems to me to have worked out splendidly.

@jspenguin2017
Copy link

jspenguin2017 commented May 16, 2018

@DandelionSprout I'm not familiar with internal working of ABP and Adguard, I can link to their documentation but I can't make a clear side-by-side comparison unless I investigate the code. There could be more than one way to interpret the documentation but there is only one way to interpret the code. If you are familiar with them, you are welcomed to contribute.

Also documentation could contains simplifications, for example, I said you need to have subscribe links present in the document when load event is fired, but actually, it is 997 milliseconds after load event is fired. This also means those links won't work until 997 milliseconds after load.

And yes, !#include pre-processing directive will work, but it is not ideal for long term as it does create two requests.

@DandelionSprout
Copy link
Author

DandelionSprout commented May 16, 2018

As an effort to document uBO, you can ask me any implementation details and I'll document them.

Now that you mention it, there are a few things about list syntaxes that I've always been curious about:

• Does the expire header have to use UTC, or does it support other timezones as well, e.g. CEST (Central European Summer Time)?
• If a header also contains other words, does it become invalidated or not? E.g. using ! Sist endret / Last Modified: as the Last Modified header, since "Sist endret" is the term's Norwegian translation.
• Is there any way in which it'd be possible to add filters for a browser's default startpage, e.g. for Edge's not particularly good MSN-based startpage?

And yes, !#include pre-processing directive will work, it is not ideal for long term as it does create two requests.

I don't imagine that I'd need to keep the old file around for more than a month, and my list isn't among the most popular lists in the world either, meaning that I doubt it'd hit some kind of request limit. So I hope it'll do fine for now.

@jspenguin2017
Copy link

jspenguin2017 commented May 16, 2018

@DandelionSprout

  • Expire header is a time interval, has nothing to do with UTC, I think it is clear enough in my doc.
  • As stated in my doc, only title and expires are parsed.
  • No, protected origins are protected. I have noted this down, I'll write a detailed doc later. As a side note, installing apps can add protected origins, for example, the YouTube app will turn youtube.com into a protected origin.

@gorhill As for the expires header in hour, I think the low cap being 1 day is healthy, it prevents abuse; but for things like 30 hours, I don't think having a float in the data object would break stuff, I haven't tested it though. I have a 60 day upper cap in Nano because if the expires delay overflows to Infinity, it could break manual update (actually 49 years should be enough to break it... until 2019).

@DandelionSprout
Copy link
Author

Ah, okay. My first question was meant to be about the Last Modified header instead of the Expires header, but now that I know that it's not an actually important header anyway, I now know that the wording of Last Modified doesn't really matter anyway, which is kind-of why I asked all these questions so that I could learn about them properly. Thanks for teaching me about it.

DandelionSprout added a commit to DandelionSprout/adfilt that referenced this issue May 16, 2018
@jspenguin2017
Copy link

jspenguin2017 commented May 16, 2018

in the hopes that paranthesises are allowed in the expire tag.

Yes, they are allowed, actually, anything is allowed after the time unit, so ! Expires: 3 dayyyyyys is valid. I updated the doc accordingly. Actually the space after ! is optional but after # is not; there are still more happening under the hood, but I chose to not document them. The doc shows one right way of doing things, but it isn't necessarily the only way.

since "Sist endret" is the term's Norwegian translation.

Headers must follow the format shown in my doc, so no, adding a translation is not going to work. However, since Last modified is not parsed, it is just treated like comment. things like ! (whatever) / Expires: 1 day would not work.

Also make sure to test your filter list in ABP, I can't say how it will react to weird header syntax.


I have opened a meta thread for all documentation related discussions: NanoAdblocker/NanoCore#167

I will probably open a new repo specifically for docs, as I find GitHub's wiki difficult to work with (especially for localization).

@gorhill
Copy link
Member

gorhill commented May 16, 2018

do you want a PR to fix it (make hour round up to nearest day)?

Sure. Thanks.

@jspenguin2017
Copy link

jspenguin2017 commented May 17, 2018

@DandelionSprout I have investigated ABP's code, it will take the last header, even if it is invalid, and validate it later. So if you have:

! Expires: 3 days
! Expires: three days

ABP will take the second header, and realize later that it is not good, and fallback to default expire time.

So you want:

! Expires: 1 day (uBO)
! Expires: 12h (ABP)

uBO will support ! Expires: 1d on the next release, but not yet. ABP accepts 1d, 12h, etc.


As a side note, Nano on Edge behaves differently than uBlock Edge, some filters won't be compatible. uBlock Edge treats fetch as different than xhr but Nano doesn't.

Nano doesn't have a special pre-parsing directive for now, so if you run into difficulties, let me know.

@gorhill
Copy link
Member

gorhill commented May 17, 2018

@nikrolls fyi

uBlock Edge treats fetch as different than xhr but Nano doesn't.

DandelionSprout added a commit to DandelionSprout/adfilt that referenced this issue May 17, 2018
adisib pushed a commit to adisib/uBlock that referenced this issue May 20, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request fixed issue has been addressed
Projects
None yet
Development

No branches or pull requests

4 participants