Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GitHub Linguist support for adblock rules #14699

Merged
merged 1 commit into from
Sep 9, 2022
Merged

GitHub Linguist support for adblock rules #14699

merged 1 commit into from
Sep 9, 2022

Conversation

scripthunter7
Copy link
Contributor

@scripthunter7 scripthunter7 commented Sep 6, 2022

GitHub has been supporting adblock syntax highlighting since September 5th. This PR will help you turn it on.

Further informations:

Please note that the following rules will be displayed as invalid:

! Skipped arguments
example.com##+js(scriptlet, , arg2, arg3)
example.com##+js(scriptlet, arg2, , arg3)

This is a known issue and will be fixed in the next Linguist release (in about ~2 months). Related highlight PR: AdguardTeam/VscodeAdblockSyntax#50

@scripthunter7

This comment was marked as resolved.

@scripthunter7 scripthunter7 reopened this Sep 6, 2022
@MasterKia
Copy link
Member

Let's keep this open.
@gorhill Could you check this?

@@ -0,0 +1 @@
*.txt linguist-language=AdBlock linguist-detectable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@scripthunter7 I noticed that [uBlock Origin] can be added to the beginning of each list to get syntax highlighting there.

So can/should the linguist-language here be "uBlock Origin" as well?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, these use Finnish list.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MasterKia If the [uBlock Origin] agent is present at the beginning of every file, it is not necessary.

But if you want statistics like this, then you need linguist-detectable
image

*.txt linguist-detectable

Copy link
Member

@MasterKia MasterKia Sep 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant something like *.txt linguist-language="uBlock Origin" linguist-detectable would work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant something like *.txt linguist-language="uBlock Origin"

This is simply not possible as Linguist doesn't know such a language or alias.
See https://github.com/github/linguist/blob/bf853f1c663903e3ee35935189760191f1c45e1c/lib/linguist/languages.yml#L230-L242

Highlighter supports ABP, uBO and ADG syntax. In summary, the grammar is called "adblock language"

Copy link
Contributor

@krystian3w krystian3w Sep 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can disable detect shell language in stats by same file.

Later you can same for Persian repo - e.g. disable detect JavaScript as main repo language.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but AdGuard can also block a lot of things, not just ads. Actually I don't want to do justice in this, I just wanted to help with the highlighter. 🙂 Perhaps you can try to submit several aliases to Linguist, eg. adblock plus, ublock / ublock origin, adguard, but the name of the language remains "Adblock Filter List" in that case as well.

Copy link
Contributor

@krystian3w krystian3w Sep 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this must wait for alias and disable generate stats as long don't count as other section "Wide spectrum filter list" (almost 0% chance to these splitting or rename if this can't detect EasyList simple filtering).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this must wait

Or just add [uBlock Origin] to each file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I know, the statistics always use the original names of the languages, i.e. in this case, even if you enter an alias in the .gitattributes file, the statistics will still show "Adblock Filter List" (by the way, adblock is also an alias - an abbreviation)

@krystian3w
Copy link
Contributor

krystian3w commented Sep 6, 2022

PS. this down procentage shell syntax from 100% to 5,56% after cover all txt files.

IMG_20220906_104652

If we cover only few txt files then down no more less than 7,14% (files with locked show in other way than raw e.g. mirror EL and EP).

@krystian3w
Copy link
Contributor

krystian3w commented Sep 6, 2022

To wide see gorhill words.

#14695 (comment)

Possible see on: https://github.com/ameshkov/VscodeAdblockSyntax/blob/master/test_rules.txt

This filter below is valid in uBO, it will convert to ASCII at compile time:

пример.рф##banner

I think it's fair to let filter list maintainers use readable domain names.

@MasterKia
Copy link
Member

MasterKia commented Sep 6, 2022

Also happens for Persian letters (but Persian domain names are very rare):

пример.рф##banner

درود.ir##.ads

@krystian3w
Copy link
Contributor

Worse if these really doesn't work in ABP and AdGuard and once or both closes a decline.

@MasterKia
Copy link
Member

Why is it a deal-breaker?

@scripthunter7
Copy link
Contributor Author

Why is it a deal-breaker?

The skipped scriptlet argument affects many more rules than the unicode domain. But both are known issues and they will hopefully be resolved in the next version.

@gorhill gorhill merged commit 7097500 into uBlockOrigin:master Sep 9, 2022
@scripthunter7
Copy link
Contributor Author

scripthunter7 commented Nov 15, 2022

@krystian3w @MasterKia @gorhill @mapx-
The new Linguist release came out today, which already uses the improved highlighter:

example.com##+js(scriptlet, , arg2, arg3)
пример.рф##banner

Please note that it may take some time for the new version to go live everywhere.

If you find any more highlighter issues, please report them here: https://github.com/ameshkov/VscodeAdblockSyntax/issues

GitHub Linguist is published approximately every quarter, and the latest versions of the grammars are downloaded before each release.

@krystian3w
Copy link
Contributor

krystian3w commented Nov 24, 2022

I think gray is bad into blockquote:

||example.org^$all,domain=com,denyallow=icu
/foo
! comment
||example.org^$all,domain=com,denyallow=icu
/foo
! comment

"fix" can be:

blockquote .highlight-text-adblock pre { color: var(--color-fg-default) }

But IDK about this think "Microsoft".

@scripthunter7
Copy link
Contributor Author

@krystian3w The issue you presented is also true for JS:

// comment
let something = 2
// comment
let something = 2

I think this should be fixed in the GitHub theme.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants