Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow limited parsing of ABP style adlists #5179

Merged
merged 9 commits into from
Mar 14, 2023
Merged

Allow limited parsing of ABP style adlists #5179

merged 9 commits into from
Mar 14, 2023

Conversation

yubiuser
Copy link
Member

@yubiuser yubiuser commented Feb 16, 2023

What does this PR aim to accomplish?:

This PR implements limited support for AdBlock Plus (ABP)-style domain lists in Pi-hole. Accompanies pi-hole/FTL#1532
Allowed are only entries in ||subdomain.domain.tlp^ style.


By submitting this pull request, I confirm the following:

  1. I have read and understood the contributors guide, as well as this entire template. I understand which branch to base my commits and Pull Requests against.
  2. I have commented my proposed changes within the code and I have tested my changes.
  3. I am willing to help maintain this change if there are issues with it later.
  4. It is compatible with the EUPL 1.2 license
  5. I have squashed any insignificant commits. (git rebase)
  6. I have checked that another pull request for this purpose does not exist.
  7. I have considered, and confirmed that this submission will be valuable to others.
  8. I accept that this submission may not be used, and the pull request closed at the will of the maintainer.
  9. I give this submission freely, and claim no ownership to its content.

  • I have read the above and my PR is ready for review. Check this box to confirm

@jfb-pihole
Copy link
Sponsor Member

I recommend that we clarify that this is "limited parsing of ABP style blocklists" for lines that are clearly identified as wildcard blocks only.

@yubiuser yubiuser changed the title Allow adding ABP style blocklists Allow limited parsing of ABP style adlists Feb 17, 2023
@yubiuser yubiuser marked this pull request as ready for review February 17, 2023 21:10
@yubiuser
Copy link
Member Author

Gravity should be fine now. I haven't thought about pihole -q yet. Please test anyway.

@dschaper
Copy link
Member

dschaper commented Feb 17, 2023

root@09184bcf19b7:/# pihole -g
  [i] Neutrino emissions detected...
  [✓] Pulling blocklist source list into range

  [✓] Preparing new gravity database
  [i] Using libz compression

  [i] Target: https://raw.githubusercontent.com/StevenBlack/hosts/master/hosts
  [✓] Status: Retrieval successful
  [i] Imported 182111 domains, ignoring 2 non-domain entries
      Sample of non-domain entries:
        - 0.0.0.0
        - seg.hadron.ad.g
  [i] List stayed unchanged

  [i] Target: https://big.oisd.nl
  [✓] Status: Retrieval successful
  [i] Imported 289063 domains, ignoring 6 non-domain entries
      Sample of non-domain entries:
        - 2023-02-17t16:29:22+0000
        - 202302171629
        - big
        - break.
        - frequency)

  [✓] Creating new gravity databases
  [✓] Storing downloaded domains in new gravity database
  [✓] Building tree
  [✓] Swapping databases
  [✓] The old database remains available.
  [i] Number of gravity domains: 493776 (476943 unique domains)
  [i] Number of exact blacklisted domains: 1
  [i] Number of regex blacklist filters: 0
  [i] Number of exact whitelisted domains: 1
  [i] Number of regex whitelist filters: 0
  [✓] Flushing DNS cache
  [✓] Cleaning up stray matter

@dschaper
Copy link
Member

dschaper commented Feb 17, 2023

root@09184bcf19b7:/etc/pihole# grep 0001.best list.4.big.oisd.nl.domains
||0001.best^

root@09184bcf19b7:/# dig 0001.best @127.0.0.1 +short
0.0.0.0
root@09184bcf19b7:/# dig i.am.the.0001.best @127.0.0.1 +short
0.0.0.0
root@09184bcf19b7:/# pihole -q 0001.best

image

@PromoFaux
Copy link
Member

After last commit:

  [i] Target: https://big.oisd.nl
  [✓] Status: No changes detected
  [i] Imported 294466 domains

@PromoFaux
Copy link
Member

Also now fixed pihole -q.

Before:

root@pihole:/# pihole -q eu1.clevertap-prod.com
 Match found in https://raw.githubusercontent.com/StevenBlack/hosts/master/hosts:
   eu1.clevertap-prod.com
 Match found in :
   
root@pihole:/# 

After:

root@pihole:/# pihole -q eu1.clevertap-prod.com
 Match found in https://raw.githubusercontent.com/StevenBlack/hosts/master/hosts:
   eu1.clevertap-prod.com
 Match found in https://big.oisd.nl:
   ||eu1.clevertap-prod.com^
root@pihole:/# 

@dschaper
Copy link
Member

pihole -q eu1.clevertap-prod.com

Dare I ask pihole -q sub.eu1.clevertap-prod.com?

@dschaper
Copy link
Member

I'd like to see pihole -q do the following for pihole -q sub.0001.best but that may take a bit to figure out a good way to recurse in shell script. I don't want to block this PR to wait for that, I'm okay with releasing this and working on the query output.

Match found in hosts list:
  sub.0001.best
Match found in abp list:
  ||0001.best^

yubiuser and others added 3 commits February 22, 2023 21:21
Co-authored-by: Adam Warner <me@adamwarner.co.uk>
Signed-off-by: Christian König <ckoenig@posteo.de>
…nd header

Also splits the piped "one-liner" in ParseFileIntoDomains into individually commented commands (makes for easier reading and debugging)

Signed-off-by: Adam Warner <me@adamwarner.co.uk>
… we now expect some valid results to contain '|'

Signed-off-by: Adam Warner <me@adamwarner.co.uk>
gravity.sh Outdated Show resolved Hide resolved
gravity.sh Outdated Show resolved Hide resolved
gravity.sh Outdated Show resolved Hide resolved
gravity.sh Outdated Show resolved Hide resolved
gravity.sh Outdated Show resolved Hide resolved
gravity.sh Outdated Show resolved Hide resolved
Signed-off-by: Christian König <ckoenig@posteo.de>
gravity.sh Outdated Show resolved Hide resolved
Signed-off-by: Christian König <ckoenig@posteo.de>
@yubiuser yubiuser added PR: Approved Open Pull Request, Approved by required number of reviewers and removed PR: Approval Required Open Pull Request, needs approval labels Mar 3, 2023
Signed-off-by: Christian König <ckoenig@posteo.de>
@yubiuser yubiuser requested a review from PromoFaux March 4, 2023 10:01
@dschaper dschaper added PR: Approval Required Open Pull Request, needs approval and removed PR: Approved Open Pull Request, Approved by required number of reviewers labels Mar 13, 2023
@dschaper
Copy link
Member

Can you revert that last change please. The trailing anchor needs to be addressed separately.

This reverts commit 0b5da9f.

Signed-off-by: Christian König <ckoenig@posteo.de>
@dschaper dschaper added PR: Approved Open Pull Request, Approved by required number of reviewers and removed PR: Approval Required Open Pull Request, needs approval labels Mar 14, 2023
@dschaper dschaper enabled auto-merge March 14, 2023 19:37
@dschaper dschaper disabled auto-merge March 14, 2023 20:22
@dschaper dschaper merged commit 75a32d2 into development Mar 14, 2023
@dschaper dschaper deleted the abp branch March 14, 2023 20:23
@pralor-bot
Copy link

This pull request has been mentioned on Pi-hole Userspace. There might be relevant details there:

https://discourse.pi-hole.net/t/pi-hole-ftl-v5-22-web-v5-19-and-core-v5-16-1-released/61999/1

@jpgpi250
Copy link
Contributor

The change in gravity .sh to expand the valid_domain_pattern, this to allow domain entries with a trailing period appeard to be lost in the v5.16.1

current:

valid_domain_pattern="([a-z0-9]([a-z0-9_-]{0,61}[a-z0-9]){0,1}\.)+[a-z0-9][a-z0-9-]{0,61}[a-z0-9]"

during the trials I performed:

valid_domain_pattern="([a-z0-9]([a-z0-9_-]{0,61}[a-z0-9]){0,1}\.)+[a-z0-9][a-z0-9-]{0,61}[a-z0-9]\.?"

@dschaper
Copy link
Member

dschaper commented Mar 23, 2023

Two different issues. This PR and code is to parse specific ABP style lists.

The validity of trailing anchors is a separate and unrelated issue.

Edit: I noted this previously in this PR #5179 (comment)

@pralor-bot
Copy link

This pull request has been mentioned on Pi-hole Userspace. There might be relevant details there:

https://discourse.pi-hole.net/t/urgent-feature-request-to-whitelist/63603/15

@pralor-bot
Copy link

This pull request has been mentioned on Pi-hole Userspace. There might be relevant details there:

https://discourse.pi-hole.net/t/add-feature-to-import-publicly-maintained-regex-lists/27161/19

@pralor-bot
Copy link

This pull request has been mentioned on Pi-hole Userspace. There might be relevant details there:

https://discourse.pi-hole.net/t/supporting-dnsmasq-wildcard-blocklists/8761/11

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PR: Approved Open Pull Request, Approved by required number of reviewers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants