Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unbound DNSBL downloader script duplicate zones and doesn't sanitize list entries #4898

Closed
2 tasks done
defiantmonkey opened this issue Apr 5, 2021 · 7 comments
Closed
2 tasks done
Assignees
Labels
cleanup Low impact changes
Milestone

Comments

@defiantmonkey
Copy link

Important notices

Before you add a new report, we ask you kindly to acknowledge the following:

Describe the bug

My unbound installation wouldn't start the other day and I found out that some urls in dnsbl.conf had an extra quote (") inside the the already quoted url, which unbound didn't like and wouldn't start so long as the dnsbl.conf was malformed. The syslog for backend through the GUI only showed that it tried to start and exited with a bad status code (1), so in order to find out what was causing issues I had to manually invoke unbound_start.sh from a terminal.

The entries comes from one of the lists I'm using but I did not check which one. It might also be worth noting that there were other special characters some of the urls (one I saw had ," as part of the domain name).

As a side effect of troubleshooting this, I also found out that when the downloader runs there are duplicate domains, as the entries being added aren't stripped of extra spaces and could potentially be the same domain but different casing (ie: uppercase vs lowercase counts as uniqueness).

To Reproduce

Steps to reproduce the behavior:

  1. Have entries added to dnsbl.conf through download_blacklists.py that contain extra quotes in domain names and/or same domains different casing/whitespace
  2. Start unbound service
  3. In the case of extra quotes in domain strings in dnsbl.conf, unbound will not start and return a vague error in syslog (only saying that unbound exited with a non-zero status)
  4. In the case that unbound starts successfully and dnsbl.conf contains duplicate domains, unbound log will spit out many warnings about duplicate zones

Expected behavior

download_blacklists.py strips all extra whitespace, considers different cased domain names as non-unique, and strips away extra quotes in domain names

Describe alternatives you considered

Currently I have modified download_blacklists.py line 128 locally to replace "blacklist_items.add(entry)" with "blacklist_items.add(entry.lower().strip().replace(""", ""))" in order to not manually have to check dnsbl.conf and fix the entries.

Environment

OPNsense 21.1.4-amd64 (LibreSSL)

@AdSchellevis AdSchellevis self-assigned this Apr 5, 2021
@AdSchellevis AdSchellevis added the cleanup Low impact changes label Apr 5, 2021
@AdSchellevis
Copy link
Member

@defiantmonkey can you try 31a0c40 ?

install from a console:

opnsense-patch 31a0c40

@kulikov-a
Copy link
Member

Hi
extra quotes are not the only garbage found in dnsbls. and imho simple removal of extra quotes does not guarantee that the correct fqdn record will turn out.
I use a custom download_blacklists.py script and collect regex-filtered records into a separate file. here is the part from last one:
www.usaa.com.inet.entlogon.logon.redirectjsp.ef4bce064403e276e26b792bda81c384ce09593b819e632a1.3923ef2c8955cookieid112824nosde.b
216.55.87.6
www.login.microsoftonline.com.common.oauth2.authorize.clientid4345a7b99a634910a42635363201d503.response.mode.form.postresponse.t
210.55.182.5
thetradersden.org/forums/tracker
204.167.148.2
www.https.portale-titolari-nexi-italia-online-verifica-nexi.it.infomazione.nexi-identita.conto.nexi.bp2ip.dayasinergibisnis.co.i
192.67.198.3
torrent.nwps.w
www.barclaycard.co.uk-personal-codeb6259568de39651d7a-login.id-107sbtd9cbhsbtd5d80a13c0db1f546757jnq9j5754675780483825.exaways.c
blog.btdeployments.cu.m
www.login.microsoftonline.com.common.oauth2.authorize.clientid4345a7b9.9a63.4910.a426-35363201d503.response.spreadsheetlimited.c
217.168.236.2

if this is a good place for discussion I would suggest changing the regex in the blacklists.ini to
^(?!(?=^.{4,253}$)(^((?!-)[a-zA-Z0-9\-\_]{0,62}[a-zA-Z0-9\-\_]\.)+[a-zA-Z0-9-]{2,63}\.?$))
to just discard invalid entries
which does not negate that imho it would be more correct to switch to the unbound-control for loading lists

@AdSchellevis
Copy link
Member

@kulikov-a I think we just crossed each other :) can you take a look at 31a0c40 as well? cleansing is definitely a good idea.

@kulikov-a
Copy link
Member

@AdSchellevis ) works great! I want to test the regex itself a little more, but it works great for the first checks. thanks!

@kulikov-a
Copy link
Member

kulikov-a commented Apr 5, 2021

@AdSchellevis I remembered why I included the leading underscore in the regex used.
As far as I understand, although the standard does not allow this in the hostname, in fact browsers try to resolve these names. that is, it turns out that the use of leading underscore can be used to bypass a dnsbls with strict compliance checking.
example from one of the lists: _ldap._tcp.pdc._msdcs.adserver.com
so maybe then it is worth allowing addresses with leading underscore (I checked - this does not kill the unbound) with
domain_pattern = re.compile( r'(([\da-zA-Z_])([_\w-]{,62})\.){,127}(([\da-zA-Z])[_\w-]{,61})' r'?([\da-zA-Z]\.((xn\-\-[a-zA-Z\d]+)|([a-zA-Z\d]{2,})))' )
?
and ^(?![a-zA-Z_\d]).* in

[exclude]
# exclude localhost entries
default_pattern_1=.*localhost$
# exclude non domain entries
default_pattern_2=^(?![a-zA-Z\d]).*

although probably the latter now makes no sense at all)

@fichtner fichtner added this to the 21.7 milestone Apr 6, 2021
@AdSchellevis
Copy link
Member

@kulikov-a yes, let's do that. thanks!

@kulikov-a
Copy link
Member

@AdSchellevis thanks!!

fichtner pushed a commit that referenced this issue May 26, 2021
(cherry picked from commit 31a0c40)
(cherry picked from commit 565688c)
oshogbo pushed a commit to DynFi/opnsense-core that referenced this issue Mar 3, 2022
oshogbo pushed a commit to DynFi/opnsense-core that referenced this issue Mar 3, 2022
oshogbo pushed a commit to DynFi/opnsense-core that referenced this issue Mar 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cleanup Low impact changes
Development

No branches or pull requests

4 participants