Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

whitespace hosts lines halting script #8

Closed
caneylan opened this issue Dec 30, 2014 · 2 comments
Closed

whitespace hosts lines halting script #8

caneylan opened this issue Dec 30, 2014 · 2 comments

Comments

@caneylan
Copy link

Hello!

The current version of the hosts files for someonewhocares.org contains a line with a single space, which is breaking the file parser:

$ git clone https://github.com/StevenBlack/hosts.git
Cloning into 'hosts'...
remote: Counting objects: 513, done.
remote: Total 513 (delta 0), reused 0 (delta 0)
Receiving objects: 100% (513/513), 21.22 MiB | 1.46 MiB/s, done.
Resolving deltas: 100% (209/209), done.
Checking connectivity... done.
$ cd hosts/
$ egrep -n '^ $' data/*/hosts
data/someonewhocares.org/hosts:323:
$ python ./updateHostsFile.py
Do you want to update all data sources? [Y/n] y
Updating source mvps.org from http://winhelp2002.mvps.org/hosts.txt
Updating source yoyo.org from http://pgl.yoyo.org/adservers/serverlist.php?hostformat=hosts&    mimetype=plaintext
Updating source StevenBlack from https://raw.github.com/StevenBlack/hosts/master    /data/StevenBlack/hosts
Updating source someonewhocares.org from http://someonewhocares.org/hosts/hosts
Updating source malwaredomainlist.com from http://www.malwaredomainlist.com/hostslist/hosts.txt
Do you want to exclude any domains?
For example, hulu.com video streaming must be able to access its tracking and ad servers in order to play video. [Y/n] y
Do you want to exclude the domain hulu.com ? [Y/n] y
Do you want to exclude any other domains? [Y/n] n
==>::1 localhost<==
==>::1 localhost<==
==># mistyped<==
==># log<==
==># URLs<==
==># May<==
==># video<==
==># and<==
==># up<==
==># problems<==
A line in the hostfile is going to cause problems because it is nonstandard
The line reads
 please check your data files. Maybe you have a comment without a #?

The removeDups() function only tests for lines that start with '#' or a newline. So a line consisting of spaces and a newline will make it into the stripRule() function and halt the script. The test can be updated to test for any line consisting of only whitespace:

$ git diff updateHostsFile.py
diff --git a/updateHostsFile.py b/updateHostsFile.py
index 709a525..14dac7a 100644
--- a/updateHostsFile.py
+++ b/updateHostsFile.py
@@ -165,7 +165,7 @@ def removeDups(mergeFile):

        hostnames = set()
        for line in mergeFile.readlines():
-               if line[0].startswith("#") or line[0] == '\n':
+               if line[0].startswith("#") or re.match(r'^\s*$', line[0]):
                        finalFile.write(line) #maintain the comments for readability
                        continue
                strippedRule = stripRule(line) #strip comments
$ python ./updateHostsFile.py
Do you want to update all data sources? [Y/n] y
Updating source mvps.org from http://winhelp2002.mvps.org/hosts.txt
Updating source yoyo.org from http://pgl.yoyo.org/adservers/serverlist.php?hostformat=hosts&mimetype=plaintext
Updating source StevenBlack from https://raw.github.com/StevenBlack/hosts/master/data/StevenBlack/hosts
Updating source someonewhocares.org from http://someonewhocares.org/hosts/hosts
Updating source malwaredomainlist.com from http://www.malwaredomainlist.com/hostslist/hosts.txt
Do you want to exclude any domains?
For example, hulu.com video streaming must be able to access its tracking and ad servers in order to play video. [Y/n] y
Do you want to exclude the domain hulu.com ? [Y/n] y
Do you want to exclude any other domains? [Y/n] n
==>::1 localhost<==
==>::1 localhost<==
Success! Your shiny new hosts file has been prepared.
It contains 25343 unique entries.
Do you want to replace your existing hosts file with the newly generated file? [Y/n] n

--Chris

@StevenBlack
Copy link
Owner

Right you are, Chris. Thanks.

StevenBlack added a commit that referenced this issue Jan 2, 2015
@StevenBlack
Copy link
Owner

Thank you @caneylan — closing this now.

StevenBlack pushed a commit that referenced this issue Dec 10, 2016
mitchellkrogza pushed a commit to mitchellkrogza/hosts that referenced this issue Aug 14, 2017
…ter about whitespace preceding comments. Thanks @caneylan.

Former-commit-id: f5d622a
Former-commit-id: dfbaee8bb4500ec29b31ae68c4230962523bec59
mitchellkrogza pushed a commit to mitchellkrogza/hosts that referenced this issue Aug 14, 2017
merge

Former-commit-id: e36136a
Former-commit-id: 53b1861cb472ebfc12c93c958efc8a85594cc1ce
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants