A bash script to update and pseudo-verify a site-blocking hosts file.
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
LICENSE.md
README.md
update-verify-file.sh
verify-implement-file.sh

README.md

Hosts File Integrity

A bash script to update and pseudo-verify a site-blocking hosts file.

StevenBlack on GitHub maintains a fantastic unified hosts file, combining data from many different reputable sources. The hosts file contains a list of malicious, undesirable or inappropriate sites that should be blocked.

This script is not designed to be a robust, plug and play solution. It is designed specifically for my needs however it may also be useful to somebody else.

This script is designed specifically to use with this hosts file. However, it should be easy to modify to work with another.

I am aware that StevenBlack already provides a script for updating the hosts file. I created this alternative for my own use, focusing more on security, automation and producing an output that is more useful to me.

What this script does:

  • Checks for an updated version of the hosts file
  • Downloads and pseudo-verifies the updated hosts file
  • Modifies the hosts file to fit my system
  • Provides a useful output in order to keep the user in the loop

What this script doesn't do:

  • Provide cryptographic proof of authenticity or integrity
  • Error handling - if something fails in the script, you'll have to investigate it yourself
  • Handle significant changes to the format/layout of the hosts file - the script will need adjusting too
  • Prevent every possible attack on the hosts file

Pseudo-verification:

The "pseudo-verification" performed by this script refers to checking statistics and filtering out content that matches a specific regex pattern. The theory behind this is that all whitelisted/legitimate content is accounted for, leaving behind anything that shouldn't be there. This of course is not a perfect solution, however it is a good way to perform basic checks on the integrity of the file before putting it in place on your system.

Script usage:

There are two scripts in this repository: "update-verify-file.sh" and "verify-implement-file.sh":

  • The first script (update-verify-file.sh) should be run as a non-privileged user, and will check for an update, download it if one is available and pseudo-verify the file.

  • The second script (verify-implement-file.sh) should be run as a user with write access to /etc/hosts. This script will verify the latest version of the hosts file created by the first script and write it to /etc/hosts.

I suggest setting up a cronjob to automatically run the scripts every 24 hours. The output of the script is designed to be sent to the user, perhaps in an email, instant message, etc.

Example output (no update available):

Starting script at: Sun 09 Jul 2017 - 6:40:28 pm BST

Checking connectivity to raw(dot)githubusercontent(dot)com:
- Ping success: "3 packets transmitted, 3 received, 0% packet loss"
- SSL handshake success: "Verify return code: 0 (ok)"
- Certificate details match: "depth=0 C = US, ST = California, L = San Francisco, O = "GitHub, Inc.", CN = www(dot)github(dot)com"

Checking for update:
- No update is available.
- Live file SHA256: 4414c6c44218e08e5623f76958f0b82bb8b934e7865d89fba6a6acd921fb75a5
- Local file SHA256: 4414c6c44218e08e5623f76958f0b82bb8b934e7865d89fba6a6acd921fb75a5

Continuing to use old hosts file:
- Not moving old file to new file (new file not downloaded).
- Date: July 04 2017
- Extensions added to this file: REDACTED
- SHA256: 4414c6c44218e08e5623f76958f0b82bb8b934e7865d89fba6a6acd921fb75a5
- SHA1: bb5e4c2e0547fd9a816d086e7e85eeca1bb924d2
- MD5: be3bf3eae1195c838ed29eaf4c7f6d7d

Script exiting (no update available) at: Sun 09 Jul 2017 - 6:40:31 pm BST
- Exiting.

Example output (successful update):

Starting script at: Sun 09 Jul 2017 - 6:46:21 pm BST

Checking connectivity to raw(dot)githubusercontent(dot)com:
- Ping success: "3 packets transmitted, 3 received, 0% packet loss"
- SSL handshake success: "Verify return code: 0 (ok)"
- Certificate details match: "depth=0 C = US, ST = California, L = San Francisco, O = "GitHub, Inc.", CN = www(dot)github(dot)com"

Checking for update:
- An update is available!
- Live file SHA256: 4414c6c44218e08e5623f76958f0b82bb8b934e7865d89fba6a6acd921fb75a5
- Local file SHA256: 2c78a7e6647e9ee208262affd70199ff649c91a42a624b2b91d3c5930d9bb203

Updating file:
- Moving new file to old file: DONE
- Downloading new file: DONE
- Checking hash against live version: MATCHES
- Live file SHA256: 4414c6c44218e08e5623f76958f0b82bb8b934e7865d89fba6a6acd921fb75a5
- Local file SHA256: 4414c6c44218e08e5623f76958f0b82bb8b934e7865d89fba6a6acd921fb75a5

Checking file integrity:
- Verifying character whitelist: SUCCESS
- All whitelisted special characters: "~<=>| _-,;:?/.'"()[]@*\&#%+	"
- Grep-stripping allowed/safe content from file: SUCCESS
- All content successfully stripped: ""
- Checking file byte count: SUCCESS
- Byte count greater than 1200000: "1389378"
- Checking file newline count: SUCCESS
- Newline count greater than 50000: "51446"

***************************
File successfully verified!
- Verification count: "7"
***************************

Using new hosts file:
- Date: July 04 2017
- Extensions added to this file: REDACTED
- SHA256: 4414c6c44218e08e5623f76958f0b82bb8b934e7865d89fba6a6acd921fb75a5
- SHA1: bb5e4c2e0547fd9a816d086e7e85eeca1bb924d2
- MD5: be3bf3eae1195c838ed29eaf4c7f6d7d

Old hosts file:
- Date: July 03 2017
- Extensions added to this file: REDACTED
- SHA256: 2c78a7e6647e9ee208262affd70199ff649c91a42a624b2b91d3c5930d9bb203
- SHA1: 1368b97e726b9d5a8729411e74329f8005df251a
- MD5: 2eb74b54d1f88e83b527fc1a6428ef28

Diff between both files (max 40 lines shown):
<--- old - new --->
4c4
< # Date: July 03 2017
---
> # Date: July 04 2017
69d68
> 0(dot)0(dot)0(dot)0 example-blocked-domain(dot)tld

Finalising updated hosts file:
- Backing up current hosts file: DONE
- Removing unwanted entries and writing to current-hosts file: DONE
- Prepending default/custom hosts file entries: DONE

Script finishing (updated successfully) at: Sun 09 Jul 2017 - 6:46:29 pm BST
- See output hosts file at "current-hosts.txt".

Notice that all dots (.) in the outputted URLs are replaced with "(dot)". This is to prevent potentially malicious domains from appearing as clickable links in whatever application you are viewing the output in.

Note: Some parts of this readme have been censored in order to keep it safe for work.