Skip to content

A tool for domain whitelists/blacklists analysis and optimization.

License

Notifications You must be signed in to change notification settings

wrzlbrmft/domains

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

50 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

domains

Build Status

domains is a command-line tool written in Java for analysis and optimization of domain lists, e.g. to be used as whitelists or blacklists.

Features

  • Auto-correct malformed domain names
  • Sorting
  • Remove duplicate list entries
  • Remove redundant list entries
  • Check against an exclusion list and
    • Remove obsolete domain list entries
    • Remove unused exception list entries
  • Save optimized lists as new files
  • Check if a domain name would be whitelisted/blacklisted

Download

A pretty up-to-date version of domains can be downloaded here.

To build the latest version by yourself, see below for Build Instructions.

Usage

Having installed the Java Runtime Environment 7+, you can run domains at the command-line or in Terminal with:

java -jar domains.jar

Append -h or --help to get a list of all available options:

java -jar domains.jar -h

All available options are:

 -b,--check-blacklist <domainName>   check if the domain name would be
                                     blacklisted, when treating the loaded
                                     domain(/exception) list(s) as
                                     blacklist configuration
 -d,--domains <file>                 load domain list from text file
 -e,--exceptions <file>              load exception list from text file
 -h,--help                           print this help message and exit
 -o,--remove-obsolete-domains        remove obsolete domain list entries
                                     (e.g. if "com" is on the exception
                                     list, the domain list entry "foo.com"
                                     is obsolete and removed)
 -r,--remove-redundant               remove redundant list entries (e.g.
                                     "com" includes "foo.com", so
                                     "foo.com" is redundant and removed)
 -s,--save-domains <file>            save optimized domain list as new
                                     text file
 -u,--remove-unused-exceptions       remove unused exception list entries
                                     (e.g. if "com" is not on the domain
                                     list, the exception list entry
                                     "foo.com" is unused and removed)
 -v,--verbose                        be more verbose
    --version                        print version info and exit
 -w,--check-whitelist <domainName>   check if the domain name would be
                                     whitelisted, when treating the loaded
                                     domain(/exception) list(s) as
                                     whitelist configuration
 -x,--save-exceptions <file>         save optimized exception list as new
                                     text file

Quick Start

Load the domain list from domains.txt, auto-correct and sort it, then remove both duplicate and redundant entries. Finally save the optimized domain list as domains-optimized.txt:

java -jar domains.jar -d domains.txt -r -s domains-optimized.txt

Read further for more available optimizations.

Checking Domain Names

You can check if a given domain name would be whitelisted/blacklisted, when treating the loaded domain(/exception) list(s) as whitelist/blacklist configuration.

Load the domain list from domains.txt and the exception list from exceptions.txt. Check if www.foo.com would be whitelisted, when treating the loaded lists as whitelist configuration:

java -jar domains.jar -d domains.txt -e exceptions.txt -w www.foo.com

Load the domain list from domains.txt and the exception list from exceptions.txt. Check if www.bar.com would be blacklisted, when treating the loaded lists as blacklist configuration:

java -jar domains.jar -d domains.txt -e exceptions.txt -b www.bar.com

You can check domain names for being whitelisted/blacklisted at the same time:

java -jar domains.jar -d domains.txt -e exceptions.txt -w www.foo.com -b www.bar.com

Domain Lists

A domain list is a simple text file, containing one domain name per line:

Example

foo.com
bar.com
www.xyz.net
org

Each domain includes all of its sub-domains. E.g. foo.com includes www.foo.com, bar.foo.com etc.

NOTE: To put a top-level-domain like .com on a list, simply use com, without the leading dot. Otherwise it will be auto-corrected.

Exception Lists

domains also supports exception lists to express rules like "'com' except 'youtube.com'". An exception list is a second file loaded in conjunction with a domain list; also a simple text file, containing one (exception) domain name per line.

Optimizations

Domain and Exception Lists

The following optimizations can be applied to both domain and exception lists.

Auto-Correct Malformed Domain Names

(Auto-correction is always applied to any list loaded.)

Malformed domain names are auto-corrected with a set of rules applied in the following order:

  1. remove the last :// and everything before it
  2. remove the first : and everything after it
  3. remove the first / and everything after it
  4. ensure that the domain name does not start with a dot (.)
  5. change to lower-case letters

NOTE: Even with rule #4, [.]foo.com does not include [.]barfoo.com, no matter if you put foo.com on a list with or without the leading dot.

Example

All of the following entries are auto-corrected to www.foo.com:

http://www.foo.com/
WWW.FOO.COM/bar
.www.foo.com:8080
https://www.foo.com/bar/index.html

Sorting

(Sorting is always applied to any list loaded.)

Domain names are sorted as reverse-strings (foo.com as moc.oof) to keep different sub-domains next to each other.

Example

www.foo.com
bar.net
ftp.foo.com
www.bar.net
foo.com

becomes

foo.com
ftp.foo.com
www.foo.com
bar.net
www.bar.net

Remove Duplicate List Entries

(De-duplication is always applied to any list loaded.)

Each domain name is only allowed to appear once on a list.

NOTE: Due to auto-correction, list entries can also result in the same domain name, then being de-duplicated.

Remove Redundant List Entries

Domain names always include all their sub-domains. Therefore, in the following list, all entries except com are redundant:

foo.com
com
bar.com

Use the -r or --remove-redundant command-line options to remove the redundant list entries.

Exception Lists

The following optimizations can only be applied when loading a domain list and an exception list.

Use the -e or --exceptions command-line option to load an exception list.

Example

Load the domain list from domains.txt and the exception list from exceptions.txt:

java -jar domains.jar -d domains.txt -e exceptions.txt

Remove Obsolete Domain List Entries

Any domain list entry included in an exception list entry is obsolete.

Example

Domain list:

www.foo.com

Exception list:

foo.com

Since the exception foo.com includes www.foo.com on the domain list, www.foo.com can be removed from the domain list.

Use the -o or --remove-obsolete-domains command-line option to remove the obsolete domain list entries.

Remove Unused Exception List Entries

Any exception not being a sub-domain of a domain list entry is unused.

Example

Domain list:

foo.com

Exception list:

bar.com

Since bar.com is not a sub-domain of any domain list entry, it can be removed from the exception list.

Use the -u or --remove-unused-exceptions command-line option to remove the unused exception list entries.

Save Optimized Lists

All optimizations are applied to copies of the domain and/or exception list files loaded into memory. The original files are never changed but you can save the optimized lists from memory as new files.

Use the -s or --save-domains command-line option to save the optimized domain list, and the -x or --save-exceptions command-line option to save the optimized exception list as a new file.

Example

Load the domain list from domains.txt and save the optimized domain list as domains-optimized.txt:

java -jar domains.jar -d domains.txt -s domains-optimized.txt

NOTE: Any existing new file will be overwritten.

Build Instructions

A pretty up-to-date version of domains can be downloaded here.

Or you can easily build the latest version by yourself.

Requirements

Download the latest source code as a ZIP file or use Git:

git clone https://github.com/wrzlbrmft/domains.git

Change into the unzipped or the checkout directory and run Maven:

mvn package

The uber-jar containing both the compiled source code and all of its dependencies is saved in the target/ directory.

Simply run it:

java -jar target/domains.jar

License

This software is distributed under the terms of the GNU General Public License v3.

About

A tool for domain whitelists/blacklists analysis and optimization.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages