A Bloom Filter to Test Compromised Passwords
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
load Make public Aug 2, 2018
ssl
Dockerfile
LICENSE
build.sh
check.go Added md5sum for filter Aug 2, 2018
clean.sh Make public Aug 2, 2018
readme.md

readme.md

Blooming Password - bp

A program that implements the NIST 800-63-3b Banned Password Check using a bloom filter built from the Have I been pwned SHA1 password hash list. The Have I Been Pwned 3.0 list contains more than 517 million password hashes and is 22GB uncompressed (as of Aug 2018). A bloom filter of this list is only 887MB and will fit entirely into memory on a virtual machine or Docker container with just 2GB of RAM.

Why a Bloom Filter?

It's one of the simplest, smallest and fastest data structures for this task. Bloom filters have constant time performance (where K is the constant) for insertion and lookup. They can easily handle billions of banned password hashes with very modest resources. When a test for membership returns 404 then it's safe to use that password.

Partial SHA1 Hashes

SHA1 hashes are 20 bytes of raw binary data and thus typically hex encoded for a total of 40 characters. Blooming Password uses just the first 16 hex encoded characters of the hashes to build the bloom filter and to test the filter for membership. The program rejects complete hashes if they are sent. False positive rates in the bloom filter are not impacted by the shortening of the SHA1 password hashes. The cardinality of the set is unchanged. The FP rate is .001 (1 in 1,000). You may verify the cardinality is unchanged after truncating the hashes.

  $ wc -l pwned-passwords-ordered-by-count.txt 
  517238891 pwned-passwords-ordered-by-count.txt

  $ sort -T /tmp/ -u 16.txt | wc -l
  517238891

How to Construct the Partial SHA1 Hash List

  $ 7z e pwned-passwords-ordered-by-count.7z

  $ cut -c 1-16 pwned-passwords-ordered-by-count.txt > 16.txt

  $ head 16.txt 
  7C4A8D09CA3762AF
  F7C3BC1D808E0473
  B1B3773A05C0ED01
  ...

How to Create the Bloom Filter

  $ load /path/to/16.txt /path/to/output.filter

Test the Bloom Filter for Membership

Send the first 16 characters of the hex encoded SHA1 hash to the Blooming Password program. Some examples using curl:

Return Codes

  • 200 - The hash is probably in the bloom filter.
  • 400 - The client sent a bad request.
  • 404 - The hash is definitely not in the bloom filter.

Notes

  • Blooming Password is written in Go.
  • It uses willf's excellent bloom filter implementation.
  • The Examples above are hosted on a Linode VPS with 2 GB of memory.
  • OPUS is an example of earlier work using a much smaller filter. (Eugene Spafford, 1992).