Skip to content
This repository

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Browse code

updated readme

  • Loading branch information...
commit 362b3b9fca1de9ba326da2c8d91544263da06341 1 parent df85c58
rn-superg rn-superg authored

Showing 3 changed files with 76 additions and 7 deletions. Show diff stats Hide diff stats

  1. +2 1  .gitignore
  2. +72 4 README.textile
  3. +2 2 project.clj
3  .gitignore
... ... @@ -1,4 +1,5 @@
1 1 pom.xml
2 2 *jar
3 3 lib
4   -classes
  4 +classes
  5 +README.html
76 README.textile
Source Rendered
@@ -2,16 +2,84 @@ h1. clj-bloom
2 2
3 3 "Bloom Filter":http://en.wikipedia.org/wiki/Bloom_filter implementation in Clojure. Based loosely on "Jeff Foster's Implementation":http://github.com/fffej/clojure-snippets/blob/master/bloom.clj
4 4
  5 +This implementation uses a Java @java.util.BitSet@ as the bit array implementation and provides several helpers for different hashing functions.
  6 +
5 7 h1. Usage
6 8
7   - (use 'clj-bloom)
  9 +<pre>
  10 +(ns words
  11 + (:require
  12 + [ clojure.contrib.duck-streams :as ds]
  13 + [com.github.kyleburton.clj-bloom :as bf]))
  14 +
  15 +(def *words-file* "/usr/share/dict/words")
  16 +
  17 +(defn make-hash-fn-crc32 [#^String x]
  18 + (let [crc (java.util.zip.CRC32.)]
  19 + (fn [#^String s bytes]
  20 + (.reset crc)
  21 + (.update crc (.getBytes (.toLowerCase (str s x))))
  22 + (mod (.getValue crc)
  23 + bytes))))
  24 +
  25 +(defn make-hash-fn-adler32 [#^String x]
  26 + (let [crc (java.util.zip.Adler32.)]
  27 + (fn [#^String s bytes]
  28 + (.reset crc)
  29 + (.update crc (.getBytes (.toLowerCase (str s x))))
  30 + (mod (.getValue crc)
  31 + bytes))))
  32 +
  33 +
  34 +(defn run [hash-fns]
  35 + (let [filter (bf/make-bloom-filter (* 10 1024 1024) hash-fns
  36 + )]
  37 + (dorun
  38 + (doseq [line (ds/read-lines *words-file*)]
  39 + (bf/add! filter (.toLowerCase line))))
  40 + (dorun
  41 + (doseq [w (.split "The quick brown ornithopter hyper-jumped over the lazy trollusk" "\\s+")]
  42 + (if (bf/include? filter (.toLowerCase w))
  43 + (prn (format "HIT: '%s' in the filter" w))
  44 + (prn (format "MISS: '%s' not in the filter" w)))))))
  45 +
  46 +;; CRC32:12s, hashCode:11s, Adler32:12s, md5:13s, sha1:14s
  47 +;; (time (run))
  48 +
  49 +(prn "fn:hashCode")
  50 +(time (run bf/*default-hash-fns*))
  51 +(prn "fn:adler32")
  52 +(time (run (map make-hash-fn-adler32 ["1" "2" "3" "4" "5"])))
  53 +(prn "fn:crc32")
  54 +(time (run (map make-hash-fn-crc32 ["1" "2" "3" "4" "5"])))
  55 +(prn "fn:md5")
  56 +(time (run (map bf/make-hash-fn-md5 ["1" "2" "3" "4" "5"])))
  57 +(prn "fn:sha1")
  58 +(time (run (map bf/make-hash-fn-sha1 ["1" "2" "3" "4" "5"])))
  59 +
  60 +</pre>
8 61
9 62 h1. Installation
10 63
11   -Instructions for Leiningen...
  64 +If you're using Leiningen, add the following to your @project.clj@ file's @:dependencies@:
  65 +
  66 +<pre>
  67 + [com.github.kyleburton/clj-bloom "1.0.0"
  68 +</pre>
  69 +
  70 +For maven:
12 71
13   -Instructions for Maven...
  72 +<pre>
  73 + <dependencies>
  74 + <dependency>
  75 + <groupId>com.github.kyleburton</groupId>
  76 + <artifactId>clj-bloom</artifactId>
  77 + <version>1.0.0</version>
  78 + </dependency>
  79 + ...
  80 + </dependencies>
  81 +</pre>
14 82
15 83 h1. License
16 84
17   -Same as Clojure, need to link to it from here...
  85 +"Same as Clojure":http://clojure.org/license
4 project.clj
... ... @@ -1,6 +1,6 @@
1   -(defproject com.github.kyleburton/clj-bloom "1.0.0-SNAPSHOT"
  1 +(defproject com.github.kyleburton/clj-bloom "1.0.0"
2 2 :description "FIXME: write"
3 3 :dependencies
4 4 [[org.clojure/clojure "1.1.0"]
5 5 [org.clojure/clojure-contrib "1.1.0"]
6   - [swank-clojure "1.2.1"]])
  6 + [swank-clojure "1.2.1"]])

0 comments on commit 362b3b9

Please sign in to comment.
Something went wrong with that request. Please try again.