Skip to content

Commit

Permalink
initial commit
Browse files Browse the repository at this point in the history
  • Loading branch information
bradclawsie committed Oct 16, 2012
1 parent faf04c4 commit 8a13a6a
Show file tree
Hide file tree
Showing 5 changed files with 169 additions and 2 deletions.
22 changes: 22 additions & 0 deletions Data-Hash-Consistent.cabal
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
Name: Data-Hash-Consistent
Version: 0.1.0
Description: Provide a simple consistent hashing mechanism
Synopsis: Provide a simple consistent hashing mechanism
Category: Data
Stability: experimental
Homepage: https://github.com/bradclawsie/haskell-Data.Hash.Consistent
License: BSD3
License-file: LICENSE
Author: brad clawsie
Maintainer: haskell@fastmail.fm
cabal-version: >=1.2
build-type: Simple
Flag splitBase
Description: Choose the new smaller, split-up base package.
Library
if flag(splitBase)
Build-Depends: base >= 3,containers,old-locale
else
Build-Depends: base < 3
Build-Depends: bytestring,vector,vector-algorithms,digest
Exposed-modules: Data.Hash.Consistent
108 changes: 108 additions & 0 deletions Data/Hash/Consistent.hs
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
{- | Data.Hash.Consistent
A consistent hash is a technique to manage the fair distribution of cacheable
entities among hosts. Each host identifier has its crc32 hash calculated
and stored in a Vector along with its canonical host name. The host identifier
may be differentiated from its canonical host name by a multiplying factor,
in our case a simple integer appeneded to the hostname to provide it with
a number of entries in the consistent hash, all evenly distributed.
This technique is explained in these links:
http://en.wikipedia.org/wiki/Consistent_hashing
http://www.tomkleinpeter.com/2008/03/17/programmers-toolbox-part-3-consistent-hashing/
Here is a small program illustrating its use:
@
module Main where
import Data.Hash.Consistent as CH
main = do
let hosts = ["hi.example.net","bar.example.net","foo.example.net"] :: [CH.Host]
let n = 2 :: Int
let ch = new
print $ show $ ch
ch <- return $ add hosts n ch
print $ show $ ch
let fh = [head hosts] :: [Host]
let hh_fh = hash_hosts fh n
print hh_fh
ch <- return $ del fh n ch
print $ show $ ch
let i = 770931073
let tgt = target_host i ch
print tgt
return ()
@
License info:
The license is a simple BSD3-style license available here:
https://www.b7j0c.org/stuff/license.txt
-}

module Data.Hash.Consistent (text_crc32,search,add,del,target_host,hash_hosts,
Host,Hash,HashHost,ConsistentHash) where

import qualified Control.Monad.ST as ST
import qualified Data.Word as W
import qualified Data.Ord as O
import qualified Data.ByteString.UTF8 as BU
import qualified Data.Digest.CRC32 as CRC
import qualified Data.Vector as V
import qualified Data.Vector.Algorithms.Intro as I
import qualified Data.Vector.Algorithms.Search as S

type Host = String
type Hash = W.Word32
type HashHost = (Hash,Host)
type ConsistentHash = V.Vector HashHost

-- a convenience method to generate a crc for a string
text_crc32 :: String -> Hash
text_crc32 = CRC.crc32 . BU.fromString

-- sort the consistent hash by the hash values
sort :: ConsistentHash -> ConsistentHash
sort v = ST.runST $ do
m <- V.unsafeThaw v
I.sortBy (O.comparing fst) m
v' <- V.unsafeFreeze m
return v'

-- find the index of the first host hash greater than i
search :: Hash -> ConsistentHash -> Int
search i v =
ST.runST $ do
m <- V.unsafeThaw v
idx <- S.binarySearchBy (O.comparing fst) m (i,"")
return idx

-- locate which host should cache the item with key hash i
target_host :: Hash -> ConsistentHash -> HashHost
target_host i ch = let idx = search i ch
l = V.length ch in
if idx >= l then ch V.! 0 else ch V.! idx

new = V.empty :: ConsistentHash

-- for a set of servers and a multiplying factor, return a list of
-- crc hash values
hash_hosts :: [Host] -> Int -> ConsistentHash
hash_hosts hosts mult =
V.fromList [(text_crc32 $ x++y,x)|x<-hosts,y<-map show [1..mult]]

-- add a list of hosts to the consistent hash
add :: [Host] -> Int -> ConsistentHash -> ConsistentHash
add hosts mult ch = sort $ ch V.++ (hash_hosts hosts mult)

-- delete a list of hosts from the consistent hash
del :: [Host] -> Int -> ConsistentHash -> ConsistentHash
del hosts mult ch = let hh = hash_hosts hosts mult in
sort $ V.filter (\a -> V.notElem a hh) (ch)


23 changes: 23 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
Copyright (c) 2007-2013, Brad Clawsie All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
* Neither the name of b7j0c.org nor the
names of its contributors may be used to endorse or promote products
derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY BRAD CLAWSIE ``AS IS'' AND ANY
EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL BRAD CLAWSIE BE LIABLE FOR ANY
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
15 changes: 13 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,17 @@
haskell-Data.Hash.Consistent
============================

Fairly assign cacheable entities to cache hosts
## Data.Hash.Consistent

see: http://en.wikipedia.org/wiki/Consistent_hashing
A consistent hash is a technique to manage the fair distribution of cacheable
entities among hosts. Each host identifier has its crc32 hash calculated
and stored in a Vector along with its canonical host name. The host identifier
may be differentiated from its canonical host name by a multiplying factor,
in our case a simple integer appeneded to the hostname to provide it with
a number of entries in the consistent hash, all evenly distributed.

This technique is explained in these links:

http://en.wikipedia.org/wiki/Consistent_hashing

http://www.tomkleinpeter.com/2008/03/17/programmers-toolbox-part-3-consistent-hashing/
3 changes: 3 additions & 0 deletions Setup.lhs
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
#!/usr/bin/env runhaskell
> import Distribution.Simple
> main = defaultMain

0 comments on commit 8a13a6a

Please sign in to comment.