Zipfian distribution in Ruby
Latest commit 21b2426 May 15, 2015 @junegunn Remove test-unit
Failed to load latest commit information.
lib Fix documentation May 15, 2015
test Remove test-unit May 15, 2015
.gitignore a rough implementation May 3, 2012
Gemfile a rough implementation May 3, 2012
Guardfile a rough implementation May 3, 2012
LICENSE a rough implementation May 3, 2012 Fix documentation May 15, 2015
Rakefile a rough implementation May 3, 2012
zipfian.gemspec Remove test-unit May 15, 2015


Zipfian distribution implementation.


Add this line to your application's Gemfile:

gem 'zipfian'

And then execute:

$ bundle

Or install it yourself as:

$ gem install zipfian


# 1000: Number of elements
#  1.0: Exponent
z = 1000, 1.0

puts z.n    # 1000
puts z.s    # 1.0

(1..1000).each do |i|
  puts [z.pmf(i), z.cdf(i)].join ' - '

puts z.sample    # Integer between 1 and 1000

Initialization overhead (CPU, Memory)

On initialization, each Zipfian instance precalculates the values of cumulative distribution function for every integer point in the range and stores them in memory. Thus, as the range gets larger, initialization will take longer, and each instance will take up more memory space accordingly.


To avoid repeated overhead when multiple Zipfian instances are used, you can optionally enable thread-safe caching of precalculated data at class-level by setting the third parameter of the constructor to true.

# Cache precalculated data
z1 = 1000000, 0.5, true

# Returns immediately. No more memory consumption
z2 = 1000000, 0.5

A workaround of memory limitation

If the range is exceptionally large, it wouldn't be just possible to hold all the calculated values in memory. In such cases, you may need to approximate the distibution with a smaller Zipfian distribution.

z = 1000000, 0.5

puts z.sample * 1000 - rand(1000)


  1. Fork it
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Added some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create new Pull Request