Skip to content
Guy Kobrinsky edited this page Feb 27, 2019 · 31 revisions

Welcome to the weighted-lottery wiki!

Weighted lotteries are used when we need to randomly select k of n items with given probabilities.

  • The sample size k is typically much smaller than n
  • The sample can ask for unique items, or allow repetitions

This project offers several weighted lottery implementations written in Kotlin or Java

Usage

Maven

<dependency>
  <groupId>io.github.guyko</groupId>
  <artifactId>WeightedLottery</artifactId>
  <version>1.0</version>
</dependency>

IntLottery represents the facade for obtaining the next item

interface IntLottery {
    fun draw(): Int
    fun empty(): Boolean
    fun remaining(): Int
}

All implementations of IntLottery expect to get a DoubleArray of probabilities, where an index in the array will be returned with each call to draw according to its given probability

val weights = doubleArrayOf(0.15, 0.0, 0.2, 0.0, 0.65)
val lottery : IntLottery = ...  // some implementation, initialized with the weights
(0 until k).forEach {
  val index = lottery.draw()
  // do something
}

Allow repetitions

With SimpleWeightedLottery, one can draw as many times as needed - with the complexity of log(n)

val weights = doubleArrayOf(0.15, 0.0, 0.2, 0.0, 0.65)
val lottery = SimpleWeightedLottery(weights)
val k = ... // no limitation on the value of k
(0 until k).forEach {
  val index = lottery.draw()
  // do something
}

AliasLottery is based on alias method sampling algorithm, with preprocessing complexity of O(n), and draw complexity of O(1) (cost of 2 random calls)

TwistedAliasLottery delegates draw calls to the SimpleWeightedLottery while building an AliasLottery in the background using a different thread. When init is done - it switches the delegated implementation

No repetitions

SimpleWeightedLotteryNoRepetitions wraps a SimpleWeightedLottery and tries to minimize number of re-draws by keeping track of the chance to hit the same element twice. In case a threshold is crossed - the internal SimpleWeightedLottery is re-allocated. Complexity is hard to analyze as it depends on the type of distribution

val weights = doubleArrayOf(0.15, 0.0, 0.2, 0.0, 0.65)
val lottery = SimpleWeightedLotteryNoRepetitions(weights)
val k = ... // k < weights.size
(0 until k).forEach {
  val index = lottery.draw()
  // do something
}

SumTreeLottery holds a sum tree data structure, with O(n) complexity for preprocessing and log(n) for draw (in oppose to the 'Simple' implementation - no need for any reallocation/preprocessing)

ReservoirLottery is based on A-Chao's weighted version of reservoir sampling algorithm, with preprocessing complexity of O(n+klogk), draw complexity of O(1). However, it depends on prior knowledge of k

val weights = doubleArrayOf(0.15, 0.0, 0.2, 0.0, 0.65)
val lottery = ReservoirLottery(weights, k = 2)
val first = lottery.draw()
val second = lottery.draw()
lottery.draw() // NoSuchElementException

Benchmark

All implementations are benchmarked using jmh, with gc profiler enabled

Use cases

  • 'with repetitions' vs 'without repetitions'
  • When measuring, we may care about preprocessing phase, draw phase in other cases and sometimes both together
  • one sample of k items vs many samples of k items (some implementations may be better when reusing the same weights)
  • uniform vs exponential distribution (0.9^1, 0.9^2, 0.9^3, .....) of probabilities

Follow benchmark wiki page to further deep dive

The results of the latest benchmark can be found here

How to contribute ?

New implementations are welcome. we prefer you to use Kotlin, but Java is also possible

Your pull request must contain:

  • Short description of your logic
  • Implement IntLottery in WeightedLottery/src/main/java/com/wl
  • Implement either WeightedLotteryNoRepetitionsTestBase or WeightedLotteryWithRepetitionsTestBase in src/test/java/com/wl
  • Add benchmark methods as described here
Clone this wiki locally