# w495/rndchoice_efl_2012 forked from etrepum/rndchoice_efl_2012

### Subversion checkout URL

You can clone with HTTPS or Subversion.

Fetching contributors…

Cannot retrieve contributors at this time

486 lines (442 sloc) 17.459 kb
 Random Choice

Random Choice

(случайный выбор)

Bob Ippolito (@etrepum)

Erlang Factory Lite 2012, Moscow

Why this talk?

• I was rewriting an IRC bot
• He "learns" how to chat
• Brain is based on Markov chains (Це́пь Ма́ркова)
• Random choice is essential to this and many other interesting applications

Andrey Andreyevich Markov

Андре́й Андре́евич Ма́рков

Markov chain

Really?

• (Most) ad servers also use random choice
• Mochi Media built this kind of ad server in Erlang
• Was fun to come up with an efficent way to do it

Erlang's random module

• Wichmann-Hill AS183 algorithm from 1982
• … designed for 16 bit computers with limited arithmetic
• It has poor results and you probably shouldn't use it
• Consider the crypto module, or a third party library
• Despite this, I will use random for the examples

Fair coin

• uniform() returns a float 0.0 ≤ X < 1.0
• ○ (heads) when X < 0.5, ● (tails) otherwise

Fair coin

X =

● (tails), X ≥ 0.5

Die roll

• uniform(N) returns an integer 1 ≤ X ≤ N
• uniform(N) -> 1 + trunc(N * uniform()).

Die roll

X =

Random selection without replacement

choose(L) ->
Nth = random:uniform(length(L)),
{H, [Choice | T]} = lists:split(Nth, L),
{Choice, H ++ T}.

Random selection without replacement

Selections =

L =

Random selection without replacement

• Can be used to shuffle a list (very slowly)
• Or simulate a deck of cards

Weighted selection without replacement

• {sum(Weight), [{Key, Weight}, …]}
• weight_split(N, L) -> weight_split(N, L, []).
weight_split(N, L, [H={K, W} | T], Acc) ->
case N - W of
N1 when N1 > 0 ->
weight_split(N1, T, [H | Acc]);
_ ->
{K, lists:reverse(Acc, T)}
end.

Random selection with replacement

choose(L) ->
lists:nth(random:uniform(length(L)), L).
• Does not change list
• Similar to rolling a die

Random selection with replacement

Selections =

L =

Weighted selection, naive

• With replacement, weight can be implemented simply
• Just add the item to the list multiple times
• [a, a, a, b, c, d]
• 50% a, ~16.6% b, …

Weighted selection with replacement

Selections =

L =

Optimizing for memory

• Naive solution uses a lot of memory
• Can do better by counting each unique choice
• {6, [{a, 3}, {b, 1}, {c, 1}, {d, 1}]}
• Tradeoff - it's harder to seek to Nth choice

Alias method

• Create N coins, one for each unique choice
• Choose coin (by die roll), then flip weighted coin
• Uses O(N) memory, has O(1) selection!

Alias method

Die:

Coin:

a 1undefined
ba
ca
da

• For our IRC bot, the choices update for every word of text
• Alias method is O(N) to update
• High O(N) garbage, no sharing at all

Using a tree

• Can build a tree for efficient updates
• Low O(log N) garbage, good sharing
• Slow O(N) seeks, since sort is by key

Simple max heap

• Seems to be a good compromise between speed and memory
• Highest weights first
• Highest weights are most likely to be updated
• Worst case O(N) for every operation
• But very good common case, near head of the list

Simple max heap

Key Weight

Questions?

Slides:
etrepum.github.com/rndchoice_efl_2012

Code:
github.com/etrepum/rndchoice_efl_2012