Sometimes, we will need to sample from a non-numerical and/or non-ordinal space. Take, for instance, a pet food company that is running a promotion as follows:
1. Each bag of dog food contains one of three letters: P, U, or Y with some probability for each
2. If a customer gets enough letters to spell PUPPY three times, they win a free bag of dog food

| x | P(X = x)      |
| --|:-------------:|
| P | 0.6           |
| U | 0.25          |
| Y | 0.15          |

To simulate the above we would need to be able to draw P, U, or Y in proportion to their probabilities. We can do this with roulette wheel selection. The core idea of roulette wheel selection is to consider a roulette wheel. On this roulette wheel, each of our choices hold a segment of the circumfrence proportional to their probabilities. We "spin" the wheel by randomly selecting a random uniform number from 0 to 1 inclusive. The value we sample the region containing the "point" represented by our uniform random variable.

In [1]:
%pylab inline 
import numpy

Populating the interactive namespace from numpy and matplotlib


In [2]:
def roulette_wheel_sample(table):
    elems = []
    probs = []
    for elem, prob in table.items():
        elems.append(elem)
        probs.append(prob)
    probs = np.cumsum(probs)
    u = numpy.random.uniform(0, 1)
    for val, p in zip(elems, probs):
        if u <= p:
            return val

In [3]:
table = {
    'P': 0.6,
    'U': 0.25,
    'Y': 0.15
}
sample = roulette_wheel_sample(table)
print('Sampled ', sample)

Sampled  Y


The above version of the function is a bit inefficient in that we need to compute the "wheel" for every sample.
A more efficient solution would be to cache the wheel to sample again. We can do neatly this in two ways:
1. Write a class that is constructed using a table
2. Write a higher order function that returns a closure with access to the table

In [4]:
# Using OOP 
class RouletteWheel(object):
    
    def __init__(self, table):
        self.elems = []
        self.probs = []
        for elem, prob in table.items():
            self.elems.append(elem)
            self.probs.append(prob)
        self.probs = np.cumsum(self.probs)
        
    def sample(self):
        u = numpy.random.uniform(0, 1)
        for val, p in zip(self.elems, self.probs):
            if u <= p:
                return val

wheel = RouletteWheel(table)
wheel.sample()

'P'

In [5]:

# Using higher order functions
def generate_wheel(table):
    elems = []
    probs = []
    for elem, prob in table.items():
        elems.append(elem)
        probs.append(prob)
    probs = np.cumsum(probs)
    
    # Closure can refer to above variables in generate_wheel 
    def sample():
        u = numpy.random.uniform(0, 1)
        for val, p in zip(elems, probs):
            if u <= p:
                return val
    # return created function
    return sample

wheel = generate_wheel(table)
wheel()

'P'