# Sampling

This notebook shows examples of simple random sampling and statified random sampling as described in section 4 of chapter 1.

In [1]:
import numpy as np

# setup generator for reproducibility
random_generator = np.random.default_rng(2020)

## Simple Random Sampling

The following code shows how to implement simple random sampling with numpy for the population shown in figure 3.

In [2]:
population = np.arange(1, 10 + 1)
population

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [3]:
sample = random_generator.choice(
    population,           # sample from population
    size=3,               # number of samples to take
    replace=False         # only allow to sample individuals once
)
sample

array([1, 8, 5])

Results of output may vary.

## Stratified Random Sampling

The following shows how to implement stratified random sampling as shown in figure 4.

In [4]:
population = [
    1, "A", 3, 4,
    5, 2, "D", 8,
    "C", 7, 6, "B"
]

strata = {
    'number' : [],
    'string' : [],
}

for item in population:
    if isinstance(item, int):
        strata['number'].append(item)
    else:
        strata['string'].append(item)

print(strata)

{'number': [1, 3, 4, 5, 2, 8, 7, 6], 'string': ['A', 'D', 'C', 'B']}


In [5]:
# fraction of population to sample
sample_fraction = 0.5

sampled_strata = {}

for group in strata:
    sample_size = int(
        sample_fraction * len(strata[group])
    )
    sampled_strata[group] = random_generator.choice(
            strata[group], 
            size=sample_size, 
            replace=False
    )
print(sampled_strata)

{'number': array([2, 8, 5, 1]), 'string': array(['D', 'C'], dtype='<U1')}
