# Randomness and Reproducibility

In python we have a **pseudo-random number generator** (PRNG), which creates a sequence of (pseudo) random number given a random seed selected by the user. One given seed produces the same sequence of (pseudo) random numbers; hence, we can repoduce randomness.

In [1]:
import random

In [8]:
random.seed(1234)
# First random number in sequence
random.random()

0.9664535356921388

In [9]:
# Second random number in sequence
random.random()

0.4407325991753527

In [7]:
# When we re-define the same seed, the same sequence starts again!
random.seed(1234)
random.random()

0.9664535356921388

In [14]:
# Uniform random number
random.uniform(25,50)

27.223220749766558

In [15]:
# List of uniformly distirbuted random number
unifNumbers = [random.uniform(0,1) for _ in range(10)]
unifNumbers

[0.5790026861873665,
 0.26958550381944824,
 0.5564325605562156,
 0.6446342341782827,
 0.48103637136651844,
 0.35523914744298335,
 0.249152121361209,
 0.9335154980423467,
 0.45338801947649354,
 0.5301612069115903]

In [16]:
# Normal random number
mu = 0
sigma = 1
random.normalvariate(mu, sigma)

-1.676475241982295

### Random Sampling from a Population

In [34]:
import random
import numpy as np

In [35]:
# We create a normally distribution of measurements for a population
mu = 0
sigma = 1
population = [random.normalvariate(mu, sigma) for _ in range(10000)]

In [36]:
# We get two samples of 500 units/measurements each
sampleA = random.sample(population, 500)
sampleB = random.sample(population, 500)

In [37]:
# Sample means should be similar to the population mean,
# as well as the standard deviation
print(np.mean(sampleA))
print(np.std(sampleA))
print(np.mean(sampleB))
print(np.std(sampleB))

-0.01887093025319796
1.025757255228497
-0.030655082826115447
1.0389949856742682


In [38]:
# Sampling distirbution: pick 100 samples of 1000 units (sample size) each
# Note that the mean of the standard deviations is computed, not the std of the means
# The std of the means refers to the spread of the sampling distirbution
means = [np.mean(random.sample(Population, 1000)) for _ in range(100)]
stds = [np.std(random.sample(Population, 1000)) for _ in range(100)]
# The mean of the sampling distirbution 
print(np.mean(means))
print(np.mean(stds))

-0.0025898174345457044
0.998534878086858


In [45]:
# It is also possible to use numpy
import numpy as np
np.random.seed(123)
mu = 100
sigma = 1
sample = np.random.normal(mu, sigma, 3)
print(sample)

[ 98.9143694  100.99734545 100.2829785 ]


In [46]:
# Sampling
population = np.arange(1,101)
#sample = np.random.choice(population,10)
sample = random.sample(list(population),10)
print(sample)
sample = np.random.choice(population,10)
print(sample)

[43, 44, 7, 21, 18, 72, 90, 32, 1, 56]
[84 58 87 98 97 48 74 33 47 97]
