# Bootstrap Resampling of Mean
---
### Does bootstrap resampling of a statistic better approximate the population value the more you resample?

In [1]:
from scipy.stats import norm
import numpy as np

In [2]:
RAND_STATE = 1 # seed for generating random values from normal distribution
NUM_RESAMPLES = 10 # number of times to bootstrap resample
POP_SIZE = 10000 # size of population
SAMP_SIZE = int(0.05 * POP_SIZE) # size of sample to be sampled from population
MU = 10 # mean of normal distribution
SIGMA = 2 # standard deviation of normal distribution

In [3]:
# population
population = norm.rvs(loc=MU, scale=SIGMA, size=POP_SIZE, random_state=RAND_STATE)
# sample from population
samp = np.random.choice(population, size=SAMP_SIZE)
# average of population
pop_mean = np.mean(population)

In [4]:
for exp in range(1, 7):
    # sample the size of the sample with replacement from the sample
    bootstrap = [np.mean(np.random.choice(samp, size=len(samp), replace=True)) for i in range(NUM_RESAMPLES**exp)]
    # average of bootstrap resamples
    bootstrap_mean = np.mean(bootstrap)
    
    print('\nNumber of Resamples: {}'.format(NUM_RESAMPLES**exp))
    print('Population Mean: {}\nBootstrap Mean: {}'.format(pop_mean, bootstrap_mean))
    print('\nDifference in means: {}'.format(round(abs(pop_mean - bootstrap_mean), 8)))
    print('-' * 40 if exp < 6 else '')


Number of Resamples: 10
Population Mean: 10.019545313398211
Bootstrap Mean: 10.262028070396484

Difference in means: 0.24248276
----------------------------------------

Number of Resamples: 100
Population Mean: 10.019545313398211
Bootstrap Mean: 10.273353168073884

Difference in means: 0.25380785
----------------------------------------

Number of Resamples: 1000
Population Mean: 10.019545313398211
Bootstrap Mean: 10.274686391322081

Difference in means: 0.25514108
----------------------------------------

Number of Resamples: 10000
Population Mean: 10.019545313398211
Bootstrap Mean: 10.27103531504074

Difference in means: 0.25149
----------------------------------------

Number of Resamples: 100000
Population Mean: 10.019545313398211
Bootstrap Mean: 10.270836347006604

Difference in means: 0.25129103
----------------------------------------

Number of Resamples: 1000000
Population Mean: 10.019545313398211
Bootstrap Mean: 10.27011292480651

Difference in means: 0.25056761



# CONCLUSION:
### No, according to the change in the difference of means, the population value of a statistic (e.g. mean) is not better approximated the more you resample using the bootstrap resampling technique. The sample used for bootstrap resampling determines how close the approximation will get to the population value and the bootstrap statistic does not better approximate the population statistic the more you resample (notice the change in "Difference in means" as "Number of Resamples" increases).