<img width="300px" src="../images/learning-tree-logo.svg" alt="Learning Tree logo" />

# Scottish independence referendum (2014) polling

In 2014 Scotland voted to remain part of the United Kingdom (UK). 2,001,926 people voted to remain, and 1,617,989 voted to leave, [resulting in a 55.3% vote share for "No"](https://en.wikipedia.org/wiki/2014_Scottish_independence_referendum).

In [None]:
import matplotlib.pyplot as plt
import numpy as np

## Create the (voter) population

`0` represents "Remain/No" and `1` represents "Leave/Yes".

In [None]:
population = np.concatenate((np.zeros(2_001_926), np.ones(1_617_989)))

The percentage of voters who voted to remain in the UK was

In [None]:
np.round((1 - np.mean(population)) * 100, 1)

## Conduct opinion polls

[Opinion polls prior to the referendum](https://en.wikipedia.org/wiki/Opinion_polling_on_Scottish_independence#2014_referendum) used a sample size of around 1000.

In [None]:
SAMPLE_SIZE = 1_000

Take 10,000 opinion polls.

In [None]:
N_SAMPLES = 10_000

In [None]:
sample_means = np.zeros(N_SAMPLES)

for i in range(N_SAMPLES):
    sample = np.random.choice(population, size=SAMPLE_SIZE, replace=True)
    sample_means[i] = np.mean(sample)

The range of estimates for the proportion of voters who'll vote "Remain/No" is

In [None]:
1 - np.max(sample_means), 1 - np.min(sample_means)

In [None]:
plt.hist(sample_means);

Note that the only problem we are dealing with here is **sampling error**. Real opinion polls have other sources of error.

## Accuracy of poll

The parameters of the sampling distribution are

In [None]:
x = 1 - np.mean(sample_means)
x

In [None]:
s = np.std(sample_means)
s

The 95% margin of error is

In [None]:
err = 1.96 * s
np.round(err, 3)

In [None]:
np.round(x - err, 3), np.round(x + err, 3)