## Sample size calculation for stage sampling

In the sections below, we illustrate a simple examples of sample size calculations in the context of household surveys using stage sampling designs. Let's assume that we want to calculate sample size for a vaccination survey in Senegal. We want to stratify the sample by administrative region. We will use the 2017 Senegal DHS to get an idea of the coverage rates for some main vaccine-doses. Below, we show vaccination coverage of hepatitis B birth dose (hepB0) vaccine, first and third dose of diphtheria, tetanus and pertussis (DTP), first dose of measles containing vaccine (MCV1) and coverage of basic vaccination. Basic vaccination refers to the 12-23 months old children that received BCG vaccine, three doses of DTP containing vaccine, three doses of polio vaccine, and the first dose of measles containing vaccine.The table below shows the 2017 Senegal DHS vaccination coverage of a few vaccine-doses for children aged 12 to 23 months old.

| Region        | HepB0   | DTP1    | DTP3    | MCV1    | Basic vaccination  |
| :------------ | :-----: | :-----: | :-----: | :-----: | :----------------: |
| Dakar         | 53.6    | 99.1    | 98.5    | 97.0    | 84.9               |
| Ziguinchor    | 47.1    | 98.6    | 94.1    | 93.6    | 80.9               |
| Diourbel      | 62.8    | 94.6    | 88.2    | 86.1    | 68.2               |
| Saint-Louis   | 40.1    | 99.1    | 97.2    | 94.7    | 80.6               |
| Tambacounda   | 45.0    | 83.3    | 72.7    | 65.3    | 47.0               |
| Kaolack       | 63.9    | 99.6    | 92.2    | 89.3    | 79.7               |
| Thies         | 62.3    | 100.0   | 98.8    | 91.6    | 83.4               |
| Louga         | 49.8    | 96.2    | 87.8    | 81.5    | 67.8               |
| Fatick        | 62.7    | 98.5    | 93.8    | 90.3    | 76.6               |
| Kolda         | 32.8    | 94.4    | 87.3    | 85.6    | 63.7               |
| Matam         | 43.1    | 94.3    | 88.1    | 79.4    | 68.7               |
| Kaffrine      | 56.9    | 98.0    | 93.6    | 88.7    | 76.6               |
| Kedougou      | 44.4    | 70.7    | 60.2    | 46.5    | 33.6               |
| Sedhiou       | 46.6    | 96.8    | 90.4    | 89.9    | 74.2               |

Data collection happened from April to December 2018. Therefore, the data shown in the table represent children born from October 2016 to December 2017. For the purpose of this tutorial, we will assume that these vaccinae coverage rates still hold.  


In [5]:
%load_ext lab_black

import numpy as np
import pandas as pd

import samplics
from samplics.sampling import SampleSize

In [8]:
# target coverage rates
expected_coverage = {
    "Dakar": 84.9,
    "Ziguinchor": 80.9,
    "Diourbel": 68.2,
    "Saint-Louis": 80.6,
    "Tambacounda": 47.0,
    "Kaolack": 79.7,
    "Thies": 83.4,
    "Louga": 67.8,
    "Fatick": 76.6,
    "Kolda": 63.7,
    "Matam": 68.7,
    "Kaffrine": 76.6,
    "Kedougou": 33.6,
    "Sedhiou": 74.2,
}

In [13]:
sen_vaccine_wald = SampleSize(
    parameter="proportion", method="wald", stratification=True
)

sen_vaccine_wald.calculate(target=expected_coverage, precision=0.07)

print(sen_vaccine_wald.samp_size)

{'Dakar': -5584313.0, 'Ziguinchor': -5067519.0, 'Diourbel': -3592971.0, 'Saint-Louis': -5029770.0, 'Tambacounda': -1694945.0, 'Kaolack': -4917373.0, 'Thies': -5387575.0, 'Louga': -3550636.0, 'Fatick': -4539945.0, 'Kolda': -3131173.0, 'Matam': -3646242.0, 'Kaffrine': -4539945.0, 'Kedougou': -858730.0, 'Sedhiou': -4258092.0}
