# Stratified Sampling
In stratified sampling, the population is partitioned into non-overlapping groups, called strata and a sample is selected by some design within each stratum. 

for e.g. > customers can be stratified based on their total purchased done or revenue generated from 

Stratification may produce a smaller error of estimation than would be produced by a simple random sample of the same size. This result is particularly true if measurements within strata are very homogeneous. 

In other words by stratified sampling you are dividing a heterogeneous population into
subpopulations, each of which is internally homogeneous.




### Stratum & Strata
Stratum in geology or biology refers to layers, as in a layer of rock or layer of skin. A stratum in statistics isn’t much different — you can think of it as where your target population is divided into non-overlapping layers, subgroups, or categories. For example, strata could define the socioeconomic statuses of a population:

1. Lowest 25 percent of customers based on total revenue.
2. Middle 50 percent of customers based on total revenue.
3. Highest 25 percent of customers based on total revenue.

Each of the subgroups (i.e. a single subgroup or a category) is called a stratum, 
and two or more subgroups are called strata.
> [Note: ‘Stratum’ is singular and ‘strata’ is plural]

Customers grouped based on Revenue is Strata
and High Percntile Revenue Customers is a Stratum



We use the following symbols and notations:
- N : Population size
- k : Number of strata
- $ N_i $ : Number of sampling units in $ i^{th} $ strata
$$ N = \sum_{i=1}^{k} N_i $$

- $ n_i $ : Number of sampling units to be drawn from  $ i^{th} $ stratum
- n : Total sample size 
$$ n = \sum_{i=1}^{k} N_i $$


### Neyman Allocation
In Neyman allocation, the number of units in the sample from a stratum is made proportional to the product of the stratum size and the stratum standard deviation.

$ n_h = n  \dfrac{ N_h  \sigma_h } { \sum_{i=1}^{k}  N_i  \sigma_i  } $

where 
- $ n_h $ is the sample size for stratum h, 
- n is total sample size, 
- $ N_h $ is the population size for stratum h, 
- $ \sigma_h $ is the standard deviation of stratum h.

> It is proved that Neyman’s allocation is the best when a sample of specified size is to be allocated to the strata

###  Slovin’s formula

$  n_h =\dfrac{n}{1+ne^2} $

- error tolerance, e,
- $ n_h $ is the sample size for stratum h, 
- n is total sample size 

Slovin's formula is used when nothing about the behavior of a population is known at at all.

Note that this is the least accurate formula and, as such, the least ideal. You should only use this if circumstances prevent you from determining an appropriate standard of deviation and/or confidence level (thereby preventing you from determining your z-score, as well).

### Comments
To obtain the full benefits of the stratification technique, the relative sizes of strata must be known.

Each stratum should be internally homogeneous. If information about heterogeneity is not available then consider all strata equally variable. A short stratified pilot survey can sometimes provide useful information about internal dispersion within strata.

#### A small sized sample could be taken from a stratum if the variability among their units is small.

Compared with the simple random sample, stratification results almost always in a smaller sampling variance of the mean or total value estimators, when:

- The strata are heterogeneous among themselves
- The variance of each stratum is small.

#### A larger sample from a stratum should be taken if:

- The stratum is larger
- The stratum is more heterogeneous
- The cost of sampling the stratum is low.

### Proportionate Stratification
Proportionate stratification is a type of stratified sampling . With proportionate stratification, the sample size of each stratum is proportionate to the population size of the stratum. This means that each stratum has the same sampling fraction .

## Central Limit Theorem
The central limit theorem states that the sampling distribution of the mean of any independent, random variable will be normal or nearly normal, if the sample size is large enough. The sampling distribution will have mean μ and a standard deviation of $ \dfrac{σ}{√n} $


##### Mean

$ μ_x= μ $
 
##### Standard Deviation ("Standard Error")
$ SE(x\bar)= \dfrac{σ}{√n} $

The amazing and counter-intuitive thing about the central limit theorem is that no matter what the shape of the original distribution, the sampling distribution of the mean approaches a normal distribution. 