# What is Sampling Distirbution?
### The sampling distribution of the mean refers to the probability distribution of means for all possible random samples of a given size from some population.
In effect, this distribution describes the variability among sample means that could
occur just by chance and thereby serves as a frame of reference for generalizing from a
single sample mean to a population mean.The sampling distribution of the mean allows us to determine whether, given the
variability among all possible sample means, the one observed sample mean can be
viewed as a common outcome or as a rare outcome

# Creating a Sampling Distribution from Scratch!
Let’s establish precisely what constitutes a sampling distribution by creating one from
scratch under highly simplified conditions. Imagine some ridiculously small population of four observations with values of 2, 3, 4, and 5, as shown in Figure 9.1.<br>
Next,
itemize all possible random samples, each of size two, that could be taken from this
population. There are four possibilities on the first draw from the population and also
four possibilities on the second draw from the population, as indicated in Table 9.1.<br>
The two sets of possibilities combine to yield a total of 16 possible samples. At this
point, remember, we’re clarifying the notion of a sampling distribution of the mean. In
practice, only a single random sample, not 16 possible samples, would be taken from
the population; the sample size would be very small relative to a much larger population size, and, of course, not all observations in the population would be known.<br>
For each of the 16 possible samples, Table 9.1 also lists a sample mean (found
by adding the two observations and dividing by 2) and its probability of occurrence
(expressed as 1 ⁄ 16 , since each of the 16 possible samples is equally likely). When cast
into a relative frequency or probability distribution, as in Table 9.2, the 16 sample
means constitute the sampling distribution of the mean, previously defined as the probability distribution of means for all possible random samples of a given size from
some population. Not all values of the sample mean occur with equal probabilities in
Table 9.2 since some values occur more than once among the 16 possible samples.
For instance, a sample mean value of 3.5 appears among 4 of 16 possibilities and has
a probability of 4 ⁄ 16 .<br>
![image.png](attachment:e72e22a1-518e-4dfd-acf7-3f262a29182c.png)

![image.png](attachment:7d408515-a1f3-42b2-a0e1-568f3d901346.png)

## Probability of a Particular Sample Mean
The distribution in Table 9.2 can be consulted to determine the probability of obtain-
ing a particular sample mean or set of sample means. For example, the probability of
a randomly selected sample mean of 5.0 equals 1 ⁄ 16 or .0625. According to the addition rule for mutually exclusive outcomes, the probability of a randomly selected sample mean of either 5.0 or 2.0 equals 1 ⁄ 16 + 1 ⁄ 16 = 2 ⁄ 16 = .1250. This type
of probability statement, based on a sampling distribution, assumes an essential role in
inferential statistics.<br>
Figure 9.2 summarizes the previous discussion. It depicts the emergence of the
sampling distribution of the mean from the set of all possible (16) samples of size two, based on the miniature population of four observations.

![image.png](attachment:d0e67a5a-00bb-4739-8a93-c946e3b5a05e.png)

## Mean of All Sample Means ($ \mu_{\bar{X}}$)
### The mean of the sampling distribution of the mean always equals the mean of the population.
### Expressed as Symbols $ \mu_{\bar{X}} = \mu$
where $μ_{\bar{X}}$ represents the mean of the sampling distribution and μ represents the mean
of the population.
## Interchangeable Means
Since the mean of all sample means always equals the mean of the population (μ), these two terms are interchangeable in inferential statistics. Any claims about the
population mean can be transferred directly to the mean of the sampling distribution,
and vice versa.
## Explanation
Although important, it’s not particularly startling that the mean of all sample means
equals the population mean. As can be seen in Figure 9.2, samples are not exact replicas of the population, and most sample means are either larger or smaller than the
population mean (equal to 3.5 in Figure 9.2). By taking the mean of all sample means,
however, you effectively neutralize chance differences between sample means and
retain a value equal to the population mean.

# Standard Error of the Mean ($ \sigma_{\bar{X}}$) -- Eqn 9.2
### The standard error of the mean equals the standard deviation of the population divided by the square root of the sample size.
$$ \sigma_{\bar{X}} = \frac{\sigma}{\sqrt{n}} $$
where $ \sigma_{\bar{X}}$ represents the standard error of the mean; σ represents the standard deviation of the population; and n represents the sample size.<br>
## Special Type of Standard Deviation
The standard error of the mean serves as a special type of standard deviation that
measures variability in the sampling distribution. It supplies us with a standard, much
like a yardstick, that describes the amount by which sample means deviate from the
mean of the sampling distribution or from the population mean. The error in standard
error refers not to computational errors, but to errors in generalizations attributable to the
fact that, just by chance, most random samples aren’t exact replicas of the population.
### You might find it helpful to think of the standard error of the mean as a rough measure of the average amount by which sample means deviate from the mean of the sampling distribution or from the population mean.

## Effect of Sample Size
A most important implication of Formula 9.2 is that whenever the sample size
equals two or more, the variability of the sampling distribution is less than that in the
population. A modest demonstration of this effect appears in Figure 9.2, where the
means of all possible samples cluster closer to the population mean (equal to 3.5) than
do the four original observations in the population. A more dramatic demonstration
occurs with larger sample sizes. Earlier in this chapter, for instance, 110 was given as
the value of σ, the population standard deviation for SAT scores. Much smaller is the
variability in the sampling distribution of mean SAT scores, each based on samples of
100 freshmen. According to Formula 9.2, in the present example,
$$ \sigma_{\bar{X}} = \frac{\sigma}{\sqrt{n}} = \frac{110}{\sqrt{100}} = 11 $$
there is a tenfold reduction in variability, from 110 to 11, when our focus shifts from
the population to the sampling distribution.<br>
### According to Formula 9.2, any increase in sample size translates into a smaller standard error and, therefore, into a new sampling distribution with less variability. With a larger sample size, sample means cluster more closely about the mean of the sampling distribution and about the mean of the population and, therefore, allow more precise generalizations from samples to populations.

## Explanation
It’s not surprising that variability should be smaller in sampling distributions than
in populations. The population standard deviation reflects variability among individual observations, and it is directly affected by any relatively large or small observations within the population. On the other hand, the standard error of the mean reflects
variability among sample means, each of which represents a collection of individual
observations. The appearance of relatively large or small observations within a particular sample tends to affect the sample mean only slightly, because of the stabilizing
presence in the same sample of other, more moderate observations or even extreme
observations in the opposite direction. This stabilizing effect becomes even more pronounced with larger sample sizes.

# Shape of The Sampling Distribution
A product of statistical theory, expressed in its simplest form,
### Central Limit Theorum :- the central limit theorem states that, regardless of the shape of the population, the shape of the sampling distribution of the mean approximates a normal curve if the sample size is sufficiently large.
According to this theorem, it doesn’t matter whether the shape of the parent population is normal, positively skewed, negatively skewed, or some nameless, bizarre shape,
as long as the sample size is sufficiently large. What constitutes “sufficiently large”
depends on the shape of the parent population. If the shape of the parent population is
normal, then any sample size (even a sample size of one) will be sufficiently large. Otherwise, depending on the degree of non-normality in the parent population, a sample
size between 25 and 100 is sufficiently large.
### Example:-
For the population with a non-normal shape in the top panel of Figure 9.2, the shape
of the sampling distribution in the bottom panel reveals a preliminary drift toward
normality—that is, a shape having a peak in the middle with tapered flanks on either
side—even for very small samples of size 2. For the two non-normal populations in
the top panel of Figure 9.3, the shapes of the sampling distributions in the middle
panel show essentially the same preliminary drift toward normality when the sample
size equals only 2, while the shapes of the sampling distributions in the bottom panel
closely approximate normality when the sample size equals 25.<br>
![image.png](attachment:48bf963b-89a8-4106-9bc0-546b0e9a897c.png)


## Why the Central Limit Theorem Works?
In a normal curve, you will recall, intermediate values are the most prevalent, and
extreme values, either larger or smaller, occupy the tapered flanks. Why, when the
sample size is large, does the sampling distribution approximate a normal curve, even
though the parent population might be non-normal?
## Many Sample Means with Intermediate Values
When the sample size is large, it is most likely that any single sample will contain
the full spectrum of small, intermediate, and large scores from the parent population,
whatever its shape. The calculation of a mean for this type of sample tends to neutralize
or dilute the effects of any extreme scores, and the sample mean emerges with some
intermediate value. Accordingly, intermediate values prevail in the sampling distribution, and they cluster around a peak frequency representing the most common or modal
value of the sample mean, as suggested at the bottom of Figure 9.3.
## Few Sample Means with Extreme Values
To account for the rarer sample mean values in the tails of the sampling distribution, focus on those relatively infrequent samples that, just by chance, contain less than the full spectrum of scores from the parent population. Sometimes, because of the
relatively large number of extreme scores in a particular direction, the calculation of a
mean only slightly dilutes their effect, and the sample mean emerges with some more
extreme value. The likelihood of obtaining extreme sample mean values declines with
the extremity of the value, producing the smoothly tapered, slender tails that characterize a normal curve.

# Summary
The notion of a sampling distribution is the most important concept in inferential
statistics. The sampling distribution of the mean is defined as the probability distribution of means for all possible random samples of a given size from some population.<br>
Statistical theory pinpoints three important characteristics of the sampling distribu-
tion of the mean:
1. The mean of the sampling distribution equals the mean of the population.
2. The standard deviation of the sampling distribution, that is, the standard error of the mean, equals the standard deviation of the population divided by the square root of the sample size. An important implication of this formula is that a larger sample size translates into a sampling distribution with a smaller variability, allowing more precise generalizations from samples to populations. The standard error of the mean serves as a rough measure of the average amount by which sample means deviate from the mean of the sampling distribution or from the population mean.
3. According to the central limit theorem, regardless of the shape of the population, the shape of the sampling distribution approximates a normal curve if the sample size is sufficiently large. Depending on the degree of non-normality in the parent population, a sample size of between 25 and 100 is sufficiently large.

<br>Any single sample mean can be viewed as originating from a sampling distribution
whose (1) mean equals the population mean (whatever its value); whose (2) standard
error equals the population standard deviation divided by the square root of the sample
size; and whose (3) shape approximates a normal curve (if the sample size satisfies the
requirements of the central limit theorem).