
#### SOFTWARE DEMONSTRATION OF ELEMENTARY SAMPLING THEORY
## 8.17 Context
**Midwestern University has $ 1/3 $ of its students taking 9 credit hours, $ 1/3 $ taking 12 credit hours, and $ 1/3 $ taking 15 credit hours.**

**If X represents the credit hours a student is taking, the distribution of X is $ p(x) = 1/3 $ for x = 9, 12, and 15. Find the mean and variance of X ($ μ $ and $ σ $ of X). What type of distribution does X have?**

| x   | p(x)  | x * p(x) | x^2 * p(x) |
|-----|-------|----------|------------|
| 9   | 1/3   | 3        | 27         |
| 12  | 1/3   | 4        | 48         |
| 15  | 1/3   | 5        | 75         |
| **total ∑** | **∑(p(x)) = 1**  | **μ = 12**     | **150**    |

Where $$ σ = \sum{[x^2 * p(x)]} - μ^2  = 150 - 144 = 6 $$

#### Answer
#### $ μ = 12 $ and $ σ^2 = 6 $

In [1]:
population_mean = 12.000
population_variance = 6.000

# Display the result
print(f"Population Mean Distribution: {population_mean : .3f}")
print(f"Population Variance Distribution: {population_variance: .3f}")

Population Mean Distribution:  12.000
Population Variance Distribution:  6.000


## Answer 8.18

List all samples of size n = 2 that are possible (with replacement) from the population in Problem
8.17. Use the chart wizard of EXCEL to plot the sampling distribution of the mean to show that $ μ_{\overline{X}} $ and $ σ^2_{\overline{X}} $:

$ μ_{\overline{X}} = μ $ where $ μ $ is the population mean answer from 8.17

$ σ^2_{\overline{X}} = \frac{σ^2}{2} $ where $ σ^2 $ is the population variance answer from 8.17

| A   | B   | Mean (x̄) | p(x̄)    |
|-----|-----|-----------|----------|
| 9   | 9   | 9         | 0.1111   |
| 9   | 12  | 10.5      | 0.2222   |
| 9   | 15  | 12        | 0.1111   |
| 12  | 9   | 10.5      | 0.2222   |
| 12  | 12  | 12        | 0.2222   |
| 12  | 15  | 13.5      | 0.2222   |
| 15  | 9   | 12        | 0.1111   |
| 15  | 12  | 13.5      | 0.2222   |
| 15  | 15  | 15        | 0.1111   |



In [2]:
import itertools
import numpy as np
import pandas as pd

# Define the population and probabilities
population = [9, 12, 15]
probabilities = [1/3, 1/3, 1/3]

# Generate all possible samples of size 2 with replacement
samples = list(itertools.product(population, repeat=2))

# Create a DataFrame to hold the samples and their properties
df_samples = pd.DataFrame(samples, columns=['A', 'B'])

# Calculate the mean (x̄) of each sample
df_samples['Mean (x̄)'] = df_samples.mean(axis=1)

# Since all probabilities are equal (1/9), we assign that value
df_samples['p(x̄)'] = 1/3 * 1/3

# Calculate the expected mean of the sampling distribution of the mean
expected_mean = np.sum(df_samples['Mean (x̄)'] * df_samples['p(x̄)'])

# Calculate the variance of the sampling distribution of the mean
expected_variance = np.sum((df_samples['Mean (x̄)']**2) * df_samples['p(x̄)']) - expected_mean**2

# Display the result
print(f"Expected Mean of the Sampling Distribution: {expected_mean : .3f}")
print(f"Expected Variance of the Sampling Distribution: {expected_variance : .3f}")

Expected Mean of the Sampling Distribution:  12.000
Expected Variance of the Sampling Distribution:  3.000


#### Is expected_mean = population_mean?
$ μ_{\overline{X}} = μ $

### Yes!

In [3]:
print(f"""
population_mean = {population_mean : .0f}
expected_mean = {expected_mean: .0f}
""")


population_mean =  12
expected_mean =  12



#### Is expected_variance = population_variance / 2 ?
$ σ^2_{\overline{X}} = \frac{σ^2}{2} $

### Yes!

In [4]:
print(f"""
population_variance divide by 2 = {population_variance / 2 : .0f}
expected_variance = {expected_variance: .0f}
""")


population_variance divide by 2 =  3
expected_variance =  3



#### SAMPLING DISTRIBUTION OF MEANS  
## Answer 8.21


A population consists of the four numbers 3, 7, 11, and 15. Consider all possible samples of size 2 that can be drawn with replacement from this population.

| x   | p(x)  | x * p(x) | x^2 * p(x) |
|-----|-------|----------|------------|
| 3   | 1/4   | 0.75     | 2.25       |
| 7   | 1/4   | 1.75     | 12.25      |
| 11  | 1/4   | 2.75     | 30.25      |
| 15  | 1/4   | 3.75     | 56.25      |
| **Total** | **∑(p(x)) = 1**  | **μ = 9**       | **101**      |


Find the following:

+ (a) the population mean
+ (b) the population standard deviation
+ (c) the mean of the sampling distribution of means
+ (d) the standard deviation of the sampling distribution of means. 

In [5]:
import numpy as np
import itertools

# A population consists of the four numbers 3, 7, 11, and 15.
population = [3, 7, 11, 15]

# Consider all possible samples of size 2 that can be drawn with replacement from this population.
samples = list(itertools.product(population, repeat=2))

# Calculate sample means
sample_means = [np.mean(sample) for sample in samples]

# (a) Population mean and variance then (b) Standard Deviation
population_mean = np.mean(population)
population_variance = np.var(population, ddof=0)
population_standard_deviation = population_variance**(1/2)

# (c) Mean of sample means
mean_sample_means = np.mean(sample_means)

# Variance of sample means then its (d) Standard Deviation
variance_sample_means = np.var(sample_means, ddof=0)
standard_deviation_sample_means = variance_sample_means**(1/2)

#Answer 2.1
print(f"""
    (a) population_mean = {population_mean : .2f}
    (b) population_standard_deviation = {population_standard_deviation : .2f}
    (c) mean_sample_means = {mean_sample_means : .2f}
    (d) standard_deviation_sample_means = {standard_deviation_sample_means : .2f}
""")


    (a) population_mean =  9.00
    (b) population_standard_deviation =  4.47
    (c) mean_sample_means =  9.00
    (d) standard_deviation_sample_means =  3.16



#### SAMPLING DISTRIBUTION OF PROPORTIONS
## Answer 8.34
Find the probability that of the next 200 children born:
   + (a) less than 40% will be boys
   + (b) between 43% and 57% will be girls
   + (c) more than 54% will be boys

Assume equal probabilities for the births of boys and girls

In [6]:
from scipy.stats import norm

# Parameters
n = 200
p = 0.5
sigma_p = np.sqrt(p * (1 - p) / n)

# Part (a): P(X < 0.4)
z_a = (0.4 - p) / sigma_p
prob_a = norm.cdf(z_a)

# Part (b): P(0.43 < X < 0.57)
z_b1 = (0.43 - p) / sigma_p
z_b2 = (0.57 - p) / sigma_p
prob_b = norm.cdf(z_b2) - norm.cdf(z_b1)

# Part (c): P(X > 0.54)
z_c = (0.54 - p) / sigma_p
prob_c = 1 - norm.cdf(z_c)


print(f"""
# Less than 40% will be boys
## P(X < 0.4) = {prob_a : .4f}

# Between 43% and 57% will be girls
## P(0.43 < X < 0.57) = {prob_b : .4f}

# More than 54% will be boys
## P(X > 0.54) = {prob_c : .4f}
""")


# Less than 40% will be boys
## P(X < 0.4) =  0.0023

# Between 43% and 57% will be girls
## P(0.43 < X < 0.57) =  0.9523

# More than 54% will be boys
## P(X > 0.54) =  0.1289



#### SOFTWARE DEMONSTRATION OF ELEMENTARY SAMPLING THEORY
## Answer 8.49
The credit hour distribution at Metropolitan Technological College is as follows

|x |6 |9 |12 |15 |18 |
|---|---|---|---|---|---|
|p(x) |0.1 |0.2 |0.4 |0.2 |0.1 |

Find $ μ $ and $ σ^2 $. Give the 25 (with replacement) possible samples of size 2, their means, and their
probabilities.

In [7]:
import itertools
import numpy as np
import pandas as pd

# Given values
x_values = np.array([6, 9, 12, 15, 18])
p_values = np.array([0.1, 0.2, 0.4, 0.2, 0.1])

# 1. Calculate population mean (μ)
population_mean = np.sum(x_values * p_values)

# 2. Calculate population variance (σ²)
population_variance = np.sum(x_values**2 * p_values) - population_mean**2

# 3. List all possible samples of size 2 with replacement
samples = list(itertools.product(x_values, repeat=2))

# 4. Calculate sample means and probabilities
sample_means = [(sample[0] + sample[1]) / 2 for sample in samples]
sample_probabilities = [p_values[x_values == sample[0]][0] * p_values[x_values == sample[1]][0] for sample in samples]

# Create DataFrame for results
df_results = pd.DataFrame({
    'Sample A': [sample[0] for sample in samples],
    'Sample B': [sample[1] for sample in samples],
    'Mean': sample_means,
    'Probability': sample_probabilities
})

# Format values to 3 decimal places
df_results['Mean'] = df_results['Mean'].map('{:.1f}'.format)
df_results['Probability'] = df_results['Probability'].map('{:.2f}'.format)

# Display DataFrame
df_results


Unnamed: 0,Sample A,Sample B,Mean,Probability
0,6,6,6.0,0.01
1,6,9,7.5,0.02
2,6,12,9.0,0.04
3,6,15,10.5,0.02
4,6,18,12.0,0.01
5,9,6,7.5,0.02
6,9,9,9.0,0.04
7,9,12,10.5,0.08
8,9,15,12.0,0.04
9,9,18,13.5,0.02


In [8]:
# Print population mean and variance, formatted to 3 decimal places
print(f"Population Mean (μ): {population_mean:.0f}")
print(f"Population Variance (σ²): {population_variance:.1f}")

Population Mean (μ): 12
Population Variance (σ²): 10.8
