I. A study was undertaken to compare the mean time spent on cell phones by male and female college students per week. Fifty male and 50 female students were selected from Midwestern University and the number of hours per week spent talking on their cell phones determined. The results in hours are shown in Table 10.6. It is desired to test $H_0 : \mu_1 = \mu_2 $ versus $H_1 : \mu_1 \neq \mu_2$ based on these samples.

\begin{array}{c|ccccc}
\textbf{Males} & 12 & 4 & 11 & 13 & 11 \\
               & 7 & 9 & 10 & 10 & 7 \\
               & 7 & 12 & 6 & 9 & 15 \\
               & 10 & 11 & 12 & 7 & 8 \\
               & 8 & 9 & 11 & 10 & 9 \\
               & 10 & 9 & 9 & 7 & 9 \\
               & 11 & 7 & 10 & 10 & 11 \\
               & 9 & 12 & 12 & 8 & 13 \\
               & 9 & 10 & 8 & 11 & 10 \\
               & 13 & 13 & 9 & 10 & 13 \\
\hline
\textbf{Females} & 11 & 9 & 7 & 10 & 9 \\
                 & 10 & 10 & 7 & 9 & 10 \\
                 & 11 & 8 & 9 & 6 & 11 \\
                 & 10 & 7 & 9 & 12 & 14 \\
                 & 11 & 12 & 12 & 8 & 12 \\
                 & 12 & 9 & 10 & 11 & 7 \\
                 & 12 & 7 & 9 & 8 & 11 \\
                 & 10 & 8 & 13 & 8 & 10 \\
                 & 9 & 9 & 9 & 11 & 9 \\
                 & 9 & 8 & 9 & 12 & 11 \\
\end{array}



In [None]:
import pandas as pd
import numpy as np
from scipy import stats


## Data
males = np.array([
    12, 4, 11, 13, 11,
    7, 9, 10, 10, 7,
    7, 12, 6, 9, 15,
    10, 11, 12, 7, 8,
    8, 9, 11, 10, 9,
    10, 9, 9, 7, 9,
    11, 7, 10, 10, 11,
    9, 12, 12, 8, 13,
    9, 10, 8, 11, 10,
    13, 13, 9, 10, 13
])

females = np.array([
    11, 9, 7, 10, 9,
    10, 10, 7, 9, 10,
    11, 8, 9, 6, 11,
    10, 7, 9, 12, 14,
    11, 12, 12, 8, 12,
    12, 9, 10, 11, 7,
    12, 7, 9, 8, 11,
    10, 8, 13, 8, 10,
    9, 9, 9, 11, 9,
    9, 8, 9, 12, 11
])

## Descriptive Statistics
n_males = np.size(males)
n_females = np.size(females)

mean_males = round(np.mean(males), 2)
mean_females = round(np.mean(females), 2)

std_males = round(np.std(males, ddof=1), 2)
std_females = round(np.std(females, ddof=1), 2)

sterror_males = round(std_males / np.sqrt(n_males), 2)
sterror_females = round(std_females / np.sqrt(n_females),2)

estimated_diff = round(mean_males - mean_females, 2)

print("Group       N   Mean   StDev  SE Mean")
print(f"Males      {n_males}   {mean_males:}   {std_males}    {sterror_males}")
print(f"Females    {n_females}   {mean_females}   {std_females}    {sterror_females}\n")
print(f"Difference = mu (males) - mu (females)")
print(f"Estimate for difference: {estimated_diff:}")


## Confidence Interval
df = n_males + n_females - 2
se_diff = np.sqrt((std_males**2 / n_males) + (std_females**2 / n_females))
t_crit = stats.t.ppf(1 - 0.025, df)

ci_low = round(estimated_diff - t_crit * se_diff, 6)
ci_high = round(estimated_diff + t_crit * se_diff, 6)

pooled_std = round(np.sqrt(((n_males - 1) * np.power(std_males, 2) + (n_females - 1)
* np.power(std_females, 2)) / (n_males + n_females - 2)), 2)

print(f"95% CI for difference: ({ci_low:}, {ci_high:})")
print(f"T-Test of difference = 0 (vs not =): T-Value = {t_stat}  P-Value = {p_value}")
print(f"DF = {df}")
print(f"Both use Pooled StDev = {pooled_std}\n")

## Hypothesis Testing

alpha = 0.05
t_stat, p_value = stats.ttest_ind(males, females, equal_var=True)
t_stat = round(t_stat, 2)
p_value = round(p_value, 3)

print(f"t-statistic: {t_stat}")
print(f"p-value: {p_value}")

if p_value < alpha:
    print("Reject the null hypothesis. There is a significant difference between the means of males and females.")
else:
    print("Fail to reject the null hypothesis. There is no significant difference between the means of males and females.")




Group       N   Mean   StDev  SE Mean
Males      50   9.82   2.15    0.3
Females    50   9.7   1.78    0.25

Difference = mu (males) - mu (females)
Estimate for difference: 0.12
95% CI for difference: (-0.663344, 0.903344)
T-Test of difference = 0 (vs not =): T-Value = 0.3  P-Value = 0.762
DF = 98
Both use Pooled StDev = 1.97

t-statistic: 0.3
p-value: 0.762
Fail to reject the null hypothesis. There is no significant difference between the means of males and females.


II.

In [None]:
import numpy as np

# --- Data Set from Problem II.1, 2, 3 ---
X = np.array([2, 3, 7, 8, 10])
n = len(X)
print(f"Data Set (X): {X}")
print(f"Number of observations (n): {n}\n")

# 1. Moments About the Origin (Raw Moments)
print("--- 1. Moments About the Origin ---")

# First Moment About the Origin (r=1)
mu_prime_1 = np.mean(X)
print(f"(a) First Moment (mu'_1): {mu_prime_1}")

# Second Moment About the Origin (r=2)
mu_prime_2 = np.sum(X**2) / n
print(f"(b) Second Moment (mu'_2): {mu_prime_2}")

# Third Moment About the Origin (r=3)
mu_prime_3 = np.sum(X**3) / n
print(f"(c) Third Moment (mu'_3): {mu_prime_3}")

# Fourth Moment About the Origin (r=4)
mu_prime_4 = np.sum(X**4) / n
print(f"(d) Fourth Moment (mu'_4): {mu_prime_4}\n")


# 2. Moments About the Mean (Central Moments) - Problem II.2

print("--- 2. Moments About the Mean ---")

# Calculate deviations from the mean
deviations = X - mu_prime_1

# First Moment About the Mean (r=1)
mu_1 = np.sum(deviations**1) / n
print(f"(a) First Moment (mu_1): {mu_1}")

# Second Moment About the Mean (r=2)
mu_2 = np.sum(deviations**2) / n
print(f"(b) Second Moment (mu_2): {mu_2}")

# Third Moment About the Mean (r=3)
mu_3 = np.sum(deviations**3) / n
print(f"(c) Third Moment (mu_3): {mu_3}")

# Fourth Moment About the Mean (r=4)
mu_4 = np.sum(deviations**4) / n
print(f"(d) Fourth Moment (mu_4): {mu_4}\n")


# 3. Verify the relationship for the fourth moment

print("--- 3. Verify---")

# Left Hand Side (LHS) of the equation
LHS = mu_4
print(f"mu_4 (from direct calculation - LHS): {LHS}")

# Right Hand Side (RHS) of the equation
RHS = (mu_prime_4 -
       4 * mu_prime_1 * mu_prime_3 +
       6 * (mu_prime_1**2) * mu_prime_2 -
       3 * (mu_prime_1**4))
print(f"Formula calculation (RHS): {RHS}")

if np.isclose(LHS, RHS):
    print("\nVerification Successful: LHS is equal to RHS (within tolerance).")
else:
    print("\nVerification Failed: LHS is NOT equal to RHS.")

Data Set (X): [ 2  3  7  8 10]
Number of observations (n): 5

--- 1. Moments About the Origin ---
(a) First Moment (mu'_1): 6.0
(b) Second Moment (mu'_2): 45.2
(c) Third Moment (mu'_3): 378.0
(d) Fourth Moment (mu'_4): 3318.8

--- 2. Moments About the Mean ---
(a) First Moment (mu_1): 0.0
(b) Second Moment (mu_2): 9.2
(c) Third Moment (mu_3): -3.6
(d) Fourth Moment (mu_4): 122.0

--- 3. Verify---
mu_4 (from direct calculation - LHS): 122.0
Formula calculation (RHS): 122.00000000000091

Verification Successful: LHS is equal to RHS (within tolerance).


III. Prove that m'_4 = m_4 + 4hm_3 + 6h^2m_2 + h^4 where h = m'_1




In [1]:
from sympy import symbols, expand

# Define symbolic variables
X, mu = symbols('X mu')

# The key identity is X = (X - mu) + mu.
print("--- Moment Proof: m_4' in terms of Central Moments ---")
print("\n[STEP 1: Algebraic Setup]")
print("We use the identity X = (X - mu) + mu. The mean 'mu' is given as 'h' (m_1').")
print("Raising both sides to the 4th power gives: X^4 = ((X - mu) + mu)^4")
print("We treat this as a Binomial Expansion of (A + B)^4, where A = (X - mu) and B = mu.")

# Use sympy to perform the Binomial Expansion: (A + B)^4
# A = (X - mu)
# B = mu
a, b = symbols('a b')
expansion_formula = expand((a + b)**4)
print("\n[STEP 2: Binomial Expansion (A + B)^4]")
print("Formula: (A + B)^4 = " + str(expansion_formula))

# Substitute A = (X - mu) and B = mu back into the expanded formula
substituted_expansion = expansion_formula.subs({a: (X - mu), b: mu})

print("\n[STEP 3: Substitution]")
print(f"Substitute A = (X - mu) and B = mu into the formula:")
print(f"X^4 = {substituted_expansion}")
print("   = (X - mu)^4 + 4*mu*(X - mu)^3 + 6*mu^2*(X - mu)^2 + 4*mu^3*(X - mu) + mu^4")

print("\n[STEP 4: Take Expectation (E) of all terms]")
print("The Expectation is taken on both sides. Since mu is a constant, it moves outside the E[] operator.")
print("E[X^4] = E[(X - mu)^4] + 4*mu*E[(X - mu)^3] + 6*mu^2*E[(X - mu)^2] + 4*mu^3*E[(X - mu)] + mu^4")

print("\n[STEP 5: Apply Moment Definitions]")
print("We use the following definitions:")
print("  - E[X^4]        = m_4' (4th moment about the origin)")
print("  - E[(X - mu)^4] = m_4  (4th central moment)")
print("  - E[(X - mu)^3] = m_3  (3rd central moment)")
print("  - E[(X - mu)^2] = m_2  (2nd central moment, or variance)")
print("  - E[(X - mu)]   = m_1  (1st central moment, which is ALWAYS 0)")

print("\n[STEP 6: Substitute Moment Symbols and Simplify]")
print("Substitute the symbols into the Expectation equation:")
print("m_4' = m_4 + 4*mu*m_3 + 6*mu^2*m_2 + 4*mu^3*(0) + mu^4")

print("\nAfter the 4*mu^3*(0) term vanishes:")
print("m_4' = m_4 + 4*mu*m_3 + 6*mu^2*m_2 + mu^4")

print("\n[STEP 7: Final Notation Conversion]")
print("Given in the problem: h = m_1' = mu.")
print("Replacing 'mu' with 'h' gives the required identity:")
print("Final identity: m_4' = m_4 + 4h*m_3 + 6h^2*m_2 + h^4")

--- Moment Proof: m_4' in terms of Central Moments ---

[STEP 1: Algebraic Setup]
We use the identity X = (X - mu) + mu. The mean 'mu' is given as 'h' (m_1').
Raising both sides to the 4th power gives: X^4 = ((X - mu) + mu)^4
We treat this as a Binomial Expansion of (A + B)^4, where A = (X - mu) and B = mu.

[STEP 2: Binomial Expansion (A + B)^4]
Formula: (A + B)^4 = a**4 + 4*a**3*b + 6*a**2*b**2 + 4*a*b**3 + b**4

[STEP 3: Substitution]
Substitute A = (X - mu) and B = mu into the formula:
X^4 = mu**4 + 4*mu**3*(X - mu) + 6*mu**2*(X - mu)**2 + 4*mu*(X - mu)**3 + (X - mu)**4
   = (X - mu)^4 + 4*mu*(X - mu)^3 + 6*mu^2*(X - mu)^2 + 4*mu^3*(X - mu) + mu^4

[STEP 4: Take Expectation (E) of all terms]
The Expectation is taken on both sides. Since mu is a constant, it moves outside the E[] operator.
E[X^4] = E[(X - mu)^4] + 4*mu*E[(X - mu)^3] + 6*mu^2*E[(X - mu)^2] + 4*mu^3*E[(X - mu)] + mu^4

[STEP 5: Apply Moment Definitions]
We use the following definitions:
  - E[X^4]        = m_4' (4th m