#### GENERAL INSTRUCTIONS: Place your code in each cell to perform the required calculations.  Ctrl-Enter will execute all lines of code in that cell. Once you are satisfied with your answer (you may work through it as much as you'd like), click on the "Submit Assignment" button in the upper right corner and your assignment will be graded.

#### Part A
The following sample has been obtained for the mean breaking strength of a particular material.

x <- c(123,117,119,128,125,121,124,119,116,126)

Calculate the P-value of the Anderson-Darling (AD) statistic (hint: **nortest** library) for the data.  Store your answer in the variable `PartA` and make sure that your result is a single scalar value (e.g., not a vector or array).

In [1]:
# Load required package
library(nortest)

x <- c(123,117,119,128,125,121,124,119,116,126)

# Perform Anderson-Darling test
ad_result <- ad.test(x)

# Extract the p-value
PartA <- ad_result$p.value

# Print the result
print(PartA)

[1] 0.8304076


#### Part B
The following sample has been obtained for the mean breaking strength of a particular material.  Assuming that the variance is unknown, what is the lower bound for a 90% confidence interval on the population mean?  Store your answer in the variable `PartB` and make sure that your result is a single scalar value (e.g., not a vector or array).

x <- c(123,117,119,128,125,121,124,119,116,126)

HINT: See the "Confidence Interval on the Mean, Variance Unknown" section of the Week 4 Cheat Sheet for information on storing the confidence interval bounds as single values.

In [2]:
# Sample data
x <- c(123, 117, 119, 128, 125, 121, 124, 119, 116, 126)

# Sample size
n <- length(x)

# Sample mean
sample_mean <- mean(x)

# Sample standard deviation
sample_sd <- sd(x)

# Critical value for a 90% confidence interval (one-tailed)
alpha <- 0.10
t_critical <- qt(1 - alpha, df = n - 1)

# Lower bound of the confidence interval
lower_bound <- sample_mean - (t_critical * (sample_sd / sqrt(n)))

# Store the result in the variable PartB
PartB <- lower_bound

# Print the result
print(PartB)


[1] 120.0409


#### Part C
The following sample has been obtained for the mean breaking strength of a particular material.  Population variance is unknown.

x <- c(123,117,119,128,125,121,124,119,116,126)

What is the P-value for the following lower-tailed hypothesis test?  Store your answer in the variable `PartC` and make sure that your result is a single scalar value (e.g., not a vector or array).

$H_{0}: \mu = 125$\
$H_{1}: \mu \lt 125$

HINT: To store the P-value, see the section "Hypothesis Tests on the Mean, Variance Unknown" of the Week 4 Cheat Sheet.

In [3]:
x <- c(123,117,119,128,125,121,124,119,116,126)

# Hypothesized mean
mu_0 <- 125

# Sample statistics
n <- length(x)        # Sample size
x_bar <- mean(x)      # Sample mean
s <- sd(x)            # Sample standard deviation

# Compute t-statistic
t_stat <- (x_bar - mu_0) / (s / sqrt(n))

# Compute p-value (lower-tailed test)
PartC <- pt(t_stat, df = n - 1)

# Print the result
print(PartC)

[1] 0.01649536


#### Part D
A packaging machine for peanut butter is supposed to place 100 grams of peanut butter into each jar.  Each hour, samples are removed and the peanut butter content determined (after subtracting the weight of the jar).

This packaging process has been operating for years, and it is well known that the population variance is 0.5 (units of grams squared).

We don't know it at the time, but a part has worn out on the machine, and now the mean packaging weight has slipped to 99.5 grams, which is not desired.  If we take samples of size n = 8 each hour, what is the probability that we will _fail_ to detect this shift in mean?  Use alpha = 0.05.

Store your answer in the variable `PartD` and make sure that your result is a single scalar value (e.g., not a vector or array).

HINTS: This problem is asking for beta, which is 1-power.  For information on storing the power as a single variable, see the "Type II Error and Power of the Test" section of the Week 4 Cheat Sheet.

In [4]:
# Given values
sigma2 <- 0.5   # Population variance
sigma <- sqrt(sigma2)  # Standard deviation
n <- 8         # Sample size
mu_0 <- 100    # Null hypothesis mean
mu_1 <- 99.5   # True mean after shift
alpha <- 0.05  # Significance level

# Find critical sample mean (lower-tailed test)
z_alpha <- qnorm(alpha)  # z-score for alpha = 0.05 (lower-tailed)
x_crit <- mu_0 + z_alpha * (sigma / sqrt(n))

# Compute beta (Type II error)
z_beta <- (x_crit - mu_1) / (sigma / sqrt(n))
PartD <- pnorm(z_beta)  # P(Z < z_beta)

# Print result
print(PartD)


[1] 0.63876


#### Part E
A packaging machine for peanut butter is supposed to place 100 grams of peanut butter into each jar.  This packaging process has been operating for years, and it is well known that the population variance is 0.5 (units of grams squared).

Each hour, samples are removed and the peanut butter content determined (after subtracting the weight of the jar).  It is important for us to be able to detect a shift (decrease to 99.5 grams) in the process mean of 0.5 grams, and we wish to successfully identify this shift in mean 80% of the time (in other words, we want the power of the test to be at least 80%).

What sample size must we use when taking samples?  Use alpha = 0.05.  Store your answer in the variable `PartE` and make sure that your result is a single scalar value (e.g., not a vector or array).

HINT: Be sure to round your answer **up** to the nearest integer.  See the section on "Choice of Sample Size" of the Week 4 Cheat Sheet for information on how to save the sample size as a single variable.

In [5]:
# Given values
sigma2 <- 0.5   # Population variance
sigma <- sqrt(sigma2)  # Standard deviation
alpha <- 0.05   # Significance level
power <- 0.80   # Desired power
delta <- 0.5    # Shift in mean

# Z-scores
z_alpha <- qnorm(alpha)   # Critical value for lower-tailed test
z_beta <- qnorm(power)    # Z-score corresponding to power = 0.80

# Compute required sample size
n_required <- ((z_alpha + z_beta) * sigma / delta) ^ 2

# Round up to the nearest integer
PartE <- ceiling(n_required)

# Print result
print(PartE)


[1] 2


#### Part F
For superior product quality, it is important that the variance of a certain polypropylene film does not exceed 0.5 mil^2.

A quality control engineer feels that the variability of the polyproplyne film has increased in recent months (the film seems thinner in some regions and thicker in others, making it difficult to work with).  Therefore, the engineer sets up the following hypothesis test on the variance:

$H_{0}: \sigma^2 = 0.5$\
$H_{1}: \sigma^2 \gt 0.5$

Over the next several days, the engineer obtains the following 12 samples for the thickness of the film (in mil):

5.8, 4.4, 6.9, 5.6, 5.9, 4.1, 5.1, 4.3, 6.7, 6.1, 4.4,and 5.3

Calculate a P-value for the hypothesis test above and store your answer in the variable `PartF`.  Make sure that your result is a single scalar value (e.g., not a vector or array).

HINT: To store the P-value as a single variable, see the "Hypothesis Tests on the Variance" section of the Week 4 Cheat Sheet.

In [6]:
x <- c(5.8,4.4,6.9,5.6,5.9,4.1,5.1,4.3,6.7,6.1,4.4,5.3)

# Hypothesized variance
sigma0_sq <- 0.5

# Sample statistics
n <- length(x)          # Sample size
s_sq <- var(x)          # Sample variance

# Compute chi-square test statistic
chi_sq_stat <- (n - 1) * s_sq / sigma0_sq

# Compute p-value for right-tailed test
PartF <- 1 - pchisq(chi_sq_stat, df = n - 1)

# Print result
print(PartF)

[1] 0.04884046


#### Part G
A wildlife biologist is conducting an experiment on the sex of individuals in a population of sea turtles.  In a balanced population, 50% of the individuals are female (and 50% are male).  The biologist suspects that the proportion of females is significantly above 50% in this particular population.  Thus, she sets up the following hypothesis test:

$H_0: p=0.5$\
$H_1: p > 0.5$

Out of 35 sea turtles that she observes the sex of, 21 of them are female.

Use the exact binomial test to calculate a P-value for the hypothesis test above.  Store your answer in the variable `PartG` and make sure that your result is a single scalar value (e.g., not a vector or array).

HINT: For information on how to store the P-value as a single value, see the "Hypothesis Tests on a Binomial Proportion" section of the Week 4 Cheat Sheet.

In [7]:
# Given data
n <- 35   # Total number of turtles observed
x <- 21   # Number of females observed
p_null <- 0.5  # Null hypothesis proportion

# Perform exact binomial test (right-tailed)
result <- binom.test(x, n, p = p_null, alternative = "greater")

# Extract p-value
PartG <- result$p.value

# Print result
print(PartG)


[1] 0.1552523
