# Kiểm tra giữa kỳ

## Chương 14: Kiểm định phi tham số

### 14.1 Kiểm định dấu trường hợp mẫu 

In [3]:
import numpy as np
import math
from scipy import stats

In [28]:
def stats_median(vec: np.ndarray, m0: float, level: float, kind: str = '!=') -> tuple[float, str, str]:
    sign = vec - m0
    sign = sign[sign != 0]
    sign = sign > 0
    n = np.size(sign)
    r = sign.sum()
    p_value = sum(math.comb(n, x) * (0.5 ** x) * ((1 - 0.5) ** (n - x)) for x in range(r, n + 1))

    if kind == '<':
        p_value = 1 - p_value
    elif kind == '>':
        p_value = p_value
    else:
        p_value = 2 * (1 - p_value) if r < n / 2 else 2 * p_value

    return p_value, f'm {kind} {m0}', 'Reject' if p_value < level else 'Not reject'


def stats_median_with_norm(vec: np.ndarray, m0: float, level: float, kind: str = '!=') -> tuple[float, str, str]:
    sign = vec - m0
    sign = sign[sign != 0]
    sign = sign > 0
    n = np.size(sign)
    r = sign.sum()
    z = (r - 0.5 * n) / (0.5 * math.sqrt(n))
    p_value = stats.norm.cdf(z)

    if kind == '<':
        p_value = p_value
    elif kind == '>':
        p_value = 1 - p_value
    else:
        p_value = 2 * (1 - stats.norm.cdf(abs(z)))

    return p_value, f'm {kind} {m0}', 'Reject' if p_value < level else 'Not reject'


def test(df: np.ndarray, muy0: float, level: float, kind: str = '!=') -> None:
    temp = df - muy0
    temp = temp[temp != 0]
    norm = np.size(temp) >= 10
    print(stats_median(df, muy0, level, kind))
    if norm:
        print(f'With normal distribution: {stats_median_with_norm(df, muy0, level, kind)}')

#### Bài tập 14.1 
Ten samples were taken from a plating bath used in an electronics
manufacturing process, and the bath pH of the bath was determined. The sample pH
values are 7.91, 7.85, 6.82, 8.01, 7.46, 6.95, 7.05, 7.35, 7.25, and 7.42. Manufacturing
engineering believes that pH has a median value of 7.0.
a. Do the sample data indicate that this statement is correct? Use the sign test with
α = 0.05 to investigate this hypothesis. Find the p-value for this test.
b. Use the normal approximation for the sign test to test H0 : µ = 0.7 versus
H1 : µ != 0.7. What is the p-value for this test?

In [29]:
data = np.array([7.91, 7.85, 6.82, 8.01, 7.46, 6.95, 7.05, 7.35, 7.25, 7.42])
muy = 7.0
alpha = 0.05
test(data, muy, alpha)

(0.109375, 'm != 7.0', 'Not reject')
With normal distribution: (0.05777957112359733, 'm != 7.0', 'Not reject')


#### Bài tập 14.2 
The titanium content in an aircraft-grade alloy is an important determinant of strength. A sample of 20 test coupons reveals the following titanium content
(in percent): 8.32, 8.05, 8.93, 8.65, 8.25, 8.46, 8.52, 8.35, 8.36, 8.41, 8.42, 8.30, 8.71,
8.75, 8.60, 8.83, 8.50, 8.38, 8.29, 8.46 The median titanium content should be 8.5%.
a. Use the sign test with α = 0.05 to investigate this hypothesis. Find the p-value
for this test.
b. Use the normal approximation for the sign test to test H0 : µ = 8.5 versus
H1 : µ 6= 8.5 with α = 0.05. What is the p-value for this test?

In [30]:
data = np.array([8.32, 8.05, 8.93, 8.65, 8.25, 8.46, 8.52, 8.35, 8.36, 8.41, 8.42, 8.30, 8.71,
                 8.75, 8.60, 8.83, 8.50, 8.38, 8.29, 8.46])
muy = 8.5
alpha = 0.05
test(data, muy, alpha)

(0.1670684814453125, 'm != 8.5', 'Not reject')
With normal distribution: (0.25134910881022265, 'm != 8.5', 'Not reject')


#### Bài tập 14.3 
The impurity level (in ppm) is routinely measured in an intermediate
chemical product. The following data were observed in a recent test: 2.4, 2.5, 1.7, 1.6,
1.9, 2.6, 1.3, 1.9, 2.0, 2.5, 2.6, 2.3, 2.0, 1.8, 1.3, 1.7, 2.0, 1.9, 2.3, 1.9, 2.4, 1.6 Can you
claim that the median impurity level is less than 2.5 ppm?
a. State and test the appropriate hypothesis using the sign test with α = 0.05. What
is the p-value for this test?
b. Use the normal approximation for the sign test to test H0 : µ = 2.5 versus
H1 : µ < 2.5 with α = 0.05. What is the p-value for this test?

In [31]:
data = np.array([2.4, 2.5, 1.7, 1.6, 1.9, 2.6, 1.3, 1.9, 2.0, 2.5, 2.6,
                 2.3, 2.0, 1.8, 1.3, 1.7, 2.0, 1.9, 2.3, 1.9, 2.4, 1.6])
muy = 2.5
alpha = 0.05
test(data, muy, alpha, '<')

(2.002716064453125e-05, 'm < 2.5', 'Reject')
With normal distribution: (0.0001733096755673334, 'm < 2.5', 'Reject')


### 14.2 Kiểm định dấu - hạng Wilcoxon trường hợp mẫu cặp

#### Bài tập 14.4 
An inspector are measured the diameter of a ball bearing using a new
type of caliper. The results were as follows (in mm): 0.265, 0.263, 0.266, 0.267, 0.267,
0.265, 0.267,0.267, 0.265, 0.268, 0.268, and 0.263.
a. Use the Wilcoxon signed-rank test to evaluate the claim that the mean ball diameter is 0.265 mm. Use α = 0.05.
b. Use the normal approximation for the test. With α = 0.05, what conclusions can
you draw?