# Lab Task 1

**A teacher claims that the average score of students in his class is 75.**  

A sample of 10 students is taken, and their scores are as follows:

`70, 80, 85, 90, 75, 60, 80, 85, 90, 95`

We want to test whether the average score of this sample is significantly different from the claimed population mean of 75.  

---

### Steps to Solve:

1. Calculate the sample mean and standard deviation.
2. Perform a t-test to compare the sample mean with the population mean.
3. Determine the critical t-value for a two-tailed test with α = 0.05.
4. Calculate the p-value.
5. Make a decision based on the t-statistic, critical value, and p-value.
6. Provide a conclusion regarding the teacher's claim.

---



In [2]:
import numpy as np
from scipy import stats

# Given data
scores = [70, 80, 85, 90, 75, 60, 80, 85, 90, 95]
population_mean = 75

# Calculate sample mean and standard deviation
sample_mean = np.mean(scores)
sample_std = np.std(scores, ddof=1)  # ddof=1 for sample standard deviation, delta degree of freedom
n = len(scores)

# Calculate t-statistic
t_statistic = (sample_mean - population_mean) / (sample_std / np.sqrt(n))

#  Degrees of freedom
df = n - 1

# Critical t-value (for two-tailed test with alpha=0.05)
# The ppf function returns the percent-point function.
alpha = 0.05
t_critical = stats.t.ppf(1 - alpha / 2, df)


# Output results
print(f"Sample Mean: {sample_mean}")
print(f"Sample Standard Deviation: {sample_std}")
print(f"T-statistic: {t_statistic}")
print(f"Critical T-value: {t_critical}")

# Decision
if abs(t_statistic) > t_critical:
    print("Reject the null hypothesis: The sample mean is significantly different from the population mean.")
else:
    print("Fail to reject the null hypothesis: The sample mean is not significantly different from the population mean.")

Sample Mean: 81.0
Sample Standard Deviation: 10.488088481701515
T-statistic: 1.8090680674665818
Critical T-value: 2.2621571628540993
Fail to reject the null hypothesis: The sample mean is not significantly different from the population mean.


# Lab Task 2

**A researcher wants to compare the average heights of male and female students in a college.**  

Two random samples are selected:

- **Sample 1 (Male Students)**:  
  `170, 165, 180, 175, 160, 172, 168, 177, 165, 180`

- **Sample 2 (Female Students)**:  
  `160, 155, 150, 158, 165, 157, 162, 155, 160, 158`

We aim to test whether there is a significant difference in the average heights of male and female students at the \( \alpha = 0.05 \) significance level.

---

### Steps to Solve:

1. Calculate the sample means and standard deviations for both groups.
2. Perform a two-sample \( t \)-test (independent samples).
3. Calculate the \( t \)-statistic and \( p \)-value.
4. Compare the \( p \)-value with the significance level (\( \alpha \)).
5. Determine if there is a significant difference in the average heights.

---

### Hypothesis:

- **Null Hypothesis (\(H_0\))**: There is no significant difference in the average heights of male and female students.  
  \[
  H_0: \mu_{\text{male}} = \mu_{\text{female}}
  \]

- **Alternative Hypothesis (\(H_a\))**: There is a significant difference in the average heights of male and female students.  
  \[
  H_a: \mu_{\text{male}} \neq \mu_{\text{female}}
  \]

---

### Data:

- **Sample 1 (Male Students)**: \(170, 165, 180, 175, 160, 172, 168, 177, 165, 180\)  
- **Sample 2 (Female Students)**: \(160, 155, 150, 158, 165, 157, 162, 155, 160, 158\)  
- **Significance Level (\(\alpha\))**: \(0.05\)  

---

### Results:

1. **Sample Statistics**:  
   - Male Students: Mean = \( \_\_\_ \), Std. Dev = \( \_\_\_ \)  
   - Female Students: Mean = \( \_\_\_ \), Std. Dev = \( \_\_\_ \)

2. **T-test Results**:  
   - \( t \)-statistic = \( \_\_\_ \)  
   - \( p \)-value = \( \_\_\_ \)

3. **Decision**:  
   - Reject \( H_0 \) if \( p \)-value \( < \alpha \).  
   - Otherwise, fail to reject \( H_0 \).

---

### Conclusion:

Include your calculated results and final conclusion regarding whether there is a significant difference in the average heights of male and female students.


In [1]:
import numpy as np
from scipy import stats

# Data: Heights of Male and Female Students
male_heights = [170, 165, 180, 175, 160, 172, 168, 177, 165, 180]
female_heights = [160, 155, 150, 158, 165, 157, 162, 155, 160, 158]

# Calculate sample means and standard deviations
mean_male = np.mean(male_heights)
std_male = np.std(male_heights, ddof=1)
mean_female = np.mean(female_heights)
std_female = np.std(female_heights, ddof=1)

# Number of samples
n_male = len(male_heights)
n_female = len(female_heights)

# Perform two-sample t-test (independent samples, unequal variance assumed)
t_statistic, _ = stats.ttest_ind(male_heights, female_heights, equal_var=False)

# Degrees of freedom calculation for unequal variance
df = ((std_male**2 / n_male + std_female**2 / n_female)**2) / \
     ((std_male**2 / n_male)**2 / (n_male - 1) + (std_female**2 / n_female)**2 / (n_female - 1))

# Critical t-value (for two-tailed test with alpha=0.05)
alpha = 0.05
t_critical = stats.t.ppf(1 - alpha / 2, df)

# Results
print(f"No of Male Students: {n_male}")
print(f"No of Female Students: {n_female}")
print(f"Mean Height (Male Students): {mean_male:.2f}")
print(f"Standard Deviation (Male Students): {std_male:.2f}")
print(f"Mean Height (Female Students): {mean_female:.2f}")
print(f"Standard Deviation (Female Students): {std_female:.2f}")
print(f"T-Statistic: {t_statistic:.2f}")
print(f"Critical T-Value: {t_critical:.2f}")
print(f"Degree of Freedom : {df}")

# Decision Rule
if abs(t_statistic) > t_critical:
    print("Reject the null hypothesis: The average heights are significantly different.")
else:
    print("Fail to reject the null hypothesis: The average heights are not significantly different.")

No of Male Students: 10
No of Female Students: 10
Mean Height (Male Students): 171.20
Standard Deviation (Male Students): 6.81
Mean Height (Female Students): 158.00
Standard Deviation (Female Students): 4.16
T-Statistic: 5.23
Critical T-Value: 2.13
Degree of Freedom : 14.90069853047738
Reject the null hypothesis: The average heights are significantly different.


# Lab Question 3

A company wants to assess whether a new training program improves employee productivity. The productivity of 12 employees is measured before and after attending the training program. The scores (measured in units of productivity) are as follows:

- **Before Training**: 40, 50, 60, 45, 55, 48, 62, 49, 41, 53, 47, 59
- **After Training**: 45, 55, 65, 50, 58, 52, 67, 54, 43, 58, 50, 62

We aim to test whether the training program has a significant effect on productivity at a significance level of 0.05.

---

### Steps to Solve:

1. Calculate the difference between the productivity scores before and after the training for each employee.
2. Compute the mean and standard deviation of the differences.
3. Formulate the hypotheses:
   - **Null Hypothesis (\(H_0\))**: The training program has no significant effect on productivity. \(H_0: \mu_d = 0\)
   - **Alternative Hypothesis (\(H_a\))**: The training program has a significant effect on productivity. \(H_a: \mu_d \neq 0\)
4. Calculate the t-statistic using the formula:
   
   \[
   t = \frac{\bar{d} - 0}{s_d / \sqrt{n}}
   \]
   Where:
   - \(\bar{d}\) is the mean of the differences.
   - \(s_d\) is the standard deviation of the differences.
   - \(n\) is the number of paired observations.

5. Determine the degrees of freedom (\(df = n - 1\)).
6. Find the critical t-value for a two-tailed test with \(\alpha = 0.05\).
7. Compare the t-statistic with the critical t-value or calculate the p-value.
8. Make a decision:
   - Reject \(H_0\) if \(|t| > t_{critical}\) or if \(p < \alpha\).
   - Otherwise, fail to reject \(H_0\).

9. Provide a conclusion regarding the effectiveness of the training program.

---

### Data:

- **Before Training**: 40, 50, 60, 45, 55, 48, 62, 49, 41, 53, 47, 59
- **After Training**: 45, 55, 65, 50, 58, 52, 67, 54, 43, 58, 50, 62
- **Significance Level**: \(\alpha = 0.05\)

---

### Results:

1. **Differences**:
   \[d = \text{After Training} - \text{Before Training}\]

2. **Statistics**:
   - Mean Difference (\(\bar{d}\)): \(\_\_\_\)
   - Standard Deviation of Differences (\(s_d\)): \(\_\_\_\)

3. **T-test Results**:
   - t-statistic = \(\_\_\_\)
   - Degrees of Freedom = \(\_\_\_\)
   - Critical t-value = \(\_\_\_\)
   - p-value = \(\_\_\_\)

4. **Decision**:
   - Reject \(H_0\) if \(p < \alpha\).
   - Otherwise, fail to reject \(H_0\).

---

### Conclusion:

Based on the analysis, state whether the training program has a significant effect on employee productivity and provide reasoning based on the test results.


In [12]:
import numpy as np
from scipy import stats

# Data: Productivity before and after training
before_training = [40, 50, 60, 45, 55, 48, 62, 49, 41, 53, 47, 59]
after_training = [45, 55, 65, 50, 58, 52, 67, 54, 43, 58, 50, 62]

# Calculate the differences (After - Before)
differences = np.subtract(after_training, before_training)

# Calculate sample mean and standard deviation of differences
mean_diff = np.mean(differences)
std_diff = np.std(differences, ddof=1)

# Number of paired samples
n = len(differences)

# Calculate t-statistic
t_statistic = mean_diff / (std_diff / np.sqrt(n))

# Degrees of freedom
df = n - 1

# Critical t-value (for two-tailed test with alpha=0.05)
alpha = 0.05
t_critical = stats.t.ppf(1 - alpha / 2, df)

# Results
print(f"Mean of Differences: {mean_diff:.2f}")
print(f"Standard Deviation of Differences: {std_diff:.2f}")
print(f"T-Statistic: {t_statistic:.2f}")
print(f"Critical T-Value: {t_critical:.2f}")

# Decision Rule
if abs(t_statistic) > t_critical:
    print("Reject the null hypothesis: The training program has a significant effect on productivity.")
else:
    print("Fail to reject the null hypothesis: The training program does not have a significant effect on productivity.")


Mean of Differences: 4.17
Standard Deviation of Differences: 1.11
T-Statistic: 12.95
Critical T-Value: 2.20
Reject the null hypothesis: The training program has a significant effect on productivity.


### Question 4: Testing the Manufacturer's Claim About Light Bulb Lifetime

#### Problem:
A manufacturer claims that the average lifetime of a light bulb is 1,000 hours. A random sample of 50 light bulbs shows a sample mean lifetime of 980 hours. The population standard deviation is known to be 80 hours. Test the manufacturer's claim at the 0.05 significance level.

#### Given Data:
- Population mean (μ): 1,000 hours
- Sample mean (X̄): 980 hours
- Population standard deviation (σ): 80 hours
- Sample size (n): 50
- Significance level (α): 0.05

#### Hypotheses:
- Null hypothesis (H₀): μ = 1,000 (The manufacturer's claim is accurate.)
- Alternative hypothesis (H₁): μ ≠ 1,000 (The manufacturer's claim is not accurate.)

#### Test Statistic:
The test statistic for a Z-test is calculated as:

\[
Z = \frac{\bar{X} - \mu}{\frac{\sigma}{\sqrt{n}}}
\]

Substitute the values:

\[
Z = \frac{980 - 1000}{\frac{80}{\sqrt{50}}}
\]

#### Decision Rule:
- Critical value for a two-tailed test at α = 0.05: ±1.96
- If |Z| > 1.96, reject the null hypothesis.

#### Calculation:
\[
Z = \frac{-20}{\frac{80}{\sqrt{50}}} = \frac{-20}{11.31} \approx -1.77
\]

#### Conclusion:
- Since |Z| = 1.77 < 1.96, we fail to reject the null hypothesis.
- **At the 0.05 significance level, there is not enough evidence to conclude that the manufacturer's claim is inaccurate.**


In [16]:
import numpy as np
from scipy import stats

# Given data for two samples
mean_1 = 75
std_1 = 10
n_1 = 30

mean_2 = 70
std_2 = 12
n_2 = 35
alpha = 0.05

# Z-Test Calculation
z_statistic = (mean_1 - mean_2) / np.sqrt((std_1**2 / n_1) + (std_2**2 / n_2))

# Critical Z-value (for two-tailed test with alpha=0.05)
z_critical = stats.norm.ppf(1 - alpha / 2)

# Results for Z-Test
print(f"Z-Statistic: {z_statistic:.2f}")
print(f"Critical Z-Value: {z_critical:.2f}")

# Decision Rule for Z-Test
if abs(z_statistic) > z_critical:
    print("Reject the null hypothesis: There is a significant difference between the two groups (Method A and Method B).")
else:
    print("Fail to reject the null hypothesis: There is no significant difference between the two groups (Method A and Method B).")


Z-Statistic: 1.83
Critical Z-Value: 1.96
Fail to reject the null hypothesis: There is no significant difference between the two groups (Method A and Method B).


### Question 5: Comparing Average Scores of Two Groups

#### Problem:
A researcher wants to compare the average scores of two groups of students who took different teaching methodologies. 

- Group 1 (Method A): X̄₁ = 75, n₁ = 30, σ₁ = 10
- Group 2 (Method B): X̄₂ = 70, n₂ = 35, σ₂ = 12

The researcher wants to test if there is a significant difference between the two groups at a α = 0.05 significance level.

#### Given Data:
- Mean of Group 1 (X̄₁): 75
- Sample size of Group 1 (n₁): 30
- Standard deviation of Group 1 (σ₁): 10
- Mean of Group 2 (X̄₂): 70
- Sample size of Group 2 (n₂): 35
- Standard deviation of Group 2 (σ₂): 12
- Significance level (α): 0.05

#### Hypotheses:
- Null hypothesis (H₀): μ₁ = μ₂ (There is no significant difference between the two groups.)
- Alternative hypothesis (H₁): μ₁ ≠ μ₂ (There is a significant difference between the two groups.)

#### Test Statistic:
The test statistic for a two-sample Z-test is calculated as:

\[
Z = \frac{X̄₁ - X̄₂}{\sqrt{\frac{\sigma₁²}{n₁} + \frac{\sigma₂²}{n₂}}}
\]

Substitute the values:

\[
Z = \frac{75 - 70}{\sqrt{\frac{10²}{30} + \frac{12²}{35}}}
\]

#### Decision Rule:
- Critical value for a two-tailed test at α = 0.05: ±1.96
- If |Z| > 1.96, reject the null hypothesis.

#### Calculation:
\[
Z = \frac{5}{\sqrt{\frac{100}{30} + \frac{144}{35}}} = \frac{5}{\sqrt{3.33 + 4.11}} = \frac{5}{\sqrt{7.44}} = \frac{5}{2.73} \approx 1.83
\]

#### Conclusion:
- Since |Z| = 1.83 > 1.96, we reject the null hypothesis.
- **At the 0.05 significance level, there is enough evidence to conclude that there is a significant difference between the two groups.**


In [17]:
# Sample data for T-Test
# Let's assume the sample standard deviations are provided or estimated
# Since we don't have the sample standard deviations, let's use the population standard deviations as approximations

# Degrees of freedom calculation for unequal variance
# Calculate the sample standard deviations
# (we assume that population standard deviations are used as approximations here)

sample_std_1 = std_1  # For Method A
sample_std_2 = std_2  # For Method B

# Calculating the t-statistic for unequal variance
t_statistic_2 = (mean_1 - mean_2) / np.sqrt((sample_std_1**2 / n_1) + (sample_std_2**2 / n_2))

# Degrees of freedom
df = ((sample_std_1**2 / n_1 + sample_std_2**2 / n_2)**2) / \
     ((sample_std_1**2 / n_1)**2 / (n_1 - 1) + (sample_std_2**2 / n_2)**2 / (n_2 - 1))

# Critical t-value (for two-tailed test with alpha=0.05)
t_critical_2 = stats.t.ppf(1 - alpha / 2, df)

# Results for T-Test
print(f"T-Statistic: {t_statistic_2:.2f}")
print(f"Degrees of Freedom: {df:.2f}")
print(f"Critical T-Value: {t_critical_2:.2f}")

# Decision Rule for T-Test
if abs(t_statistic_2) > t_critical_2:
    print("Reject the null hypothesis: There is a significant difference between the two groups (Method A and Method B).")
else:
    print("Fail to reject the null hypothesis: There is no significant difference between the two groups (Method A and Method B).")


T-Statistic: 1.83
Degrees of Freedom: 62.96
Critical T-Value: 2.00
Fail to reject the null hypothesis: There is no significant difference between the two groups (Method A and Method B).
