Question - 1

```In the population, the average IQ is 100 with a standard deviation of 15. A team of scientists want to test a new medication to see if it has either a positive or negative effect on intelligence, or not effect at all. A sample of 30 participants who have taken the medication has a mean of 140. Did the medication affect intelligence?```

In [1]:
from scipy import stats

# Population parameters
population_mean = 100
population_std_dev = 15

# Sample parameters
sample_size = 30
sample_mean = 140

# Calculate the standard error of the mean (SEM)
sem = population_std_dev / (sample_size ** 0.5)

# Calculate the t-statistic
t_statistic = (sample_mean - population_mean) / sem

# Degrees of freedom for a one-sample t-test
degrees_of_freedom = sample_size - 1

# Set the significance level (alpha)
alpha = 0.05

# Calculate the critical t-value for a two-tailed test
critical_t_value = stats.t.ppf(1 - alpha / 2, degrees_of_freedom)

# Print the results
print(f"T-Statistic: {t_statistic}")
print(f"Critical T-Value: {critical_t_value}")

# Compare the t-statistic to the critical t-value
if abs(t_statistic) > critical_t_value:
    print("The medication has a significant effect on intelligence.")
else:
    print("There is no significant effect of the medication on intelligence.")


T-Statistic: 14.60593486680443
Critical T-Value: 2.045229642132703
The medication has a significant effect on intelligence.


Question - 2

```A professor wants to know if her introductory statistics class has a good grasp of basic math. Six students are chosen at random form the call an given a math proficiency test. The professor wants the class to be able to score above 70 on the test. The six students get the following scores: 62, 92, 75, 68, 83, 95. Can the professor have 90% confidence that the mean score for the class on the test would be above 70.```70.

In [2]:
from scipy import stats
import numpy as np

# Given data
scores = np.array([62, 92, 75, 68, 83, 95])

# Sample parameters
sample_size = len(scores)
sample_mean = np.mean(scores)
population_mean_threshold = 70

# Calculate the standard error of the mean (SEM)
sem = np.std(scores, ddof=1) / (sample_size ** 0.5)

# Calculate the t-statistic
t_statistic = (sample_mean - population_mean_threshold) / sem

# Degrees of freedom for a one-sample t-test
degrees_of_freedom = sample_size - 1

# Set the confidence level
confidence_level = 0.90

# Calculate the critical t-value for a one-tailed test
critical_t_value = stats.t.ppf(1 - (1 - confidence_level), degrees_of_freedom)

# Print the results
print(f"T-Statistic: {t_statistic}")
print(f"Critical T-Value: {critical_t_value}")

# Compare the t-statistic to the critical t-value
if t_statistic > critical_t_value:
    print(f"The professor can have {confidence_level * 100}% confidence that the mean score is above 70.")
else:
    print("The professor cannot have 90% confidence that the mean score is above 70.")


T-Statistic: 1.705313636019149
Critical T-Value: 1.4758840487820273
The professor can have 90.0% confidence that the mean score is above 70.


Question - 3

```A clinic provides a program to help their clients lose weight and asks a consumer agency to investigate the effectiveness of the program. The agency takes a sample of 15 people, weighing each person in the sample before the program begins and 3 months later. The results are tabulated below. Determine is the program is effective```ive

In [3]:
from scipy import stats
import pandas as pd

# Given data
data = {
    'Person': list(range(1, 16)),
    'Before': [210, 205, 193, 259, 239, 239, 164, 197, 222, 211, 187, 175, 186, 243, 246],
    'After': [197, 191, 174, 236, 226, 226, 157, 196, 201, 196, 181, 164, 181, 229, 231],
    'Difference': [13, 10, 19, 23, 13, 13, 7, 1, 21, 15, 5, 11, 5, 14, 15]
}

df = pd.DataFrame(data)

# Perform a paired t-test
t_statistic, p_value = stats.ttest_rel(df['Before'], df['After'])

# Set the significance level (alpha)
alpha = 0.05

# Print the results
print(f"T-Statistic: {t_statistic}")
print(f"P-Value: {p_value}")

# Compare the p-value to the significance level
if p_value < alpha:
    print("The weight loss program is effective.")
else:
    print("There is no significant evidence that the weight loss program is effective.")


T-Statistic: 8.165504377650214
P-Value: 1.0781423462615868e-06
The weight loss program is effective.


Question - 4

```Consider you are performing ML for predicting housing prices you have trained three models and following data summarizes the predicted house price by each model for 5 different trial runs```s

Model                         House price predicted (in Lakh Rs.)
Code           Trial1      Trial2        Trial3    Trial4     Trial5
ModelA       3.5           3.4            3.8         3.5        3.4
ModelB       3.9           3.8            3.7          3.9       3.6
ModelC        3.5          3.3            3.6           3.5      3.8

In [2]:
import numpy as np
from scipy import stats

# Given data
house_prices = {
    'ModelA': [3.5, 3.4, 3.8, 3.5, 3.4],
    'ModelB': [3.9, 3.8, 3.7, 3.9, 3.6],
    'ModelC': [3.5, 3.3, 3.6, 3.5, 3.8],
}

# Convert the data to numpy arrays for easier calculations
model_a_prices = np.array(house_prices['ModelA'])
model_b_prices = np.array(house_prices['ModelB'])
model_c_prices = np.array(house_prices['ModelC'])

# Calculate summary statistics
mean_prices = {
    'ModelA': np.mean(model_a_prices),
    'ModelB': np.mean(model_b_prices),
    'ModelC': np.mean(model_c_prices),
}

median_prices = {
    'ModelA': np.median(model_a_prices),
    'ModelB': np.median(model_b_prices),
    'ModelC': np.median(model_c_prices),
}

std_dev_prices = {
    'ModelA': np.std(model_a_prices, ddof=1),
    'ModelB': np.std(model_b_prices, ddof=1),
    'ModelC': np.std(model_c_prices, ddof=1),
}

# Print the summary statistics
print("Mean House Prices:")
for model, mean_price in mean_prices.items():
    print(f"{model}: {mean_price:.2f} Lakh Rs.")

print("\nMedian House Prices:")
for model, median_price in median_prices.items():
    print(f"{model}: {median_price:.2f} Lakh Rs.")

print("\nStandard Deviation of House Prices:")
for model, std_dev_price in std_dev_prices.items():
    print(f"{model}: {std_dev_price:.2f} Lakh Rs.")

# Perform one-way ANOVA
f_statistic, p_value = stats.f_oneway(model_a_prices, model_b_prices, model_c_prices)

# Print the results
print(f"F-Statistic: {f_statistic:.2f}")
print(f"P-Value: {p_value:.4f}")

# Check for statistical significance
alpha = 0.05
if p_value < alpha:
    print("The means of the three models are significantly different.")
else:
    print("There is no significant difference in the means of the three models.")


Mean House Prices:
ModelA: 3.52 Lakh Rs.
ModelB: 3.78 Lakh Rs.
ModelC: 3.54 Lakh Rs.

Median House Prices:
ModelA: 3.50 Lakh Rs.
ModelB: 3.80 Lakh Rs.
ModelC: 3.50 Lakh Rs.

Standard Deviation of House Prices:
ModelA: 0.16 Lakh Rs.
ModelB: 0.13 Lakh Rs.
ModelC: 0.18 Lakh Rs.
F-Statistic: 4.08
P-Value: 0.0445
The means of the three models are significantly different.
