Q1. Explain the assumptions required to use ANOVA and provide examples of violations that could impact
the validity of the results.

1. The population from which samples are drawn should be normally distributed.
2. Independence of cases: the sample cases should be independent of each other.
3. Homogeneity of variance: Homogeneity means that the variance among the groups should be approximately equal.
4. Random Sampling: The data are collected through random sampling from the population of interest.

Examples of violations and their impacts on validity:

1. Non-Normality: If the residuals are not normally distributed, the p-values and confidence intervals generated by ANOVA may be inaccurate. 

2. Homoscedasticity: When the assumption of equal variances is violated, the F-test in ANOVA becomes less reliable. 
3. Independence: Violation of independence assumptions, such as in repeated measures designs or clustered data, can lead to biased estimates of variability and inflated Type I error rates.

4. Random Sampling: If the sampling process is not random, the generalizability of the results may be compromised.

Q2. What are the three types of ANOVA, and in what situations would each be used?


The three main types of ANOVA are:

One-Way ANOVA: This type of ANOVA is used when you have one categorical independent variable (with three or more levels) and one continuous dependent variable. It tests whether the means of the dependent variable are equal across all levels of the independent variable. One-way ANOVA is appropriate when you want to compare the means of multiple groups simultaneously. For example, you might use one-way ANOVA to determine if there are differences in test scores among students who studied under different teaching methods (e.g., traditional lecture, online modules, group projects).

Two-Way ANOVA: Two-way ANOVA is used when you have two categorical independent variables (factors) and one continuous dependent variable. It examines the main effects of each independent variable as well as the interaction between them. Two-way ANOVA is suitable for situations where you want to explore how two factors individually and together influence the dependent variable. For instance, in a study on the effects of both diet and exercise on weight loss, you might have one independent variable representing diet type (e.g., low-carb, low-fat) and another representing exercise intensity (e.g., high, moderate, low).

Repeated Measures ANOVA: This type of ANOVA is used when you have one categorical independent variable (with three or more levels) and one continuous dependent variable, but the same participants are measured under all conditions or at multiple time points. Repeated measures ANOVA is appropriate when you want to assess changes within subjects over time or across different conditions. For example, in a study investigating the effects of three different treatments on pain relief over time, participants' pain levels might be measured before treatment, immediately after treatment, and at regular intervals afterward.

Q3. What is the partitioning of variance in ANOVA, and why is it important to understand this concept?

Total Variance (Total Sum of Squares, SST): This represents the total variability in the dependent variable across all observations.

Between-Group Variance (Between-Group Sum of Squares, SSB): This represents the variability in the dependent variable that can be attributed to the differences between the group means.

Within-Group Variance (Within-Group Sum of Squares, SSW or SSE): This represents the variability in the dependent variable that cannot be explained by the differences between the group means. It reflects the variability within each group or condition.

The partitioning of variance is important for several reasons:

Assessment of Group Differences: By partitioning the total variance into between-group and within-group components, ANOVA allows us to determine whether the differences observed between groups are statistically significant. If the between-group variance is significantly greater than the within-group variance, it suggests that there are significant differences between the group means.

Effect Size Estimation: Understanding the proportion of variance explained by the independent variable(s) (i.e., between-group variance) relative to the total variance provides insight into the magnitude of the effect. Effect size measures such as eta-squared (η²) or partial eta-squared (η²_p) are calculated based on the partitioning of variance.

Hypothesis Testing: The partitioning of variance forms the basis for hypothesis testing in ANOVA. The F-statistic, which is calculated as the ratio of between-group variance to within-group variance (F = SSB / SSW), is used to determine whether the observed differences between group means are statistically significant.

Model Evaluation: Partitioning of variance helps in evaluating the fit of the ANOVA model to the data. It allows researchers to assess how well the model accounts for the observed variability in the dependent variable.

Q4. How would you calculate the total sum of squares (SST), explained sum of squares (SSE), and residual
sum of squares (SSR) in a one-way ANOVA using Python?

Calculate the Mean: Calculate the overall mean of the dependent variable.

Calculate the Total Sum of Squares (SST): Calculate the sum of squared deviations of each observation from the overall mean.

Calculate the Explained Sum of Squares (SSE): Calculate the sum of squared deviations of each group mean from the overall mean, weighted by the number of observations in each group.

Calculate the Residual Sum of Squares (SSR): Calculate the sum of squared deviations of each observation from its group mean.

In [1]:
import numpy as np

# Sample data
group_means = [10, 15, 12]  # Means of each group
group_sizes = [20, 25, 30]   # Number of observations in each group
overall_mean = np.mean(group_means)  # Overall mean of the dependent variable

# Calculate SST
SST = np.sum([(group_means[i] - overall_mean)**2 * group_sizes[i] for i in range(len(group_means))])

# Calculate SSE
SSE = np.sum([(group_means[i] - overall_mean)**2 for i in range(len(group_means))])

# Calculate SSR
SSR = SST - SSE

print("Total Sum of Squares (SST):", SST)
print("Explained Sum of Squares (SSE):", SSE)
print("Residual Sum of Squares (SSR):", SSR)


Total Sum of Squares (SST): 290.0
Explained Sum of Squares (SSE): 12.666666666666666
Residual Sum of Squares (SSR): 277.3333333333333


Q6. Suppose you conducted a one-way ANOVA and obtained an F-statistic of 5.23 and a p-value of 0.02.
What can you conclude about the differences between the groups, and how would you interpret these
results?

In this case, you obtained an F-statistic of 5.23 and a p-value of 0.02. Here's how to interpret these results:

Significance of the F-Statistic: The F-statistic of 5.23 indicates that there is some variability between the group means relative to the variability within the groups. However, to determine whether this variability is statistically significant, we need to consider the p-value.

Interpretation of the p-value: The p-value of 0.02 means that if the null hypothesis (that the means of all groups are equal) is true, there is only a 2% probability of observing the data or more extreme results. Typically, if the p-value is less than a predetermined significance level (e.g., 0.05), we reject the null hypothesis in favor of the alternative hypothesis.

Conclusion: With a p-value of 0.02, we would reject the null hypothesis and conclude that there are statistically significant differences between the groups. In other words, at least one group mean is different from the others.

Practical Significance: While the results are statistically significant, it's also important to consider the practical significance of the differences between the groups. Even though the differences are statistically significant, they may not be practically significant if they are small or negligible in magnitude.

Q7. In a repeated measures ANOVA, how would you handle missing data, and what are the potential
consequences of using different methods to handle missing data?

1. Complete Case Analysis (Listwise Deletion):

Method: Exclude any cases with missing data on any variable included in the analysis.

Consequences: Reduces sample size and statistical power, potentially leading to biased estimates if missingness is related to the outcome or other variables.

2. Mean/Median Imputation:

Method: Replace missing values with the mean or median of the observed values for that variable.

Consequences: Alters the distribution of the data, underestimates standard errors, and reduces variability, potentially leading to biased estimates and inaccurate hypothesis tests.

3. Last Observation Carried Forward (LOCF):

Method: Use the last observed value for each participant to replace missing values.

Consequences: Assumes that the last observed value accurately represents the missing data, which may not always be true and can lead to biased estimates, particularly if there is systematic change over time.

4. Multiple Imputation:

Method: Generate multiple plausible values for each missing data point based on the observed data and uncertainty about the missing values. Analyze each imputed dataset separately and then combine the results.

Consequences: Preserves variability and more accurately reflects uncertainty, but can be computationally intensive and may require assumptions about the missing data mechanism.

5. Model-Based Imputation:

Method: Use regression or other statistical models to predict missing values based on observed data.

Consequences: Preserves variability and can provide more accurate estimates if the imputation model is correctly specified, but relies on assumptions about the relationship between the variables.

6. Weighted Estimation:

Method: Give more weight to observations with complete data and less weight to observations with missing data.

Consequences: Can reduce bias compared to complete case analysis, but may still lead to biased estimates if missingness is related to the outcome or other variables.

Q8. What are some common post-hoc tests used after ANOVA, and when would you use each one? Provide
an example of a situation where a post-hoc test might be necessary.

Post-hoc tests are used after conducting an ANOVA to determine which specific group means differ significantly from each other when the overall ANOVA test indicates that there is a significant difference between groups. Some common post-hoc tests include:

1. Tukey's Honestly Significant Difference (HSD):

When to use: Tukey's HSD is typically used when you have three or more groups and you want to conduct all possible pairwise comparisons while controlling the overall Type I error rate. It is considered one of the most conservative post-hoc tests.

Example: In a study comparing the effectiveness of three different teaching methods (traditional lecture, online modules, and group projects) on student performance, Tukey's HSD can be used to determine which pairs of teaching methods differ significantly in terms of average test scores.

2. Bonferroni Correction:

When to use: The Bonferroni correction is a simple method to control the familywise error rate when conducting multiple comparisons. It adjusts the significance level (α) for each individual comparison to maintain an overall desired alpha level.

Example: In a clinical trial comparing the efficacy of four different treatments for a medical condition, the Bonferroni correction can be used to compare each treatment to every other treatment while controlling for the increased risk of Type I error due to multiple comparisons.

3. Sidak Correction:

When to use: Similar to the Bonferroni correction, the Sidak correction adjusts the significance level for multiple comparisons to maintain an overall desired alpha level. It is less conservative than Bonferroni when the number of comparisons is large.

Example: In a market research study comparing the sales performance of multiple product variations across different regions, the Sidak correction can be applied to determine which specific product variations significantly differ in sales.

4. Dunnett's Test:

When to use: Dunnett's test is used when you have one control group and several treatment groups, and you want to compare each treatment group to the control group while controlling the overall Type I error rate.

Example: In a study evaluating the effectiveness of different doses of a new medication compared to a placebo, Dunnett's test can be used to determine if any of the medication doses lead to significantly different outcomes compared to the placebo.

5. Holm's Sequential Bonferroni Procedure:

When to use: Holm's procedure is a step-down method that adjusts the significance level for multiple comparisons while maintaining control over the familywise error rate. It starts by testing the most significant comparison and progressively adjusts the significance level for subsequent comparisons.

Example: In a study investigating the impact of various marketing strategies on sales revenue across different demographics, Holm's procedure can be used to identify specific demographic groups where the marketing strategies have a significant effect on sales.

Q9. A researcher wants to compare the mean weight loss of three diets: A, B, and C. They collect data from
50 participants who were randomly assigned to one of the diets. Conduct a one-way ANOVA using Python
to determine if there are any significant differences between the mean weight loss of the three diets.
Report the F-statistic and p-value, and interpret the results.

In [3]:
import numpy as np
from scipy.stats import f_oneway

# Sample data
diet_A = np.array([2.1, 1.9, 2.5, 2.3, 2.2, 1.8, 2.0, 2.4, 2.6, 2.3,
                   2.1, 2.4, 2.0, 2.2, 2.5, 2.3, 2.1, 2.3, 2.6, 2.4,
                   2.1, 2.0, 2.3, 2.5, 2.2])
diet_B = np.array([2.3, 2.7, 2.5, 2.4, 2.6, 2.8, 2.9, 2.5, 2.7, 2.6,
                   2.3, 2.4, 2.6, 2.7, 2.8, 2.9, 2.5, 2.4, 2.6, 2.7,
                   2.3, 2.4, 2.6, 2.8, 2.7])
diet_C = np.array([2.8, 3.0, 3.2, 2.9, 3.1, 3.0, 2.8, 3.2, 3.1, 3.3,
                   3.0, 3.2, 3.1, 3.0, 2.9, 3.3, 3.0, 3.1, 3.2, 3.4,
                   2.9, 3.0, 3.2, 3.1, 3.3])

# Perform one-way ANOVA
f_statistic, p_value = f_oneway(diet_A, diet_B, diet_C)

# Report the results
print("F-statistic:", f_statistic)
print("p-value:", p_value)

# Interpret the results
alpha = 0.05
if p_value < alpha:
    print("The p-value is less than", alpha, "so we reject the null hypothesis.")
    print("There is sufficient evidence to conclude that there are significant differences between the mean weight loss of the three diets.")
else:
    print("The p-value is greater than or equal to", alpha, "so we fail to reject the null hypothesis.")
    print("There is not enough evidence to conclude that there are significant differences between the mean weight loss of the three diets.")


F-statistic: 126.29272898961278
p-value: 2.8584049603842127e-24
The p-value is less than 0.05 so we reject the null hypothesis.
There is sufficient evidence to conclude that there are significant differences between the mean weight loss of the three diets.


Q10. A company wants to know if there are any significant differences in the average time it takes to
complete a task using three different software programs: Program A, Program B, and Program C. They
randomly assign 30 employees to one of the programs and record the time it takes each employee to
complete the task. Conduct a two-way ANOVA using Python to determine if there are any main effects or
interaction effects between the software programs and employee experience level (novice vs.
experienced). Report the F-statistics and p-values, and interpret the results.

In [17]:
import pandas as pd
import numpy as np
import statsmodels.api as sm
from statsmodels.formula.api import ols

# Define constants
num_employees_per_program = 10
num_levels_per_factor = 2

# Generate software programs
software_programs = np.repeat(['A', 'B', 'C'], num_employees_per_program * num_levels_per_factor)

# Generate experience levels to match the length of software_programs
experience_levels = np.tile(['Novice', 'Experienced'], len(software_programs) // 2)

# Generate random task times
np.random.seed(0)  # for reproducibility
task_times = np.random.randint(10, 20, size=len(software_programs))

# Create DataFrame
df = pd.DataFrame({'Software_Program': software_programs, 'Experience_Level': experience_levels, 'Task_Time': task_times})

# Print DataFrame
print("\nDataFrame:")
print(df)

# Fit the two-way ANOVA model
model = ols('Task_Time ~ C(Software_Program) + C(Experience_Level) + C(Software_Program):C(Experience_Level)', data=df).fit()
anova_table = sm.stats.anova_lm(model, typ=2)

# Extract main effects and interaction effects from the ANOVA table
main_effects = anova_table[['sum_sq', 'df', 'F', 'PR(>F)']].iloc[:2]
interaction_effect = anova_table[['sum_sq', 'df', 'F', 'PR(>F)']].iloc[-1]

print("\nMain Effects:")
print(main_effects)
print("\nInteraction Effect:")
print(interaction_effect)



DataFrame:
   Software_Program Experience_Level  Task_Time
0                 A           Novice         15
1                 A      Experienced         10
2                 A           Novice         13
3                 A      Experienced         13
4                 A           Novice         17
5                 A      Experienced         19
6                 A           Novice         13
7                 A      Experienced         15
8                 A           Novice         12
9                 A      Experienced         14
10                A           Novice         17
11                A      Experienced         16
12                A           Novice         18
13                A      Experienced         18
14                A           Novice         11
15                A      Experienced         16
16                A           Novice         17
17                A      Experienced         17
18                A           Novice         18
19                A      Exp

Q11. An educational researcher is interested in whether a new teaching method improves student test
scores. They randomly assign 100 students to either the control group (traditional teaching method) or the
experimental group (new teaching method) and administer a test at the end of the semester. Conduct a
two-sample t-test using Python to determine if there are any significant differences in test scores
between the two groups. If the results are significant, follow up with a post-hoc test to determine which
group(s) differ significantly from each other.

In [18]:
import numpy as np
from scipy.stats import ttest_ind
from statsmodels.stats.multicomp import pairwise_tukeyhsd

# Generate random test scores for the control and experimental groups
np.random.seed(0)  # for reproducibility
control_group_scores = np.random.normal(loc=70, scale=10, size=100)  # mean=70, std=10
experimental_group_scores = np.random.normal(loc=75, scale=10, size=100)  # mean=75, std=10

# Perform two-sample t-test
t_stat, p_value = ttest_ind(control_group_scores, experimental_group_scores)
print("Two-sample t-test results:")
print("t-statistic:", t_stat)
print("p-value:", p_value)

# Check if the difference is significant (using alpha = 0.05)
if p_value < 0.05:
    print("The difference in test scores between the two groups is statistically significant.")
    
    # Perform post-hoc test (Tukey's HSD)
    all_scores = np.concatenate([control_group_scores, experimental_group_scores])
    group_labels = ['Control'] * 100 + ['Experimental'] * 100
    tukey_results = pairwise_tukeyhsd(all_scores, group_labels, alpha=0.05)
    print("\nPost-hoc (Tukey's HSD) test results:")
    print(tukey_results)
else:
    print("There is no significant difference in test scores between the two groups.")


Two-sample t-test results:
t-statistic: -3.597192759749614
p-value: 0.0004062796020362504
The difference in test scores between the two groups is statistically significant.

Post-hoc (Tukey's HSD) test results:
   Multiple Comparison of Means - Tukey HSD, FWER=0.05   
 group1    group2    meandiff p-adj  lower  upper  reject
---------------------------------------------------------
Control Experimental    5.222 0.0004 2.3593 8.0848   True
---------------------------------------------------------


Q12. A researcher wants to know if there are any significant differences in the average daily sales of three
retail stores: Store A, Store B, and Store C. They randomly select 30 days and record the sales for each store
on those days. Conduct a repeated measures ANOVA using Python to determine if there are any

significant differences in sales between the three stores. If the results are significant, follow up with a post-
hoc test to determine which store(s) differ significantly from each other.

In [19]:
import pandas as pd
from statsmodels.stats.anova import AnovaRM
from scipy.stats import f_oneway
from statsmodels.stats.multicomp import pairwise_tukeyhsd

# Generate sample data for daily sales of three stores
np.random.seed(0)  # for reproducibility
store_a_sales = np.random.normal(loc=100, scale=20, size=30)
store_b_sales = np.random.normal(loc=110, scale=20, size=30)
store_c_sales = np.random.normal(loc=120, scale=20, size=30)

# Create DataFrame
df = pd.DataFrame({
    'Day': np.arange(1, 31), 
    'Store_A_Sales': store_a_sales, 
    'Store_B_Sales': store_b_sales, 
    'Store_C_Sales': store_c_sales
})

# Melt the DataFrame for repeated measures ANOVA
melted_df = pd.melt(df, id_vars=['Day'], value_vars=['Store_A_Sales', 'Store_B_Sales', 'Store_C_Sales'],
                     var_name='Store', value_name='Sales')

# Perform repeated measures ANOVA
rm_anova = AnovaRM(melted_df, 'Sales', 'Day', within=['Store']).fit()
print("Repeated Measures ANOVA results:")
print(rm_anova)

# Perform one-way ANOVA (to check overall significance)
f_stat, p_value = f_oneway(store_a_sales, store_b_sales, store_c_sales)
print("\nOne-way ANOVA results:")
print("F-statistic:", f_stat)
print("p-value:", p_value)

# Check if the overall difference is significant (using alpha = 0.05)
if p_value < 0.05:
    print("\nThe overall difference in sales between the three stores is statistically significant.")

    # Perform post-hoc test (Tukey's HSD)
    all_sales = np.concatenate([store_a_sales, store_b_sales, store_c_sales])
    group_labels = ['Store A'] * 30 + ['Store B'] * 30 + ['Store C'] * 30
    tukey_results = pairwise_tukeyhsd(all_sales, group_labels, alpha=0.05)
    print("\nPost-hoc (Tukey's HSD) test results:")
    print(tukey_results)
else:
    print("\nThere is no significant difference in sales between the three stores.")


Repeated Measures ANOVA results:
               Anova
      F Value Num DF  Den DF Pr > F
-----------------------------------
Store  3.1805 2.0000 58.0000 0.0489


One-way ANOVA results:
F-statistic: 3.3414606706069834
p-value: 0.039981492411499175

The overall difference in sales between the three stores is statistically significant.

Post-hoc (Tukey's HSD) test results:
  Multiple Comparison of Means - Tukey HSD, FWER=0.05  
 group1  group2 meandiff p-adj   lower    upper  reject
-------------------------------------------------------
Store A Store B  -4.6476 0.6397 -16.9155  7.6203  False
Store A Store C   8.4685 0.2321  -3.7994 20.7363  False
Store B Store C   13.116 0.0333   0.8481 25.3839   True
-------------------------------------------------------
