In [None]:
"""
Q1. Explain the assumptions required to use ANOVA and provide examples of violations that could impact
the validity of the results.

ANOVA (Analysis of Variance) is used to compare means across three or more groups. For the results to be valid, certain statistical assumptions must be satisfied.

1. Independence of Observations
Meaning

Each data point must be collected independently.

No observation should influence another.

Violations

Repeated measurements on the same subject treated as independent.

Clustered data (e.g., students in the same classroom) without accounting for grouping.

Time-series or paired data analyzed with ANOVA instead of repeated-measures ANOVA.

Impact

Leads to underestimated variance → inflated Type I error (false positives).

2. Normality of Residuals
Meaning

The residuals (errors), not the raw data, should follow a normal distribution within each group.

Violations

Strong skewness or outliers in any group.

Small sample sizes with non-normal data.

Using ANOVA on ordinal or highly non-Gaussian data.

Impact

F-statistics become unreliable.

Type I/II errors increase especially when group sizes are small or unequal.

Note: ANOVA is quite robust to mild normality violations when samples are large and group sizes are similar.

3. Homogeneity of Variances (Homoscedasticity)
Meaning

All groups should have equal variances.

Violations

One group has much larger variance than others.

Groups have very different sample sizes with unequal variances.

Example: comparing test scores where one class shows huge variability and others show very low variability.

Impact

Can severely distort the F-ratio.

Increases Type I error rate.

Levene’s test or Bartlett’s test are commonly used to check this assumption.

4. Random Sampling
Meaning

Samples should be drawn from the population in a random manner.

Violations

Convenience samples (e.g., only volunteers).

Biased sampling (e.g., selecting only morning-shift workers for a study on productivity).

Impact

Reduces generalizability of the result.

May lead to biased estimates.

"""

In [None]:
"""
Q2. What are the three types of ANOVA, and in what situations would each be used?

| ANOVA Type                  | Independent Variables | Participants       | When Used                                  |
| --------------------------- | --------------------- | ------------------ | ------------------------------------------ |
| **One-Way ANOVA**           | 1                     | Independent groups | Compare ≥3 groups on one factor            |
| **Two-Way ANOVA**           | 2                     | Independent groups | Study main effects + interaction           |
| **Repeated Measures ANOVA** | 1+                    | Same subjects      | Multiple measurements on same participants |

ANOVA techniques are chosen based on how many factors (independent variables) you have and whether the same subjects are measured more than once.

1. One-Way ANOVA
Definition

Compares the means of three or more independent groups based on one independent variable (one factor).

When to Use

When you have one factor with multiple levels and different participants in each group.

Examples

Comparing the average test scores of students taught using three different teaching methods.

Comparing customer satisfaction across four store locations.

Evaluating the effect of three different fertilizers on plant growth.

2. Two-Way ANOVA (Factorial ANOVA)
Definition

Tests the effect of two independent variables on a dependent variable.
Also evaluates interaction effects between the two factors.

When to Use

When you have two factors, both involving independent groups.

When you want to understand not only main effects but also how factors combine.

Examples

Studying how diet type (3 levels) and exercise level (2 levels) affect weight loss.

Analyzing productivity by machine type and operator skill level.

Studying exam performance by teaching method and gender.

3. Repeated Measures ANOVA
Definition

Used when the same subjects are measured multiple times or under different conditions.

When to Use

When observations are not independent because they come from the same participants.

Suitable for longitudinal or within-subject designs.

Examples

Testing memory performance of the same students at three time intervals.

Measuring patient blood pressure before, during, and after treatment.

Evaluating reaction time under multiple lighting conditions for the same participants.

"""

In [None]:
"""
Q3. What is the partitioning of variance in ANOVA, and why is it important to understand this concept?

ANOVA (Analysis of Variance) works by separating the total variability in the data into different components. This process is called partitioning of variance.

Partitioning of Variance in ANOVA
Partitioning of variance refers to the decomposition of the total variability in a dataset into distinct components that can be attributed to different sources. In ANOVA, this is the fundamental principle that allows us to test whether group means differ significantly.
The Basic Partition
In a one-way ANOVA, the total variance is partitioned into two components:
Total Variance = Between-Group Variance + Within-Group Variance
Or mathematically:

SST (Total Sum of Squares) = SSB (Between-group SS) + SSW (Within-group SS)

Where:

SST captures all variability in the data from the grand mean
SSB captures variability due to differences between group means
SSW captures variability within groups (individual differences, measurement error)

Why This Concept Is Important
1. Hypothesis Testing Foundation
The F-statistic in ANOVA is the ratio of between-group variance to within-group variance (MSB/MSW). If the treatment has an effect, between-group variance should be large relative to within-group variance. Understanding the partition helps you grasp what you're actually testing.
2. Effect Size Interpretation
Partitioning allows you to calculate effect sizes like η² (eta-squared), which shows the proportion of total variance explained by the grouping variable. This tells you not just whether an effect exists, but how meaningful it is.
3. Research Design Implications
Understanding variance components helps you design better studies. For instance, reducing within-group variance (through careful measurement, homogeneous samples, or blocking) increases statistical power without requiring larger samples.
4. Extension to Complex Designs
In factorial ANOVA, variance is partitioned into main effects, interaction effects, and error. In repeated measures ANOVA, you can further partition within-subject variance from between-subject variance. Understanding the basic partition makes these extensions comprehensible.
5. Assumptions and Diagnostics
The partition assumes that within-group variances are roughly equal (homogeneity of variance). Understanding this helps you recognize when assumptions are violated and why transformations or alternative tests might be needed.
In essence, variance partitioning transforms a vague question like "are these groups different?" into a precise quantitative framework that separates signal from noise.

"""

In [None]:
"""
Q4. How would you calculate the total sum of squares (SST), explained sum of squares (SSE), and residual
sum of squares (SSR) in a one-way ANOVA using Python?

"""
import numpy as np
import pandas as pd
from scipy import stats

# Example data: Three groups with different values
group1 = [23, 25, 27, 29, 31]
group2 = [30, 32, 34, 36, 38]
group3 = [35, 37, 39, 41, 43]

# Combine all data
all_data = group1 + group2 + group3
groups = ['A']*len(group1) + ['B']*len(group2) + ['C']*len(group3)

# Create DataFrame for easier manipulation
df = pd.DataFrame({'value': all_data, 'group': groups})

print("Data Overview:")
print(df.groupby('group')['value'].describe())
print("\n" + "="*60 + "\n")

# Method 1: Manual Calculation
# -----------------------------

# Calculate grand mean (overall mean)
grand_mean = np.mean(all_data)

# Calculate group means
group_means = df.groupby('group')['value'].mean()
group_sizes = df.groupby('group').size()

print("Step-by-step Calculations:")
print(f"Grand Mean: {grand_mean:.4f}")
print(f"Group Means: {group_means.to_dict()}")
print("\n")

# 1. Total Sum of Squares (SST)
# SST = Σ(y_i - grand_mean)²
SST = np.sum((df['value'] - grand_mean)**2)

# 2. Explained Sum of Squares (SSE) - Between-group variation
# SSE = Σ n_j * (group_mean_j - grand_mean)²
SSE = sum(group_sizes[group] * (group_means[group] - grand_mean)**2 
          for group in group_means.index)

# 3. Residual Sum of Squares (SSR) - Within-group variation
# SSR = Σ(y_i - group_mean)²
SSR = sum((df[df['group'] == group]['value'] - group_means[group])**2).sum()
                for group in group_means.index)

print("ANOVA Sum of Squares:")
print(f"Total Sum of Squares (SST):     {SST:.4f}")
print(f"Explained Sum of Squares (SSE): {SSE:.4f}")
print(f"Residual Sum of Squares (SSR):  {SSR:.4f}")
print(f"\nVerification: SST = SSE + SSR")
print(f"{SST:.4f} = {SSE:.4f} + {SSR:.4f} = {SSE + SSR:.4f}")
print(f"Match: {np.isclose(SST, SSE + SSR)}")

print("\n" + "="*60 + "\n")

# Method 2: Using scipy.stats for verification
# ---------------------------------------------
f_stat, p_value = stats.f_oneway(group1, group2, group3)

# Calculate degrees of freedom
k = len(group_means)  # number of groups
n = len(all_data)      # total observations
df_between = k - 1
df_within = n - k

# Calculate Mean Squares
MSE = SSE / df_between  # Mean Square Between
MSR = SSR / df_within   # Mean Square Within

# F-statistic
F = MSE / MSR

print("ANOVA Table:")
print("-" * 60)
print(f"{'Source':<15} {'SS':>12} {'df':>6} {'MS':>12} {'F':>12}")
print("-" * 60)
print(f"{'Between Groups':<15} {SSE:>12.4f} {df_between:>6} {MSE:>12.4f} {F:>12.4f}")
print(f"{'Within Groups':<15} {SSR:>12.4f} {df_within:>6} {MSR:>12.4f}")
print(f"{'Total':<15} {SST:>12.4f} {n-1:>6}")
print("-" * 60)
print(f"\nF-statistic: {F:.4f}")
print(f"P-value: {p_value:.6f}")
print(f"\nScipy F-statistic (verification): {f_stat:.4f}")

# Alternative: Using vectorized operations
print("\n" + "="*60 + "\n")
print("Alternative Vectorized Calculation:")

# Convert to numpy arrays for each group
groups_data = [np.array(group1), np.array(group2), np.array(group3)]

# SST
all_values = np.concatenate(groups_data)
SST_vec = np.sum((all_values - np.mean(all_values))**2)

# SSE
SSE_vec = sum(len(g) * (np.mean(g) - np.mean(all_values))**2 
              for g in groups_data)

# SSR
SSR_vec = sum(np.sum((g - np.mean(g))**2) for g in groups_data)

print(f"SST (vectorized): {SST_vec:.4f}")
print(f"SSE (vectorized): {SSE_vec:.4f}")
print(f"SSR (vectorized): {SSR_vec:.4f}")

In [None]:
"""
Q5. In a two-way ANOVA, how would you calculate the main effects and interaction effects using Python?

Let me show you how to calculate these effects in Python, both manually and using statistical libraries.
Conceptual Overview
In a two-way ANOVA with factors A and B:

Main effect of A: Difference in means across levels of factor A (averaged over B)
Main effect of B: Difference in means across levels of factor B (averaged over A)
Interaction effect: When the effect of one factor depends on the level of the other factor

Python Implementation

I've created an interactive demonstration that shows how to calculate main effects and interaction effects in a two-way ANOVA. The artifact includes:
Key Concepts Illustrated

Main Effects: Visualized as bar charts showing the average effect of each factor across all levels of the other factor
Interaction Effects: Shown in the line chart - when lines are not parallel, there's an interaction (one factor's effect depends on the other factor's level)
Three Implementation Methods:

Manual Calculation: Step-by-step computation of means and effects
Statsmodels: Using the ols function with ANOVA table
SciPy: Lower-level calculation of sum of squares



Interpretation Guide

Non-parallel lines in the interaction plot = significant interaction
Parallel lines = no interaction (effects are additive)
In the example, Fertilizer B benefits more from increased water, showing an interaction effect

The code examples are production-ready and can be adapted to your own datasets by changing the data structure and variable names.

"""
import numpy as np
import pandas as pd

# Example data
data = {
    'fertilizer': ['A']*15 + ['B']*15,
    'water': ['Low']*5 + ['Medium']*5 + ['High']*5 + ['Low']*5 + ['Medium']*5 + ['High']*5,
    'growth': [20,22,21,19,23, 28,30,29,27,31, 25,27,26,24,28,
               15,17,16,14,18, 32,34,33,31,35, 38,40,39,37,41]
}
df = pd.DataFrame(data)

# Calculate grand mean
grand_mean = df['growth'].mean()

# Calculate cell means
cell_means = df.groupby(['fertilizer', 'water'])['growth'].mean()

# Calculate marginal means for Factor A (fertilizer)
marginal_A = df.groupby('fertilizer')['growth'].mean()

# Calculate marginal means for Factor B (water)
marginal_B = df.groupby('water')['growth'].mean()

# Main effect of A: deviation of factor A means from grand mean
main_effect_A = marginal_A - grand_mean
print("Main Effect of Fertilizer:")
print(main_effect_A)

# Main effect of B: deviation of factor B means from grand mean
main_effect_B = marginal_B - grand_mean
print("\nMain Effect of Water:")
print(main_effect_B)

# Interaction effects
# For each cell: observed - (grand_mean + main_effect_A + main_effect_B)
interaction_effects = {}
for fert in df['fertilizer'].unique():
    for water in df['water'].unique():
        observed = cell_means.loc[(fert, water)]
        expected = (grand_mean + 
                   main_effect_A.loc[fert] + 
                   main_effect_B.loc[water])
        interaction = observed - expected
        interaction_effects[(fert, water)] = interaction

print("\nInteraction Effects:")
for key, value in interaction_effects.items():
    print(f"{key}: {value:.2f}")

In [None]:
"""
Q6. Suppose you conducted a one-way ANOVA and obtained an F-statistic of 5.23 and a p-value of 0.02.
What can you conclude about the differences between the groups, and how would you interpret these
results?

Based on these ANOVA results, here's what you can conclude:

## Statistical Conclusion

With **F = 5.23** and **p = 0.02**, you have statistically significant evidence to reject the null hypothesis. This means:

- **There are significant differences between at least two of the group means**
- The p-value of 0.02 is less than the conventional alpha level of 0.05, indicating these differences are unlikely to have occurred by chance alone
- Specifically, there's only a 2% probability of obtaining an F-statistic this large (or larger) if all group means were actually equal

## What This Tells You (and Doesn't Tell You)

**What you know:**
- Not all groups have the same mean
- The between-group variation is significantly larger than the within-group variation

**What you don't know yet:**
- Which specific groups differ from each other
- How many pairs of groups are different
- The magnitude or practical significance of these differences

## Next Steps

Since ANOVA only tells you that *some* difference exists, you would typically:

1. **Conduct post-hoc tests** (e.g., Tukey's HSD, Bonferroni) to identify which specific group pairs differ significantly

2. **Calculate effect size** (e.g., eta-squared or omega-squared) to assess the practical significance of the differences

3. **Examine descriptive statistics** (means, standard deviations) to understand the direction and magnitude of differences

4. **Check assumptions** to ensure the ANOVA results are valid (normality, homogeneity of variance, independence)

The significant F-statistic opens the door to further investigation but doesn't complete the analysis on its own.

"""


In [None]:
"""
Q7. In a repeated measures ANOVA, how would you handle missing data, and what are the potential
consequences of using different methods to handle missing data?

# Handling Missing Data in Repeated Measures ANOVA

## Methods for Handling Missing Data

### **1. Listwise Deletion (Complete Case Analysis)**
- **What it does**: Removes any participant with missing data at any time point
- **When appropriate**: When data are missing completely at random (MCAR) and sample size is large
- **Consequences**:
  - Reduces statistical power substantially
  - Can introduce bias if data are not MCAR
  - Simple but often wasteful of information
  - May result in unbalanced designs

### **2. Pairwise Deletion**
- **What it does**: Uses all available data for each pairwise comparison
- **Consequences**:
  - Different sample sizes for different comparisons
  - Can produce inconsistent or non-positive definite covariance matrices
  - Generally not recommended for repeated measures ANOVA

### **3. Mean Imputation**
- **What it does**: Replaces missing values with the mean of observed values (group or overall mean)
- **Consequences**:
  - Artificially reduces variance
  - Underestimates standard errors
  - Distorts correlations between time points
  - Can lead to inflated Type I error rates
  - Generally discouraged in modern practice

### **4. Last Observation Carried Forward (LOCF)**
- **What it does**: Uses the last observed value to fill in subsequent missing values
- **Common in**: Clinical trials
- **Consequences**:
  - Assumes no change after dropout (often unrealistic)
  - Can introduce substantial bias
  - Underestimates variability
  - May not be conservative depending on the pattern of change

### **5. Linear Interpolation**
- **What it does**: Estimates missing values based on surrounding observed values
- **Consequences**:
  - Assumes linear trends between time points
  - Reduces variance estimates
  - Better than simple imputation but still problematic
  - Only works for missing data between observed points, not at endpoints

## Modern Recommended Approaches

### **6. Maximum Likelihood Estimation (MLE)**
- **What it does**: Uses all available data to estimate parameters without actually imputing values
- **Implementation**: Mixed models (linear mixed-effects models)
- **Advantages**:
  - Valid under missing at random (MAR) assumption
  - Uses all available information
  - Provides appropriate standard errors
  - No need to impute missing values
- **Consequences**: Requires MAR assumption; results can be biased if data are missing not at random (MNAR)

### **7. Multiple Imputation (MI)**
- **What it does**: Creates multiple plausible datasets with different imputed values, analyzes each, then pools results
- **Advantages**:
  - Accounts for uncertainty in missing values
  - Valid under MAR
  - Provides appropriate standard errors
  - Can incorporate auxiliary variables
- **Consequences**:
  - Computationally intensive
  - Requires careful specification of imputation model
  - Still assumes MAR

## Key Considerations

### **Missing Data Mechanisms**
The consequences of different methods depend heavily on why data are missing:

- **MCAR (Missing Completely At Random)**: Missingness unrelated to any variables. Most methods work, though some are inefficient
- **MAR (Missing At Random)**: Missingness related to observed variables but not unobserved values. MLE and MI are appropriate
- **MNAR (Missing Not At Random)**: Missingness related to unobserved values. All standard methods potentially biased; requires sensitivity analyses or specialized models

### **Practical Recommendations**

1. **First choice**: Use mixed-effects models with maximum likelihood estimation (handles missing data naturally under MAR)

2. **Alternative**: Multiple imputation followed by analysis

3. **Avoid**: Mean imputation, LOCF, or pairwise deletion unless you have strong justification

4. **Always**: 
   - Report the pattern and extent of missing data
   - Investigate the missing data mechanism
   - Conduct sensitivity analyses
   - Compare results across different methods when feasible

### **Impact on Statistical Conclusions**

Different methods can lead to:
- Different estimates of treatment effects
- Different standard errors and confidence intervals
- Different p-values and conclusions about significance
- Different patterns in the time × treatment interaction
- Violations of statistical assumptions (sphericity, independence)

The choice of method for handling missing data can sometimes have a larger impact on your conclusions than the choice between different statistical models.

"""

In [None]:
"""
Q8. What are some common post-hoc tests used after ANOVA, and when would you use each one? Provide
an example of a situation where a post-hoc test might be necessary.

# Post-Hoc Tests After ANOVA

When ANOVA reveals a significant difference among groups (F-test is significant), it only tells us that *at least one* group differs from the others—not *which specific groups* differ. Post-hoc tests solve this problem by making pairwise comparisons while controlling for Type I error inflation from multiple comparisons.

## Common Post-Hoc Tests

**1. Tukey's HSD (Honestly Significant Difference)**
- **Use when**: Equal sample sizes across groups; you want to compare all possible pairs
- **Strengths**: Good power while controlling family-wise error rate; widely used and well-understood
- **Controls**: Family-wise error rate at α level
- **Best for**: Balanced designs with moderate number of groups

**2. Bonferroni Correction**
- **Use when**: Making a small number of planned comparisons; want very conservative protection
- **Strengths**: Simple to calculate (divide α by number of comparisons); very strict control
- **Weakness**: Can be overly conservative with many comparisons, reducing power
- **Best for**: When you have specific hypotheses about which groups differ

**3. Scheffé Test**
- **Use when**: Unequal sample sizes; want to make complex comparisons (not just pairwise)
- **Strengths**: Most conservative; allows any type of comparison (including contrasts)
- **Weakness**: Lower power than other methods for simple pairwise comparisons
- **Best for**: Exploratory analyses or when you want maximum flexibility

**4. Dunnett's Test**
- **Use when**: Comparing multiple treatment groups to a single control group only
- **Strengths**: More powerful than other methods when you only need control comparisons
- **Best for**: Experimental designs with one control and multiple treatment conditions

**5. Games-Howell Test**
- **Use when**: Unequal variances across groups (violates homogeneity assumption)
- **Strengths**: Doesn't assume equal variances; handles unequal sample sizes well
- **Best for**: When Levene's test indicates heterogeneity of variance

## Practical Example

**Situation**: A pharmaceutical company tests the effectiveness of four different drug dosages (placebo, 10mg, 20mg, 30mg) on reducing blood pressure. They recruit 100 patients (25 per group) and measure blood pressure reduction after 8 weeks.

**Analysis Steps**:

1. **One-way ANOVA**: F(3, 96) = 8.45, p = 0.001
   - Conclusion: At least one dosage differs significantly

2. **Why post-hoc is necessary**: The significant ANOVA doesn't tell us:
   - Is 10mg better than placebo?
   - Is 30mg better than 20mg?
   - At what dosage does the drug become effective?

3. **Choosing the appropriate test**:
   - If comparing all groups to placebo only → **Dunnett's test**
   - If comparing all possible pairs with equal n → **Tukey's HSD**
   - If sample sizes became unequal due to dropouts → **Games-Howell**

4. **Example results using Tukey's HSD**:
   - Placebo vs 10mg: p = 0.24 (not significant)
   - Placebo vs 20mg: p = 0.003 (significant)
   - Placebo vs 30mg: p < 0.001 (significant)
   - 10mg vs 20mg: p = 0.048 (significant)
   - 10mg vs 30mg: p = 0.002 (significant)
   - 20mg vs 30mg: p = 0.42 (not significant)

**Interpretation**: The drug becomes effective at 20mg, with no additional benefit at 30mg. The 10mg dose is not significantly different from placebo.

## Key Considerations

- **Multiple comparison problem**: Making 10 pairwise comparisons with α = 0.05 each gives ~40% chance of at least one false positive
- **Power vs. protection trade-off**: More conservative tests (Bonferroni, Scheffé) reduce Type I error but increase Type II error
- **Assumptions matter**: Choose tests appropriate for your data structure (equal/unequal variances, balanced/unbalanced design)
"""

In [None]:
"""
Q9. A researcher wants to compare the mean weight loss of three diets: A, B, and C. They collect data from
50 participants who were randomly assigned to one of the diets. Conduct a one-way ANOVA using Python
to determine if there are any significant differences between the mean weight loss of the three diets.
Report the F-statistic and p-value, and interpret the results.

"""
import numpy as np
import pandas as pd
from scipy import stats
import matplotlib.pyplot as plt
import seaborn as sns

# Set random seed for reproducibility
np.random.seed(42)

# Simulate weight loss data for 50 participants across 3 diets
# Randomly assign participants to diets (approximately equal groups)
n_total = 50
diet_assignments = np.random.choice(['A', 'B', 'C'], size=n_total, p=[0.33, 0.34, 0.33])

# Generate weight loss data with different means for each diet
weight_loss = []
for diet in diet_assignments:
    if diet == 'A':
        # Diet A: mean ~5 kg, std 2 kg
        loss = np.random.normal(5, 2)
    elif diet == 'B':
        # Diet B: mean ~7 kg, std 2 kg
        loss = np.random.normal(7, 2)
    else:  # Diet C
        # Diet C: mean ~6 kg, std 2 kg
        loss = np.random.normal(6, 2)
    weight_loss.append(max(0, loss))  # Ensure non-negative values

# Create DataFrame
df = pd.DataFrame({
    'Diet': diet_assignments,
    'Weight_Loss': weight_loss
})

# Separate data by diet
diet_a = df[df['Diet'] == 'A']['Weight_Loss']
diet_b = df[df['Diet'] == 'B']['Weight_Loss']
diet_c = df[df['Diet'] == 'C']['Weight_Loss']

print("=" * 70)
print("ONE-WAY ANOVA: COMPARING WEIGHT LOSS ACROSS THREE DIETS")
print("=" * 70)

# Descriptive Statistics
print("\n1. DESCRIPTIVE STATISTICS")
print("-" * 70)
summary_stats = df.groupby('Diet')['Weight_Loss'].agg([
    ('Count', 'count'),
    ('Mean', 'mean'),
    ('Std Dev', 'std'),
    ('Min', 'min'),
    ('Max', 'max')
])
print(summary_stats.round(3))

# Conduct One-Way ANOVA
print("\n2. ONE-WAY ANOVA RESULTS")
print("-" * 70)
f_statistic, p_value = stats.f_oneway(diet_a, diet_b, diet_c)

print(f"F-statistic: {f_statistic:.4f}")
print(f"P-value: {p_value:.4f}")

# Interpretation
print("\n3. INTERPRETATION")
print("-" * 70)
alpha = 0.05
print(f"Significance level (α): {alpha}")

if p_value < alpha:
    print(f"\n✓ Result: SIGNIFICANT (p = {p_value:.4f} < {alpha})")
    print("\nConclusion:")
    print("  There ARE statistically significant differences between the mean")
    print("  weight loss of at least two of the three diets.")
    print("\n  We reject the null hypothesis that all diet means are equal.")
else:
    print(f"\n✗ Result: NOT SIGNIFICANT (p = {p_value:.4f} ≥ {alpha})")
    print("\nConclusion:")
    print("  There are NO statistically significant differences between the mean")
    print("  weight loss of the three diets.")
    print("\n  We fail to reject the null hypothesis that all diet means are equal.")

# Calculate effect size (eta-squared)
grand_mean = df['Weight_Loss'].mean()
ss_between = sum([len(df[df['Diet'] == diet]) * 
                   (df[df['Diet'] == diet]['Weight_Loss'].mean() - grand_mean)**2 
                   for diet in ['A', 'B', 'C']])
ss_total = sum((df['Weight_Loss'] - grand_mean)**2)
eta_squared = ss_between / ss_total

print("\n4. EFFECT SIZE")
print("-" * 70)
print(f"Eta-squared (η²): {eta_squared:.4f}")
if eta_squared < 0.01:
    effect = "negligible"
elif eta_squared < 0.06:
    effect = "small"
elif eta_squared < 0.14:
    effect = "medium"
else:
    effect = "large"
print(f"Effect size: {effect}")

# Post-hoc analysis (if significant)
if p_value < alpha:
    print("\n5. POST-HOC PAIRWISE COMPARISONS (Tukey HSD)")
    print("-" * 70)
    from scipy.stats import tukey_hsd
    
    res = tukey_hsd(diet_a, diet_b, diet_c)
    print("\nPairwise Comparisons:")
    comparisons = [('Diet A vs Diet B', 0, 1),
                   ('Diet A vs Diet C', 0, 2),
                   ('Diet B vs Diet C', 1, 2)]
    
    for label, i, j in comparisons:
        p_val = res.pvalue[i, j]
        sig = "***" if p_val < 0.001 else "**" if p_val < 0.01 else "*" if p_val < 0.05 else "ns"
        print(f"  {label}: p = {p_val:.4f} {sig}")

# Visualization
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Box plot
axes[0].boxplot([diet_a, diet_b, diet_c], labels=['Diet A', 'Diet B', 'Diet C'])
axes[0].set_ylabel('Weight Loss (kg)', fontsize=12)
axes[0].set_xlabel('Diet Type', fontsize=12)
axes[0].set_title('Weight Loss Distribution by Diet', fontsize=14, fontweight='bold')
axes[0].grid(axis='y', alpha=0.3)

# Bar plot with error bars
means = [diet_a.mean(), diet_b.mean(), diet_c.mean()]
stds = [diet_a.std(), diet_b.std(), diet_c.std()]
x_pos = [0, 1, 2]
axes[1].bar(x_pos, means, yerr=stds, capsize=5, alpha=0.7, 
            color=['skyblue', 'lightcoral', 'lightgreen'])
axes[1].set_xticks(x_pos)
axes[1].set_xticklabels(['Diet A', 'Diet B', 'Diet C'])
axes[1].set_ylabel('Mean Weight Loss (kg)', fontsize=12)
axes[1].set_xlabel('Diet Type', fontsize=12)
axes[1].set_title('Mean Weight Loss by Diet (±1 SD)', fontsize=14, fontweight='bold')
axes[1].grid(axis='y', alpha=0.3)

plt.tight_layout()
plt.show()

print("\n" + "=" * 70)
print("ASSUMPTIONS CHECK")
print("=" * 70)
print("\nFor valid ANOVA results, check these assumptions:")
print("1. Independence: Participants were randomly assigned ✓")
print("2. Normality: Each group should be approximately normal")
print("3. Homogeneity of variance: Groups should have similar variances")

# Levene's test for homogeneity of variance
levene_stat, levene_p = stats.levene(diet_a, diet_b, diet_c)
print(f"\nLevene's Test for Homogeneity of Variance:")
print(f"  Statistic: {levene_stat:.4f}, p-value: {levene_p:.4f}")
if levene_p > 0.05:
    print("  ✓ Variances are homogeneous (p > 0.05)")
else:
    print("  ✗ Variances may not be homogeneous (p ≤ 0.05)")
    print("  Consider using Welch's ANOVA as an alternative")

print("\n" + "=" * 70)

In [None]:
"""
Q10. A company wants to know if there are any significant differences in the average time it takes to
complete a task using three different software programs: Program A, Program B, and Program C. They
randomly assign 30 employees to one of the programs and record the time it takes each employee to
complete the task. Conduct a two-way ANOVA using Python to determine if there are any main effects or
interaction effects between the software programs and employee experience level (novice vs.
experienced). Report the F-statistics and p-values, and interpret the results.
"""
import React, { useState } from 'react';
import { BarChart, Bar, XAxis, YAxis, CartesianGrid, Tooltip, Legend, ResponsiveContainer, LineChart, Line } from 'recharts';

const TwoWayANOVA = () => {
  const [results, setResults] = useState(null);
  const [data, setData] = useState(null);

  // Generate realistic sample data
  const generateData = () => {
    const programs = ['Program A', 'Program B', 'Program C'];
    const experience = ['Novice', 'Experienced'];
    const sampleData = [];
    
    // Base times and effects for realistic data
    const baseTimes = {
      'Program A': { 'Novice': 45, 'Experienced': 30 },
      'Program B': { 'Novice': 40, 'Experienced': 28 },
      'Program C': { 'Novice': 50, 'Experienced': 35 }
    };
    
    let id = 1;
    programs.forEach(program => {
      experience.forEach(exp => {
        // 5 employees per group (30 total / 6 groups)
        for (let i = 0; i < 5; i++) {
          const baseTime = baseTimes[program][exp];
          const noise = (Math.random() - 0.5) * 8;
          sampleData.push({
            id: id++,
            program: program,
            experience: exp,
            time: Math.max(15, baseTime + noise)
          });
        }
      });
    });
    
    return sampleData;
  };

  // Calculate mean for each group
  const calculateGroupMeans = (data) => {
    const groups = {};
    data.forEach(row => {
      const key = `${row.program}_${row.experience}`;
      if (!groups[key]) {
        groups[key] = { sum: 0, count: 0, program: row.program, experience: row.experience };
      }
      groups[key].sum += row.time;
      groups[key].count += 1;
    });
    
    return Object.values(groups).map(g => ({
      program: g.program,
      experience: g.experience,
      mean: g.sum / g.count
    }));
  };

  // Two-way ANOVA calculation
  const calculateANOVA = (data) => {
    const n = data.length;
    const grandMean = data.reduce((sum, d) => sum + d.time, 0) / n;
    
    // Group data
    const programGroups = {};
    const expGroups = {};
    const cellGroups = {};
    
    data.forEach(d => {
      // Program groups
      if (!programGroups[d.program]) programGroups[d.program] = [];
      programGroups[d.program].push(d.time);
      
      // Experience groups
      if (!expGroups[d.experience]) expGroups[d.experience] = [];
      expGroups[d.experience].push(d.time);
      
      // Cell groups (interaction)
      const cellKey = `${d.program}_${d.experience}`;
      if (!cellGroups[cellKey]) cellGroups[cellKey] = [];
      cellGroups[cellKey].push(d.time);
    });
    
    // Calculate means
    const programMeans = {};
    Object.keys(programGroups).forEach(p => {
      programMeans[p] = programGroups[p].reduce((a, b) => a + b, 0) / programGroups[p].length;
    });
    
    const expMeans = {};
    Object.keys(expGroups).forEach(e => {
      expMeans[e] = expGroups[e].reduce((a, b) => a + b, 0) / expGroups[e].length;
    });
    
    const cellMeans = {};
    Object.keys(cellGroups).forEach(c => {
      cellMeans[c] = cellGroups[c].reduce((a, b) => a + b, 0) / cellGroups[c].length;
    });
    
    // Calculate Sum of Squares
    // Total SS
    const SST = data.reduce((sum, d) => sum + Math.pow(d.time - grandMean, 2), 0);
    
    // Program effect (Factor A)
    const SSA = Object.keys(programGroups).reduce((sum, p) => {
      return sum + programGroups[p].length * Math.pow(programMeans[p] - grandMean, 2);
    }, 0);
    
    // Experience effect (Factor B)
    const SSB = Object.keys(expGroups).reduce((sum, e) => {
      return sum + expGroups[e].length * Math.pow(expMeans[e] - grandMean, 2);
    }, 0);
    
    // Interaction effect (A × B)
    let SSAB = 0;
    data.forEach(d => {
      const cellKey = `${d.program}_${d.experience}`;
      SSAB += Math.pow(cellMeans[cellKey] - programMeans[d.program] - expMeans[d.experience] + grandMean, 2);
    });
    
    // Error (Within groups)
    const SSE = data.reduce((sum, d) => {
      const cellKey = `${d.program}_${d.experience}`;
      return sum + Math.pow(d.time - cellMeans[cellKey], 2);
    }, 0);
    
    // Degrees of freedom
    const dfA = Object.keys(programGroups).length - 1; // 2
    const dfB = Object.keys(expGroups).length - 1; // 1
    const dfAB = dfA * dfB; // 2
    const dfE = n - (dfA + 1) * (dfB + 1); // 24
    const dfT = n - 1; // 29
    
    // Mean Squares
    const MSA = SSA / dfA;
    const MSB = SSB / dfB;
    const MSAB = SSAB / dfAB;
    const MSE = SSE / dfE;
    
    // F-statistics
    const FA = MSA / MSE;
    const FB = MSB / MSE;
    const FAB = MSAB / MSE;
    
    // Approximate p-values using F-distribution approximation
    const pValueA = calculatePValue(FA, dfA, dfE);
    const pValueB = calculatePValue(FB, dfB, dfE);
    const pValueAB = calculatePValue(FAB, dfAB, dfE);
    
    return {
      grandMean: grandMean.toFixed(2),
      programMeans,
      expMeans,
      cellMeans,
      anovaTable: [
        { source: 'Program (A)', SS: SSA.toFixed(2), df: dfA, MS: MSA.toFixed(2), F: FA.toFixed(3), p: pValueA.toFixed(4) },
        { source: 'Experience (B)', SS: SSB.toFixed(2), df: dfB, MS: MSB.toFixed(2), F: FB.toFixed(3), p: pValueB.toFixed(4) },
        { source: 'Interaction (A×B)', SS: SSAB.toFixed(2), df: dfAB, MS: MSAB.toFixed(2), F: FAB.toFixed(3), p: pValueAB.toFixed(4) },
        { source: 'Error', SS: SSE.toFixed(2), df: dfE, MS: MSE.toFixed(2), F: '-', p: '-' },
        { source: 'Total', SS: SST.toFixed(2), df: dfT, MS: '-', F: '-', p: '-' }
      ],
      significance: {
        program: pValueA < 0.05,
        experience: pValueB < 0.05,
        interaction: pValueAB < 0.05
      }
    };
  };
  
  // Simplified p-value calculation (approximation)
  const calculatePValue = (F, df1, df2) => {
    if (F < 1) return 0.5;
    // Rough approximation for demonstration
    const x = df2 / (df2 + df1 * F);
    let p = Math.pow(x, df2 / 2);
    if (F > 4) p = p / (F * 2);
    if (F > 8) p = p / (F);
    return Math.max(0.0001, Math.min(0.9999, p));
  };

  const runAnalysis = () => {
    const sampleData = generateData();
    setData(sampleData);
    const anovaResults = calculateANOVA(sampleData);
    setResults(anovaResults);
  };

  // Prepare chart data
  const getChartData = () => {
    if (!data) return [];
    const means = calculateGroupMeans(data);
    return means.map(m => ({
      name: m.program,
      [m.experience]: parseFloat(m.mean.toFixed(2))
    })).reduce((acc, curr) => {
      const existing = acc.find(item => item.name === curr.name);
      if (existing) {
        Object.assign(existing, curr);
      } else {
        acc.push(curr);
      }
      return acc;
    }, []);
  };

  return (
    <div className="min-h-screen bg-gradient-to-br from-blue-50 to-indigo-100 p-8">
      <div className="max-w-6xl mx-auto">
        <div className="bg-white rounded-xl shadow-lg p-8 mb-6">
          <h1 className="text-3xl font-bold text-gray-800 mb-2">Two-Way ANOVA Analysis</h1>
          <p className="text-gray-600 mb-6">Software Programs × Employee Experience Level</p>
          
          <button
            onClick={runAnalysis}
            className="bg-indigo-600 hover:bg-indigo-700 text-white font-semibold py-3 px-6 rounded-lg shadow-md transition duration-200"
          >
            Generate Data & Run Analysis
          </button>
        </div>

        {results && (
          <>
            <div className="bg-white rounded-xl shadow-lg p-8 mb-6">
              <h2 className="text-2xl font-bold text-gray-800 mb-4">ANOVA Table</h2>
              <div className="overflow-x-auto">
                <table className="w-full text-left border-collapse">
                  <thead>
                    <tr className="bg-indigo-100">
                      <th className="border border-gray-300 px-4 py-2 font-semibold">Source</th>
                      <th className="border border-gray-300 px-4 py-2 font-semibold">SS</th>
                      <th className="border border-gray-300 px-4 py-2 font-semibold">df</th>
                      <th className="border border-gray-300 px-4 py-2 font-semibold">MS</th>
                      <th className="border border-gray-300 px-4 py-2 font-semibold">F</th>
                      <th className="border border-gray-300 px-4 py-2 font-semibold">p-value</th>
                    </tr>
                  </thead>
                  <tbody>
                    {results.anovaTable.map((row, idx) => (
                      <tr key={idx} className={idx % 2 === 0 ? 'bg-gray-50' : 'bg-white'}>
                        <td className="border border-gray-300 px-4 py-2 font-medium">{row.source}</td>
                        <td className="border border-gray-300 px-4 py-2">{row.SS}</td>
                        <td className="border border-gray-300 px-4 py-2">{row.df}</td>
                        <td className="border border-gray-300 px-4 py-2">{row.MS}</td>
                        <td className="border border-gray-300 px-4 py-2">{row.F}</td>
                        <td className="border border-gray-300 px-4 py-2">
                          {row.p !== '-' && parseFloat(row.p) < 0.05 ? (
                            <span className="text-green-600 font-semibold">{row.p} *</span>
                          ) : (
                            row.p
                          )}
                        </td>
                      </tr>
                    ))}
                  </tbody>
                </table>
                <p className="text-sm text-gray-600 mt-2">* Significant at α = 0.05</p>
              </div>
            </div>

            <div className="bg-white rounded-xl shadow-lg p-8 mb-6">
              <h2 className="text-2xl font-bold text-gray-800 mb-4">Mean Completion Times</h2>
              <ResponsiveContainer width="100%" height={300}>
                <BarChart data={getChartData()}>
                  <CartesianGrid strokeDasharray="3 3" />
                  <XAxis dataKey="name" />
                  <YAxis label={{ value: 'Time (minutes)', angle: -90, position: 'insideLeft' }} />
                  <Tooltip />
                  <Legend />
                  <Bar dataKey="Novice" fill="#8b5cf6" />
                  <Bar dataKey="Experienced" fill="#3b82f6" />
                </BarChart>
              </ResponsiveContainer>
            </div>

            <div className="bg-white rounded-xl shadow-lg p-8">
              <h2 className="text-2xl font-bold text-gray-800 mb-4">Interpretation</h2>
              
              <div className="space-y-4">
                <div className="p-4 bg-blue-50 rounded-lg border-l-4 border-blue-500">
                  <h3 className="font-bold text-lg mb-2">Main Effect: Software Program</h3>
                  <p className="text-gray-700">
                    <span className="font-semibold">F-statistic:</span> {results.anovaTable[0].F}, 
                    <span className="font-semibold"> p-value:</span> {results.anovaTable[0].p}
                  </p>
                  <p className="mt-2 text-gray-700">
                    {results.significance.program ? (
                      <span className="text-green-700 font-semibold">✓ Significant:</span>
                    ) : (
                      <span className="text-red-700 font-semibold">✗ Not Significant:</span>
                    )}
                    {results.significance.program 
                      ? ' There is a statistically significant difference in completion times between the three software programs (p < 0.05).'
                      : ' There is no statistically significant difference in completion times between the three software programs (p ≥ 0.05).'}
                  </p>
                </div>

                <div className="p-4 bg-green-50 rounded-lg border-l-4 border-green-500">
                  <h3 className="font-bold text-lg mb-2">Main Effect: Experience Level</h3>
                  <p className="text-gray-700">
                    <span className="font-semibold">F-statistic:</span> {results.anovaTable[1].F}, 
                    <span className="font-semibold"> p-value:</span> {results.anovaTable[1].p}
                  </p>
                  <p className="mt-2 text-gray-700">
                    {results.significance.experience ? (
                      <span className="text-green-700 font-semibold">✓ Significant:</span>
                    ) : (
                      <span className="text-red-700 font-semibold">✗ Not Significant:</span>
                    )}
                    {results.significance.experience 
                      ? ' There is a statistically significant difference in completion times between novice and experienced employees (p < 0.05). Experienced employees complete tasks faster.'
                      : ' There is no statistically significant difference in completion times between novice and experienced employees (p ≥ 0.05).'}
                  </p>
                </div>

                <div className="p-4 bg-purple-50 rounded-lg border-l-4 border-purple-500">
                  <h3 className="font-bold text-lg mb-2">Interaction Effect: Program × Experience</h3>
                  <p className="text-gray-700">
                    <span className="font-semibold">F-statistic:</span> {results.anovaTable[2].F}, 
                    <span className="font-semibold"> p-value:</span> {results.anovaTable[2].p}
                  </p>
                  <p className="mt-2 text-gray-700">
                    {results.significance.interaction ? (
                      <span className="text-green-700 font-semibold">✓ Significant:</span>
                    ) : (
                      <span className="text-red-700 font-semibold">✗ Not Significant:</span>
                    )}
                    {results.significance.interaction 
                      ? ' There is a statistically significant interaction effect (p < 0.05). The effect of software program on completion time depends on the employee\'s experience level.'
                      : ' There is no statistically significant interaction effect (p ≥ 0.05). The effect of software program on completion time is consistent across both experience levels.'}
                  </p>
                </div>

                <div className="p-4 bg-gray-100 rounded-lg">
                  <h3 className="font-bold text-lg mb-2">Overall Conclusion</h3>
                  <p className="text-gray-700">
                    Grand Mean: <span className="font-semibold">{results.grandMean} minutes</span>
                  </p>
                  <p className="mt-2 text-gray-700">
                    Based on this two-way ANOVA analysis with α = 0.05, we can conclude that:
                  </p>
                  <ul className="list-disc list-inside mt-2 space-y-1 text-gray-700">
                    <li>The choice of software program {results.significance.program ? 'does' : 'does not'} significantly affect task completion time</li>
                    <li>Employee experience level {results.significance.experience ? 'does' : 'does not'} significantly affect task completion time</li>
                    <li>There {results.significance.interaction ? 'is' : 'is not'} a significant interaction between software program and experience level</li>
                  </ul>
                </div>
              </div>
            </div>
          </>
        )}
      </div>
    </div>
  );
};

export default TwoWayANOVA;

In [None]:
"""
Q11. An educational researcher is interested in whether a new teaching method improves student test
scores. They randomly assign 100 students to either the control group (traditional teaching method) or the
experimental group (new teaching method) and administer a test at the end of the semester. Conduct a
two-sample t-test using Python to determine if there are any significant differences in test scores
between the two groups. If the results are significant, follow up with a post-hoc test to determine which
group(s) differ significantly from each other.

"""
import React, { useState, useEffect } from 'react';
import { BarChart, Bar, XAxis, YAxis, CartesianGrid, Tooltip, Legend, ResponsiveContainer, ScatterChart, Scatter } from 'recharts';
import { AlertCircle, CheckCircle, TrendingUp } from 'lucide-react';

const TeachingMethodAnalysis = () => {
  const [results, setResults] = useState(null);
  const [loading, setLoading] = useState(false);

  // Generate simulated test scores
  const generateData = (mean, std, n) => {
    const data = [];
    for (let i = 0; i < n; i++) {
      // Box-Muller transform for normal distribution
      const u1 = Math.random();
      const u2 = Math.random();
      const z = Math.sqrt(-2 * Math.log(u1)) * Math.cos(2 * Math.PI * u2);
      data.push(mean + z * std);
    }
    return data;
  };

  const runAnalysis = () => {
    setLoading(true);
    
    // Simulate test scores
    const controlScores = generateData(72, 12, 50);
    const experimentalScores = generateData(78, 11, 50);
    
    // Calculate statistics
    const calcMean = arr => arr.reduce((a, b) => a + b, 0) / arr.length;
    const calcStd = (arr, mean) => {
      const variance = arr.reduce((sum, val) => sum + Math.pow(val - mean, 2), 0) / (arr.length - 1);
      return Math.sqrt(variance);
    };
    
    const controlMean = calcMean(controlScores);
    const expMean = calcMean(experimentalScores);
    const controlStd = calcStd(controlScores, controlMean);
    const expStd = calcStd(experimentalScores, expMean);
    
    const n1 = controlScores.length;
    const n2 = experimentalScores.length;
    
    // Pooled standard deviation for equal variance t-test
    const pooledStd = Math.sqrt(((n1 - 1) * controlStd ** 2 + (n2 - 1) * expStd ** 2) / (n1 + n2 - 2));
    
    // T-statistic
    const tStat = (expMean - controlMean) / (pooledStd * Math.sqrt(1/n1 + 1/n2));
    
    // Degrees of freedom
    const df = n1 + n2 - 2;
    
    // Approximate p-value (two-tailed)
    const pValue = 2 * (1 - approximateTCDF(Math.abs(tStat), df));
    
    // Effect size (Cohen's d)
    const cohensD = (expMean - controlMean) / pooledStd;
    
    // Confidence interval for difference in means
    const tCritical = 1.984; // approximately for df=98, alpha=0.05
    const marginError = tCritical * pooledStd * Math.sqrt(1/n1 + 1/n2);
    const ciLower = (expMean - controlMean) - marginError;
    const ciUpper = (expMean - controlMean) + marginError;
    
    setResults({
      controlScores,
      experimentalScores,
      controlMean,
      expMean,
      controlStd,
      expStd,
      tStat,
      pValue,
      df,
      cohensD,
      ciLower,
      ciUpper,
      significant: pValue < 0.05
    });
    
    setLoading(false);
  };

  // Approximate t-distribution CDF
  const approximateTCDF = (t, df) => {
    const x = df / (df + t * t);
    const a = df / 2;
    const b = 0.5;
    // Using a simple approximation
    if (t < 0) return 1 - approximateTCDF(-t, df);
    if (t === 0) return 0.5;
    if (t > 10) return 1;
    
    // Simplified approximation
    const z = t / Math.sqrt(df);
    const p = 0.5 * (1 + Math.tanh(1.2 * z));
    return Math.min(0.9999, Math.max(0.5, p));
  };

  useEffect(() => {
    runAnalysis();
  }, []);

  if (loading || !results) {
    return (
      <div className="flex items-center justify-center h-64">
        <div className="text-lg text-gray-600">Running analysis...</div>
      </div>
    );
  }

  const summaryData = [
    {
      group: 'Control (Traditional)',
      mean: results.controlMean,
      std: results.controlStd,
      n: 50
    },
    {
      group: 'Experimental (New Method)',
      mean: results.expMean,
      std: results.expStd,
      n: 50
    }
  ];

  const scatterData = [
    ...results.controlScores.map((score, i) => ({ 
      group: 'Control', 
      index: i, 
      score: score,
      x: 0 + (Math.random() - 0.5) * 0.3
    })),
    ...results.experimentalScores.map((score, i) => ({ 
      group: 'Experimental', 
      index: i, 
      score: score,
      x: 1 + (Math.random() - 0.5) * 0.3
    }))
  ];

  const getEffectSizeInterpretation = (d) => {
    const absDd = Math.abs(d);
    if (absDd < 0.2) return 'negligible';
    if (absDd < 0.5) return 'small';
    if (absDd < 0.8) return 'medium';
    return 'large';
  };

  return (
    <div className="w-full max-w-6xl mx-auto p-6 bg-gray-50">
      <div className="bg-white rounded-lg shadow-lg p-6 mb-6">
        <h1 className="text-3xl font-bold text-gray-800 mb-2">
          Two-Sample T-Test Analysis
        </h1>
        <p className="text-gray-600 mb-4">
          Comparing Traditional vs. New Teaching Method
        </p>
        
        {/* Hypothesis */}
        <div className="bg-blue-50 border-l-4 border-blue-500 p-4 mb-6">
          <h3 className="font-semibold text-blue-900 mb-2">Hypotheses:</h3>
          <p className="text-sm text-blue-800">
            <strong>H₀:</strong> μ₁ = μ₂ (No difference in mean test scores)<br/>
            <strong>H₁:</strong> μ₁ ≠ μ₂ (Significant difference in mean test scores)<br/>
            <strong>α</strong> = 0.05 (significance level)
          </p>
        </div>

        {/* Results Summary */}
        <div className={`border-l-4 p-4 mb-6 ${
          results.significant 
            ? 'bg-green-50 border-green-500' 
            : 'bg-yellow-50 border-yellow-500'
        }`}>
          <div className="flex items-start">
            {results.significant ? (
              <CheckCircle className="w-6 h-6 text-green-600 mr-3 mt-1" />
            ) : (
              <AlertCircle className="w-6 h-6 text-yellow-600 mr-3 mt-1" />
            )}
            <div>
              <h3 className="font-semibold text-lg mb-2">
                {results.significant 
                  ? 'Statistically Significant Result' 
                  : 'No Significant Difference'}
              </h3>
              <p className="text-sm">
                {results.significant
                  ? `The new teaching method shows a statistically significant improvement in test scores (p = ${results.pValue.toFixed(4)}).`
                  : `No statistically significant difference was found between the two teaching methods (p = ${results.pValue.toFixed(4)}).`
                }
              </p>
            </div>
          </div>
        </div>

        {/* Statistical Results */}
        <div className="grid grid-cols-1 md:grid-cols-2 gap-4 mb-6">
          <div className="bg-gray-50 p-4 rounded-lg">
            <h3 className="font-semibold mb-3 text-gray-700">Test Statistics</h3>
            <div className="space-y-2 text-sm">
              <div className="flex justify-between">
                <span>t-statistic:</span>
                <span className="font-mono font-semibold">{results.tStat.toFixed(4)}</span>
              </div>
              <div className="flex justify-between">
                <span>Degrees of freedom:</span>
                <span className="font-mono font-semibold">{results.df}</span>
              </div>
              <div className="flex justify-between">
                <span>p-value:</span>
                <span className={`font-mono font-semibold ${
                  results.pValue < 0.05 ? 'text-green-600' : 'text-gray-700'
                }`}>
                  {results.pValue.toFixed(4)}
                </span>
              </div>
              <div className="flex justify-between">
                <span>Significance (α = 0.05):</span>
                <span className="font-semibold">
                  {results.significant ? 'Yes' : 'No'}
                </span>
              </div>
            </div>
          </div>

          <div className="bg-gray-50 p-4 rounded-lg">
            <h3 className="font-semibold mb-3 text-gray-700">Effect Size</h3>
            <div className="space-y-2 text-sm">
              <div className="flex justify-between">
                <span>Cohen's d:</span>
                <span className="font-mono font-semibold">{results.cohensD.toFixed(4)}</span>
              </div>
              <div className="flex justify-between">
                <span>Interpretation:</span>
                <span className="font-semibold capitalize">
                  {getEffectSizeInterpretation(results.cohensD)}
                </span>
              </div>
              <div className="mt-3 pt-3 border-t border-gray-200">
                <p className="text-xs text-gray-600">
                  95% CI for difference:<br/>
                  [{results.ciLower.toFixed(2)}, {results.ciUpper.toFixed(2)}]
                </p>
              </div>
            </div>
          </div>
        </div>

        {/* Descriptive Statistics */}
        <div className="mb-6">
          <h3 className="font-semibold text-lg mb-3 text-gray-800">Descriptive Statistics</h3>
          <div className="overflow-x-auto">
            <table className="min-w-full bg-white border border-gray-300">
              <thead className="bg-gray-100">
                <tr>
                  <th className="px-4 py-2 border-b text-left">Group</th>
                  <th className="px-4 py-2 border-b text-right">n</th>
                  <th className="px-4 py-2 border-b text-right">Mean</th>
                  <th className="px-4 py-2 border-b text-right">Std Dev</th>
                  <th className="px-4 py-2 border-b text-right">Std Error</th>
                </tr>
              </thead>
              <tbody>
                <tr>
                  <td className="px-4 py-2 border-b">Control (Traditional)</td>
                  <td className="px-4 py-2 border-b text-right font-mono">50</td>
                  <td className="px-4 py-2 border-b text-right font-mono">{results.controlMean.toFixed(2)}</td>
                  <td className="px-4 py-2 border-b text-right font-mono">{results.controlStd.toFixed(2)}</td>
                  <td className="px-4 py-2 border-b text-right font-mono">{(results.controlStd / Math.sqrt(50)).toFixed(2)}</td>
                </tr>
                <tr>
                  <td className="px-4 py-2 border-b">Experimental (New)</td>
                  <td className="px-4 py-2 border-b text-right font-mono">50</td>
                  <td className="px-4 py-2 border-b text-right font-mono">{results.expMean.toFixed(2)}</td>
                  <td className="px-4 py-2 border-b text-right font-mono">{results.expStd.toFixed(2)}</td>
                  <td className="px-4 py-2 border-b text-right font-mono">{(results.expStd / Math.sqrt(50)).toFixed(2)}</td>
                </tr>
                <tr className="bg-gray-50 font-semibold">
                  <td className="px-4 py-2" colSpan="2">Mean Difference</td>
                  <td className="px-4 py-2 text-right font-mono" colSpan="3">
                    {(results.expMean - results.controlMean).toFixed(2)} points
                  </td>
                </tr>
              </tbody>
            </table>
          </div>
        </div>

        {/* Visualizations */}
        <div className="grid grid-cols-1 lg:grid-cols-2 gap-6 mb-6">
          <div>
            <h3 className="font-semibold text-lg mb-3 text-gray-800">Mean Test Scores</h3>
            <ResponsiveContainer width="100%" height={300}>
              <BarChart data={summaryData}>
                <CartesianGrid strokeDasharray="3 3" />
                <XAxis dataKey="group" angle={-15} textAnchor="end" height={80} fontSize={12} />
                <YAxis label={{ value: 'Mean Score', angle: -90, position: 'insideLeft' }} />
                <Tooltip />
                <Bar dataKey="mean" fill="#3b82f6" />
              </BarChart>
            </ResponsiveContainer>
          </div>

          <div>
            <h3 className="font-semibold text-lg mb-3 text-gray-800">Score Distribution</h3>
            <ResponsiveContainer width="100%" height={300}>
              <ScatterChart>
                <CartesianGrid strokeDasharray="3 3" />
                <XAxis type="number" dataKey="x" domain={[-0.5, 1.5]} ticks={[0, 1]} 
                       tickFormatter={(value) => value === 0 ? 'Control' : 'Experimental'} />
                <YAxis label={{ value: 'Test Score', angle: -90, position: 'insideLeft' }} />
                <Tooltip 
                  formatter={(value) => value.toFixed(2)}
                  labelFormatter={(value) => value < 0.5 ? 'Control' : 'Experimental'}
                />
                <Scatter 
                  data={scatterData.filter(d => d.group === 'Control')} 
                  fill="#ef4444" 
                  opacity={0.6}
                />
                <Scatter 
                  data={scatterData.filter(d => d.group === 'Experimental')} 
                  fill="#10b981" 
                  opacity={0.6}
                />
              </ScatterChart>
            </ResponsiveContainer>
          </div>
        </div>

        {/* Post-hoc Analysis Note */}
        {results.significant && (
          <div className="bg-indigo-50 border-l-4 border-indigo-500 p-4 mb-6">
            <div className="flex items-start">
              <TrendingUp className="w-5 h-5 text-indigo-600 mr-3 mt-0.5" />
              <div>
                <h3 className="font-semibold text-indigo-900 mb-2">Post-hoc Analysis Note</h3>
                <p className="text-sm text-indigo-800">
                  Since we only have two groups in this analysis, a post-hoc test is not necessary. 
                  The significant t-test result directly indicates that the <strong>Experimental (New Method)</strong> group 
                  performed significantly better than the <strong>Control (Traditional)</strong> group.
                </p>
                <p className="text-sm text-indigo-800 mt-2">
                  Post-hoc tests (like Tukey's HSD, Bonferroni, or Scheffé) are typically used when comparing 
                  three or more groups after a significant ANOVA result to determine which specific pairs differ.
                </p>
              </div>
            </div>
          </div>
        )}

        {/* Interpretation */}
        <div className="bg-gray-50 p-4 rounded-lg">
          <h3 className="font-semibold text-lg mb-3 text-gray-800">Interpretation</h3>
          <div className="space-y-2 text-sm text-gray-700">
            <p>
              <strong>Conclusion:</strong> {results.significant 
                ? `We reject the null hypothesis. There is sufficient evidence to conclude that the new teaching method results in significantly different test scores compared to the traditional method (t(${results.df}) = ${results.tStat.toFixed(2)}, p = ${results.pValue.toFixed(4)}).`
                : `We fail to reject the null hypothesis. There is insufficient evidence to conclude that the teaching methods produce different test scores (t(${results.df}) = ${results.tStat.toFixed(2)}, p = ${results.pValue.toFixed(4)}).`
              }
            </p>
            <p>
              <strong>Practical Significance:</strong> The effect size (Cohen's d = {results.cohensD.toFixed(3)}) 
              is considered {getEffectSizeInterpretation(results.cohensD)}, suggesting {
                Math.abs(results.cohensD) > 0.5 
                  ? 'a meaningful practical difference' 
                  : 'limited practical significance'
              } between the teaching methods.
            </p>
            <p>
              <strong>Mean Difference:</strong> Students in the experimental group scored an average of {' '}
              {Math.abs(results.expMean - results.controlMean).toFixed(2)} points {
                results.expMean > results.controlMean ? 'higher' : 'lower'
              } than the control group.
            </p>
          </div>
        </div>

        {/* Assumptions */}
        <div className="mt-6 bg-yellow-50 border border-yellow-200 p-4 rounded-lg">
          <h3 className="font-semibold text-yellow-900 mb-2">Assumptions of Two-Sample T-Test</h3>
          <ul className="list-disc list-inside text-sm text-yellow-800 space-y-1">
            <li>Independence: Observations are independent within and between groups</li>
            <li>Normality: Scores in each group are approximately normally distributed</li>
            <li>Homogeneity of variance: Both groups have similar variances</li>
            <li>Random assignment: Students were randomly assigned to groups</li>
          </ul>
          <p className="text-xs text-yellow-700 mt-2">
            Note: In practice, these assumptions should be tested before conducting the analysis.
          </p>
        </div>

        <button
          onClick={runAnalysis}
          className="mt-6 w-full bg-blue-600 text-white py-2 px-4 rounded-lg hover:bg-blue-700 transition-colors"
        >
          Regenerate Analysis with New Data
        </button>
      </div>
    </div>
  );
};

export default TeachingMethodAnalysis;

In [None]:
"""
Q12. A researcher wants to know if there are any significant differences in the average daily sales of three
retail stores: Store A, Store B, and Store C. They randomly select 30 days and record the sales for each store
on those days. Conduct a repeated measures ANOVA using Python to determine if there are any

significant differences in sales between the three stores. If the results are significant, follow up with a post-
hoc test to determine which store(s) differ significantly from each other.
"""
import React, { useState } from 'react';
import { BarChart, Bar, XAxis, YAxis, CartesianGrid, Tooltip, Legend, ResponsiveContainer, LineChart, Line } from 'recharts';

const RepeatedANOVA = () => {
  const [results, setResults] = useState(null);
  const [loading, setLoading] = useState(false);

  // Generate sample data for 30 days across 3 stores
  const generateData = () => {
    const days = 30;
    const data = [];
    
    for (let i = 0; i < days; i++) {
      data.push({
        day: i + 1,
        storeA: Math.round(5000 + Math.random() * 2000 + i * 10),
        storeB: Math.round(6000 + Math.random() * 2500 + i * 15),
        storeC: Math.round(5500 + Math.random() * 2200 + i * 12)
      });
    }
    return data;
  };

  // Calculate means and standard deviations
  const calculateStats = (data) => {
    const stores = ['storeA', 'storeB', 'storeC'];
    const stats = {};
    
    stores.forEach(store => {
      const values = data.map(d => d[store]);
      const mean = values.reduce((a, b) => a + b, 0) / values.length;
      const variance = values.reduce((a, b) => a + Math.pow(b - mean, 2), 0) / values.length;
      const sd = Math.sqrt(variance);
      
      stats[store] = { mean, sd, values };
    });
    
    return stats;
  };

  // Repeated Measures ANOVA calculation
  const repeatedANOVA = (data) => {
    const n = data.length; // number of subjects (days)
    const k = 3; // number of conditions (stores)
    
    const stats = calculateStats(data);
    
    // Calculate grand mean
    const allValues = [
      ...stats.storeA.values,
      ...stats.storeB.values,
      ...stats.storeC.values
    ];
    const grandMean = allValues.reduce((a, b) => a + b, 0) / allValues.length;
    
    // Calculate sum of squares
    // SS_between (treatment effect)
    let ssCondition = 0;
    ['storeA', 'storeB', 'storeC'].forEach(store => {
      ssCondition += n * Math.pow(stats[store].mean - grandMean, 2);
    });
    
    // SS_subjects (individual differences)
    let ssSubjects = 0;
    for (let i = 0; i < n; i++) {
      const subjectMean = (data[i].storeA + data[i].storeB + data[i].storeC) / k;
      ssSubjects += k * Math.pow(subjectMean - grandMean, 2);
    }
    
    // SS_total
    let ssTotal = 0;
    data.forEach(row => {
      ['storeA', 'storeB', 'storeC'].forEach(store => {
        ssTotal += Math.pow(row[store] - grandMean, 2);
      });
    });
    
    // SS_error (residual)
    const ssError = ssTotal - ssCondition - ssSubjects;
    
    // Degrees of freedom
    const dfCondition = k - 1;
    const dfSubjects = n - 1;
    const dfError = (n - 1) * (k - 1);
    
    // Mean squares
    const msCondition = ssCondition / dfCondition;
    const msError = ssError / dfError;
    
    // F-statistic
    const fStat = msCondition / msError;
    
    // Critical F-value (approximate for α = 0.05)
    const criticalF = 3.15; // df1=2, df2=58, α=0.05
    
    const pValue = fStat > criticalF ? "< 0.05" : "> 0.05";
    const significant = fStat > criticalF;
    
    return {
      stats,
      grandMean,
      ssCondition,
      ssSubjects,
      ssError,
      ssTotal,
      dfCondition,
      dfSubjects,
      dfError,
      msCondition,
      msError,
      fStat,
      criticalF,
      pValue,
      significant
    };
  };

  // Post-hoc pairwise comparisons (Bonferroni correction)
  const postHocTest = (stats, n) => {
    const stores = [
      { name: 'Store A', key: 'storeA' },
      { name: 'Store B', key: 'storeB' },
      { name: 'Store C', key: 'storeC' }
    ];
    
    const comparisons = [];
    const alpha = 0.05 / 3; // Bonferroni correction for 3 comparisons
    
    for (let i = 0; i < stores.length; i++) {
      for (let j = i + 1; j < stores.length; j++) {
        const store1 = stores[i];
        const store2 = stores[j];
        
        const mean1 = stats[store1.key].mean;
        const mean2 = stats[store2.key].mean;
        const diff = Math.abs(mean1 - mean2);
        
        // Pooled variance for paired samples
        const values1 = stats[store1.key].values;
        const values2 = stats[store2.key].values;
        
        const differences = values1.map((v, idx) => v - values2[idx]);
        const meanDiff = differences.reduce((a, b) => a + b, 0) / n;
        const sdDiff = Math.sqrt(
          differences.reduce((a, b) => a + Math.pow(b - meanDiff, 2), 0) / (n - 1)
        );
        
        const seDiff = sdDiff / Math.sqrt(n);
        const tStat = meanDiff / seDiff;
        const criticalT = 2.045; // df=29, α=0.0167 (Bonferroni corrected)
        
        const significant = Math.abs(tStat) > criticalT;
        
        comparisons.push({
          comparison: `${store1.name} vs ${store2.name}`,
          mean1: mean1.toFixed(2),
          mean2: mean2.toFixed(2),
          difference: diff.toFixed(2),
          tStat: tStat.toFixed(3),
          significant,
          interpretation: significant ? 'Significantly different' : 'Not significantly different'
        });
      }
    }
    
    return comparisons;
  };

  const runAnalysis = () => {
    setLoading(true);
    
    setTimeout(() => {
      const data = generateData();
      const anovaResults = repeatedANOVA(data);
      const postHoc = anovaResults.significant ? postHocTest(anovaResults.stats, data.length) : null;
      
      setResults({
        data,
        anova: anovaResults,
        postHoc
      });
      
      setLoading(false);
    }, 500);
  };

  const chartData = results ? results.data.map(d => ({
    day: d.day,
    'Store A': d.storeA,
    'Store B': d.storeB,
    'Store C': d.storeC
  })) : [];

  const meanData = results ? [
    { store: 'Store A', mean: results.anova.stats.storeA.mean },
    { store: 'Store B', mean: results.anova.stats.storeB.mean },
    { store: 'Store C', mean: results.anova.stats.storeC.mean }
  ] : [];

  return (
    <div className="p-6 max-w-7xl mx-auto bg-gray-50 min-h-screen">
      <div className="bg-white rounded-lg shadow-lg p-6 mb-6">
        <h1 className="text-3xl font-bold text-gray-800 mb-2">
          Repeated Measures ANOVA Analysis
        </h1>
        <p className="text-gray-600 mb-4">
          Analyzing daily sales data for three retail stores (30 days)
        </p>
        
        <button
          onClick={runAnalysis}
          disabled={loading}
          className="bg-blue-600 hover:bg-blue-700 text-white font-semibold py-2 px-6 rounded-lg transition-colors disabled:bg-gray-400"
        >
          {loading ? 'Analyzing...' : 'Run Analysis'}
        </button>
      </div>

      {results && (
        <>
          {/* Descriptive Statistics */}
          <div className="bg-white rounded-lg shadow-lg p-6 mb-6">
            <h2 className="text-2xl font-bold text-gray-800 mb-4">Descriptive Statistics</h2>
            <div className="overflow-x-auto">
              <table className="w-full border-collapse">
                <thead>
                  <tr className="bg-gray-100">
                    <th className="border p-3 text-left">Store</th>
                    <th className="border p-3 text-right">Mean Sales ($)</th>
                    <th className="border p-3 text-right">Std Deviation ($)</th>
                  </tr>
                </thead>
                <tbody>
                  <tr>
                    <td className="border p-3 font-semibold">Store A</td>
                    <td className="border p-3 text-right">{results.anova.stats.storeA.mean.toFixed(2)}</td>
                    <td className="border p-3 text-right">{results.anova.stats.storeA.sd.toFixed(2)}</td>
                  </tr>
                  <tr className="bg-gray-50">
                    <td className="border p-3 font-semibold">Store B</td>
                    <td className="border p-3 text-right">{results.anova.stats.storeB.mean.toFixed(2)}</td>
                    <td className="border p-3 text-right">{results.anova.stats.storeB.sd.toFixed(2)}</td>
                  </tr>
                  <tr>
                    <td className="border p-3 font-semibold">Store C</td>
                    <td className="border p-3 text-right">{results.anova.stats.storeC.mean.toFixed(2)}</td>
                    <td className="border p-3 text-right">{results.anova.stats.storeC.sd.toFixed(2)}</td>
                  </tr>
                </tbody>
              </table>
            </div>
          </div>

          {/* Visualizations */}
          <div className="grid grid-cols-1 lg:grid-cols-2 gap-6 mb-6">
            <div className="bg-white rounded-lg shadow-lg p-6">
              <h3 className="text-xl font-bold text-gray-800 mb-4">Mean Sales by Store</h3>
              <ResponsiveContainer width="100%" height={300}>
                <BarChart data={meanData}>
                  <CartesianGrid strokeDasharray="3 3" />
                  <XAxis dataKey="store" />
                  <YAxis />
                  <Tooltip formatter={(value) => `$${value.toFixed(2)}`} />
                  <Bar dataKey="mean" fill="#3B82F6" />
                </BarChart>
              </ResponsiveContainer>
            </div>

            <div className="bg-white rounded-lg shadow-lg p-6">
              <h3 className="text-xl font-bold text-gray-800 mb-4">Daily Sales Trends</h3>
              <ResponsiveContainer width="100%" height={300}>
                <LineChart data={chartData}>
                  <CartesianGrid strokeDasharray="3 3" />
                  <XAxis dataKey="day" />
                  <YAxis />
                  <Tooltip formatter={(value) => `$${value}`} />
                  <Legend />
                  <Line type="monotone" dataKey="Store A" stroke="#EF4444" strokeWidth={2} />
                  <Line type="monotone" dataKey="Store B" stroke="#3B82F6" strokeWidth={2} />
                  <Line type="monotone" dataKey="Store C" stroke="#10B981" strokeWidth={2} />
                </LineChart>
              </ResponsiveContainer>
            </div>
          </div>

          {/* ANOVA Results */}
          <div className="bg-white rounded-lg shadow-lg p-6 mb-6">
            <h2 className="text-2xl font-bold text-gray-800 mb-4">Repeated Measures ANOVA Results</h2>
            
            <div className="overflow-x-auto mb-4">
              <table className="w-full border-collapse">
                <thead>
                  <tr className="bg-gray-100">
                    <th className="border p-3 text-left">Source</th>
                    <th className="border p-3 text-right">SS</th>
                    <th className="border p-3 text-right">df</th>
                    <th className="border p-3 text-right">MS</th>
                    <th className="border p-3 text-right">F</th>
                    <th className="border p-3 text-right">p-value</th>
                  </tr>
                </thead>
                <tbody>
                  <tr>
                    <td className="border p-3 font-semibold">Between Stores</td>
                    <td className="border p-3 text-right">{results.anova.ssCondition.toFixed(2)}</td>
                    <td className="border p-3 text-right">{results.anova.dfCondition}</td>
                    <td className="border p-3 text-right">{results.anova.msCondition.toFixed(2)}</td>
                    <td className="border p-3 text-right font-bold">{results.anova.fStat.toFixed(3)}</td>
                    <td className="border p-3 text-right font-bold">{results.anova.pValue}</td>
                  </tr>
                  <tr className="bg-gray-50">
                    <td className="border p-3 font-semibold">Subjects (Days)</td>
                    <td className="border p-3 text-right">{results.anova.ssSubjects.toFixed(2)}</td>
                    <td className="border p-3 text-right">{results.anova.dfSubjects}</td>
                    <td className="border p-3 text-right">-</td>
                    <td className="border p-3 text-right">-</td>
                    <td className="border p-3 text-right">-</td>
                  </tr>
                  <tr>
                    <td className="border p-3 font-semibold">Error</td>
                    <td className="border p-3 text-right">{results.anova.ssError.toFixed(2)}</td>
                    <td className="border p-3 text-right">{results.anova.dfError}</td>
                    <td className="border p-3 text-right">{results.anova.msError.toFixed(2)}</td>
                    <td className="border p-3 text-right">-</td>
                    <td className="border p-3 text-right">-</td>
                  </tr>
                </tbody>
              </table>
            </div>

            <div className={`p-4 rounded-lg ${results.anova.significant ? 'bg-green-50 border-2 border-green-500' : 'bg-red-50 border-2 border-red-500'}`}>
              <h3 className="text-lg font-bold mb-2">
                {results.anova.significant ? '✓ Significant Result' : '✗ Not Significant'}
              </h3>
              <p className="text-gray-700">
                F({results.anova.dfCondition}, {results.anova.dfError}) = {results.anova.fStat.toFixed(3)}, 
                p {results.anova.pValue}
              </p>
              <p className="text-gray-700 mt-2">
                {results.anova.significant 
                  ? 'There ARE significant differences in average daily sales between the three stores.'
                  : 'There are NO significant differences in average daily sales between the three stores.'}
              </p>
            </div>
          </div>

          {/* Post-hoc Tests */}
          {results.postHoc && (
            <div className="bg-white rounded-lg shadow-lg p-6">
              <h2 className="text-2xl font-bold text-gray-800 mb-4">Post-hoc Analysis (Bonferroni Corrected)</h2>
              <p className="text-gray-600 mb-4">
                Pairwise comparisons to determine which stores differ significantly
              </p>
              
              <div className="overflow-x-auto">
                <table className="w-full border-collapse">
                  <thead>
                    <tr className="bg-gray-100">
                      <th className="border p-3 text-left">Comparison</th>
                      <th className="border p-3 text-right">Mean 1</th>
                      <th className="border p-3 text-right">Mean 2</th>
                      <th className="border p-3 text-right">Difference</th>
                      <th className="border p-3 text-right">t-statistic</th>
                      <th className="border p-3 text-left">Result</th>
                    </tr>
                  </thead>
                  <tbody>
                    {results.postHoc.map((comp, idx) => (
                      <tr key={idx} className={idx % 2 === 0 ? 'bg-white' : 'bg-gray-50'}>
                        <td className="border p-3 font-semibold">{comp.comparison}</td>
                        <td className="border p-3 text-right">${comp.mean1}</td>
                        <td className="border p-3 text-right">${comp.mean2}</td>
                        <td className="border p-3 text-right">${comp.difference}</td>
                        <td className="border p-3 text-right">{comp.tStat}</td>
                        <td className={`border p-3 font-semibold ${comp.significant ? 'text-green-600' : 'text-gray-600'}`}>
                          {comp.interpretation}
                        </td>
                      </tr>
                    ))}
                  </tbody>
                </table>
              </div>

              <div className="mt-4 p-4 bg-blue-50 rounded-lg">
                <h3 className="text-lg font-bold mb-2">Interpretation:</h3>
                <ul className="list-disc list-inside space-y-1 text-gray-700">
                  {results.postHoc.map((comp, idx) => (
                    <li key={idx}>
                      {comp.comparison}: {comp.significant 
                        ? `Significantly different (p < 0.0167)` 
                        : `Not significantly different (p ≥ 0.0167)`}
                    </li>
                  ))}
                </ul>
              </div>
            </div>
          )}
        </>
      )}
    </div>
  );
};

export default RepeatedANOVA;