### Question1

In [None]:
# ANOVA (Analysis of Variance) is a statistical technique used to compare means between three or more groups in a study. To obtain reliable and valid results from ANOVA, certain assumptions need to be met. These assumptions are essential for the validity of the test and interpretation of the results. The main assumptions of ANOVA are:

#    Independence: The observations in each group must be independent of each other. This means that the values within one group should not be related or influenced by the values in another group.

#    Normality: The data in each group should be approximately normally distributed. This means that the distribution of data points within each group should resemble a bell-shaped curve.

#    Homogeneity of Variance: The variance of the data in each group should be roughly equal. In other words, the spread of data points around the mean should be consistent across all groups.

# If these assumptions are not met, the validity of the ANOVA results may be compromised. Here are some examples of violations and their impact on ANOVA results:

#    Violation of Independence:
#        Example: A researcher collects data from students in multiple classes within the same school. However, some students in different classes are friends, and their scores may be influenced by each other.
#        Impact: Violation of independence can lead to pseudoreplication, where the data points are not truly independent, causing an overestimation of the significance of the results.

#    Violation of Normality:
#        Example: In a study comparing test scores of students in different schools, the test scores in one group are heavily skewed and do not follow a normal distribution.
#        Impact: Violation of normality can lead to biased or inaccurate results. If the data is not normally distributed, the assumptions of ANOVA may not be met, and the test may be less reliable.

#    Violation of Homogeneity of Variance:
#        Example: A researcher compares the effectiveness of three different drugs on a medical condition, and the variability in the response of patients to one drug is much larger than the other two.
#        Impact: Violation of homogeneity of variance can lead to incorrect conclusions. If the variances are not equal across groups, the F-statistic in ANOVA may not accurately reflect the differences between group means.

# When assumptions are violated, alternative statistical methods or transformations of the data might be needed to obtain meaningful results. It's essential for researchers to carefully assess the data for adherence to the ANOVA assumptions and take appropriate actions to ensure the validity of their conclusions.

### Question2

In [None]:
# The three main types of ANOVA (Analysis of Variance) are:

#    One-Way ANOVA:
#        Situation: One-Way ANOVA is used when there is one independent variable (factor) with three or more levels (groups). It is used to compare the means of three or more independent groups to determine if there are any significant differences among them.
#        Example: A researcher wants to compare the average test scores of students from three different schools to see if there is any significant difference in performance.

#    Two-Way ANOVA:
#        Situation: Two-Way ANOVA is used when there are two independent variables (factors) and their interaction effect on the dependent variable is of interest. Each independent variable has two or more levels, and the study aims to investigate both main effects and the interaction between the two factors.
#        Example: A researcher wants to examine the effect of two different teaching methods (Factor 1) and gender (Factor 2) on students' test scores.

#    Repeated Measures ANOVA:
#        Situation: Repeated Measures ANOVA is used when the same participants are measured under different conditions or at multiple time points. It is designed to analyze within-subjects data, where participants serve as their control group, and measurements are made at different time points or conditions.
#        Example: A researcher wants to investigate the effect of different doses of a drug on a group of patients, and each patient receives all doses in a specific order.

#In summary:

#    One-Way ANOVA is used when there is one independent variable with three or more levels and is used to compare means across independent groups.
#    Two-Way ANOVA is used when there are two independent variables, and their main effects and interaction effect are of interest.
#    Repeated Measures ANOVA is used when the same participants are measured under different conditions or time points to analyze within-subjects data.

#The choice of which type of ANOVA to use depends on the research design and the specific research question being investigated. Each type of ANOVA allows researchers to test different hypotheses and gain insights into the relationships between variables in different experimental settings.

### Question3

In [None]:
#Partitioning of variance in ANOVA refers to the process of breaking down the total variance in the data into different components that can be attributed to specific sources of variation. In ANOVA, the total variance in the dependent variable (outcome) is divided into two main components: the variance due to the effect of the independent variable(s) and the variance within groups or error variance.

#The importance of understanding the concept of partitioning of variance in ANOVA lies in its ability to help researchers:

#    Identify Sources of Variation: By partitioning the total variance, ANOVA allows researchers to identify and quantify the sources of variation in the data. It helps determine how much of the variability in the dependent variable can be attributed to the independent variable(s) and how much is due to random error or other factors.

#    Assess the Significance of Effects: ANOVA enables researchers to test the significance of the effects of the independent variable(s). By comparing the variability between groups to the variability within groups, ANOVA determines whether the observed differences in means are statistically significant.

#    Understand the Impact of Factors: Partitioning of variance allows researchers to understand the relative importance of different factors in explaining the variation in the dependent variable. For example, in a Two-Way ANOVA, it helps determine the individual contributions of each independent variable and their interaction to the overall variability.

#    Make Inferences and Interpret Results: ANOVA provides a framework for making inferences about population parameters based on sample data. Understanding the partitioning of variance aids in the interpretation of ANOVA results and assists researchers in drawing meaningful conclusions from their studies.

#    Assess Model Fit: ANOVA allows researchers to evaluate the goodness of fit of their statistical model by examining the ratio of variance explained by the model to the total variance. This ratio is often represented as an F-statistic, and it indicates how well the model explains the data.

#Overall, the partitioning of variance in ANOVA is a fundamental concept that underpins the statistical analysis of group comparisons. It provides valuable insights into the relationships between variables, aids in hypothesis testing, and enhances the validity and interpretability of research findings.

### Question4

In [2]:
# In a one-way ANOVA, the Total Sum of Squares (SST), Explained Sum of Squares (SSE), and Residual Sum of Squares (SSR) are used to analyze the variance in the data and test for significant differences between group means. Here's how you can calculate these sums of squares in Python using the numpy library:

import numpy as np

# Sample data for each group (replace with your actual data)
group1 = [10, 12, 15, 8, 11]
group2 = [14, 16, 18, 13, 17]
group3 = [20, 22, 25, 19, 21]

# Combine all the data into one array
all_data = np.concatenate([group1, group2, group3])

# Overall mean
overall_mean = np.mean(all_data)

# Number of data points in each group
n1 = len(group1)
n2 = len(group2)
n3 = len(group3)

# Calculate the group means
group1_mean = np.mean(group1)
group2_mean = np.mean(group2)
group3_mean = np.mean(group3)

# Calculate the Total Sum of Squares (SST)
sst = np.sum((all_data - overall_mean) ** 2)

# Calculate the Explained Sum of Squares (SSE)
sse = n1 * (group1_mean - overall_mean) ** 2 + n2 * (group2_mean - overall_mean) ** 2 + n3 * (group3_mean - overall_mean) ** 2

# Calculate the Residual Sum of Squares (SSR)
ssr = np.sum((group1 - group1_mean) ** 2) + np.sum((group2 - group2_mean) ** 2) + np.sum((group3 - group3_mean) ** 2)

print("Total Sum of Squares (SST):", sst)
print("Explained Sum of Squares (SSE):", sse)
print("Residual Sum of Squares (SSR):", ssr)

# In this code, we first define the sample data for each group (group1, group2, and group3). We then combine all the data into one array (all_data) to calculate the overall mean (overall_mean).

# Next, we calculate the group means (group1_mean, group2_mean, and group3_mean). Using these means, we calculate the SST, which represents the total variation in the data.

# The SSE represents the variation explained by the group means and is calculated as the sum of the squared deviations of each group mean from the overall mean, weighted by the number of data points in each group.

# The SSR represents the variation not explained by the group means and is calculated as the sum of the squared deviations of individual data points from their respective group means.

# By understanding and calculating these sums of squares, we can perform an ANOVA to test for significant differences between group means and assess the overall variance in the data.

Total Sum of Squares (SST): 326.9333333333333
Explained Sum of Squares (SSE): 261.7333333333333
Residual Sum of Squares (SSR): 65.2


### Question5

In [3]:
#In a two-way ANOVA, we can calculate the main effects and interaction effects using Python by performing the analysis on the data and examining the results. To do this, we can use the statsmodels library, which provides functionalities for conducting ANOVA in Python. Here's how you can calculate the main effects and interaction effects:

import pandas as pd
import statsmodels.api as sm
from statsmodels.formula.api import ols

# Sample data (replace with your actual data)
data = {
    'Group1': [10, 12, 15, 8, 11, 14, 16, 18, 13, 17],
    'Group2': [20, 22, 25, 19, 21, 24, 26, 28, 23, 27],
    'Response': [30, 35, 40, 32, 38, 34, 36, 42, 33, 37]
}

df = pd.DataFrame(data)

# Fit the two-way ANOVA model
model = ols('Response ~ Group1 + Group2 + Group1:Group2', data=df).fit()

# Get the ANOVA table
anova_table = sm.stats.anova_lm(model, typ=2)

# Extract main effects and interaction effects from the ANOVA table
main_effect_Group1 = anova_table.loc['Group1', 'sum_sq'] / anova_table.loc['Group1', 'df']
main_effect_Group2 = anova_table.loc['Group2', 'sum_sq'] / anova_table.loc['Group2', 'df']
interaction_effect = anova_table.loc['Group1:Group2', 'sum_sq'] / anova_table.loc['Group1:Group2', 'df']

print("Main Effect of Group1:", main_effect_Group1)
print("Main Effect of Group2:", main_effect_Group2)
print("Interaction Effect:", interaction_effect)

# In this code, we first define the sample data as a pandas DataFrame (df). We then use the ols function from statsmodels.formula.api to fit the two-way ANOVA model. The formula 'Response ~ Group1 + Group2 + Group1:Group2' specifies the model with the main effects of Group1 and Group2, as well as their interaction effect.

# Next, we use sm.stats.anova_lm to obtain the ANOVA table for the model. The typ=2 argument specifies that we want to use the Type 2 sums of squares, which is appropriate for balanced designs.

# Finally, we extract the main effects and interaction effect from the ANOVA table. The main effects of Group1 and Group2 represent the variation explained by each independent variable individually, while the interaction effect represents the additional variation explained by the interaction of the two variables.

# By examining the main effects and interaction effect, we can assess the impact of each variable on the response variable and whether there is a significant interaction effect between the two independent variables.

Main Effect of Group1: 0.09898989898991313
Main Effect of Group2: 0.9708513708514256
Interaction Effect: 0.975468975468866


### Question6

In [None]:
# In a one-way ANOVA, the F-statistic is used to test whether there are significant differences between the means of three or more independent groups. The p-value associated with the F-statistic indicates the probability of obtaining such a result by chance, assuming that there are no real differences between the group means.

# In your case, you obtained an F-statistic of 5.23 and a p-value of 0.02. With a significance level (alpha) commonly set at 0.05, the p-value (0.02) is less than alpha. Therefore, we reject the null hypothesis and conclude that there are significant differences between the group means.

# Interpretation:
# Based on the results of the one-way ANOVA, we can confidently state that there are statistically significant differences between the means of the groups being compared. However, the ANOVA does not tell us which specific groups have different means; it only tells us that at least one group differs significantly from the others.

# If you want to determine which specific groups are different from each other, you would need to conduct post hoc tests (e.g., Tukey's test, Bonferroni correction) to perform pairwise comparisons between the groups.

# Keep in mind that the interpretation of the ANOVA results depends on the context and the research question. If the F-statistic is significant and the p-value is small, it indicates that there is evidence to support the presence of differences between groups. It is essential to consider the effect size and practical significance of the differences when interpreting the results.

### Question7

In [None]:
#Handling missing data in a repeated measures ANOVA is crucial to ensure the accuracy and validity of the results. The appropriate method for handling missing data depends on the nature and pattern of missingness. Here are some common approaches and their potential consequences:

#    Complete Case Analysis (Listwise Deletion):
#        Method: This approach involves removing any participants with missing data from the analysis. Only complete cases with data in all time points or conditions are included in the analysis.
#        Consequences: While it is a straightforward method, it can lead to a loss of statistical power and potential bias if the missing data are not missing completely at random (MCAR). It may also reduce the representativeness of the sample if missing data are related to specific characteristics.

#    Mean Imputation:
#        Method: Missing values are replaced with the mean of the available data for that variable.
#        Consequences: Mean imputation can artificially reduce the variance of the data and may lead to biased estimates of group means and standard errors. It does not capture the uncertainty associated with imputed values and can distort the true relationships between variables.

#    Last Observation Carried Forward (LOCF):
#        Method: Missing values are replaced with the value from the last observed time point for that participant.
#        Consequences: LOCF assumes that the missing values are constant over time, which may not be accurate. This approach can lead to biased estimates if the missingness is related to changes in the variable over time.

#    Multiple Imputation:
#        Method: Multiple imputation involves creating multiple plausible imputed datasets based on the observed data's uncertainty. Each imputed dataset is then analyzed separately, and the results are combined using appropriate statistical methods.
#        Consequences: Multiple imputation provides more accurate estimates of parameters and standard errors and accounts for the uncertainty associated with missing values. It is considered the most valid approach when the missing data are not MCAR. However, it can be computationally intensive and may require larger sample sizes to produce reliable results.

#    Maximum Likelihood Estimation (MLE):
#        Method: MLE estimates the parameters of the repeated measures ANOVA model while accounting for missing data. It uses all available data to maximize the likelihood of the observed data given the model.
#        Consequences: MLE is a robust approach that can provide unbiased estimates when data are missing at random (MAR) or missing not at random (MNAR). It is widely used in software packages designed for repeated measures ANOVA.

#The choice of method for handling missing data in a repeated measures ANOVA should be carefully considered, and researchers should be transparent in reporting their approach and any assumptions made about the missing data mechanism. It is also advisable to perform sensitivity analyses to assess the impact of different missing data methods on the results and conclusions of the study.

### Question8

In [None]:
# After conducting an ANOVA and finding a significant overall effect, post-hoc tests are used to make pairwise comparisons between groups to identify which specific groups differ significantly from each other. There are several common post-hoc tests, and the choice of which one to use depends on the design and assumptions of the study. Some common post-hoc tests include:

#    Tukey's Honestly Significant Difference (HSD) Test:
#        Use: Tukey's HSD is suitable when sample sizes are equal and the groups have equal variances. It controls the family-wise error rate, making it appropriate for multiple pairwise comparisons.
#        Example: In a study comparing the effectiveness of three different treatments on patient recovery time, the ANOVA shows a significant overall effect. To determine which treatments significantly differ from each other, you can use Tukey's HSD to perform all pairwise comparisons.

#    Bonferroni Correction:
#        Use: The Bonferroni correction is a conservative method that can be used when conducting multiple pairwise comparisons. It adjusts the significance level for each comparison to control the overall family-wise error rate.
#        Example: In a study comparing the effects of four different diets on weight loss, the ANOVA reveals a significant difference among the diets. To avoid false positives when making multiple comparisons, you can use the Bonferroni correction to adjust the alpha level for each comparison.

#    Scheffe's Test:
#        Use: Scheffe's test is a robust post-hoc test that can handle unequal sample sizes and unequal variances among groups. It is appropriate when the assumption of equal variances is violated.
#        Example: In a study examining the effects of two different teaching methods on student test scores, the ANOVA indicates a significant overall difference. Since the sample sizes and variances may differ between the groups, Scheffe's test can be used to make pairwise comparisons.

#    Dunnett's Test:
#        Use: Dunnett's test is specifically designed for comparing multiple treatment groups to a single control group. It controls the Type I error rate when making these comparisons.
#        Example: In a clinical trial with a control group and multiple experimental groups testing the efficacy of different drugs, the ANOVA shows a significant difference. To identify which experimental groups differ significantly from the control, you can use Dunnett's test.

#Post-hoc tests are essential in ANOVA when multiple group comparisons are involved. They allow researchers to identify specific group differences and provide more detailed insights into the effects of the independent variable(s) on the dependent variable. Properly chosen post-hoc tests help avoid false conclusions and enhance the accuracy of the study's findings.

### Question9

In [None]:
# To conduct a one-way ANOVA in Python to compare the mean weight loss of three diets (A, B, and C), you can use the scipy.stats library. First, you need to have the weight loss data for each diet group. Assuming you have the data in three separate lists weight_loss_A, weight_loss_B, and weight_loss_C, you can perform the ANOVA as follows:

import scipy.stats as stats

# Replace these lists with the actual weight loss data for each diet group
weight_loss_A = [5, 7, 8, 6, 4, 9, 5, 6, 7, 8, ...]  # 50 observations for Diet A
weight_loss_B = [4, 6, 5, 3, 8, 6, 7, 5, 6, 4, ...]  # 50 observations for Diet B
weight_loss_C = [3, 4, 6, 5, 5, 7, 4, 6, 5, 7, ...]  # 50 observations for Diet C

# Perform one-way ANOVA
F_statistic, p_value = stats.f_oneway(weight_loss_A, weight_loss_B, weight_loss_C)

# Print the results
print("F-statistic:", F_statistic)
print("p-value:", p_value)

# Interpretation of results:

#    The F-statistic represents the test statistic of the ANOVA. It measures the ratio of the between-group variance to the within-group variance. In simple terms, it tells us how much the means of the three diets vary from each other.
#    The p-value represents the probability of observing the data or more extreme data if the null hypothesis (all diets have the same mean weight loss) is true. If the p-value is small (usually less than 0.05), it suggests that there is a significant difference between the mean weight loss of at least one pair of diets.

# Interpretation of p-value:

#    If the p-value is less than the chosen significance level (commonly set at 0.05), you would reject the null hypothesis and conclude that there is a significant difference in mean weight loss between at least one pair of diets.
#    If the p-value is greater than the chosen significance level, you would fail to reject the null hypothesis, indicating that there is no significant difference in mean weight loss between the diets.

# Keep in mind that the interpretation will be based on the actual p-value obtained from your data. Make sure to replace the placeholder data with the actual weight loss observations for each diet group before running the analysis.

### Question10

In [None]:
# To conduct a two-way ANOVA in Python to analyze the effects of software programs and employee experience level on task completion time, you can use the statsmodels library. First, you need to have the task completion time data, the software program each employee used, and their experience level (novice or experienced). Assuming you have the data in a DataFrame named data_frame, with columns Task_Time, Software_Program, and Experience_Level, you can perform the two-way ANOVA as follows:

import statsmodels.api as sm
from statsmodels.formula.api import ols

# Assuming you have the data in a DataFrame named data_frame

# Create a formula for the ANOVA
formula = 'Task_Time ~ C(Software_Program) + C(Experience_Level) + C(Software_Program):C(Experience_Level)'

# Fit the ANOVA model
model = ols(formula, data=data_frame).fit()

# Perform two-way ANOVA
anova_table = sm.stats.anova_lm(model, typ=2)

# Print the results
print(anova_table)

# The anova_table will contain the F-statistics and p-values for the main effects (Software Program and Experience Level) and the interaction effect between them.

# Interpretation of results:

#    If the p-value for the main effect of Software Program is significant (usually less than 0.05), it indicates that there is a significant difference in the average task completion time between at least one pair of software programs.
#    If the p-value for the main effect of Experience Level is significant (usually less than 0.05), it suggests that there is a significant difference in the average task completion time between novice and experienced employees across all software programs.
#    If the p-value for the interaction effect between Software Program and Experience Level is significant (usually less than 0.05), it implies that the effect of software programs on task completion time depends on the employee's experience level, and vice versa.

# It's important to note that a significant main effect doesn't necessarily mean that all individual groups (software programs or experience levels) are significantly different from each other. Post-hoc tests or pairwise comparisons can be conducted to determine which specific groups differ significantly if you find a significant main effect.

# Before running the analysis, ensure that you have replaced the placeholders with the actual data and column names from your DataFrame. Also, consider checking for assumptions like normality and homogeneity of variance before interpreting the results of the ANOVA.

### Question11

In [2]:
# To conduct a two-sample t-test in Python to compare the test scores between the control group (traditional teaching method) and the experimental group (new teaching method), you can use the scipy.stats library. Additionally, if the t-test results are significant, you can follow up with a post-hoc test to determine which group(s) differ significantly from each other. For post-hoc testing, we'll use the Tukey's Honestly Significant Difference (HSD) test, which can be performed using the statsmodels library. Here's how you can do it:


import numpy as np
import scipy.stats as stats
import pandas as pd
from statsmodels.stats.multicomp import pairwise_tukeyhsd

# Generate some sample data (replace this with your actual data)
np.random.seed(42)  # For reproducibility
control_scores = np.random.normal(loc=70, scale=10, size=100)
experimental_scores = np.random.normal(loc=75, scale=12, size=100)

# Perform two-sample t-test
t_statistic, p_value = stats.ttest_ind(control_scores, experimental_scores)

# Print the t-test results
print("Two-sample t-test results:")
print("t-statistic:", t_statistic)
print("p-value:", p_value)

# Post-hoc test (Tukey's HSD) if the t-test results are significant
if p_value < 0.05:
    # Combine the data and create a corresponding group label
    data = np.concatenate([control_scores, experimental_scores])
    group_labels = np.array(['Control'] * len(control_scores) + ['Experimental'] * len(experimental_scores))
    
    # Create a DataFrame for post-hoc test
    df = pd.DataFrame({'Data': data, 'Group': group_labels})
    
    # Perform Tukey's HSD test
    posthoc = pairwise_tukeyhsd(df['Data'], df['Group'])
    
    # Print the post-hoc test results
    print("\nPost-hoc (Tukey's HSD) test results:")
    print(posthoc)

# Interpretation of results:

#    The t-statistic represents the test statistic of the two-sample t-test. It measures the difference between the means of the control and experimental groups relative to the spread of the data.
#    The p-value represents the probability of observing the data or more extreme data if the null hypothesis (no difference in test scores between the two groups) is true. If the p-value is small (usually less than 0.05), it suggests that there is a significant difference in test scores between the control and experimental groups.

# Interpretation of post-hoc (Tukey's HSD) test results:

#    The post-hoc test is conducted only when the t-test results are significant (p-value < 0.05).
#    The output of the Tukey's HSD test will show the significant pairwise comparisons between the groups and their corresponding confidence intervals. If the confidence interval includes zero, it indicates that there is no significant difference between those groups. If the confidence interval does not include zero, it means the groups are significantly different from each other.

# Before running the analysis, make sure to replace the control_scores and experimental_scores arrays with your actual test score data for the control and experimental groups, respectively.

Two-sample t-test results:
t-statistic: -4.316398519082441
p-value: 2.5039591073846333e-05

Post-hoc (Tukey's HSD) test results:
  Multiple Comparison of Means - Tukey HSD, FWER=0.05   
 group1    group2    meandiff p-adj lower  upper  reject
--------------------------------------------------------
Control Experimental   6.3061   0.0 3.4251 9.1872   True
--------------------------------------------------------


### Question12

In [4]:
# A repeated measures ANOVA is used when the same subjects are measured multiple times under different conditions. In this scenario, we have three retail stores (Store A, Store B, and Store C) being measured on the same 30 days. However, it seems that you want to compare the average daily sales between the three stores, which can be achieved using a one-way ANOVA as opposed to a repeated measures ANOVA.

# Let's conduct a one-way ANOVA in Python to compare the average daily sales of the three retail stores. Additionally, if the results are significant, we'll follow up with a post-hoc test (e.g., Tukey's Honestly Significant Difference) to determine which stores differ significantly from each other:

import numpy as np
import pandas as pd
import scipy.stats as stats
from statsmodels.stats.multicomp import pairwise_tukeyhsd

# Generate some sample data (replace this with your actual data)
np.random.seed(42)  # For reproducibility
store_A_sales = np.random.randint(100, 500, 30)
store_B_sales = np.random.randint(150, 550, 30)
store_C_sales = np.random.randint(200, 600, 30)

# Combine the data and create a corresponding store label
data = np.concatenate([store_A_sales, store_B_sales, store_C_sales])
store_labels = np.array(['Store A'] * 30 + ['Store B'] * 30 + ['Store C'] * 30)

# Create a DataFrame for the analysis
df = pd.DataFrame({'Sales': data, 'Store': store_labels})

# Perform one-way ANOVA
F_statistic, p_value = stats.f_oneway(store_A_sales, store_B_sales, store_C_sales)

# Print the ANOVA results
print("One-way ANOVA results:")
print("F-statistic:", F_statistic)
print("p-value:", p_value)

# Post-hoc test (Tukey's HSD) if the ANOVA results are significant
if p_value < 0.05:
    # Perform Tukey's HSD test
    posthoc = pairwise_tukeyhsd(df['Sales'], df['Store'])
    
    # Print the post-hoc test results
    print("\nPost-hoc (Tukey's HSD) test results:")
    print(posthoc)

# Interpretation of results:

#    The F-statistic represents the test statistic of the one-way ANOVA. It measures the ratio of the between-group variance to the within-group variance. In simple terms, it tells us how much the means of the three retail stores vary from each other.
#    The p-value represents the probability of observing the data or more extreme data if the null hypothesis (all stores have the same average daily sales) is true. If the p-value is small (usually less than 0.05), it suggests that there is a significant difference in average daily sales between at least one pair of stores.

# Interpretation of post-hoc (Tukey's HSD) test results:

#    The post-hoc test is conducted only when the one-way ANOVA results are significant (p-value < 0.05).
#    The output of the Tukey's HSD test will show the significant pairwise comparisons between the stores and their corresponding confidence intervals. If the confidence interval includes zero, it indicates that there is no significant difference between those stores. If the confidence interval does not include zero, it means the stores are significantly different from each other.

# Before running the analysis, make sure to replace the store_A_sales, store_B_sales, and store_C_sales arrays with your actual sales data for each store.

One-way ANOVA results:
F-statistic: 5.970179522416751
p-value: 0.003718270834068322

Post-hoc (Tukey's HSD) test results:
  Multiple Comparison of Means - Tukey HSD, FWER=0.05   
 group1  group2 meandiff p-adj   lower    upper   reject
--------------------------------------------------------
Store A Store B  50.0333 0.2148 -20.5651 120.6318  False
Store A Store C    102.3 0.0024  31.7016 172.8984   True
Store B Store C  52.2667 0.1873 -18.3318 122.8651  False
--------------------------------------------------------
