In [1]:
import pandas as pd

# Load the dataset
data = pd.read_csv('C:/Users/Harvey/Downloads/RatExploration_csvFile.csv')


# Display the first few rows of the dataset to understand its structure
#data.drop(['Unnamed: 3'], axis = 1)

display(data)
data.columns

Unnamed: 0,ID,Stimuli,Time
0,1,Shape,2.0
1,2,Shape,0.75
2,3,Shape,1.25
3,4,Shape,1.0
4,5,Shape,1.5
5,6,Shape,1.25
6,7,Shape,1.75
7,8,Shape,0.5
8,9,Pattern,2.5
9,10,Pattern,3.25


Index(['ID', 'Stimuli', 'Time'], dtype='object')

## Provide a full analysis using one of the ANOVA types to compare the number of seconds the rats spent exploring the experiment chamber with the images. Is there a significant difference in time spent between the three treatment conditions: shapes, patterns, and pictures?

The dataset contains the following columns:

ID: Identifier for each rat.

Stimuli: The type of stimuli presented to the rats (e.g., shapes, patterns, pictures).

Time: The time in seconds that each rat spent exploring the experiment chamber.

To compare the time spent by rats in different treatment conditions (shapes, patterns, pictures), we can perform an Analysis of Variance (ANOVA). ANOVA is used to determine if there are any statistically significant differences between the means of three or more independent (unrelated) groups. In this case, the groups are the different types of stimuli.

In [2]:
import scipy.stats as stats

# Preparing the data for ANOVA
groups = data.groupby('Stimuli')['Time']

# Perform One-Way ANOVA
anova_result = stats.f_oneway(*[group for name, group in groups])

anova_result


F_onewayResult(statistic=62.08885876263263, pvalue=6.53173977189194e-12)

F-statistic: 62.09

P-value: 6.53e-12

In this case, the p-value is extremely small (6.53e-12), much lower than the common alpha level of 0.05. This indicates that there is a statistically significant difference in the time spent by the rats under the three different treatment conditions (shapes, patterns, and pictures). Thus, we can conclude that the type of stimuli does significantly affect the time rats spend exploring the experiment chamber.

## Check the assumptions underlying the chosen ANOVA type.


To ensure the validity of the One-Way ANOVA results, we need to check the following assumptions:

Independence of Observations: The observations should be independent of each other. This is more of a study design issue and is usually assumed to be met in a well-designed experiment.

Normality: The distribution of the residuals (differences between observed and predicted values) should be approximately normally distributed for each group.

Homogeneity of Variances (Homoscedasticity): The variances among the groups should be approximately equal.



In [3]:
import scipy.stats as stats

# Shapiro-Wilk Test for each stimuli group
shapiro_results = {stimulus: stats.shapiro(data[data['Stimuli'] == stimulus]['Time']) for stimulus in data['Stimuli'].unique()}
shapiro_results

{'Shape': ShapiroResult(statistic=0.943737804889679, pvalue=0.5479545593261719),
 'Pattern': ShapiroResult(statistic=0.9500510692596436, pvalue=0.6377314925193787),
 'Picture': ShapiroResult(statistic=0.9151579737663269, pvalue=0.24829529225826263)}

Normality Check
We used the Shapiro-Wilk test for normality. The results are:

Shapes: Shapiro statistic = 0.944, p-value = 0.548

Patterns: Shapiro statistic = 0.950, p-value = 0.638

Pictures: Shapiro statistic = 0.915, p-value = 0.248


In [4]:
# Levene's Test for homogeneity of variances across the groups
levene_result = stats.levene(*[data[data['Stimuli'] == stimulus]['Time'] for stimulus in data['Stimuli'].unique()])
levene_result

LeveneResult(statistic=0.43132798942236117, pvalue=0.6532556922411884)

Levene's test was used to assess the homogeneity of variances. The result is:

Levene statistic = 0.431, p-value = 0.653
The p-value is greater than 0.05, suggesting that the variances are approximately equal across the groups, meeting the homoscedasticity assumption.

## Conclusion

Based on these tests, the assumptions of normality and homogeneity of variances required for One-Way ANOVA seem to be met in this dataset. Therefore, the results of the ANOVA are likely to be valid.