# Independent Samples t-test: Eye Movement Study

## Research Question
Does fixation lead to better memory recall than horizontal eye movements?

Dataset: Matzke et al. (2015)
DV: CriticalRecall
IV: Condition (Horizontal vs Fixation)

## Step 1: Import Libraries

- `pandas` to handle tabular data (like Excel).
- `scipy.stats` contains statistical tests such as the independent samples t-test.
- `numpy` for mathematical operations (e.g., square root, variance calculations).

These libraries allow us to perform statistical analysis in Python.

In [None]:
import pandas as pd
from scipy import stats

## Step 2: Load the Dataset

The Excel file is read into a pandas DataFrame.

This allows us to manipulate and analyse the data programmatically.

In [3]:
import pandas as pd
data = pd.read_excel("C:/Users/mihna/OneDrive/Desktop/eyemove.xlsx")
data.head()

Unnamed: 0,ParticipantNumber,Condition,CriticalRecall
0,1,Horizontal,4
1,3,Fixation,14
2,4,Horizontal,12
3,6,Fixation,4
4,7,Horizontal,11


## Step 3: Separate Groups

Extract the recall scores for each experimental condition.

- First, filter rows where Condition == "Horizontal".
- Then select the "CriticalRecall" column.
- Repeat the same for the Fixation group.

This gives two separate sets of scores to compare.

In [5]:
horizontal = data[data["Condition"] == "Horizontal"]["CriticalRecall"]
fixation = data[data["Condition"] == "Fixation"]["CriticalRecall"]

## Step 4: Independent Samples t-test

Test whether the mean recall differs between the Horizontal and Fixation groups.

`ttest_ind()` performs an independent samples t-test.

- `horizontal, fixation` -> the two groups being compared
- `equal_var=True` -> assumes equal variances (Student’s t-test)

The test evaluates whether the difference in means is larger than what would be expected due to random sampling variability.

In [6]:
stats.ttest_ind(horizontal, fixation, equal_var=True)

TtestResult(statistic=np.float64(-2.845274620058386), pvalue=np.float64(0.006553815987160374), df=np.float64(47.0))

### Interpretation of Results

t(47) = -2.85, p = .007

- The negative sign indicates that the Horizontal group had lower recall than the Fixation group.
- The p-value is below .05, indicating a statistically significant difference.
- Therefore, recall performance differs between the two conditions.

Conclusion: The Fixation group recalled significantly more words than the Horizontal group.

## Step 5: Repeat the same for Welch's test

- `equal_var=False` -> Does not assume equal variance, therefore gives Welch's t test.

In [7]:
stats.ttest_ind(horizontal, fixation, equal_var=False)

TtestResult(statistic=np.float64(-2.8234133654901394), pvalue=np.float64(0.007351503583719712), df=np.float64(40.26876885842331))

## Step 6: Calculate mean and standard deviation for each group

`ddof=1` ensures we compute the sample SD (dividing by n−1), which is appropriate for statistical inference.

In [8]:
horizontal.mean(), fixation.mean()

(np.float64(10.88), np.float64(15.291666666666666))

In [9]:
horizontal.std(ddof=1), fixation.std(ddof=1)

(4.323578764557589, 6.3757636655416094)

## Step 7: Calculate Pooled Standard Deviation

The pooled standard deviation combines variability from both groups.

It is a weighted average of the two group variances.

This value represents the shared within-group variability and is required to compute Cohen’s d.

The formula combines variability from both groups.

- `sd1**2` and `sd2**2` convert standard deviations into variances.
- `(n1 - 1)` and `(n2 - 1)` weight each group by its degrees of freedom.
- The numerator adds total within-group variability.
- `(n1 + n2 - 2)` divides by total degrees of freedom.
- `np.sqrt()` converts pooled variance back into standard deviation.

In [10]:
import numpy as np

n1 = len(horizontal)
n2 = len(fixation)

sd1 = horizontal.std(ddof=1)
sd2 = fixation.std(ddof=1)

pooled_sd = np.sqrt(((n1-1)*sd1**2 + (n2-1)*sd2**2) / (n1+n2-2))

pooled_sd

np.float64(5.42570386321881)

In [None]:
## Step 8: Calculate Cohen's d

Cohen’s d standardizes the mean difference by dividing it by the pooled standard deviation.

Formula:
d = (Mean₁ − Mean₂) / SD_pooled

This expresses the group difference in standard deviation units.

In [11]:
d = (horizontal.mean() - fixation.mean()) / pooled_sd
d

np.float64(-0.8131049496773374)

### Effect Size Interpretation

d = -0.81

The negative sign indicates the Horizontal group scored lower.

The magnitude (|0.81|) is considered a large effect.

This suggests the difference is not only statistically significant but also practically meaningful.