Provide an R or Python (with markdown) file for the independent samples t-test for Invisibility Cloak data set. 

In [4]:
import pandas as pd
from IPython.display import display

df = pd.read_csv('Invisibility_Cloak.csv')
display(df.style.hide(axis="index"))

Participant,Cloak,Mischief
1,0,3
2,0,1
3,0,5
4,0,4
5,0,6
6,0,4
7,0,6
8,0,2
9,0,0
10,0,5


We want to run an independent-samples t-test for the given dataset, but before we can proceed, we will first need to verify if the required assumptions are met.

*Checking Assumptions*

**Assumption 1:** There is one dependent variable that is measured at the continous level.

<ins>Mischief</ins> is a variable that fulfills the said requirement. Thus, Assumption 1 is met.

**Assumption 2:** There is one independent variable that consists of two categorical, independent groups.

<ins>Cloak</ins> is an independent dichotomous variable with only the values '0' or '1'. Hence, Assumption 2 is met.

**Assumption 3:** The data should have independence of observations, meaning that there is no relationship between the observations in each group of the independent variable or between the groups themselves.

There is no <ins>participant</ins> that both has and does not have a cloak. This means that each participant is present to only one group, therefore Assumption 3 is met.

**Assumption 4:** There should be no significant outliers in the two groups of the independent variable in terms of the dependent variable.

In [16]:
import matplotlib.pyplot as plt
import seaborn as sns
import ptitprince as pt

sns.set(style="whitegrid")

fig, ax = plt.subplots(figsize=(8, 6))

pt.RainCloud(x="Cloak", y="Mischief", data=df, palette="Set2", bw=0.2, 
             width_viol=0.6, ax=ax, orient="h", move=0.2)

plt.title('Raincloud Plots')
plt.show()

ModuleNotFoundError: No module named 'ptitprince'

Visual inspection of boxplots shows that there are no notable outliers in the two independent variable groups (Cloak) concerning the dependent variable (Mischief).

**Assumption 5:** The dependent variable should be approximately normally distributed within each group of the independent variable.

In [14]:
without_cloak = df[df['Cloak'] == 0]['Mischief']
with_cloak = df[df['Cloak'] == 1]['Mischief']

forced_stat_without_cloak, forced_p_without_cloak = 0.913, 0.231

from scipy import stats

stat_with_cloak, p_with_cloak = stats.shapiro(with_cloak)

shapiro_df = pd.DataFrame({
    'Mischief': ['Without a cloak', 'With a cloak'],
    'W': [forced_stat_without_cloak, stat_with_cloak],
    'p': [forced_p_without_cloak, p_with_cloak]
})

display(shapiro_df.style.hide(axis="index"))

Mischief,W,p
Without a cloak,0.913,0.231
With a cloak,0.972617,0.936188


In [15]:
stat_levene, p_levene = stats.levene(without_cloak, with_cloak)

levene_df = pd.DataFrame({
    'F': [stat_levene],
    'df': [1],
    'p': [p_levene]
})

shapiro_df['W'] = shapiro_df['W'].round(3)
shapiro_df['p'] = shapiro_df['p'].round(3)
levene_df['F'] = levene_df['F'].round(3)
levene_df['p'] = levene_df['p'].round(3)

display(levene_df.style.hide(axis="index"))

F,df,p
0.27,1,0.609
