## A working example

### 1. Data exploration

The `AU` data frame contains information from a research project investigating the effects of a novel behavioral therapy on facial-emotion processing in the context of autism disorder. Given extensive preliminary reports provided by family members and physicians, no pre-intervention measure was considered.  Emotion processing is therefore compared between participants diagnosed with autistic disorder and healthy controls **after intervention only**. `AU` also includes the participants' biological sex (b-sex), age (age) and fluid intelligence (f_inte) score. 

Feel free to inspect the data frame in the next code chunk.

In [2]:
import warnings
warnings.filterwarnings('ignore')

import pandas as pd


# Read the CSV file using a relative path
## Data frame
AU = pd.read_csv(f"../PSM/Datasets/AU.txt", sep="\t", escapechar='"', skipinitialspace=True)

# Display the first few rows of the dataframe
print(AU.head())

# Display the shape of the dataframe
print(f"Shape of the dataframe: {AU.shape}")

# Count the Nan values in each column
nan_counts = AU.isna().sum()

# Display the count of NaN values in each column
print("Count of NaN values in each column:")
print(nan_counts)

   Gruppe  Geschlecht  Subject_id  Ort  Dia  Alter  NDeu  NEngl  NMathe  \
0       1           0       10001  HWI  Asp     12   NaN    NaN     NaN   
1       1           0       10002  HWI   AA      8   1.0    NaN     1.0   
2       1           0       10003  HWI  Asp     13   4.0    4.0     3.0   
3       1           1       11004  HWI  ASS     12   2.0    2.0     3.0   
4       1           1       11005  HWI  ASS     14   1.0    2.0     1.0   

   Gminiq3  Soz Schule Klasse  
0       28    1    NAN    NAN  
1       19    1    NAN    NAN  
2       27    1    NAN    NAN  
3       24    0    NAN    NAN  
4       41    0    NAN    NAN  
Shape of the dataframe: (28, 13)
Count of NaN values in each column:
Gruppe        0
Geschlecht    0
Subject_id    0
Ort           0
Dia           0
Alter         0
NDeu          1
NEngl         4
NMathe        1
Gminiq3       0
Soz           0
Schule        0
Klasse        0
dtype: int64


The `KG` data frame is the control group (no autism). The columns are identical to `AU` data frame.

In [6]:
import warnings
warnings.filterwarnings('ignore')

import pandas as pd


# Read the CSV file using a relative path
## Data frame
KG = pd.read_csv(f"../PSM/Datasets/KG.txt", sep="\t", escapechar='"', skipinitialspace=True)

# Display the first few rows of the dataframe
print(KG.head())

# Display the shape of the dataframe
print(f"Shape of the dataframe: {KG.shape}")

# Count the Nan values in each column
nan_counts = KG.isna().sum()

# Display the count of NaN values in each column
print("Count of NaN values in each column:")
print(nan_counts)

   Gruppe  Geschlecht  Subject_id  Ort Dia  Alter  NDeu  NEngl  NMathe  \
0       0           0          31  HRO  KG   15.0   2.0    3.0     3.0   
1       0           0          35  HRO  KG   15.0   2.0    2.0     2.0   
2       0           0          37  HRO  KG   15.0   2.0    2.0     3.0   
3       0           0          38  HRO  KG   15.0   2.0    1.0     1.0   
4       0           0          39  HRO  KG   15.0   2.0    2.0     1.0   

   Gminiq3  Soz Schule Klasse  
0       36    2    CJD     9b  
1       31    2    CJD     9b  
2       33    2    CJD     9b  
3       28    2    CJD     9b  
4       45    2    CJD     9b  
Shape of the dataframe: (139, 13)
Count of NaN values in each column:
Gruppe        0
Geschlecht    0
Subject_id    0
Ort           0
Dia           0
Alter         1
NDeu          1
NEngl         1
NMathe        1
Gminiq3       0
Soz           0
Schule        0
Klasse        0
dtype: int64


### 2. Research question and hypothesis

We aim to assess differences in facial emotion recognition between a control group (no autism) and an intervention group in which individuals with autism disorder received a novel therapy targeting social competences. 

Because individuals in the treatment group received an intense facial expression recognition training prior to the assessment, we expect no differences between the two groups in terms of facial emotion recognition performance. 