# Causal Analysis of Multimodal Emotion Data

## Abstract

This notebook presents a comprehensive analysis of the RECOLA dataset, focusing on emotion tracking through the integration of multimodal data sources, including audio, video, physiological measures, and others, coupled with arousal and valence annotations. Our primary objective is to uncover the underlying causal relationships between these multimodal measures and emotional states using advanced statistical techniques. We employ a rigorous preprocessing routine to ensure data integrity, followed by dimensionality reduction via PCA or ICA to capture the most informative features. The experiment utilizes a k-fold cross-validation approach to enhance the robustness of the causal discovery process, employing the PC algorithm to identify significant causal links. Our methodology is designed to be flexible, allowing for the exploration of various configurations and thresholds to optimize the analysis. The findings aim to contribute to our understanding of the complex dynamics between multimodal sensory data and human emotional states, offering insights that could benefit areas ranging from affective computing to psychological research.

## Experiment Parameters

- **Participant Number**: Identifier for the participant being analyzed.
- **Folds Number**: Number of folds for cross-validation in causal discovery.
- **Components Threshold (COMPONENTS_THRESHOLD)**: Maximum number of PCA/ICA components, adjusting based on explained variance and threshold.
- **Edge Cutoff (EDGE_CUTOFF)**: Minimum number of edges required for inclusion in the final causal graph.
- **Analysis Features (ANALYSIS_FEATURES)**: Categories of data included in the causal analysis, with Arousal and Valence always selected.

## Experiment Steps

1. **Data Preprocessing**
    - Data Cleaning, Sub-sampling, Standardization, Validation Checks.
2. **Data Categorization**
    - Organize data into categories (audio, video, ECG, EDA, other).
3. **Dimensionality Reduction**
    - Perform PCA or ICA to select the most informative components.
4. **Feature Selection and Experiment Configuration**
    - Select data categories for analysis based on ANALYSIS_FEATURES.
5. **Causal Discovery Experiment**
    - Implement k-fold cross-validation and run the PC algorithm.
6. **Result Aggregation and Analysis**
    - Compile causal graphs, create a histogram of node pairs, apply EDGE_CUTOFF.
7. **Causal Graph Visualization**
    - Display the final causal graph highlighting significant relationships.
8. **Conclusion and Future Work**
    - Summarize findings and propose directions for future research.