# DecVAE Tutorial: IEMOCAP Dataset

Complete workflow example for the IEMOCAP dataset.

In [None]:
# Import necessary libraries
import os
import json
from pathlib import Path

# Set the working directory to the DecVAE root
# Adjust this path to your local DecVAE directory
DECVAE_ROOT = Path(os.getcwd()).parent if 'examples' in os.getcwd() else Path(os.getcwd())
os.chdir(DECVAE_ROOT)
print(f"Working directory: {os.getcwd()}")

## 1. Prepare IEMOCAP Dataset

The IEMOCAP (Interactive Emotional Dyadic Motion Capture) dataset contains emotional speech recordings.

Download from: [https://sail.usc.edu/iemocap/](https://sail.usc.edu/iemocap/)

After downloading, place the dataset in "../IEMOCAP" (same level as the DecVAE project directory).

## 2. Input Visualization

We generate input visualizations for the raw audio signal (X), and the components after applying a decomposition. We visualize individual components (OC1, OC2, ..., OCn) and aggregated representations, e.g. concatenation of all components and initial X [X,OC1,OC2,...,OCn]. We color the representations using frequency correspondence of the inputs or generative factors (phoneme, speaker, emotion).

For the IEMOCAP dataset, we will visualize the inputs to all models.

Frame-level:

In [None]:
# Visualize frame-level inputs
!accelerate launch scripts/visualize/low_dim_vis_input.py \
    --config_file config_files/input_visualizations/config_visualizing_input_frames_iemocap.json

Sequence-level:

In [None]:
# Visualize sequence-level inputs
!accelerate launch scripts/visualize/low_dim_vis_input.py \
    --config_file config_files/input_visualizations/config_visualizing_input_sequences_iemocap.json

## 3. Fine-tuning DecVAE

For IEMOCAP, we use fine-tuning on a pre-trained model rather than training from scratch.

Single-GPU: use the --gpu_ids argument to specify the id of the GPU (0,1,2,...) - accelerate launch --gpu_ids <id> scripts... . Alternatively omit this argument and the default GPU id in your system will be used (as below).

In [None]:
# Fine-tune DecVAE on single GPU
!accelerate launch scripts/fine_tuning/ssl_fine_tune_pretrained_models.py \
    --config_file config_files/DecVAEs/iemocap/fine_tuning/config_finetune_iemocap_NoC4.json

Multi-GPU (specify GPU IDs):

In [None]:
# Fine-tune DecVAE on multiple GPUs (e.g., GPU 0 and 1)
# Uncomment and modify as needed:
# !accelerate launch --gpu_ids 0,1 scripts/fine_tuning/ssl_fine_tune_pretrained_models.py \
#     --config_file config_files/DecVAEs/iemocap/fine_tuning/config_finetune_iemocap_NoC4.json

View configuration:

In [None]:
import json

with open("config_files/DecVAEs/iemocap/fine_tuning/config_finetune_iemocap_NoC4.json", 'r') as f:
    config = json.load(f)

print(json.dumps(config, indent=2))

## 4. Latent Evaluation

In [None]:
# Evaluate latent representations
!accelerate launch scripts/post-training/latents_post_analysis.py \
    --config_file config_files/DecVAEs/iemocap/latent_evaluations/config_latent_anal_iemocap.json

## 5. Latent Visualization

Frame-level:

In [None]:
# Visualize frame-level latent representations
!accelerate launch scripts/visualize/low_dim_vis_latents.py \
    --config_file config_files/DecVAEs/iemocap/latent_visualizations/config_latent_frames_visualization_iemocap.json

Sequence-level:

In [None]:
# Visualize sequence-level latent representations
!accelerate launch scripts/visualize/low_dim_vis_latents.py \
    --config_file config_files/DecVAEs/iemocap/latent_visualizations/config_latent_sequences_visualization_iemocap.json

## 6. Latent Traversals

Perform traversal analysis:

In [None]:
# Perform latent traversal analysis
!accelerate launch scripts/latent_response_analysis/latent_traversal_analysis.py \
    --config_file config_files/DecVAEs/iemocap/latent_traversals/config_latent_traversals_iemocap.json