Skip to content
/ EmoCF Public

Counterfactually Generated Emotional Audio Recognition

Notifications You must be signed in to change notification settings

SALT-NLP/EmoCF

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Emotional Counterfactual Data Generator

A toolkit for generating and validating ambiguous emotional utterances with corresponding audio representations.

Overview

This project creates a dataset of emotionally ambiguous sentences, where the same text can be interpreted with different emotions based on the tone. The pipeline:

  1. Generates ambiguous utterances
  2. Filters candidates for quality
  3. Creates emotional responses to these utterances
  4. Generates audio files with different emotional tones
  5. Validates the generated audio using multiple emotion recognition models

Requirements

  • Python 3.8+
  • CUDA-capable GPU
  • Hugging Face API key
  • OpenAI API key (for GPT4o validation)
  • Gemini API key (for Gemini validation)
  • F5-TTS for audio generation

Installation

git clone <repository-url>
cd emotional-counterfactual-data
pip install -r requirements.txt

Environment Setup

Set required API keys:

export HF_ACCESS_KEY="your_huggingface_token"
export OPENAI_API_KEY="your_openai_key"
export GEMINI_API_KEY="your_gemini_key"

Pipeline Steps

1. Generate Ambiguous Utterances

python generate_ambig.py

Creates utterances.jsonl with emotionally ambiguous sentences.

2. Filter Quality Candidates

python refilter_candidates.py

Produces filtered_utterances.jsonl with high-quality examples.

3. Generate Emotional Responses

python generate_responses.py

Creates responses.jsonl with appropriate responses for each emotion.

4. Extract Unique Sentences

bash filter_to_unique.sh

Generates unique_sentences.txt and unique_responses.jsonl.

5. Generate Audio Files

Requires audio reference files in a references directory:

  • references/man/[emotion].wav
  • references/woman/[emotion].wav
python generate_audio.py

Creates audio files in the generated_audio directory.

6. Validate Audio Emotions

python validate_responses.py

Validates generated audio using emotion recognition models (DiVA, Qwen2, Gemini, GPT4o).

7. Compute Statistics

python compute_stats.py

Analyzes model performance and generates the final filtered dataset.

Data Structure

The final data contains:

  • Original sentences with their intended emotions
  • Audio files with different emotional tones
  • Model predictions for each audio sample
  • Statistical analysis of model performance

About

Counterfactually Generated Emotional Audio Recognition

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •