## Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS)

Speech audio-only files (16bit, 48kHz .wav) from the RAVDESS. 

Full dataset of speech and song, audio and video (24.8 GB) available from Zenodo( https://zenodo.org/record/1188976#.YfAv1upBxPY). 

Construction and perceptual validation of the RAVDESS is described in our Open Access paper in PLoS ONE.

Check out our Kaggle Song emotion dataset (https://www.kaggle.com/uwrfkaggler/ravdess-emotional-speech-audio).

### Files

This portion of the RAVDESS contains 1440 files: **60 trials per actor x 24 actors = 1440**. 

The RAVDESS contains 24 professional actors (12 female, 12 male), vocalizing two lexically-matched statements in a neutral North American accent. Speech emotions includes **calm, happy, sad, angry, fearful, surprise, and disgust** expressions. Each expression is produced at **two levels of emotional intensity (normal, strong)**, with an additional neutral expression.

### File naming convention

Each of the 1440 files has a unique filename. The filename consists of a 7-part numerical identifier (e.g., 03-01-06-01-02-01-12.wav). These identifiers define the stimulus characteristics:

- Modality (01 = full-AV, 02 = video-only, 03 = audio-only).
- Vocal channel (01 = speech, 02 = song).
- Emotion (01 = neutral, 02 = calm, 03 = happy, 04 = sad, 05 = angry, 06 = fearful, 07 = disgust, 08 = surprised).
- Emotional intensity (01 = normal, 02 = strong). NOTE: There is no strong intensity for the 'neutral' emotion.
- Statement (01 = "Kids are talking by the door", 02 = "Dogs are sitting by the door").
- Repetition (01 = 1st repetition, 02 = 2nd repetition).
- Actor (01 to 24. Odd numbered actors are male, even numbered actors are female).

Filename example: `03-01-06-01-02-01-12.wav`

- Audio-only (03)
- Speech (01)
- Fearful (06)
- Normal intensity (01)
- Statement "dogs" (02)
- 1st Repetition (01)
- 12th Actor (12)

Female, as the actor ID number is even.

### How to cite the RAVDESS

Academic citation

If you use the RAVDESS in an academic publication, please use the following citation: Livingstone SR, Russo FA (2018) The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English. PLoS ONE 13(5): e0196391. https://doi.org/10.1371/journal.pone.0196391.

All other attributions

If you use the RAVDESS in a form other than an academic publication, such as in a blog post, school project, or non-commercial product, please use the following attribution: "The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS)" by Livingstone & Russo is licensed under CC BY-NA-SC 4.0.

In [1]:
import os

In [31]:
root_path = 'DataSource'

#### Output data set:

0. full file name
1. modality
2. vocal chanel
3. emotion
4. emotional intensity
5. statement
6. repetitions
7. gender (1 - female, 0 - male)

In [61]:
def create_source_dataset(root_path): 
    
    ds = []
    
    root_dirs = [os.path.join(root_path, x) for x in os.listdir(root_path)]    
    for root_dir in os.listdir(root_path):
        
        dir_path = os.path.join(root_path, root_dir)
        
        for file_src in os.listdir(dir_path):
            # 0 element
            file_path = os.path.join(dir_path, file_src)
            # 1 modality 0..2
            modality = int(file_src[0:2])
            # 2 vocal channel 3..5
            vocal_channel = int(file_src[3:5])
            # 3 emotion 6..8
            emotion = int(file_src[6:8])
            # 4 emotional intensity 9..11
            emotional_intensity = int(file_src[9:11])
            # 5 statement 12..14
            statement = int(file_src[12:14])
            # 6 repetitions 15..17
            repetions = int(file_src[15:17])
            # 7 gender 18..20
            gender_id = int(file_src[18:20])
            gender = 0
            if gender_id % 2 == 0:
                gender = 1
                
            ds.append([file_path, modality, vocal_channel, emotion, 
                       emotional_intensity, statement, repetions, gender])
            
    assert len(ds) == 1440, 'it must be 1440 files in the source'
    
    return ds