# Analysing main datasets for Facial Action Unit Detection and loading them

## List of Datasets:

  * DISFA
  * CK+
  * UNBC Shoulder Pain Dataset
  * AM-FED (Affectiva)
  * FERA 2015 and 2017 Challenge Datasets

Importing libraries:

In [3]:
import glob
import os
import sys
import pandas as pd

### **DISFA**

http://mohammadmahoor.com/wp-content/uploads/2017/06/DiSFA_Paper_andAppendix_Final_OneColumn1-1.pdf

* 16.5 GB
* 12 FAUs intensity
* 27 subjects (15 male, 12 female)
* 130,000 frames -each video has 4845 frames @ 20 fps
* Page 5 of paper has the distribution of occurrence of FAUs, each FAU occurs in atleast 5000 frames

#### Contains videos from the left and the right cameras in avi format, and also FAU labels.
**Videos Name formatting:**

*Videos_LeftCamera/{L,l}eftVideoSN001_{c,C}omp.avi* **or** *Videos_RightCamera/{R,r}ightVideoSN001_{c,C}omp.avi*

**FAU label formatting:**

*ActionUnit_Labels/SN001/SN001_au1.txt*

inside this file, each row is like:   *frame_no,intensity{0,1,2,3,4,5}*

In [4]:
DISFA_path='/media/amogh/Stuff/CMU/datasets/DISFA_data/'

In [5]:
% ls {DISFA_path}

ls: cannot access /media/amogh/Stuff/CMU/datasets/DISFA_data/: No such file or directory


In [20]:
DISFA_AU_path=DISFA_path+'ActionUnit_Labels/'
print(DISFA_AU_path)
Videos_right_path=DISFA_path+'Videos_RightCamera/'
print(Videos_right_path)

/media/amogh/Stuff/CMU/datasets/DISFA_data/ActionUnit_Labels/
/media/amogh/Stuff/CMU/datasets/DISFA_data/Videos_RightCamera/


**Loading FAUs**

**Extracting frames**

In [21]:
os.getcwd()

'/home/amogh/cmu/notebooks'

In [22]:
f'{Videos_right_path}*.avi'

'/media/amogh/Stuff/CMU/datasets/DISFA_data/Videos_RightCamera/*.avi'

In [23]:
glob.glob(f'{Videos_right_path}*.avi')[0].split('/')[-1].split('_')[0]

'RightVideoSN013'

Create folders, raw code to avoid overwrite

View files in Videos_right_path:

In [24]:
glob.glob(f'{Videos_right_path}*')

['/media/amogh/Stuff/CMU/datasets/DISFA_data/Videos_RightCamera/RightVideoSN013_comp.avi',
 '/media/amogh/Stuff/CMU/datasets/DISFA_data/Videos_RightCamera/RightVideoSN001',
 '/media/amogh/Stuff/CMU/datasets/DISFA_data/Videos_RightCamera/RightVideoSN001_comp.avi',
 '/media/amogh/Stuff/CMU/datasets/DISFA_data/Videos_RightCamera/RightVideoSN002',
 '/media/amogh/Stuff/CMU/datasets/DISFA_data/Videos_RightCamera/RightVideoSN002_comp.avi',
 '/media/amogh/Stuff/CMU/datasets/DISFA_data/Videos_RightCamera/RightVideoSN003',
 '/media/amogh/Stuff/CMU/datasets/DISFA_data/Videos_RightCamera/RightVideoSN003_comp.avi',
 '/media/amogh/Stuff/CMU/datasets/DISFA_data/Videos_RightCamera/RightVideoSN004',
 '/media/amogh/Stuff/CMU/datasets/DISFA_data/Videos_RightCamera/RightVideoSN004_comp.avi',
 '/media/amogh/Stuff/CMU/datasets/DISFA_data/Videos_RightCamera/RightVideoSN005',
 '/media/amogh/Stuff/CMU/datasets/DISFA_data/Videos_RightCamera/RightVideoSN005_comp.avi',
 '/media/amogh/Stuff/CMU/datasets/DISFA_data

In [25]:
for path in glob.glob(f'{Videos_right_path}*.avi'):
    print(path)


/media/amogh/Stuff/CMU/datasets/DISFA_data/Videos_RightCamera/RightVideoSN013_comp.avi
/media/amogh/Stuff/CMU/datasets/DISFA_data/Videos_RightCamera/RightVideoSN001_comp.avi
/media/amogh/Stuff/CMU/datasets/DISFA_data/Videos_RightCamera/RightVideoSN002_comp.avi
/media/amogh/Stuff/CMU/datasets/DISFA_data/Videos_RightCamera/RightVideoSN003_comp.avi
/media/amogh/Stuff/CMU/datasets/DISFA_data/Videos_RightCamera/RightVideoSN004_comp.avi
/media/amogh/Stuff/CMU/datasets/DISFA_data/Videos_RightCamera/RightVideoSN005_comp.avi
/media/amogh/Stuff/CMU/datasets/DISFA_data/Videos_RightCamera/RightVideoSN006_comp.avi
/media/amogh/Stuff/CMU/datasets/DISFA_data/Videos_RightCamera/RightVideoSN007_comp.avi
/media/amogh/Stuff/CMU/datasets/DISFA_data/Videos_RightCamera/RightVideoSN008_comp.avi
/media/amogh/Stuff/CMU/datasets/DISFA_data/Videos_RightCamera/RightVideoSN009_comp.avi
/media/amogh/Stuff/CMU/datasets/DISFA_data/Videos_RightCamera/RightVideoSN010_comp.avi
/media/amogh/Stuff/CMU/datasets/DISFA_data/

### CK+

### UNBC Shoulder Pain Dataset

### AM-FED(Affectiva)

## **FERA 2015 and 2017 Challenges**

### **BP4D**

In [12]:
BP4D_base_path='/home/amogh/cmu/dataset/BP4D/'


**Loading FAUs:**
<br>
AUCoding has csv files for each sequence. Eg- F001_T1.csv
<br>
Each csv file has a row corresponding to each frame, and corresponding columns 1-27 represent FAUs.
<br>
Occurrence codes: 0 for absent, 1 for present, or 9 for missing data (unknown).



In [13]:
BP4D_AU_path=BP4D_base_path+'AUCoding/AUCoding/'

In [14]:
example_subject='F001'
example_sequence='T1'
example_file=f'{example_subject}_{example_sequence}.csv'
example_file

'F001_T1.csv'

In [15]:
df_example=pd.read_csv(BP4D_AU_path+example_file)
df_example.head()

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,90,91,92,93,94,95,96,97,98,99
0,2440,0,0,9,0,0,0,0,9,0,...,9,9,9,9,9,9,9,9,9,9
1,2441,0,0,9,0,0,0,0,9,0,...,9,9,9,9,9,9,9,9,9,9
2,2442,0,0,9,0,0,0,0,9,0,...,9,9,9,9,9,9,9,9,9,9
3,2443,0,0,9,0,0,0,0,9,0,...,9,9,9,9,9,9,9,9,9,9
4,2444,0,0,9,0,0,0,0,9,0,...,9,9,9,9,9,9,9,9,9,9


In [16]:
frame_numbers=df_example['0']

*nth* AU value is the *nth* column for that frame

In [17]:
frame_no=2441
fau_no=3
df2=(df_example.loc[df_example['0']==frame_no]).iloc[:,fau_no]
df2
# df_example.loc[(frame_no),str(fau_no)]

1    9
Name: 3, dtype: int64

In [18]:
def label_getter(subj_req,seq_req,frame_req):
    au_file_reqd=BP4D_AU_path+f'{subj_req}_{seq_req}.csv'
    print (au_file_reqd)
    df_reqd=pd.read_csv(au_file_reqd)
    list_of_faus=list(df_reqd.loc[df_example['0']==frame_req].iloc[0]) #list: [frame_no, fau1, fau2, fau3....]
    #choose FAUS for which you want the labels.
    list_mask=[1,2,4,6,7,10,12,14,15,17,23]
    list_final=[list_of_faus[i] for i in list_mask]
    return list_final

In [19]:
label_getter('F001','T1',2440)

/home/amogh/cmu/dataset/BP4D/AUCoding/AUCoding/F001_T1.csv


[0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0]

**Loading images:**
<br>
BP4D training has folders as *subject/sequence/frame_no.jpg*

In [126]:
BP4D_training_folder=BP4D_base_path+'BP4D-training/'

In [136]:
!ls $BP4D_training_folder

F001


### **SEMAINE**

* 150 participants' recordings; total 959 conversations, ~5 minutes each
* FACs annotation in 181 frames
