## List of Datasets:

  * DISFA
  * CK+
  * UNBC Shoulder Pain Dataset
  * AM-FED (Affectiva)
  * FERA 2015 and 2017 Challenge Datasets

#### This notebook demonstrates how the FAU coding and the images can be loaded for facial analysis

**conda env-** faus_dl *with python 3 kernel*

Importing libraries:

In [3]:
import glob
import os
import sys
import pandas as pd
import numpy as np

In [4]:
from tqdm import tqdm #for visualising progress bar

### **DISFA**

http://mohammadmahoor.com/wp-content/uploads/2017/06/DiSFA_Paper_andAppendix_Final_OneColumn1-1.pdf

* 16.5 GB
* 12 FAUs intensity
* 27 subjects (15 male, 12 female)
* 130,000 frames -each video has 4845 frames @ 20 fps
* Page 5 of paper has the distribution of occurrence of FAUs, each FAU occurs in atleast 5000 frames

#### Contains videos from the left and the right cameras in avi format, and also FAU labels.
**Videos Name formatting:**

*Videos_LeftCamera/{L,l}eftVideoSN001_{c,C}omp.avi* **or** *Videos_RightCamera/{R,r}ightVideoSN001_{c,C}omp.avi*

**FAU label formatting:**

*ActionUnit_Labels/SN001/SN001_au1.txt*

inside this file, each row is like:   *frame_no,intensity{0,1,2,3,4,5}*

In [5]:
DISFA_path='/media/amogh/Stuff/CMU/datasets/DISFA_data/'

In [6]:
% ls {DISFA_path}

[0m[01;34mActionUnit_Labels[0m/     [01;34mmodels[0m/                [01;34mVideos_LeftCamera[0m/
[01;31mActionUnit_Labels.zip[0m  ReadMe DISFA.pdf       [01;31mVideos_LeftCamera.zip[0m
[01;34mfeatures[0m/              [01;34mtemp[0m/                  [01;34mVideos_RightCamera[0m/
[01;31mLandmark_Points.rar[0m    [01;31mVideo_RightCamera.zip[0m


In [7]:
DISFA_AU_path=DISFA_path+'ActionUnit_Labels/'
print(DISFA_AU_path)
Videos_right_path=DISFA_path+'Videos_RightCamera/'
print(Videos_right_path)

/media/amogh/Stuff/CMU/datasets/DISFA_data/ActionUnit_Labels/
/media/amogh/Stuff/CMU/datasets/DISFA_data/Videos_RightCamera/


**DISFA FAUs**

Let's try to save a new dataframe corresponding to all the relevant FAUs

FAUs for which you want a csv file to be created (also taking into account that DISFA doesn't have all the relevant FAU annotations)

In [8]:
relevant_fau=[1,2,4,5,12,25,26]

In [9]:
sample_path=DISFA_AU_path+'SN001/SN001_au4.txt'
sample_path

'/media/amogh/Stuff/CMU/datasets/DISFA_data/ActionUnit_Labels/SN001/SN001_au4.txt'

In [10]:
all_dict={}
for fau in relevant_fau:
    all_dict[f"dict_{fau}"]={}
all_dict

{'dict_1': {},
 'dict_2': {},
 'dict_4': {},
 'dict_5': {},
 'dict_12': {},
 'dict_25': {},
 'dict_26': {}}

In [11]:
subject_files_list=glob.glob(DISFA_AU_path+"/*")
subject_files_list

['/media/amogh/Stuff/CMU/datasets/DISFA_data/ActionUnit_Labels/SN009',
 '/media/amogh/Stuff/CMU/datasets/DISFA_data/ActionUnit_Labels/SN001',
 '/media/amogh/Stuff/CMU/datasets/DISFA_data/ActionUnit_Labels/SN002',
 '/media/amogh/Stuff/CMU/datasets/DISFA_data/ActionUnit_Labels/SN003',
 '/media/amogh/Stuff/CMU/datasets/DISFA_data/ActionUnit_Labels/SN004',
 '/media/amogh/Stuff/CMU/datasets/DISFA_data/ActionUnit_Labels/SN005',
 '/media/amogh/Stuff/CMU/datasets/DISFA_data/ActionUnit_Labels/SN006',
 '/media/amogh/Stuff/CMU/datasets/DISFA_data/ActionUnit_Labels/SN007',
 '/media/amogh/Stuff/CMU/datasets/DISFA_data/ActionUnit_Labels/SN008',
 '/media/amogh/Stuff/CMU/datasets/DISFA_data/ActionUnit_Labels/SN010',
 '/media/amogh/Stuff/CMU/datasets/DISFA_data/ActionUnit_Labels/SN011',
 '/media/amogh/Stuff/CMU/datasets/DISFA_data/ActionUnit_Labels/SN012',
 '/media/amogh/Stuff/CMU/datasets/DISFA_data/ActionUnit_Labels/SN013',
 '/media/amogh/Stuff/CMU/datasets/DISFA_data/ActionUnit_Labels/SN016',
 '/med

In [10]:
for fau in tqdm(relevant_fau):
    for subject_path in subject_files_list:
        subject=os.path.basename(subject_path)
        reqd_file=subject_path+f"/{subject}_au{fau}.txt"
        reqd_df=pd.read_csv(reqd_file,names=["frameNo","value"])
        all_dict[f"dict_{fau}"][f"{subject}"]=(reqd_df['frameNo'][reqd_df["value"]>1]).values
        all_dict[f"dict_{fau}"][f"{subject}_neg"]=(reqd_df['frameNo'][reqd_df["value"]<=1]).values

100%|██████████| 7/7 [00:00<00:00,  9.22it/s]


Saving all data holding dictionaries as csv files

In [11]:
count_desc=pd.DataFrame()

In [12]:
for key in all_dict.keys():
    fau_no=key.split('_')[1]
    reqd_df=pd.DataFrame(dict([(k,pd.Series(v)) for k,v in (all_dict[key]).items()]))
    count_df=reqd_df.describe().iloc[0:1,0:]
    count_df.index=[f"count_FAU{fau_no}"]
    count_desc=pd.concat([count_desc,count_df])
    reqd_df.to_csv('DISFA_FAUs/'+f'FAU{fau_no}.csv',index=False)

Save it as count_summary.csv

In [13]:
count_desc.to_csv('DISFA_FAUs/count_summary.csv')

In [14]:
count_desc

Unnamed: 0,SN001,SN001_neg,SN002,SN002_neg,SN003,SN003_neg,SN004,SN004_neg,SN005,SN005_neg,...,SN028,SN028_neg,SN029,SN029_neg,SN030,SN030_neg,SN031,SN031_neg,SN032,SN032_neg
count_FAU1,0.0,4845.0,152.0,4693.0,496.0,4349.0,1086.0,3759.0,26.0,4819.0,...,0.0,4845.0,1008.0,3837.0,38.0,4807.0,23.0,4822.0,870.0,3975.0
count_FAU2,0.0,4845.0,166.0,4679.0,18.0,4827.0,1167.0,3678.0,4.0,4841.0,...,23.0,4822.0,1949.0,2896.0,173.0,4672.0,30.0,4815.0,182.0,4663.0
count_FAU4,29.0,4816.0,10.0,4835.0,1752.0,3093.0,1546.0,3299.0,245.0,4600.0,...,145.0,4700.0,2246.0,2599.0,405.0,4440.0,183.0,4662.0,1073.0,3772.0
count_FAU5,0.0,4845.0,9.0,4836.0,21.0,4824.0,53.0,4792.0,0.0,4845.0,...,4.0,4841.0,20.0,4825.0,56.0,4789.0,0.0,4845.0,4.0,4841.0
count_FAU12,408.0,4437.0,273.0,4572.0,629.0,4216.0,765.0,4080.0,140.0,4705.0,...,359.0,4486.0,627.0,4218.0,783.0,4062.0,1155.0,3690.0,608.0,4237.0
count_FAU25,249.0,4596.0,1071.0,3774.0,1777.0,3068.0,838.0,4007.0,741.0,4104.0,...,1476.0,3369.0,383.0,4462.0,1552.0,3293.0,1809.0,3036.0,3067.0,1778.0
count_FAU26,186.0,4659.0,1680.0,3165.0,811.0,4034.0,71.0,4774.0,484.0,4361.0,...,1585.0,3260.0,31.0,4814.0,253.0,4592.0,1475.0,3370.0,2153.0,2692.0


The number of positives for each FAU:

In [15]:
count_desc.sum(1)-count_desc.filter(like="neg",axis=1).sum(1) #subtracting the sum of all columns containing "neg" from the sum of all columns(total count)

count_FAU1      6506.0
count_FAU2      5644.0
count_FAU4     19933.0
count_FAU5      1150.0
count_FAU12    16851.0
count_FAU25    36247.0
count_FAU26    11533.0
dtype: float64

This is how the data looks like finally:

In [16]:
sample_df=pd.read_csv("DISFA_FAUs/FAU1.csv")
sample_df.head()

Unnamed: 0,SN001,SN001_neg,SN002,SN002_neg,SN003,SN003_neg,SN004,SN004_neg,SN005,SN005_neg,...,SN028,SN028_neg,SN029,SN029_neg,SN030,SN030_neg,SN031,SN031_neg,SN032,SN032_neg
0,,1,414.0,1.0,1629.0,1.0,937.0,1.0,1011.0,1.0,...,,1,128.0,1.0,384.0,1.0,2684.0,1.0,553.0,1.0
1,,2,415.0,2.0,1630.0,2.0,938.0,2.0,1012.0,2.0,...,,2,129.0,2.0,385.0,2.0,2685.0,2.0,554.0,2.0
2,,3,416.0,3.0,1631.0,3.0,939.0,3.0,1013.0,3.0,...,,3,130.0,3.0,386.0,3.0,2686.0,3.0,555.0,3.0
3,,4,417.0,4.0,1632.0,4.0,940.0,4.0,1014.0,4.0,...,,4,131.0,4.0,387.0,4.0,2687.0,4.0,556.0,4.0
4,,5,418.0,5.0,1633.0,5.0,941.0,5.0,1015.0,5.0,...,,5,132.0,5.0,388.0,5.0,2688.0,5.0,557.0,5.0


**Extracting frames**

In [21]:
os.getcwd()

'/home/amogh/cmu/notebooks'

In [22]:
f'{Videos_right_path}*.avi'

'/media/amogh/Stuff/CMU/datasets/DISFA_data/Videos_RightCamera/*.avi'

In [23]:
glob.glob(f'{Videos_right_path}*.avi')[0].split('/')[-1].split('_')[0]

'RightVideoSN013'

Create folders, raw code to avoid overwrite

View files in Videos_right_path:

In [None]:
glob.glob(f'{Videos_right_path}*')

Use the following to convert all frames to bitmap:<br>
ffmpeg -i "/media/amogh/Stuff/CMU/datasets/DISFA_data/Videos_RightCamera/RightVideoSN001_comp.avi" "/media/amogh/Stuff/CMU/datasets/DISFA_data/Videos_RightCamera/RightVideoSN001/out-%03d.bmp"<br>
Used the following code to convert the videos into jpeg:

In [None]:
for path in tqdm(glob.glob(f'{Videos_right_path}*.avi')):
    print(path)
    folder_name=('RightVideo'+os.path.basename(path).split('_')[0][-5:])
    print(folder_name)
    !ffmpeg -i "/media/amogh/Stuff/CMU/datasets/DISFA_data/Videos_RightCamera/{folder_name}_comp.avi" -q:v 1 "/media/amogh/Stuff/CMU/datasets/DISFA_data/Videos_RightCamera/{folder_name}/%01d.jpeg"

#### End to End functions to save csv files

In [26]:
def saveCSV(threshold=1,relevant_fau=[1,2,4,5,12,25,26]):
    #initialising an empty dictionary
    all_dict={}
    for fau in relevant_fau:
        all_dict[f"dict_{fau}"]={}
    for fau in (relevant_fau):
        for subject_path in subject_files_list:
            subject=os.path.basename(subject_path)
            reqd_file=subject_path+f"/{subject}_au{fau}.txt"
            reqd_df=pd.read_csv(reqd_file,names=["frameNo","value"])
            all_dict[f"dict_{fau}"][f"{subject}"]=(reqd_df['frameNo'][reqd_df["value"]>threshold]).values
            all_dict[f"dict_{fau}"][f"{subject}_neg"]=(reqd_df['frameNo'][reqd_df["value"]==0]).values
    # Converting dictionary into a dataframe
    dest_path=f'DISFA_FAUs/{threshold}/'
    if not os.path.exists(dest_path):
        os.makedirs(dest_path)
    count_desc=pd.DataFrame()
    for key in all_dict.keys():
        fau_no=key.split('_')[1]
        reqd_df=pd.DataFrame(dict([(k,pd.Series(v)) for k,v in (all_dict[key]).items()]))
        count_df=reqd_df.describe().iloc[0:1,0:]
        count_df.index=[f"count_FAU{fau_no}"]
        count_desc=pd.concat([count_desc,count_df])
        reqd_df.to_csv(f'DISFA_FAUs/{threshold}/'+f'FAU{fau_no}.csv',index=False)
    count_desc.to_csv(f'DISFA_FAUs/{threshold}/count_summary.csv')
    print (f"no of positives for each FAU for threshold : {threshold} are: ")
    print(count_desc.sum(1)-count_desc.filter(like="neg",axis=1).sum(1))
    return count_desc


In [29]:
for thresh in [1,2,3,4]:
    saveCSV(thresh)

no of positives for each FAU for threshold : 1 are: 
count_FAU1      6506.0
count_FAU2      5644.0
count_FAU4     19933.0
count_FAU5      1150.0
count_FAU12    16851.0
count_FAU25    36247.0
count_FAU26    11533.0
dtype: float64
no of positives for each FAU for threshold : 2 are: 
count_FAU1      4757.0
count_FAU2      4710.0
count_FAU4     12297.0
count_FAU5       431.0
count_FAU12     9982.0
count_FAU25    22312.0
count_FAU26     4060.0
dtype: float64
no of positives for each FAU for threshold : 3 are: 
count_FAU1     1948.0
count_FAU2     1205.0
count_FAU4     5711.0
count_FAU5      138.0
count_FAU12    2749.0
count_FAU25    6619.0
count_FAU26     531.0
dtype: float64
no of positives for each FAU for threshold : 4 are: 
count_FAU1      555.0
count_FAU2      369.0
count_FAU4     1383.0
count_FAU5       34.0
count_FAU12     172.0
count_FAU25    1039.0
count_FAU26     217.0
dtype: float64


In [21]:
print(count_desc_thresh1.sum(1)-count_desc_thresh1.filter(like="neg",axis=1).sum(1))

count_FAU1      6506.0
count_FAU2      5644.0
count_FAU4     19933.0
count_FAU5      1150.0
count_FAU12    16851.0
count_FAU25    36247.0
count_FAU26    11533.0
dtype: float64


In [20]:
count_desc_thresh1.sum(1)-count_desc_thresh1.filter(like="neg",axis=1).sum(1) #subtracting the sum of all columns containing "neg" from the sum of all columns(total count)

count_FAU1      6506.0
count_FAU2      5644.0
count_FAU4     19933.0
count_FAU5      1150.0
count_FAU12    16851.0
count_FAU25    36247.0
count_FAU26    11533.0
dtype: float64

## **FERA 2015 and 2017 Challenges**

### **BP4D**

In [18]:
BP4D_base_path='/home/amogh/cmu/dataset/BP4D/'

**Loading FAUs:**
<br>
AUCoding has csv files for each sequence. Eg- F001_T1.csv
<br>
Each csv file has a row corresponding to each frame, and corresponding columns 1-27 represent FAUs.
<br>
Occurrence codes: 0 for absent, 1 for present, or 9 for missing data (unknown).



In [19]:
BP4D_AU_path=BP4D_base_path+'AUCoding/AUCoding/'

#### **BP4D example functions**

In [20]:
example_subject='F001'
example_sequence='T1'
example_file=f'{example_subject}_{example_sequence}.csv'
example_file

'F001_T1.csv'

In [21]:
df_example=pd.read_csv(BP4D_AU_path+example_file)
df_example.head()

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,90,91,92,93,94,95,96,97,98,99
0,2440,0,0,9,0,0,0,0,9,0,...,9,9,9,9,9,9,9,9,9,9
1,2441,0,0,9,0,0,0,0,9,0,...,9,9,9,9,9,9,9,9,9,9
2,2442,0,0,9,0,0,0,0,9,0,...,9,9,9,9,9,9,9,9,9,9
3,2443,0,0,9,0,0,0,0,9,0,...,9,9,9,9,9,9,9,9,9,9
4,2444,0,0,9,0,0,0,0,9,0,...,9,9,9,9,9,9,9,9,9,9


In [None]:
frame_numbers=df_example['0']
frame_numbers

**Seeing the occurrence (0(not present)/1(present)/9(unlabelled)) of an FAU number for a frame**;  *nth* AU value is the *nth* column for that frame

In [23]:
frame_no=2441
fau_no=3
df2=(df_example.loc[df_example['0']==frame_no]).iloc[:,fau_no]
df2
# df_example.loc[(frame_no),str(fau_no)]

1    9
Name: 3, dtype: int64

Seeing the **number of times each FAU occurs(1), does not occur(0), isn't labelled(9)**; syntax- df_example.iloc[:,0:28] returns the column for 27 FAUs, apply the value_counts function to each column, which gives the frequency of all values that occur in it. iloc at end is used to just see '0','1','9'. 

In [24]:
df_example.iloc[:,0:28].apply(pd.Series.value_counts).iloc[:3,:]

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,18,19,20,21,22,23,24,25,26,27
0,,233.0,415.0,,527.0,432.0,263.0,553.0,,553.0,...,553.0,553.0,537.0,,553.0,553.0,553.0,,,496.0
1,,320.0,138.0,,26.0,121.0,290.0,,,,...,,,16.0,,,,,,,57.0
9,,,,553.0,,,,,553.0,,...,,,,553.0,,,,553.0,553.0,


In [25]:
def label_getter(subj_req,seq_req,frame_req):
    au_file_reqd=BP4D_AU_path+f'{subj_req}_{seq_req}.csv'
    print (au_file_reqd)
    df_reqd=pd.read_csv(au_file_reqd)
    list_of_faus=list(df_reqd.loc[df_example['0']==frame_req].iloc[0]) #list: [frame_no, fau1, fau2, fau3....]
    #choose FAUS for which you want the labels.
    list_mask=[1,2,4,6,7,10,12,14,15,17,23]
    list_final=[list_of_faus[i] for i in list_mask]
    return list_final

#running on an example.
label_getter('F001','T1',2440)

/home/amogh/cmu/dataset/BP4D/AUCoding/AUCoding/F001_T1.csv


[0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0]

**Analysing occurrence of FAUs**

In [None]:
list_of_AU_files=glob.glob(f'{BP4D_AU_path}/*')
list_of_AU_files

Let's try to create csv files for each relevant FAU.<br>
Rows - frameNo<br>
Columns - subject_sequence

FAUs for which you want a csv file to be created

In [27]:
relevant_fau=[1,2,4,5,7,12,25,26,43]

intialising FAU dictionaries for above FAUS

In [28]:
all_dict={}
for fau in relevant_fau:
    all_dict[f"dict_{fau}"]={}
all_dict

{'dict_1': {},
 'dict_2': {},
 'dict_4': {},
 'dict_5': {},
 'dict_7': {},
 'dict_12': {},
 'dict_25': {},
 'dict_26': {},
 'dict_43': {}}

for each csv:

In [29]:
for csv_file in tqdm(list_of_AU_files):
    csv_df=pd.read_csv(csv_file)
    csv_basename=os.path.basename(csv_file)
    user_seq=os.path.splitext(csv_basename)[0]
    #1st column is for frame_no, returns the occurrence dataframe for relevant FAUs
    csv_relevant_df=csv_df.iloc[:,np.insert(relevant_fau,0,0)] 
    #adding to dictionary for each FAU:
    for fau in relevant_fau:
        occurrence_frames=(csv_relevant_df['0'][csv_relevant_df[str(fau)]==1]).values #list of all frames with au occurence
        not_occurrence_frames=(csv_relevant_df['0'][csv_relevant_df[str(fau)]==0]).values
        all_dict[f'dict_{fau}'][f'{user_seq}']=occurrence_frames
        all_dict[f'dict_{fau}'][f'{user_seq}_neg']=not_occurrence_frames

100%|██████████| 328/328 [00:09<00:00, 33.36it/s]


Saving all data holding dictionaries as csv files

In [30]:
count_desc=pd.DataFrame()

In [31]:
for key in tqdm(all_dict.keys()):
    fau_no=key.split('_')[1]
    reqd_df=pd.DataFrame(dict([(k,pd.Series(v)) for k,v in (all_dict[key]).items()])) #so that NaN appears in columns of unequal lengths
    count_df=reqd_df.describe().iloc[0:1,0:]
    count_df.index=[f"count_FAU{fau_no}"]
    count_desc=pd.concat([count_desc,count_df])
    reqd_df.to_csv('BP4D_FAUs/'+f'FAU{fau_no}.csv')

100%|██████████| 9/9 [00:11<00:00,  1.23s/it]


This is how the data looks like finally:

In [32]:
sample_df=pd.read_csv("BP4D_FAUs/FAU1.csv")
sample_df.head()

Unnamed: 0.1,Unnamed: 0,F001_T1,F001_T1_neg,F001_T2,F001_T2_neg,F001_T3,F001_T3_neg,F001_T4,F001_T4_neg,F001_T5,...,M018_T4,M018_T4_neg,M018_T5,M018_T5_neg,M018_T6,M018_T6_neg,M018_T7,M018_T7_neg,M018_T8,M018_T8_neg
0,0,2451.0,2440.0,836.0,721.0,,1.0,,664.0,,...,,1075.0,,237.0,,649.0,610.0,567.0,169.0,1.0
1,1,2452.0,2441.0,837.0,722.0,,2.0,,665.0,,...,,1076.0,,238.0,,650.0,611.0,568.0,170.0,2.0
2,2,2453.0,2442.0,838.0,723.0,,3.0,,666.0,,...,,1077.0,,239.0,,651.0,612.0,569.0,171.0,3.0
3,3,2454.0,2443.0,839.0,724.0,,4.0,,667.0,,...,,1078.0,,240.0,,652.0,613.0,570.0,172.0,4.0
4,4,2455.0,2444.0,840.0,725.0,,5.0,,668.0,,...,,1079.0,,241.0,,653.0,614.0,571.0,173.0,5.0


Here is the count summary for all the FAUs:

In [33]:
count_desc

Unnamed: 0,F001_T1,F001_T1_neg,F001_T2,F001_T2_neg,F001_T3,F001_T3_neg,F001_T4,F001_T4_neg,F001_T5,F001_T5_neg,...,M018_T4,M018_T4_neg,M018_T5,M018_T5_neg,M018_T6,M018_T6_neg,M018_T7,M018_T7_neg,M018_T8,M018_T8_neg
count_FAU1,320.0,233.0,134.0,440.0,0.0,273.0,0.0,509.0,0.0,518.0,...,0.0,500.0,0.0,499.0,0.0,489.0,26.0,459.0,13.0,188.0
count_FAU2,138.0,415.0,32.0,542.0,0.0,273.0,0.0,509.0,0.0,518.0,...,0.0,500.0,0.0,499.0,0.0,489.0,27.0,458.0,0.0,201.0
count_FAU4,26.0,527.0,498.0,76.0,20.0,253.0,0.0,509.0,74.0,444.0,...,0.0,500.0,13.0,486.0,0.0,489.0,0.0,485.0,150.0,51.0
count_FAU5,121.0,432.0,68.0,506.0,54.0,219.0,28.0,481.0,0.0,518.0,...,0.0,500.0,0.0,499.0,0.0,489.0,23.0,462.0,0.0,201.0
count_FAU7,0.0,553.0,0.0,574.0,4.0,269.0,482.0,27.0,351.0,167.0,...,146.0,354.0,439.0,60.0,0.0,489.0,62.0,423.0,160.0,41.0
count_FAU12,553.0,0.0,0.0,574.0,0.0,273.0,509.0,0.0,452.0,66.0,...,357.0,143.0,499.0,0.0,0.0,489.0,133.0,352.0,0.0,201.0
count_FAU25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
count_FAU26,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
count_FAU43,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


Save it as count_summary.csv

In [34]:
count_desc.to_csv('BP4D_FAUs/count_summary.csv')

The number of positives for each FAU:

In [35]:
count_desc.sum(1)-count_desc.filter(like="neg",axis=1).sum(1) #subtracting the sum of all columns containing "neg" from the sum of all columns(total count)

count_FAU1     31043.0
count_FAU2     25110.0
count_FAU4     29755.0
count_FAU5      5693.0
count_FAU7     80617.0
count_FAU12    82531.0
count_FAU25        0.0
count_FAU26        0.0
count_FAU43        0.0
dtype: float64

Now since we have the csv files corresponding to each action unit, it is much easier to balance the data.

This is how the data looks like finally:

**Loading images:**
<br>
BP4D training has folders as *subject/sequence/frame_no.jpg*

In [126]:
BP4D_training_folder=BP4D_base_path+'BP4D-training/'

In [136]:
!ls $BP4D_training_folder

F001


### **SEMAINE**

* 150 participants' recordings; total 959 conversations, ~5 minutes each
* FACs annotation in 181 frames
