# Moving onto temporal modelling.

**The way I want to do this is to make 2 models: One first-order Markov model for modelling the temporal relationships and then the DBN that will model the co-occurrences.**

In order to make the Markov model, we need the following steps:
1. define variables
2. discretize the variables (ie make 0/1 vars)
3. Estimate transition probabilities
4. Construct transition matrix
5. Combine transition matrices
6. Use the model for prediction

Of these, we already have 1. and 2., so we can move onto 3.

How to estimate **transition probabilities**: 
Count the number of transitions between states in dataset and divide by total number of transitions.



In [1]:
import os
import pandas as pd
import pprint 
import pickle
import numpy as np
pp = pprint.PrettyPrinter(indent=4)
from sklearn.model_selection import train_test_split
import math

In [2]:
# load data

with open("facetouch_dataframes.pickle", 'rb') as f:
    dataframes = pickle.load(f)

In [3]:
print(dataframes['/home/roni/coding/mastersProject/src/csvOut/p_100/recording_3'].columns)

Index(['Unnamed: 0', 'participant', 'frame', 'pose', 'hand_left', 'hand_right',
       'leftHandTouching', 'rightHandTouching', ' face_id', ' timestamp',
       ' confidence', ' success', ' AU01_r', ' AU02_r', ' AU04_r', ' AU05_r',
       ' AU06_r', ' AU07_r', ' AU09_r', ' AU10_r', ' AU12_r', ' AU14_r',
       ' AU15_r', ' AU17_r', ' AU20_r', ' AU23_r', ' AU25_r', ' AU26_r',
       ' AU45_r', ' AU01_c', ' AU02_c', ' AU04_c', ' AU05_c', ' AU06_c',
       ' AU07_c', ' AU09_c', ' AU10_c', ' AU12_c', ' AU14_c', ' AU15_c',
       ' AU17_c', ' AU20_c', ' AU23_c', ' AU25_c', ' AU26_c', ' AU28_c',
       ' AU45_c', 'GAD Score', 'PHQ Score', 'combinedRightHand',
       'combinedLeftHand'],
      dtype='object')


In [4]:
print(len(dataframes))
relevant_nodes = [' AU17_c', ' AU07_c',' AU14_c' , ' AU12_c',' AU20_c' , 'leftHandTouching', 'rightHandTouching' ]

50


In [5]:
# interlude: make the dfs on a 1 second time step instead of frame wise.

ft_dataframes = {}

for d in dataframes:
    df = dataframes[d]
    # convert 'timestamp' column to datetime type
    df[' timestamp'] = pd.to_datetime(df[' timestamp'], unit='s')

    # set 'timestamp' as the DataFrame index
    df = df.set_index(' timestamp')

    # group the DataFrame by 1-second time windows and calculate the mode for each relevant variable
    ft_dataframes[d] = df[relevant_nodes].groupby(pd.Grouper(freq='1S')).agg(lambda x: x.mode()[0])

In [6]:
ft_dataframes['/home/roni/coding/mastersProject/src/csvOut/p_100/recording_3'].info()

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 491 entries, 1970-01-01 00:00:00 to 1970-01-01 00:08:10
Freq: S
Data columns (total 7 columns):
 #   Column             Non-Null Count  Dtype  
---  ------             --------------  -----  
 0    AU17_c            491 non-null    float64
 1    AU07_c            491 non-null    float64
 2    AU14_c            491 non-null    float64
 3    AU12_c            491 non-null    float64
 4    AU20_c            491 non-null    float64
 5   leftHandTouching   491 non-null    bool   
 6   rightHandTouching  491 non-null    bool   
dtypes: bool(2), float64(5)
memory usage: 24.0 KB


In [7]:
ft_dataframes['/home/roni/coding/mastersProject/src/csvOut/p_100/recording_3']['rightHandTouching'].sum()

0

In [8]:
dataframes['/home/roni/coding/mastersProject/src/csvOut/p_100/recording_3']['leftHandTouching'].sum()

3

In [9]:
# need to remove the ones that have zeroes

nonzeroes = {}

for df in ft_dataframes:
    i = ft_dataframes[df]
    if ( i['leftHandTouching'].sum() > 0 or i['rightHandTouching'].sum() > 0 ):
        nonzeroes[df] = i
print(len(ft_dataframes),len(nonzeroes))

50 13


In [10]:
for df in nonzeroes: #'GAD Score', 'PHQ Score'
    nonzeroes[df]['PHQ Score'] = dataframes[df]['PHQ Score'][0]
    nonzeroes[df]['GAD Score'] = dataframes[df]['GAD Score'][0]
    

In [11]:
#save the updated dataframes
with open("onesec_dataframes_unsorted.pickle", 'wb') as f:
    pickle.dump(ft_dataframes, f)
    
with open("onesec_dataframes_sorted.pickle", 'wb') as f:
    pickle.dump(nonzeroes, f)    

From the results of the correlation calculations we want to model **AU17, AU7 and AU14 for the left hand** and **AU14, AU12 and AU20 for the right hand** as those are the largest correlations.

Out of interest this correlates to the chin raiser, lid tightener, dimpler for left hand and dimpler, lip corner puller and lip stretcher for the right hand.

Need to compute the probability P(t = 0| t-1 = 1) for each of these. We also need to split into train/test/val.

In [12]:
video_keys = list(nonzeroes.keys())

# Split the keys into training, validation, and testing sets
train_keys, test_keys = train_test_split(video_keys, test_size=0.1, random_state=42)
train_keys, val_keys = train_test_split(train_keys, test_size=0.111, random_state=42)

# Create the training, validation, and testing sets as dictionaries
train_data = {key: nonzeroes[key] for key in train_keys}
val_data = {key: nonzeroes[key] for key in val_keys}
test_data = {key: nonzeroes[key] for key in test_keys}

print(len(train_data),len(val_data),len(test_data))

9 2 2


In [13]:
transition_cols = [' AU17_c', ' AU07_c',' AU14_c' , ' AU12_c',' AU20_c' , 'leftHandTouching', 'rightHandTouching' ]
transition_probs = { df:{ key:[0,0,0,0,0,0] for key in transition_cols } for df in train_data} # [ 1 to 0, total 1, 0 to 1, total 0, 0 to 0 , 1 to 1]

In [14]:
for df in train_data:
    for i in transition_cols:
        #if( True not in train_data[df][i].unique() ):
            #continue
        diffs = train_data[df][i].astype(int).diff()
        positive_transitions = (diffs == 1).sum() # this is 0 to 1
        negative_transitions = (diffs == -1).sum() # this is 1 to 0
        zero_to_zero = ((train_data[df][i] == 0) & (diffs == 0)).sum()
        one_to_one = ((train_data[df][i] == 1) & (diffs == 0)).sum()
        zeroes = (train_data[df][i] == 0).sum()
        ones = (train_data[df][i] == 1).sum()
        
        #print(i, positive_transitions, ones, negative_transitions, zeroes)
        transition_probs[df][i][0] += negative_transitions
        transition_probs[df][i][1] += ones
        transition_probs[df][i][2] += positive_transitions
        transition_probs[df][i][3] += zeroes
        transition_probs[df][i][5] += one_to_one
        transition_probs[df][i][4] += zero_to_zero
        
        
        
    

In [15]:
print(transition_probs)

{'/home/roni/coding/mastersProject/src/csvOut/p_33': {' AU17_c': [88, 212, 88, 425, 336, 124], ' AU07_c': [78, 208, 78, 429, 350, 130], ' AU14_c': [35, 61, 35, 576, 540, 26], ' AU12_c': [36, 109, 37, 528, 491, 72], ' AU20_c': [45, 77, 46, 560, 514, 31], 'leftHandTouching': [0, 2, 1, 635, 634, 1], 'rightHandTouching': [1, 1, 1, 636, 634, 0]}, '/home/roni/coding/mastersProject/src/csvOut/p_86/recording_1': {' AU17_c': [3, 35, 3, 45, 41, 32], ' AU07_c': [1, 1, 1, 79, 77, 0], ' AU14_c': [10, 23, 10, 57, 46, 13], ' AU12_c': [4, 30, 4, 50, 45, 26], ' AU20_c': [11, 40, 10, 40, 29, 29], 'leftHandTouching': [0, 1, 1, 79, 78, 0], 'rightHandTouching': [0, 1, 1, 79, 78, 0]}, '/home/roni/coding/mastersProject/src/csvOut/p_109': {' AU17_c': [57, 148, 57, 429, 371, 91], ' AU07_c': [83, 230, 83, 347, 263, 147], ' AU14_c': [66, 180, 67, 397, 330, 113], ' AU12_c': [30, 57, 30, 520, 489, 27], ' AU20_c': [50, 76, 50, 501, 450, 26], 'leftHandTouching': [49, 150, 49, 427, 377, 101], 'rightHandTouching': [0,

In [16]:
probabilities =  { df:{ key:[] for key in transition_cols } for df in train_data}
for df in transition_probs:
    for i in transition_probs[df]:
        t = transition_probs[df][i]
        #print(i, t)
        if(t[1]==0):
            a = 0
            d = 0
        else: 
            d = t[5]/t[1]
            a = t[0]/t[1]
            
        if(t[3]==0):
            b = 0
            c = 0
        else:
            b = t[2]/t[3]
            c = t[4]/t[3]
        
        probabilities[df][i] = [ a,  b, c, d ]
    
pp.pprint(probabilities)

{   '/home/roni/coding/mastersProject/src/csvOut/p_105/recording_0': {   ' AU07_c': [   0.11858974358974358,
                                                                                        0.2753623188405797,
                                                                                        0.7246376811594203,
                                                                                        0.8782051282051282],
                                                                         ' AU12_c': [   0.3881578947368421,
                                                                                        0.19463087248322147,
                                                                                        0.802013422818792,
                                                                                        0.6118421052631579],
                                                                         ' AU14_c': [   0.17557251908396945,
                        

In [17]:
#for df in probabilities:
#    for i in probabilities[df]:
#        for j in probabilities[df][i]:
#            if( math.isnan(j)==True ):
#                j=0

In [18]:
# now we make the transition matrix

# Define the states
states = [0, 1]
matrices = {df:{ key: np.zeros((len(states), len(states))) for key in transition_cols } for df in train_data}
#print(matrices)
## Initialize the transition matrix with zeros
#transition_matrix = np.zeros((len(states), len(states)))
#print(matrices)
for df in matrices:
    for m in matrices[df]:
        transition_matrix = matrices[df][m]
        # Set the transition probabilities
        
        transition_matrix[0, 0] = probabilities[df][m][2] # 0 to 0
        transition_matrix[0, 1] = probabilities[df][m][1] # 0 to 1
        transition_matrix[1, 0] = probabilities[df][m][0] # 1 to 0
        transition_matrix[1, 1] = probabilities[df][m][3] # 1 to 1
    

        # Print the transition matrix
        print(transition_matrix)


[[0.79058824 0.20705882]
 [0.41509434 0.58490566]]
[[0.81585082 0.18181818]
 [0.375      0.625     ]]
[[0.9375     0.06076389]
 [0.57377049 0.42622951]]
[[0.92992424 0.07007576]
 [0.33027523 0.66055046]]
[[0.91785714 0.08214286]
 [0.58441558 0.4025974 ]]
[[0.9984252 0.0015748]
 [0.        0.5      ]]
[[0.99685535 0.00157233]
 [1.         0.        ]]
[[0.91111111 0.06666667]
 [0.08571429 0.91428571]]
[[0.97468354 0.01265823]
 [1.         0.        ]]
[[0.80701754 0.1754386 ]
 [0.43478261 0.56521739]]
[[0.9        0.08      ]
 [0.13333333 0.86666667]]
[[0.725 0.25 ]
 [0.275 0.725]]
[[0.98734177 0.01265823]
 [0.         0.        ]]
[[0.98734177 0.01265823]
 [0.         0.        ]]
[[0.86480186 0.13286713]
 [0.38513514 0.61486486]]
[[0.75792507 0.23919308]
 [0.36086957 0.63913043]]
[[0.83123426 0.16876574]
 [0.36666667 0.62777778]]
[[0.94038462 0.05769231]
 [0.52631579 0.47368421]]
[[0.89820359 0.0998004 ]
 [0.65789474 0.34210526]]
[[0.88290398 0.1147541 ]
 [0.32666667 0.67333333]]
[[0.

In [19]:
pp.pprint(matrices)

{   '/home/roni/coding/mastersProject/src/csvOut/p_105/recording_0': {   ' AU07_c': array([[0.72463768, 0.27536232],
       [0.11858974, 0.87820513]]),
                                                                         ' AU12_c': array([[0.80201342, 0.19463087],
       [0.38815789, 0.61184211]]),
                                                                         ' AU14_c': array([[0.75531915, 0.24468085],
       [0.17557252, 0.82061069]]),
                                                                         ' AU17_c': array([[0.75115207, 0.24884793],
       [0.23175966, 0.7639485 ]]),
                                                                         ' AU20_c': array([[0.89825581, 0.10174419],
       [0.32075472, 0.66981132]]),
                                                                         'leftHandTouching': array([[0.99105145, 0.00671141],
       [1.        , 0.        ]]),
                                                                         'right

In [20]:
for df in matrices:
    for m in matrices[df]:
        for i in matrices[df][m]:
            if (m=='rightHandTouching'):
                print(m,i, i[0]+i[1],'\n')

rightHandTouching [0.99685535 0.00157233] 0.9984276729559749 

rightHandTouching [1. 0.] 1.0 

rightHandTouching [0.98734177 0.01265823] 1.0 

rightHandTouching [0. 0.] 0.0 

rightHandTouching [0.9982669 0.       ] 0.9982668977469671 

rightHandTouching [0. 0.] 0.0 

rightHandTouching [0.99295775 0.        ] 0.9929577464788732 

rightHandTouching [0. 0.] 0.0 

rightHandTouching [0.99724518 0.        ] 0.9972451790633609 

rightHandTouching [0. 0.] 0.0 

rightHandTouching [0.99751244 0.        ] 0.9975124378109452 

rightHandTouching [0. 0.] 0.0 

rightHandTouching [0.99777778 0.        ] 0.9977777777777778 

rightHandTouching [0. 0.] 0.0 

rightHandTouching [0.9964539  0.00177305] 0.99822695035461 

rightHandTouching [0.5 0.5] 1.0 

rightHandTouching [0.99879081 0.        ] 0.9987908101571947 

rightHandTouching [0. 0.] 0.0 



In [21]:
#save the updated dataframes
with open("markov_probs.pickle", 'wb') as f:
    pickle.dump(matrices, f)

In [22]:
# ideally also need to keep the test adn val sets as pickles as well

# Conditional Probabilities Calculations

In order to calculate the conditional probabilities, we assume that AU and FT are statistically independent and thus P(AU and FT) = P(AU)*P(FT).
We do this for every video, for every combination of AU and FT (should be 10).

In [23]:
cps = [ 'LH AU17_c', 'LH AU07_c','LH AU14_c' , 'LH AU12_c','LH AU20_c' , 'RH AU17_c', 'RH AU07_c','RH AU14_c' , 'RH AU12_c','RH AU20_c' ]

cps_matrices = {df:{ key: np.zeros((len(states), len(states))) for key in cps } for df in train_data}

for df in matrices:
    lh = matrices[df]['leftHandTouching']
    rh = matrices[df]['rightHandTouching']

    for m in matrices[df]:
        t = matrices[df][m]
        # Set the transition probabilities
        #print('LH' + m)
        
        if('LH' + m in cps_matrices[df]):
            #add to lh
            #print('yay')
            cps_matrices[df]['LH' + m] = np.array([[ t[0,0]*lh[0,0], t[0,1]*lh[0,1] ],[ t[1,0]*lh[1,0], t[1,1]*lh[1,1] ]])
            #add to rh
            cps_matrices[df]['RH' + m] = np.array([[ t[0,0]*rh[0,0], t[0,1]*rh[0,1] ],[ t[1,0]*rh[1,0], t[1,1]*rh[1,1] ]])

            # Print the transition matrix
            #print(cps_matrices[df]['LH' + m])


In [24]:
#save the updated dataframes
with open("conditional_probs.pickle", 'wb') as f:
    pickle.dump(cps_matrices, f)