 **Media Memorability Prediction using Machine Learning**




# Introduction 

Author : **Prasad Govardhankar - 20210305**

This paper summarizes machine learning and statistical methods to predict the memorability of videos within the MediaEval Predicting Media Memorability Task. In this task, we are focused on predicting how video is memorable to viewers. Here we are predicting memorability score for videos that reflect the probability of whether a video will be remembered or not. The aim is to train computational models to get video memorability scores from visual content. Spearman’s rank correlation is used to evaluate the model through standard evaluation metrics.

In this Notebook I have used two Features on below models



1.   Extra Decision Tree
2.   Gradient Booster
3.  MLP Regressor
4. Random Forest
5. Liner Regression
6. XGB

On ***C3D, and HMP***  features seperately and with combination.


# Mounting our drive for Datasets

In [11]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


#Importing libraries which are require for our project



In [12]:

!pip install pyprind 
import numpy as np
import pandas as pd
#import pyprind
#from collections import Counter
#from keras.preprocessing.text import Tokenizer
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt
#from keras import preprocessing
import os
import glob



In [13]:
np.random.seed(42)

# **Load Ground Truth For Dev set**

In [14]:
# load the ground truth values
truth_path = '/content/drive/MyDrive/'
gdtruth = pd.read_csv(truth_path+'ground-truth.csv')

In [15]:
gdtruth.head(5)

Unnamed: 0,video,short-term_memorability,nb_short-term_annotations,long-term_memorability,nb_long-term_annotations
0,video3.webm,0.924,34,0.846,13
1,video4.webm,0.923,33,0.667,12
2,video6.webm,0.863,33,0.7,10
3,video8.webm,0.922,33,0.818,11
4,video10.webm,0.95,34,0.9,10


We don't need annotations so need to drop those columns

In [16]:
gdtruth = gdtruth.drop(columns=['nb_short-term_annotations', 'nb_long-term_annotations'])


In [17]:
gdtruth.head(5)

Unnamed: 0,video,short-term_memorability,long-term_memorability
0,video3.webm,0.924,0.846
1,video4.webm,0.923,0.667
2,video6.webm,0.863,0.7
3,video8.webm,0.922,0.818
4,video10.webm,0.95,0.9


In [18]:
gdtruth['video'] = gdtruth['video'].apply(lambda x : x.split('.')[0])

In [19]:
gdtruth.head(5)

Unnamed: 0,video,short-term_memorability,long-term_memorability
0,video3,0.924,0.846
1,video4,0.923,0.667
2,video6,0.863,0.7
3,video8,0.922,0.818
4,video10,0.95,0.9


# Load Dev C3D Features

In [20]:

C3D_feature_list= []
video_names_list = []
path = '/content/drive/MyDrive/C3D/*.txt'
for filename in glob.glob(path):
    name = ((filename.split('/')[-1]).split('.')[0])
    video_names_list.append(name) 
    with open(filename) as f:
        for line in f:
            C3D_features =[float(item) for item in line.split()]
    C3D_feature_list.append(C3D_features)



In [21]:
C3D_features_Dev = pd.DataFrame(np.array(C3D_feature_list).reshape(6000,101))
C3D_features_Dev["video"] = video_names_list

In [23]:
C3D_features_Dev.head(5)

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,...,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,video
0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,7.75e-06,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.999985,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,7.12e-06,video6633
1,0.010858,0.010386,0.0,0.0,0.0,0.0,2.7e-07,0.0,1e-08,3.4e-07,8e-08,1e-08,4e-06,0.000105,0.0,4e-08,1e-08,0.0,0.0,0.00013318,0.0,0.0,0.0,0.0,7.6e-07,0.0,9.8e-07,0.0,0.0,0.0,0.0,0.0,6e-08,2.1e-05,5e-08,0.0,1e-08,3e-08,2e-08,0.0,...,0.970125,1.6e-05,0.001298,3.2e-05,1e-06,0.0,0.0,1e-08,2.5e-07,0.0,0.0,5e-08,0.0,1e-08,1e-08,4.2e-07,0.0,3e-08,2e-08,0.0,0.0,0.0,6e-08,0.0,0.0,9e-08,0.0,0.0,1.1e-07,1.4e-07,0.0,0.0,1.7e-07,0.0,0.0,1e-08,1.3e-06,2.6e-06,8e-08,video6632
2,0.0002,6.5e-05,0.993807,2e-07,4.7e-07,7.3e-05,3.7e-06,0.000337,6.71e-06,2.29e-06,6.38e-06,7.34e-06,1.9e-05,7e-06,3.24e-06,3.81e-06,1.411e-05,1e-06,3e-06,2.3e-07,1.9e-07,0.00293762,5.2e-07,9.2e-07,1.136e-05,1.1e-05,0.00033104,9.2e-07,2.2e-05,8e-08,2.8e-05,1.3e-05,5.778e-05,3.7e-05,1.683e-05,6.86e-06,3.99e-06,8.03e-06,1.45e-06,3.8e-07,...,2.7e-05,1.3e-05,7.7e-05,5e-05,0.000138,1.2e-05,7e-06,4.1e-07,5.44e-06,3.4e-07,7e-06,8.399e-05,4.07e-06,3.97e-06,5.9e-07,0.00012853,8.8e-07,2.228e-05,1.105e-05,3e-06,1.5e-05,1.4e-05,4.09e-06,3e-06,4.6e-06,4.92e-06,5.29e-06,6.4e-07,2.372e-05,6.61e-06,1e-05,2.03e-06,5.8e-06,1e-06,1.49e-06,1.17e-05,1.5e-07,8.3e-07,0.000106,video6634
3,0.791961,0.001496,0.00681,9.39e-06,1.22e-06,4e-06,7.452e-05,3e-06,3.659e-05,1.63e-06,8.65e-06,8.063e-05,0.00349,9e-06,1.48e-06,0.00032604,1.254e-05,2.2e-05,5.4e-05,2.084e-05,1.38e-06,3.5e-07,2.585e-05,1.62e-06,0.00045397,8e-06,6.03e-06,1.475e-05,1.5e-05,5.58e-06,2e-06,2.4e-05,0.00024217,0.006136,5.6e-06,1.2e-07,2.86e-06,4.289e-05,4.11e-06,2.33e-06,...,0.030029,0.0046,6.9e-05,3.9e-05,0.004966,6e-06,3e-06,7.55e-06,2.539e-05,1.184e-05,1e-05,6.911e-05,6e-08,2.055e-05,1.99e-06,0.00800232,7.43e-06,1.842e-05,0.00319682,1.1e-05,0.000256,6e-06,3.234e-05,2e-06,6.3e-07,3.683e-05,1.3e-07,7.95e-06,0.00040195,0.00026156,6e-06,3.2e-07,0.00051944,2e-06,9.7e-07,1.735e-05,0.00025745,0.1305835,3.993e-05,video6639
4,0.005782,0.000306,0.004011,1.007e-05,1.034e-05,2e-06,3.16e-06,3e-06,1.984e-05,5.75e-06,6.642e-05,6.69e-06,0.000301,0.004799,2.8e-07,1.669e-05,2.67e-06,1e-06,1.1e-05,6.526e-05,3.1e-07,3.59e-05,1.4e-06,6.26e-06,0.01750103,1.9e-05,0.01190515,4.45e-06,5e-06,2.387e-05,7e-06,3.5e-05,3.63e-06,0.66742,0.00034824,1.9e-07,1.121e-05,6.83e-06,0.00018376,1.25e-06,...,0.005818,0.000535,0.001711,0.112263,0.000408,3.5e-05,0.000267,5.395e-05,3.899e-05,3.619e-05,0.000321,0.00045509,1.51e-06,6.2e-06,7.51e-06,0.00797192,7e-08,1.873e-05,0.00015111,1.1e-05,0.000398,1e-06,4.09e-06,4e-06,4.68e-06,2.08e-06,1.48e-06,0.00016518,2.738e-05,2.106e-05,2e-06,3.72e-06,0.00681835,5e-06,5e-08,2.088e-05,0.00127175,0.00048622,1.965e-05,video6643


In [100]:
C3D_features_Dev.to_pickle('/content/drive/MyDrive/C3D_features_Dev.pkl')

In [101]:
C3D_features_Dev = pd.read_pickle('/content/drive/MyDrive/C3D_features_Dev.pkl')

Combine Ground truth and C3D features

In [24]:
df = pd.merge(gdtruth,C3D_features_Dev)
df.shape

(6000, 104)

In [28]:
#Function to calculate Spearman coefficient scores
def Get_score(Y_pred,Y_true):
    '''Calculate the Spearmann"s correlation coefficient'''
    Y_pred = np.squeeze(Y_pred)
    Y_true = np.squeeze(Y_true)
    if Y_pred.shape != Y_true.shape:
        print('Input shapes don\'t match!')
    else:
        if len(Y_pred.shape) == 1:
            Res = pd.DataFrame({'Y_true':Y_true,'Y_pred':Y_pred})
            score_mat = Res[['Y_true','Y_pred']].corr(method='spearman',min_periods=1)
            print('The Spearman\'s rank correlation coefficient is: %.3f' % score_mat.iloc[1][0])
        else:
            for ii in range(Y_pred.shape[1]):
                Get_score(Y_pred[:,ii],Y_true[:,ii])

load requried models from SKLearn

In [26]:
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor
from sklearn.tree import ExtraTreeRegressor
from sklearn.tree import DecisionTreeRegressor
from sklearn.ensemble import BaggingRegressor
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.ensemble import AdaBoostRegressor
from sklearn.multioutput import MultiOutputRegressor
from sklearn.svm import SVR
from sklearn.neural_network import MLPRegressor
from sklearn.datasets import make_regression

In [25]:
X = df.iloc[:,3:104].values
Y = df.iloc[:, 1:3].values

# Extra Decision tree on Dev C3D

In [29]:
X_train, X_test, Y_train, Y_test = train_test_split(X,Y, test_size=0.2, random_state=42) # random state for reproducability

extra_tree = ExtraTreeRegressor()
reg = BaggingRegressor(extra_tree).fit(X_train, Y_train)
print(reg.score(X_test,Y_test))
pred = reg.predict(X_test)
Get_score(pred, Y_test)

-0.08091399835903747
The Spearman's rank correlation coefficient is: 0.205
The Spearman's rank correlation coefficient is: 0.057




# MLP Regressor on Dev C3D

In [30]:
X_train, X_test, Y_train, Y_test = train_test_split(X,Y,test_size=0.20, random_state=42) # random state for reproducability
reg = MLPRegressor(random_state=1, max_iter=500)
#reg = make_pipeline(StandardScaler(),SGDRegressor(max_iter=1000, tol=1e-3))
reg.fit(X_train, Y_train)
print(reg.score(X_test,Y_test))
pred = reg.predict(X_test)
Get_score(pred, Y_test)

0.0201140114673548
The Spearman's rank correlation coefficient is: 0.287
The Spearman's rank correlation coefficient is: 0.103




# Gradient Booster on Dev C3D

In [31]:
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.multioutput import MultiOutputRegressor
from sklearn.model_selection import train_test_split

X_train, X_test, Y_train, Y_test = train_test_split(X,Y, test_size=0.2, random_state=10) # random state for reproducability

gbr = GradientBoostingRegressor()
model = MultiOutputRegressor(estimator=gbr)
model.fit(X_train,Y_train)
print(model.score(X_test,Y_test))
predictionsGB = model.predict(X_test)
Get_score(predictionsGB, Y_test)

0.073295338340026
The Spearman's rank correlation coefficient is: 0.319
The Spearman's rank correlation coefficient is: 0.162


# Random Forest on Dev C3D

In [32]:
X_train, X_test, Y_train, Y_test = train_test_split(X,Y,test_size=0.20, random_state=42) # random state for reproducability

reg = RandomForestRegressor(max_depth= 10,n_estimators=100)
reg.fit(X_train, Y_train)
print(reg.score(X_test,Y_test))
pred = reg.predict(X_test)
Get_score(pred, Y_test)

0.030594167920888263
The Spearman's rank correlation coefficient is: 0.291
The Spearman's rank correlation coefficient is: 0.128




# Linear Regression on Dev C3D

In [33]:
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size = 0.20)

from sklearn.linear_model import LinearRegression
regressor1 = LinearRegression()
regressor1.fit(X_train, Y_train)

Y_pred1 = regressor1.predict(X_test)
Get_score(Y_pred1, Y_test)

The Spearman's rank correlation coefficient is: 0.272
The Spearman's rank correlation coefficient is: 0.112


# XGB on Dev C3D

In [34]:
import xgboost as xgb
from sklearn.metrics import mean_squared_error

X_train, X_test, Y_train, Y_test = train_test_split(X,Y,test_size=0.20, random_state=10) # random state for reproducability


xgb = xgb.XGBRegressor(learning_rate = 0.1, max_depth = 9, alpha = 10, n_estimators = 50)
model = MultiOutputRegressor(estimator=xgb)
model.fit(X_train,Y_train)
print(model.score(X_test,Y_test))
XGB = model.predict(X_test)
Get_score(XGB, Y_test)






0.039817269299019464
The Spearman's rank correlation coefficient is: 0.314
The Spearman's rank correlation coefficient is: 0.092


I've tried multiple models and as we can see **Gradient Boost Regressor** out performed other models.

# Now we'll try same Models with HMP feature

In [35]:

def load_hmp(hmp_path):
    files = list(gdtruth["video"].values)
    hmp_features = []
    video_names_list = []
    for file in files:
        file = hmp_path+file+'.txt'
        name = ((file.split('/')[-1]).split('.')[0])
        with open(file) as f:
            for line in f:
                pairs=line.split()
                HMP_temp = { int(p.split(':')[0]) : float(p.split(':')[1]) for p in pairs}
                HMP = np.zeros(6075)
            for idx in HMP_temp.keys():
                HMP[idx-1] = HMP_temp[idx]
            hmp_features.append(HMP)
            video_names_list.append(name)
    return hmp_features

# Load HMP Dev Set

In [36]:
hmp_path = '/content/drive/MyDrive/HMP/'

hmp_features = load_hmp(hmp_path)

In [37]:
len(hmp_features)

6000

In [None]:
print(video_names_list)

In [38]:
HMP_features = pd.DataFrame(np.array(hmp_features).reshape(6000,6075))
HMP_features["video"] = video_names_list
HMP_features.head()

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,...,6036,6037,6038,6039,6040,6041,6042,6043,6044,6045,6046,6047,6048,6049,6050,6051,6052,6053,6054,6055,6056,6057,6058,6059,6060,6061,6062,6063,6064,6065,6066,6067,6068,6069,6070,6071,6072,6073,6074,video
0,0.125563,0.024036,0.000314,0.0,0.015864,0.000358,0.0,0.0,8.6e-05,0.0,0.0,0.0,0.0,0.002795,5.4e-05,0.0,0.0,3.7e-05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,5e-06,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.000173,0.000459,0.0,0.000148,0.000104,0.0,0.000121,0.000551,0.0,0.000114,0.000884,2e-06,0.000116,7.7e-05,2e-06,2.7e-05,0.000136,0.0,0.0,2e-06,0.0,9.1e-05,3.5e-05,0.0,0.000163,0.000467,2e-06,1e-05,1.7e-05,0.0,0.000393,0.000279,0.0,0.000289,0.001926,0.0,8.6e-05,0.00058,0.0,video6633
1,0.007526,0.001421,6.8e-05,0.0,0.001184,0.000143,0.0,0.0,7.9e-05,0.0,0.0,0.0,0.0,0.000246,2.4e-05,0.0,0.0,4.2e-05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,4e-06,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.000308,0.001054,0.000751,0.000176,6.2e-05,0.0,0.000123,0.000398,8.6e-05,0.000246,0.000433,0.000446,0.000143,5.3e-05,0.0,5.3e-05,9.9e-05,9e-06,4e-06,3.3e-05,4e-06,5.1e-05,3.5e-05,0.0,6.2e-05,0.000358,3.5e-05,2.4e-05,8.3e-05,5.3e-05,0.000244,6.6e-05,0.0,8.1e-05,0.000617,9.4e-05,0.00022,0.000762,0.001224,video6632
2,0.109584,0.018978,0.000289,0.0,0.008774,0.000208,0.0,2e-06,8.8e-05,0.0,0.0,0.0,0.0,0.002046,6.1e-05,0.0,0.0,3.8e-05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2e-06,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,5.7e-05,0.000158,7.3e-05,2.1e-05,9e-06,2e-06,1.9e-05,9.5e-05,2.1e-05,1.9e-05,9e-05,7.3e-05,5e-05,2.4e-05,0.0,1.2e-05,2.1e-05,0.0,0.0,2e-06,0.0,1.7e-05,7e-06,2e-06,6.6e-05,0.000203,2.6e-05,2e-06,4e-05,7e-06,5.4e-05,4.5e-05,0.0,2.8e-05,0.000291,3.3e-05,5.2e-05,0.000258,0.000215,video6634
3,0.120431,0.013561,0.000277,0.0,0.018974,0.000913,0.0,2.4e-05,0.000713,0.0,0.0,0.0,0.0,0.002496,0.000149,0.0,1.1e-05,0.000157,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,5e-06,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.000434,0.000543,0.000412,0.000412,4.5e-05,3e-06,0.000144,0.000282,3.7e-05,0.000197,0.000218,0.000157,0.000237,2.1e-05,0.0,4e-05,5.6e-05,8e-06,5e-06,1.3e-05,1.9e-05,0.000168,1.3e-05,0.0,0.000133,0.000202,2.9e-05,2.9e-05,3.5e-05,5.9e-05,0.00111,7.5e-05,8e-06,0.000333,0.000793,0.000101,0.000588,0.000503,0.000452,video6639
4,0.005026,0.001356,5.5e-05,0.0,0.000665,2.9e-05,0.0,0.0,2.4e-05,0.0,0.0,0.0,0.0,0.000147,2e-05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.000996,0.001604,0.000103,0.000768,0.000215,9e-06,0.000415,0.000926,2e-05,0.000538,0.001178,5e-05,0.000518,0.000169,7e-06,0.000134,0.000169,7e-06,2.6e-05,4.6e-05,7e-06,0.000373,8.8e-05,0.0,0.000338,0.000441,2.9e-05,7e-05,0.000149,9e-06,0.000882,0.0002,9e-06,0.000559,0.001097,1.8e-05,0.000632,0.001128,6.4e-05,video6643


In [None]:
#HMP_features = pd.merge(gdtruth,HMP_features)

In [39]:
HMP_features.head(5)

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,...,6036,6037,6038,6039,6040,6041,6042,6043,6044,6045,6046,6047,6048,6049,6050,6051,6052,6053,6054,6055,6056,6057,6058,6059,6060,6061,6062,6063,6064,6065,6066,6067,6068,6069,6070,6071,6072,6073,6074,video
0,0.125563,0.024036,0.000314,0.0,0.015864,0.000358,0.0,0.0,8.6e-05,0.0,0.0,0.0,0.0,0.002795,5.4e-05,0.0,0.0,3.7e-05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,5e-06,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.000173,0.000459,0.0,0.000148,0.000104,0.0,0.000121,0.000551,0.0,0.000114,0.000884,2e-06,0.000116,7.7e-05,2e-06,2.7e-05,0.000136,0.0,0.0,2e-06,0.0,9.1e-05,3.5e-05,0.0,0.000163,0.000467,2e-06,1e-05,1.7e-05,0.0,0.000393,0.000279,0.0,0.000289,0.001926,0.0,8.6e-05,0.00058,0.0,video6633
1,0.007526,0.001421,6.8e-05,0.0,0.001184,0.000143,0.0,0.0,7.9e-05,0.0,0.0,0.0,0.0,0.000246,2.4e-05,0.0,0.0,4.2e-05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,4e-06,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.000308,0.001054,0.000751,0.000176,6.2e-05,0.0,0.000123,0.000398,8.6e-05,0.000246,0.000433,0.000446,0.000143,5.3e-05,0.0,5.3e-05,9.9e-05,9e-06,4e-06,3.3e-05,4e-06,5.1e-05,3.5e-05,0.0,6.2e-05,0.000358,3.5e-05,2.4e-05,8.3e-05,5.3e-05,0.000244,6.6e-05,0.0,8.1e-05,0.000617,9.4e-05,0.00022,0.000762,0.001224,video6632
2,0.109584,0.018978,0.000289,0.0,0.008774,0.000208,0.0,2e-06,8.8e-05,0.0,0.0,0.0,0.0,0.002046,6.1e-05,0.0,0.0,3.8e-05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2e-06,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,5.7e-05,0.000158,7.3e-05,2.1e-05,9e-06,2e-06,1.9e-05,9.5e-05,2.1e-05,1.9e-05,9e-05,7.3e-05,5e-05,2.4e-05,0.0,1.2e-05,2.1e-05,0.0,0.0,2e-06,0.0,1.7e-05,7e-06,2e-06,6.6e-05,0.000203,2.6e-05,2e-06,4e-05,7e-06,5.4e-05,4.5e-05,0.0,2.8e-05,0.000291,3.3e-05,5.2e-05,0.000258,0.000215,video6634
3,0.120431,0.013561,0.000277,0.0,0.018974,0.000913,0.0,2.4e-05,0.000713,0.0,0.0,0.0,0.0,0.002496,0.000149,0.0,1.1e-05,0.000157,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,5e-06,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.000434,0.000543,0.000412,0.000412,4.5e-05,3e-06,0.000144,0.000282,3.7e-05,0.000197,0.000218,0.000157,0.000237,2.1e-05,0.0,4e-05,5.6e-05,8e-06,5e-06,1.3e-05,1.9e-05,0.000168,1.3e-05,0.0,0.000133,0.000202,2.9e-05,2.9e-05,3.5e-05,5.9e-05,0.00111,7.5e-05,8e-06,0.000333,0.000793,0.000101,0.000588,0.000503,0.000452,video6639
4,0.005026,0.001356,5.5e-05,0.0,0.000665,2.9e-05,0.0,0.0,2.4e-05,0.0,0.0,0.0,0.0,0.000147,2e-05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.000996,0.001604,0.000103,0.000768,0.000215,9e-06,0.000415,0.000926,2e-05,0.000538,0.001178,5e-05,0.000518,0.000169,7e-06,0.000134,0.000169,7e-06,2.6e-05,4.6e-05,7e-06,0.000373,8.8e-05,0.0,0.000338,0.000441,2.9e-05,7e-05,0.000149,9e-06,0.000882,0.0002,9e-06,0.000559,0.001097,1.8e-05,0.000632,0.001128,6.4e-05,video6643


In [102]:
HMP_features.to_pickle('/content/drive/MyDrive/HMP_features.pkl')

In [103]:
HMP_features = pd.read_pickle('/content/drive/MyDrive/HMP_features.pkl')

In [41]:
X = HMP_features.iloc[:,0:6075].values
Y = gdtruth[['short-term_memorability','long-term_memorability']].values

# EXTRA Decision Tree on Dev HMP

In [42]:
X_train, X_test, Y_train, Y_test = train_test_split(X,Y, test_size=0.2, random_state=42) # random state for reproducability

extra_tree = ExtraTreeRegressor()
reg = BaggingRegressor(extra_tree).fit(X_train, Y_train)
print(reg.score(X_test,Y_test))
pred = reg.predict(X_test)
Get_score(pred, Y_test)



-0.06009090924388922
The Spearman's rank correlation coefficient is: 0.206
The Spearman's rank correlation coefficient is: 0.046


# Gradient booster on Dev HMP

In [43]:


from sklearn.ensemble import GradientBoostingRegressor
from sklearn.multioutput import MultiOutputRegressor
from sklearn.model_selection import train_test_split

X_train, X_test, Y_train, Y_test = train_test_split(X,Y, test_size=0.2, random_state=42) # random state for reproducability

gbr = GradientBoostingRegressor()
model = MultiOutputRegressor(estimator=gbr)
model.fit(X_train,Y_train)
print(model.score(X_test,Y_test))
predictionsGB = model.predict(X_test)
Get_score(predictionsGB, Y_test)

0.04894695876668437
The Spearman's rank correlation coefficient is: 0.310
The Spearman's rank correlation coefficient is: 0.113


# MLP REGRESSOR on Dev HMP

In [44]:
X_train, X_test, Y_train, Y_test = train_test_split(X,Y,test_size=0.20, random_state=42) # random state for reproducability
reg = MLPRegressor(random_state=1, max_iter=500)
#reg = make_pipeline(StandardScaler(),SGDRegressor(max_iter=1000, tol=1e-3))
reg.fit(X_train, Y_train)
print(reg.score(X_test,Y_test))
pred = reg.predict(X_test)
Get_score(pred, Y_test)

0.017501924837915333
The Spearman's rank correlation coefficient is: 0.257
The Spearman's rank correlation coefficient is: 0.111




# Random Forest on Dev HMP

In [45]:
X_train, X_test, Y_train, Y_test = train_test_split(X,Y,test_size=0.20, random_state=42) # random state for reproducability

reg = RandomForestRegressor(max_depth= 20,n_estimators=100)
reg.fit(X_train, Y_train)
print(reg.score(X_test,Y_test))
pred = reg.predict(X_test)
Get_score(pred, Y_test)

0.03279023738417329
The Spearman's rank correlation coefficient is: 0.315
The Spearman's rank correlation coefficient is: 0.128




# Linear Regression on Dev HMP

In [46]:
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size = 0.20)

from sklearn.linear_model import LinearRegression
regressor1 = LinearRegression()
regressor1.fit(X_train, Y_train)

Y_pred1 = regressor1.predict(X_test)
Get_score(Y_pred1, Y_test)

The Spearman's rank correlation coefficient is: 0.009
The Spearman's rank correlation coefficient is: 0.073


# XGB on Dev HMP

In [47]:
import xgboost as xgb
from sklearn.metrics import mean_squared_error

X_train, X_test, Y_train, Y_test = train_test_split(X,Y,test_size=0.20, random_state=42) # random state for reproducability


xgb = xgb.XGBRegressor(learning_rate = 0.2, max_depth = 5, alpha = 10, n_estimators = 20)
model = MultiOutputRegressor(estimator=xgb)
model.fit(X_train,Y_train)
print(model.score(X_test,Y_test))
XGB = model.predict(X_test)
Get_score(XGB, Y_test)



0.024981902128770994
The Spearman's rank correlation coefficient is: 0.292
The Spearman's rank correlation coefficient is: 0.092


# **Executing same models for combination of C3D and hmp combined Feature!**


merge C3D + HMP + Ground truth

In [50]:
len(C3D_features_Dev)

6000

In [49]:
len(HMP_features)

6000

In [51]:
#Combining captions and c3d with HMP
c3d_hmp  = pd.concat([C3D_features_Dev,HMP_features],axis=1)


In [52]:
len(c3d_hmp)

6000

In [53]:
c3d_hmp.head(10)

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,...,6036,6037,6038,6039,6040,6041,6042,6043,6044,6045,6046,6047,6048,6049,6050,6051,6052,6053,6054,6055,6056,6057,6058,6059,6060,6061,6062,6063,6064,6065,6066,6067,6068,6069,6070,6071,6072,6073,6074,video
0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,7.75e-06,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.000173,0.000459,0.0,0.000148,0.000104,0.0,0.000121,0.000551,0.0,0.000114,0.000884,2e-06,0.000116,7.7e-05,2e-06,2.7e-05,0.000136,0.0,0.0,2e-06,0.0,9.1e-05,3.5e-05,0.0,0.000163,0.000467,2e-06,1e-05,1.7e-05,0.0,0.000393,0.000279,0.0,0.000289,0.001926,0.0,8.6e-05,0.00058,0.0,video6633
1,0.010858,0.010386,0.0,0.0,0.0,0.0,2.7e-07,0.0,1e-08,3.4e-07,8e-08,1e-08,4e-06,0.000105,0.0,4e-08,1e-08,0.0,0.0,0.00013318,0.0,0.0,0.0,0.0,7.6e-07,0.0,9.8e-07,0.0,0.0,0.0,0.0,0.0,6e-08,2.1e-05,5e-08,0.0,1e-08,3e-08,2e-08,0.0,...,0.000308,0.001054,0.000751,0.000176,6.2e-05,0.0,0.000123,0.000398,8.6e-05,0.000246,0.000433,0.000446,0.000143,5.3e-05,0.0,5.3e-05,9.9e-05,9e-06,4e-06,3.3e-05,4e-06,5.1e-05,3.5e-05,0.0,6.2e-05,0.000358,3.5e-05,2.4e-05,8.3e-05,5.3e-05,0.000244,6.6e-05,0.0,8.1e-05,0.000617,9.4e-05,0.00022,0.000762,0.001224,video6632
2,0.0002,6.5e-05,0.993807,2e-07,4.7e-07,7.3e-05,3.7e-06,0.000337,6.71e-06,2.29e-06,6.38e-06,7.34e-06,1.9e-05,7e-06,3.24e-06,3.81e-06,1.411e-05,1.04e-06,3e-06,2.3e-07,1.9e-07,0.00293762,5.2e-07,9.2e-07,1.136e-05,1.1e-05,0.00033104,9.2e-07,2.2e-05,8e-08,2.8e-05,1.3e-05,5.778e-05,3.7e-05,1.683e-05,6.86e-06,3.99e-06,8.03e-06,1.45e-06,3.8e-07,...,5.7e-05,0.000158,7.3e-05,2.1e-05,9e-06,2e-06,1.9e-05,9.5e-05,2.1e-05,1.9e-05,9e-05,7.3e-05,5e-05,2.4e-05,0.0,1.2e-05,2.1e-05,0.0,0.0,2e-06,0.0,1.7e-05,7e-06,2e-06,6.6e-05,0.000203,2.6e-05,2e-06,4e-05,7e-06,5.4e-05,4.5e-05,0.0,2.8e-05,0.000291,3.3e-05,5.2e-05,0.000258,0.000215,video6634
3,0.791961,0.001496,0.00681,9.39e-06,1.22e-06,4e-06,7.452e-05,3e-06,3.659e-05,1.63e-06,8.65e-06,8.063e-05,0.00349,9e-06,1.48e-06,0.00032604,1.254e-05,2.197e-05,5.4e-05,2.084e-05,1.38e-06,3.5e-07,2.585e-05,1.62e-06,0.00045397,8e-06,6.03e-06,1.475e-05,1.5e-05,5.58e-06,2e-06,2.4e-05,0.00024217,0.006136,5.6e-06,1.2e-07,2.86e-06,4.289e-05,4.11e-06,2.33e-06,...,0.000434,0.000543,0.000412,0.000412,4.5e-05,3e-06,0.000144,0.000282,3.7e-05,0.000197,0.000218,0.000157,0.000237,2.1e-05,0.0,4e-05,5.6e-05,8e-06,5e-06,1.3e-05,1.9e-05,0.000168,1.3e-05,0.0,0.000133,0.000202,2.9e-05,2.9e-05,3.5e-05,5.9e-05,0.00111,7.5e-05,8e-06,0.000333,0.000793,0.000101,0.000588,0.000503,0.000452,video6639
4,0.005782,0.000306,0.004011,1.007e-05,1.034e-05,2e-06,3.16e-06,3e-06,1.984e-05,5.75e-06,6.642e-05,6.69e-06,0.000301,0.004799,2.8e-07,1.669e-05,2.67e-06,1.06e-06,1.1e-05,6.526e-05,3.1e-07,3.59e-05,1.4e-06,6.26e-06,0.01750103,1.9e-05,0.01190515,4.45e-06,5e-06,2.387e-05,7e-06,3.5e-05,3.63e-06,0.66742,0.00034824,1.9e-07,1.121e-05,6.83e-06,0.00018376,1.25e-06,...,0.000996,0.001604,0.000103,0.000768,0.000215,9e-06,0.000415,0.000926,2e-05,0.000538,0.001178,5e-05,0.000518,0.000169,7e-06,0.000134,0.000169,7e-06,2.6e-05,4.6e-05,7e-06,0.000373,8.8e-05,0.0,0.000338,0.000441,2.9e-05,7e-05,0.000149,9e-06,0.000882,0.0002,9e-06,0.000559,0.001097,1.8e-05,0.000632,0.001128,6.4e-05,video6643
5,0.001927,0.003879,0.738982,2.75e-05,2.616e-05,2.9e-05,0.00080219,0.000775,4.875e-05,3.906e-05,0.00043363,0.00033424,5.8e-05,2.6e-05,7.5e-05,0.00034563,4.777e-05,5.93e-05,4.5e-05,9.816e-05,4.38e-06,0.1439418,6.315e-05,0.00010764,9.316e-05,0.000179,9.464e-05,6.445e-05,0.000211,6.314e-05,5e-05,0.005812,0.00340365,0.000247,0.00353672,1.259e-05,5.703e-05,0.00076019,0.00232122,1.756e-05,...,0.000372,0.000608,0.000293,0.000173,8.1e-05,1.5e-05,0.000169,0.00037,5.9e-05,0.000245,0.000449,0.00016,0.000346,9.2e-05,1.1e-05,9.4e-05,0.000114,9e-06,2e-06,1.5e-05,4e-06,0.000103,2e-05,7e-06,0.000263,0.000409,3.9e-05,2e-05,7.2e-05,3.1e-05,0.000249,9.4e-05,2e-06,0.000201,0.000444,5.7e-05,0.000462,0.000698,0.00021,video6641
6,0.001606,0.000703,0.016928,0.00026241,6.521e-05,0.000166,0.00138288,0.000564,2.549e-05,0.00011021,4.727e-05,0.000106,0.000129,0.001323,6.937e-05,0.00018763,0.00051358,0.00010552,0.000179,0.00010611,2.16e-06,0.2306214,7.12e-06,0.00013489,0.00224466,0.008552,0.00264915,4.501e-05,7.3e-05,5.949e-05,7e-05,0.030642,0.00024119,0.000154,0.09763245,0.00098085,0.01003205,0.00190817,0.00104929,8.088e-05,...,0.000453,0.001969,0.000142,0.000215,0.000195,9e-06,0.00041,0.000933,4.6e-05,0.000331,0.000883,7.4e-05,8.8e-05,0.000134,9e-06,7.7e-05,0.000147,7e-06,9e-06,3.5e-05,0.0,0.000151,0.000131,4e-06,0.000226,0.000434,2.2e-05,6.4e-05,0.000138,1.1e-05,0.000162,0.000103,0.0,0.000212,0.00058,2e-05,0.0003,0.000679,5e-05,video6646
7,3.8e-05,0.000295,0.001251,3.434e-05,0.00019411,6.2e-05,8.84e-06,0.000499,3.923e-05,2.291e-05,0.00012775,0.01619368,1.8e-05,2.1e-05,5.5e-07,0.000342,1.09e-06,7.3e-07,0.000356,5.43e-06,4.89e-06,0.4323865,1.476e-05,0.00174491,0.00120853,0.027514,0.00195184,2.14e-06,3.6e-05,0.01447474,5e-06,0.238099,2.738e-05,2e-06,0.00642882,0.01607525,5.994e-05,8.658e-05,2.555e-05,0.00049637,...,0.000204,0.0006,0.000252,0.000135,6.9e-05,2e-06,0.000131,0.000345,6e-05,0.000126,0.000409,0.000175,0.000128,4.4e-05,2e-06,4.2e-05,6.2e-05,7e-06,2e-06,2.2e-05,2e-06,7.5e-05,3.3e-05,0.0,8.4e-05,0.000237,2.9e-05,2.4e-05,5.5e-05,2.4e-05,0.000173,5.3e-05,2e-06,0.000117,0.000401,2.9e-05,0.000193,0.000432,0.000314,video6637
8,0.023067,0.005192,0.001679,6.392e-05,0.00133022,0.01529,0.01064836,4.5e-05,0.00058927,0.00710164,0.00340782,0.00057827,0.064231,0.009031,0.00020088,0.00048604,0.00029634,0.00013754,0.000602,0.00032946,2.154e-05,0.00077521,2.877e-05,0.00058209,0.00268821,9.7e-05,0.09716719,0.00050756,4.4e-05,0.00203105,0.001906,0.00197,7.597e-05,0.038017,0.00070354,3.516e-05,0.00016261,0.00077328,0.04882166,5.065e-05,...,0.000229,0.000397,0.000165,7.5e-05,3.5e-05,2e-05,0.000143,0.000128,2.2e-05,0.00015,0.00022,9.7e-05,5.5e-05,2.4e-05,2e-06,1.1e-05,1.1e-05,0.0,4e-06,0.0,0.0,2.9e-05,1.3e-05,2e-06,4.2e-05,6e-05,4e-06,2.2e-05,7e-06,0.0,4e-05,1.1e-05,2e-06,6.8e-05,6.2e-05,7e-06,0.000119,0.000115,3.5e-05,video6635
9,0.005959,0.004765,0.003757,0.00057871,7.344e-05,0.000129,0.00046653,0.010266,0.00030281,0.00021965,0.00025218,0.00296373,3.2e-05,0.000592,3.26e-05,0.00146643,0.00024307,0.00018084,0.004019,0.00239096,0.00012069,0.00848849,0.00052162,0.00041646,0.0164342,0.049686,0.00147055,5.432e-05,0.001055,0.00544277,0.000173,0.024688,0.00096407,0.00088,0.01140529,0.00452216,0.00085945,0.0006573,0.00069424,0.00182311,...,9.6e-05,0.00026,4.5e-05,5.6e-05,2.4e-05,0.0,7.8e-05,0.00016,4e-06,8.7e-05,0.00018,2.7e-05,3.8e-05,1.3e-05,0.0,2e-05,2.9e-05,0.0,4e-06,4e-06,2e-06,5.3e-05,1.6e-05,0.0,9.3e-05,6.9e-05,1.3e-05,7e-06,1.8e-05,2e-06,4.2e-05,2.2e-05,0.0,6.5e-05,0.000111,7e-06,8e-05,0.000194,5.8e-05,video6645


In [107]:
c3d_hmp.shape

(6000, 6178)

In [108]:
X = c3d_hmp.drop(columns=['video'])
Y = gdtruth[['short-term_memorability','long-term_memorability']].values

# EXTRA Decision tree ON C3D and HMP

In [56]:
X_train, X_test, Y_train, Y_test = train_test_split(X,Y, test_size=0.2, random_state=42) # random state for reproducability

extra_tree = ExtraTreeRegressor()
reg = BaggingRegressor(extra_tree).fit(X_train, Y_train)
print(reg.score(X_test,Y_test))
pred = reg.predict(X_test)
Get_score(pred, Y_test)



-0.07250704061438354
The Spearman's rank correlation coefficient is: 0.160
The Spearman's rank correlation coefficient is: 0.066


# Gradient Booster ON C3D and HMP

In [110]:
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.multioutput import MultiOutputRegressor
from sklearn.model_selection import train_test_split

X_train, X_test, Y_train, Y_test = train_test_split(X,Y, test_size=0.2, random_state=42) # random state for reproducability

gbr = GradientBoostingRegressor()
model = MultiOutputRegressor(estimator=gbr)
model.fit(X_train,Y_train)
print(model.score(X_test,Y_test))
predictionsGB = model.predict(X_test)
Get_score(predictionsGB, Y_test)

0.04428595301461358
The Spearman's rank correlation coefficient is: 0.299
The Spearman's rank correlation coefficient is: 0.117


# MLREGRESSOR ON C3D and HMP

In [58]:
X_train, X_test, Y_train, Y_test = train_test_split(X,Y,test_size=0.20, random_state=42) # random state for reproducability
reg = MLPRegressor(random_state=1, max_iter=500)
#reg = make_pipeline(StandardScaler(),SGDRegressor(max_iter=1000, tol=1e-3))
reg.fit(X_train, Y_train)
print(reg.score(X_test,Y_test))
pred = reg.predict(X_test)
Get_score(pred, Y_test)

0.006162197509033656
The Spearman's rank correlation coefficient is: 0.199
The Spearman's rank correlation coefficient is: 0.082




# Random Forest ON C3D and HMP

In [59]:
X_train, X_test, Y_train, Y_test = train_test_split(X,Y,test_size=0.20, random_state=42) # random state for reproducability

reg = RandomForestRegressor(max_depth= 20,n_estimators=100)
reg.fit(X_train, Y_train)
print(reg.score(X_test,Y_test))
pred = reg.predict(X_test)
Get_score(pred, Y_test)

0.029278291877852566
The Spearman's rank correlation coefficient is: 0.275
The Spearman's rank correlation coefficient is: 0.126




# Liner Regression ON C3D and HMP

In [60]:
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size = 0.20)

from sklearn.linear_model import LinearRegression
regressor1 = LinearRegression()
regressor1.fit(X_train, Y_train)

Y_pred1 = regressor1.predict(X_test)
Get_score(Y_pred1, Y_test)

The Spearman's rank correlation coefficient is: 0.010
The Spearman's rank correlation coefficient is: 0.002


# XGB ON C3D and HMP

In [61]:
import xgboost as xgb
from sklearn.metrics import mean_squared_error

X_train, X_test, Y_train, Y_test = train_test_split(X,Y,test_size=0.20, random_state=42) # random state for reproducability


xgb = xgb.XGBRegressor(learning_rate = 0.2, max_depth = 5, alpha = 10, n_estimators = 20)
model = MultiOutputRegressor(estimator=xgb)
model.fit(X_train,Y_train)
print(model.score(X_test,Y_test))
XGB = model.predict(X_test)
Get_score(XGB, Y_test)



0.02983055559527381
The Spearman's rank correlation coefficient is: 0.273
The Spearman's rank correlation coefficient is: 0.101


# Executing Best performing model on the Test Dataset

#Load C3D test set

In [63]:
os.chdir('/content/drive/MyDrive/C3D_test')

In [64]:
C3D_feature_Test_list= []
video_names_Test_list = []
path = '/content/drive/MyDrive/C3D_test/*.txt'
for filename in glob.glob(path):
    name = ((filename.split('/')[-1]).split('.')[0])
    video_names_Test_list.append(name) 
    with open(filename) as f:
        for line in f:
            C3D_features_Test =[float(item) for item in line.split()]
    C3D_feature_Test_list.append(C3D_features_Test)

In [65]:
C3D_features_Test = pd.DataFrame(np.array(C3D_feature_Test_list).reshape(2000,101))
C3D_features_Test["video"] = video_names_Test_list
C3D_features_Test.head()

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,...,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,video
0,0.968162,0.001538,1.7e-07,3e-08,0.0,0.0,3e-08,0.0,0.0,1.7e-07,1e-08,0.0,0.018083,7e-06,0.0,4e-08,3e-08,2e-06,1e-08,9.6e-05,0.0,0.0,0.0,0.0,5.4e-07,0.0,2.9e-07,0.0,0.0,0.0,0.0,0.0,1e-08,0.007307,1.3e-07,0.0,1.7e-07,0.0,0.000201,0.0,...,0.000991,7e-06,6.2e-05,2e-06,6e-06,0.0,2e-08,6e-08,2e-08,1e-08,1e-08,1.2e-07,0.0,0.0,0.0,0.002485,0.0,0.0,0.0,0.0,2e-08,0.0,1e-08,0.0,0.0,0.0,0.0,1e-08,1e-08,1e-07,0.0,0.0,5e-06,0.0,0.0,0.0,7e-06,2e-06,1e-06,video8770
1,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1e-08,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,2e-08,1e-08,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2.8e-07,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,video8760
2,0.000358,0.003973,0.0088033,0.00774053,0.000403,0.000194,0.00909474,0.017529,0.000285,2.582e-05,0.00032793,0.013337,7e-06,0.000169,0.000122,0.00068516,0.00081646,4e-06,0.00124903,0.000297,0.001731,0.017458,0.00045758,0.011841,0.06208798,0.000916,3.288e-05,0.001933,0.00135,0.00305029,0.000202,0.138158,0.00200605,2.7e-05,0.01112671,0.028939,0.00764601,0.015354,0.006985,0.000424,...,0.000889,0.001144,0.003669,0.006591,0.000239,0.00458847,0.00322734,0.00048361,0.00093505,0.06927425,0.00049766,0.01659992,0.000226,0.086639,0.000103,0.001129,2.6e-05,0.004527,0.003472,0.045509,0.00667972,0.001158,0.00109179,0.01939896,0.00162,0.004252,0.003154,0.00055855,0.01479572,0.01822863,0.058241,0.002662,0.028245,0.000521,5.476e-05,0.002431,0.000706,0.045552,0.001591,video8765
3,0.001573,0.001398,0.07172299,2.488e-05,1.9e-05,0.000827,0.00683807,0.000136,2.9e-05,0.00023931,0.00056579,0.000672,0.000265,0.000454,1.3e-05,0.0003156,0.00051108,8e-06,9.75e-06,4e-06,1e-05,0.001801,1.66e-06,4.7e-05,0.00046028,3.1e-05,0.00042496,0.000133,0.001446,0.00024836,0.000171,5.5e-05,0.00629857,0.000264,5.318e-05,3e-06,8.437e-05,0.000372,0.000109,1.3e-05,...,0.003979,0.003753,0.002594,0.003642,0.030717,0.00348137,0.00059504,1.133e-05,0.00050565,0.0001954,5.344e-05,0.00035296,9e-06,0.000182,9e-05,0.000233,0.0019,0.000345,0.000182,0.000258,0.00101389,5.4e-05,6.851e-05,0.00019253,0.00141,0.000869,1.4e-05,8.568e-05,0.00169846,4.608e-05,0.001467,0.000128,0.000102,7e-06,3.577e-05,0.001544,3.4e-05,0.000837,0.804485,video8762
4,0.003526,0.001376,0.04618705,7.76e-06,4.1e-05,0.000295,0.00034319,0.000406,7e-05,1.534e-05,0.00035398,0.000599,0.000349,0.004517,1.6e-05,5.5e-05,5.68e-05,1.5e-05,1.409e-05,0.000145,4e-06,0.153209,6.9e-07,2.9e-05,0.00232581,0.000133,0.0020681,1.2e-05,2.6e-05,0.00015458,9e-06,0.000481,0.00014286,0.000359,0.00100858,0.000114,4.409e-05,1.4e-05,0.000449,3.2e-05,...,0.000249,0.000621,0.00064,0.012609,0.000678,0.00039446,0.00318399,1.618e-05,0.00039387,1.934e-05,0.00081261,0.00248128,4e-06,1.9e-05,5e-06,0.001956,3e-06,2.6e-05,6.5e-05,0.000133,0.00065311,5.7e-05,9.699e-05,0.00113418,0.00024,3.6e-05,2.2e-05,5.704e-05,0.00049511,0.00114253,0.000321,0.000167,0.005811,0.000252,4.4e-07,0.000568,5.1e-05,6.7e-05,0.682813,video8768


In [66]:
C3D_features_Test = C3D_features_Test.drop(['video'], axis=1)

# Combine C3D Train and Test Features

In [67]:
C3D_final_dataset = pd.concat([C3D_features_Dev, C3D_features_Test],ignore_index=True)
C3D_final_dataset.head()

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,...,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,video
0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,7.75e-06,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.999985,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,7.12e-06,video6633
1,0.010858,0.010386,0.0,0.0,0.0,0.0,2.7e-07,0.0,1e-08,3.4e-07,8e-08,1e-08,4e-06,0.000105,0.0,4e-08,1e-08,0.0,0.0,0.00013318,0.0,0.0,0.0,0.0,7.6e-07,0.0,9.8e-07,0.0,0.0,0.0,0.0,0.0,6e-08,2.1e-05,5e-08,0.0,1e-08,3e-08,2e-08,0.0,...,0.970125,1.6e-05,0.001298,3.2e-05,1e-06,0.0,0.0,1e-08,2.5e-07,0.0,0.0,5e-08,0.0,1e-08,1e-08,4.2e-07,0.0,3e-08,2e-08,0.0,0.0,0.0,6e-08,0.0,0.0,9e-08,0.0,0.0,1.1e-07,1.4e-07,0.0,0.0,1.7e-07,0.0,0.0,1e-08,1.3e-06,2.6e-06,8e-08,video6632
2,0.0002,6.5e-05,0.993807,2e-07,4.7e-07,7.3e-05,3.7e-06,0.000337,6.71e-06,2.29e-06,6.38e-06,7.34e-06,1.9e-05,7e-06,3.24e-06,3.81e-06,1.411e-05,1e-06,3e-06,2.3e-07,1.9e-07,0.00293762,5.2e-07,9.2e-07,1.136e-05,1.1e-05,0.00033104,9.2e-07,2.2e-05,8e-08,2.8e-05,1.3e-05,5.778e-05,3.7e-05,1.683e-05,6.86e-06,3.99e-06,8.03e-06,1.45e-06,3.8e-07,...,2.7e-05,1.3e-05,7.7e-05,5e-05,0.000138,1.2e-05,7e-06,4.1e-07,5.44e-06,3.4e-07,7e-06,8.399e-05,4.07e-06,3.97e-06,5.9e-07,0.00012853,8.8e-07,2.228e-05,1.105e-05,3e-06,1.5e-05,1.4e-05,4.09e-06,3e-06,4.6e-06,4.92e-06,5.29e-06,6.4e-07,2.372e-05,6.61e-06,1e-05,2.03e-06,5.8e-06,1e-06,1.49e-06,1.17e-05,1.5e-07,8.3e-07,0.000106,video6634
3,0.791961,0.001496,0.00681,9.39e-06,1.22e-06,4e-06,7.452e-05,3e-06,3.659e-05,1.63e-06,8.65e-06,8.063e-05,0.00349,9e-06,1.48e-06,0.00032604,1.254e-05,2.2e-05,5.4e-05,2.084e-05,1.38e-06,3.5e-07,2.585e-05,1.62e-06,0.00045397,8e-06,6.03e-06,1.475e-05,1.5e-05,5.58e-06,2e-06,2.4e-05,0.00024217,0.006136,5.6e-06,1.2e-07,2.86e-06,4.289e-05,4.11e-06,2.33e-06,...,0.030029,0.0046,6.9e-05,3.9e-05,0.004966,6e-06,3e-06,7.55e-06,2.539e-05,1.184e-05,1e-05,6.911e-05,6e-08,2.055e-05,1.99e-06,0.00800232,7.43e-06,1.842e-05,0.00319682,1.1e-05,0.000256,6e-06,3.234e-05,2e-06,6.3e-07,3.683e-05,1.3e-07,7.95e-06,0.00040195,0.00026156,6e-06,3.2e-07,0.00051944,2e-06,9.7e-07,1.735e-05,0.00025745,0.1305835,3.993e-05,video6639
4,0.005782,0.000306,0.004011,1.007e-05,1.034e-05,2e-06,3.16e-06,3e-06,1.984e-05,5.75e-06,6.642e-05,6.69e-06,0.000301,0.004799,2.8e-07,1.669e-05,2.67e-06,1e-06,1.1e-05,6.526e-05,3.1e-07,3.59e-05,1.4e-06,6.26e-06,0.01750103,1.9e-05,0.01190515,4.45e-06,5e-06,2.387e-05,7e-06,3.5e-05,3.63e-06,0.66742,0.00034824,1.9e-07,1.121e-05,6.83e-06,0.00018376,1.25e-06,...,0.005818,0.000535,0.001711,0.112263,0.000408,3.5e-05,0.000267,5.395e-05,3.899e-05,3.619e-05,0.000321,0.00045509,1.51e-06,6.2e-06,7.51e-06,0.00797192,7e-08,1.873e-05,0.00015111,1.1e-05,0.000398,1e-06,4.09e-06,4e-06,4.68e-06,2.08e-06,1.48e-06,0.00016518,2.738e-05,2.106e-05,2e-06,3.72e-06,0.00681835,5e-06,5e-08,2.088e-05,0.00127175,0.00048622,1.965e-05,video6643


In [68]:
C3D_final_dataset = C3D_final_dataset.drop(['video'], axis=1)

In [69]:
C3D_final_dataset.shape

(8000, 101)

# Load HMP TEST DATA

In [70]:
def read_HMP(fname):
    """Scan HMP(Histogram of Motion Patterns) features from file"""
    with open(fname) as f:
        for line in f:
            pairs=line.split()
            HMP_temp = { int(p.split(':')[0]) : float(p.split(':')[1]) for p in pairs}
    # there are 6075 bins, fill zeros
    HMP = np.zeros(6075)
    for idx in HMP_temp.keys():
        HMP[idx-1] = HMP_temp[idx]            
    return HMP

In [71]:
HMP_feature_list_test= []
video_names_list_test = []
for filename in glob.glob('/content/drive/MyDrive/HMP_test/*.txt'):
    name = ((filename.split('/')[-1]).split('.')[0]+'.webm')
    video_names_list.append(name)
    HMP_features_test = read_HMP(filename)
    HMP_feature_list_test.append(HMP_features_test)

HMP_features_Test = pd.DataFrame(np.array(HMP_feature_list_test).reshape(2000,6075))


In [72]:
HMP_features_Test.shape

(2000, 6075)

Combine HMP train and test set

In [73]:
HMP_final_dataset = pd.concat([HMP_features, HMP_features_Test],ignore_index=True)
HMP_final_dataset.head(5)

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,...,6036,6037,6038,6039,6040,6041,6042,6043,6044,6045,6046,6047,6048,6049,6050,6051,6052,6053,6054,6055,6056,6057,6058,6059,6060,6061,6062,6063,6064,6065,6066,6067,6068,6069,6070,6071,6072,6073,6074,video
0,0.125563,0.024036,0.000314,0.0,0.015864,0.000358,0.0,0.0,8.6e-05,0.0,0.0,0.0,0.0,0.002795,5.4e-05,0.0,0.0,3.7e-05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,5e-06,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.000173,0.000459,0.0,0.000148,0.000104,0.0,0.000121,0.000551,0.0,0.000114,0.000884,2e-06,0.000116,7.7e-05,2e-06,2.7e-05,0.000136,0.0,0.0,2e-06,0.0,9.1e-05,3.5e-05,0.0,0.000163,0.000467,2e-06,1e-05,1.7e-05,0.0,0.000393,0.000279,0.0,0.000289,0.001926,0.0,8.6e-05,0.00058,0.0,video6633
1,0.007526,0.001421,6.8e-05,0.0,0.001184,0.000143,0.0,0.0,7.9e-05,0.0,0.0,0.0,0.0,0.000246,2.4e-05,0.0,0.0,4.2e-05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,4e-06,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.000308,0.001054,0.000751,0.000176,6.2e-05,0.0,0.000123,0.000398,8.6e-05,0.000246,0.000433,0.000446,0.000143,5.3e-05,0.0,5.3e-05,9.9e-05,9e-06,4e-06,3.3e-05,4e-06,5.1e-05,3.5e-05,0.0,6.2e-05,0.000358,3.5e-05,2.4e-05,8.3e-05,5.3e-05,0.000244,6.6e-05,0.0,8.1e-05,0.000617,9.4e-05,0.00022,0.000762,0.001224,video6632
2,0.109584,0.018978,0.000289,0.0,0.008774,0.000208,0.0,2e-06,8.8e-05,0.0,0.0,0.0,0.0,0.002046,6.1e-05,0.0,0.0,3.8e-05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2e-06,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,5.7e-05,0.000158,7.3e-05,2.1e-05,9e-06,2e-06,1.9e-05,9.5e-05,2.1e-05,1.9e-05,9e-05,7.3e-05,5e-05,2.4e-05,0.0,1.2e-05,2.1e-05,0.0,0.0,2e-06,0.0,1.7e-05,7e-06,2e-06,6.6e-05,0.000203,2.6e-05,2e-06,4e-05,7e-06,5.4e-05,4.5e-05,0.0,2.8e-05,0.000291,3.3e-05,5.2e-05,0.000258,0.000215,video6634
3,0.120431,0.013561,0.000277,0.0,0.018974,0.000913,0.0,2.4e-05,0.000713,0.0,0.0,0.0,0.0,0.002496,0.000149,0.0,1.1e-05,0.000157,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,5e-06,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.000434,0.000543,0.000412,0.000412,4.5e-05,3e-06,0.000144,0.000282,3.7e-05,0.000197,0.000218,0.000157,0.000237,2.1e-05,0.0,4e-05,5.6e-05,8e-06,5e-06,1.3e-05,1.9e-05,0.000168,1.3e-05,0.0,0.000133,0.000202,2.9e-05,2.9e-05,3.5e-05,5.9e-05,0.00111,7.5e-05,8e-06,0.000333,0.000793,0.000101,0.000588,0.000503,0.000452,video6639
4,0.005026,0.001356,5.5e-05,0.0,0.000665,2.9e-05,0.0,0.0,2.4e-05,0.0,0.0,0.0,0.0,0.000147,2e-05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.000996,0.001604,0.000103,0.000768,0.000215,9e-06,0.000415,0.000926,2e-05,0.000538,0.001178,5e-05,0.000518,0.000169,7e-06,0.000134,0.000169,7e-06,2.6e-05,4.6e-05,7e-06,0.000373,8.8e-05,0.0,0.000338,0.000441,2.9e-05,7e-05,0.000149,9e-06,0.000882,0.0002,9e-06,0.000559,0.001097,1.8e-05,0.000632,0.001128,6.4e-05,video6643


In [74]:
HMP_final_dataset.shape

(8000, 6076)

Combine Train and Test

In [75]:
c3d_hmp_Final  = pd.concat([C3D_final_dataset,HMP_final_dataset],axis=1)

In [76]:
c3d_hmp_Final.shape

(8000, 6177)

In [77]:
c3d_hmp_Final = c3d_hmp_Final.drop(['video'], axis=1)

In [78]:
c3d_hmp_Final.head(10)

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,...,6035,6036,6037,6038,6039,6040,6041,6042,6043,6044,6045,6046,6047,6048,6049,6050,6051,6052,6053,6054,6055,6056,6057,6058,6059,6060,6061,6062,6063,6064,6065,6066,6067,6068,6069,6070,6071,6072,6073,6074
0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,7.75e-06,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,2e-06,0.000173,0.000459,0.0,0.000148,0.000104,0.0,0.000121,0.000551,0.0,0.000114,0.000884,2e-06,0.000116,7.7e-05,2e-06,2.7e-05,0.000136,0.0,0.0,2e-06,0.0,9.1e-05,3.5e-05,0.0,0.000163,0.000467,2e-06,1e-05,1.7e-05,0.0,0.000393,0.000279,0.0,0.000289,0.001926,0.0,8.6e-05,0.00058,0.0
1,0.010858,0.010386,0.0,0.0,0.0,0.0,2.7e-07,0.0,1e-08,3.4e-07,8e-08,1e-08,4e-06,0.000105,0.0,4e-08,1e-08,0.0,0.0,0.00013318,0.0,0.0,0.0,0.0,7.6e-07,0.0,9.8e-07,0.0,0.0,0.0,0.0,0.0,6e-08,2.1e-05,5e-08,0.0,1e-08,3e-08,2e-08,0.0,...,0.000685,0.000308,0.001054,0.000751,0.000176,6.2e-05,0.0,0.000123,0.000398,8.6e-05,0.000246,0.000433,0.000446,0.000143,5.3e-05,0.0,5.3e-05,9.9e-05,9e-06,4e-06,3.3e-05,4e-06,5.1e-05,3.5e-05,0.0,6.2e-05,0.000358,3.5e-05,2.4e-05,8.3e-05,5.3e-05,0.000244,6.6e-05,0.0,8.1e-05,0.000617,9.4e-05,0.00022,0.000762,0.001224
2,0.0002,6.5e-05,0.993807,2e-07,4.7e-07,7.3e-05,3.7e-06,0.000337,6.71e-06,2.29e-06,6.38e-06,7.34e-06,1.9e-05,7e-06,3.24e-06,3.81e-06,1.411e-05,1.04e-06,3e-06,2.3e-07,1.9e-07,0.00293762,5.2e-07,9.2e-07,1.136e-05,1.1e-05,0.00033104,9.2e-07,2.2e-05,8e-08,2.8e-05,1.3e-05,5.778e-05,3.7e-05,1.683e-05,6.86e-06,3.99e-06,8.03e-06,1.45e-06,3.8e-07,...,8.3e-05,5.7e-05,0.000158,7.3e-05,2.1e-05,9e-06,2e-06,1.9e-05,9.5e-05,2.1e-05,1.9e-05,9e-05,7.3e-05,5e-05,2.4e-05,0.0,1.2e-05,2.1e-05,0.0,0.0,2e-06,0.0,1.7e-05,7e-06,2e-06,6.6e-05,0.000203,2.6e-05,2e-06,4e-05,7e-06,5.4e-05,4.5e-05,0.0,2.8e-05,0.000291,3.3e-05,5.2e-05,0.000258,0.000215
3,0.791961,0.001496,0.00681,9.39e-06,1.22e-06,4e-06,7.452e-05,3e-06,3.659e-05,1.63e-06,8.65e-06,8.063e-05,0.00349,9e-06,1.48e-06,0.00032604,1.254e-05,2.197e-05,5.4e-05,2.084e-05,1.38e-06,3.5e-07,2.585e-05,1.62e-06,0.00045397,8e-06,6.03e-06,1.475e-05,1.5e-05,5.58e-06,2e-06,2.4e-05,0.00024217,0.006136,5.6e-06,1.2e-07,2.86e-06,4.289e-05,4.11e-06,2.33e-06,...,0.00021,0.000434,0.000543,0.000412,0.000412,4.5e-05,3e-06,0.000144,0.000282,3.7e-05,0.000197,0.000218,0.000157,0.000237,2.1e-05,0.0,4e-05,5.6e-05,8e-06,5e-06,1.3e-05,1.9e-05,0.000168,1.3e-05,0.0,0.000133,0.000202,2.9e-05,2.9e-05,3.5e-05,5.9e-05,0.00111,7.5e-05,8e-06,0.000333,0.000793,0.000101,0.000588,0.000503,0.000452
4,0.005782,0.000306,0.004011,1.007e-05,1.034e-05,2e-06,3.16e-06,3e-06,1.984e-05,5.75e-06,6.642e-05,6.69e-06,0.000301,0.004799,2.8e-07,1.669e-05,2.67e-06,1.06e-06,1.1e-05,6.526e-05,3.1e-07,3.59e-05,1.4e-06,6.26e-06,0.01750103,1.9e-05,0.01190515,4.45e-06,5e-06,2.387e-05,7e-06,3.5e-05,3.63e-06,0.66742,0.00034824,1.9e-07,1.121e-05,6.83e-06,0.00018376,1.25e-06,...,0.00016,0.000996,0.001604,0.000103,0.000768,0.000215,9e-06,0.000415,0.000926,2e-05,0.000538,0.001178,5e-05,0.000518,0.000169,7e-06,0.000134,0.000169,7e-06,2.6e-05,4.6e-05,7e-06,0.000373,8.8e-05,0.0,0.000338,0.000441,2.9e-05,7e-05,0.000149,9e-06,0.000882,0.0002,9e-06,0.000559,0.001097,1.8e-05,0.000632,0.001128,6.4e-05
5,0.001927,0.003879,0.738982,2.75e-05,2.616e-05,2.9e-05,0.00080219,0.000775,4.875e-05,3.906e-05,0.00043363,0.00033424,5.8e-05,2.6e-05,7.5e-05,0.00034563,4.777e-05,5.93e-05,4.5e-05,9.816e-05,4.38e-06,0.1439418,6.315e-05,0.00010764,9.316e-05,0.000179,9.464e-05,6.445e-05,0.000211,6.314e-05,5e-05,0.005812,0.00340365,0.000247,0.00353672,1.259e-05,5.703e-05,0.00076019,0.00232122,1.756e-05,...,0.000492,0.000372,0.000608,0.000293,0.000173,8.1e-05,1.5e-05,0.000169,0.00037,5.9e-05,0.000245,0.000449,0.00016,0.000346,9.2e-05,1.1e-05,9.4e-05,0.000114,9e-06,2e-06,1.5e-05,4e-06,0.000103,2e-05,7e-06,0.000263,0.000409,3.9e-05,2e-05,7.2e-05,3.1e-05,0.000249,9.4e-05,2e-06,0.000201,0.000444,5.7e-05,0.000462,0.000698,0.00021
6,0.001606,0.000703,0.016928,0.00026241,6.521e-05,0.000166,0.00138288,0.000564,2.549e-05,0.00011021,4.727e-05,0.000106,0.000129,0.001323,6.937e-05,0.00018763,0.00051358,0.00010552,0.000179,0.00010611,2.16e-06,0.2306214,7.12e-06,0.00013489,0.00224466,0.008552,0.00264915,4.501e-05,7.3e-05,5.949e-05,7e-05,0.030642,0.00024119,0.000154,0.09763245,0.00098085,0.01003205,0.00190817,0.00104929,8.088e-05,...,0.000239,0.000453,0.001969,0.000142,0.000215,0.000195,9e-06,0.00041,0.000933,4.6e-05,0.000331,0.000883,7.4e-05,8.8e-05,0.000134,9e-06,7.7e-05,0.000147,7e-06,9e-06,3.5e-05,0.0,0.000151,0.000131,4e-06,0.000226,0.000434,2.2e-05,6.4e-05,0.000138,1.1e-05,0.000162,0.000103,0.0,0.000212,0.00058,2e-05,0.0003,0.000679,5e-05
7,3.8e-05,0.000295,0.001251,3.434e-05,0.00019411,6.2e-05,8.84e-06,0.000499,3.923e-05,2.291e-05,0.00012775,0.01619368,1.8e-05,2.1e-05,5.5e-07,0.000342,1.09e-06,7.3e-07,0.000356,5.43e-06,4.89e-06,0.4323865,1.476e-05,0.00174491,0.00120853,0.027514,0.00195184,2.14e-06,3.6e-05,0.01447474,5e-06,0.238099,2.738e-05,2e-06,0.00642882,0.01607525,5.994e-05,8.658e-05,2.555e-05,0.00049637,...,0.000299,0.000204,0.0006,0.000252,0.000135,6.9e-05,2e-06,0.000131,0.000345,6e-05,0.000126,0.000409,0.000175,0.000128,4.4e-05,2e-06,4.2e-05,6.2e-05,7e-06,2e-06,2.2e-05,2e-06,7.5e-05,3.3e-05,0.0,8.4e-05,0.000237,2.9e-05,2.4e-05,5.5e-05,2.4e-05,0.000173,5.3e-05,2e-06,0.000117,0.000401,2.9e-05,0.000193,0.000432,0.000314
8,0.023067,0.005192,0.001679,6.392e-05,0.00133022,0.01529,0.01064836,4.5e-05,0.00058927,0.00710164,0.00340782,0.00057827,0.064231,0.009031,0.00020088,0.00048604,0.00029634,0.00013754,0.000602,0.00032946,2.154e-05,0.00077521,2.877e-05,0.00058209,0.00268821,9.7e-05,0.09716719,0.00050756,4.4e-05,0.00203105,0.001906,0.00197,7.597e-05,0.038017,0.00070354,3.516e-05,0.00016261,0.00077328,0.04882166,5.065e-05,...,0.000247,0.000229,0.000397,0.000165,7.5e-05,3.5e-05,2e-05,0.000143,0.000128,2.2e-05,0.00015,0.00022,9.7e-05,5.5e-05,2.4e-05,2e-06,1.1e-05,1.1e-05,0.0,4e-06,0.0,0.0,2.9e-05,1.3e-05,2e-06,4.2e-05,6e-05,4e-06,2.2e-05,7e-06,0.0,4e-05,1.1e-05,2e-06,6.8e-05,6.2e-05,7e-06,0.000119,0.000115,3.5e-05
9,0.005959,0.004765,0.003757,0.00057871,7.344e-05,0.000129,0.00046653,0.010266,0.00030281,0.00021965,0.00025218,0.00296373,3.2e-05,0.000592,3.26e-05,0.00146643,0.00024307,0.00018084,0.004019,0.00239096,0.00012069,0.00848849,0.00052162,0.00041646,0.0164342,0.049686,0.00147055,5.432e-05,0.001055,0.00544277,0.000173,0.024688,0.00096407,0.00088,0.01140529,0.00452216,0.00085945,0.0006573,0.00069424,0.00182311,...,8e-05,9.6e-05,0.00026,4.5e-05,5.6e-05,2.4e-05,0.0,7.8e-05,0.00016,4e-06,8.7e-05,0.00018,2.7e-05,3.8e-05,1.3e-05,0.0,2e-05,2.9e-05,0.0,4e-06,4e-06,2e-06,5.3e-05,1.6e-05,0.0,9.3e-05,6.9e-05,1.3e-05,7e-06,1.8e-05,2e-06,4.2e-05,2.2e-05,0.0,6.5e-05,0.000111,7e-06,8e-05,0.000194,5.8e-05


In [104]:
c3d_hmp_Final.to_pickle('/content/drive/MyDrive/c3d_hmp_Final.pkl')

In [105]:
c3d_hmp_Final = pd.read_pickle('/content/drive/MyDrive/c3d_hmp_Final.pkl')

# Gradient Boost on Final Test Data

In [95]:
X_train = c3d_hmp_Final[0:6000]
X_test = c3d_hmp_Final[6000:]
y_train = gdtruth[['short-term_memorability','long-term_memorability']].values



In [96]:
print('X_train', X_train.shape)
print('X_test', X_test.shape)
print('y_train', y_train.shape)

X_train (6000, 6176)
X_test (2000, 6176)
y_train (6000, 2)


In [97]:
gbr = GradientBoostingRegressor()
model = MultiOutputRegressor(estimator=gbr)
model.fit(X_train,y_train)
gbr_final = model.predict(X_test)


In [98]:
print(gbr_final)

[[0.84662461 0.72596225]
 [0.83719919 0.7492249 ]
 [0.829174   0.72897341]
 ...
 [0.85585285 0.7451828 ]
 [0.84635474 0.77069128]
 [0.87690077 0.76748629]]


# Store results into CSV

In [99]:

print(gb_test_pred)

prediction = pd.DataFrame(gbr_final, columns=['Short-term','long-term']).to_csv('/content/drive/MyDrive/final_result.csv',index=False)



[[0.84662461 0.72596225]
 [0.83719919 0.7492249 ]
 [0.829174   0.72897341]
 ...
 [0.85167743 0.7451828 ]
 [0.84635474 0.77069128]
 [0.87690077 0.76748629]]
