# Video Memorability Prediction
<p>
    In this notebook we will try to determine video memorabilty of users based on certain feature. We will predict score in two classes short term and long term respectively. We will use caption and C3D features for prediction. For prediction we will use Artificial nueral network.
</p>

# Functions
<p>
    This section of notebook will be used for creating all the function that we wil require.
<p>

In [3]:
# for loading data from a text file
def load_txt_file(file,col_name_1,col_name_2):
    """For current task we will use this method to load the captions into a dataframe. In future we can genralize this 
    method for all type of text files"""
    vn = []
    cap = []
    df = pd.DataFrame();
    with open(file) as f:
        for line in f:
            pairs = line.split()
            vn.append(pairs[0])
            cap.append(pairs[1])
        df[col_name_1]=vn
        df[col_name_2]=cap
    return df

#for reading c3d feature from text file
def load_C3D(file):
    with open(file) as f:
        for line in f:
            C3D =[float(item) for item in line.split()] # convert to float type, using default separator
    return C3D

# for computing spearman's correaltion coefficient
def Get_score(Y_pred,Y_true):
    '''Calculate the Spearmann"s correlation coefficient'''
    Y_pred = np.squeeze(Y_pred)
    Y_true = np.squeeze(Y_true)
    if Y_pred.shape != Y_true.shape:
        print('Input shapes don\'t match!')
    else:
        if len(Y_pred.shape) == 1:
            Res = pd.DataFrame({'Y_true':Y_true,'Y_pred':Y_pred})
            score_mat = Res[['Y_true','Y_pred']].corr(method='spearman',min_periods=1)
            print('The Spearman\'s correlation coefficient is: %.3f' % score_mat.iloc[1][0])
        else:
            for ii in range(Y_pred.shape[1]):
                Get_score(Y_pred[:,ii],Y_true[:,ii])

# Connecting Google Drive
<p> Below piece of code will load data from DCU google drive</p>

In [4]:
from google.colab import drive
import os
drive.mount('/content/drive/')
os.chdir('/content/drive/My Drive/CA684_Assignment/')

Mounted at /content/drive/


# Importing Libraries
<p>
    This section will be used for importing all the required packages for accomplishing objective
</p>


In [2]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

#for c3d loading
from pathlib import Path

#for tf-idf
from sklearn.feature_extraction.text import TfidfVectorizer

#for splitting dataset
from sklearn.model_selection import train_test_split as tts
# for random forest
from sklearn.ensemble import RandomForestRegressor as rf

#for tensor flow implementation
import tensorflow as tf

# Loading Data

<p>
    In this part we will load all the features and required data that we will use for our memorabilty prediction task. Mentioned below are loaded data:
    <p>Features List</p>
    <ol>
        <li>
            <b> Caption</b>
        </li>
        <li>
            <b> C3D</b>
        </li>
    </ol>
    <p>Other Data</p>
    <ol>
        <li>
            <b> Ground Truth</b>
        </li>
    </ol>
</p>

In [5]:
#captions from text file
caption_path = './Dev-set/Captions/dev-set_video-captions.txt'
df_captions = load_txt_file(caption_path,'video','Caption')
df_captions

Unnamed: 0,video,Caption
0,video3.webm,blonde-woman-is-massaged-tilt-down
1,video4.webm,roulette-table-spinning-with-ball-in-closeup-shot
2,video6.webm,khr-gangsters
3,video8.webm,medical-helicopter-hovers-at-airport
4,video10.webm,couple-relaxing-on-picnic-crane-shot
...,...,...
5995,video7488.webm,beautiful-young-woman-in-front-of-fountains
5996,video7489.webm,focus-pull-from-molting-penguin-to-penguin-col...
5997,video7491.webm,students-walking-in-university-of-mexico
5998,video7492.webm,beautiful-black-woman-at-spa


In [6]:
#c3d features from text file
c3d_dir = Path('./Dev-set/C3D/')
c3d_Dict = {}

for file in list(c3d_dir.glob('*.txt')):
  key = file.with_suffix('.webm').name
  c3d_Dict[key] = load_C3D(file)

# it will be easy to use both features together when shape is consistent. That is why transpose is done.
df_C3D = pd.DataFrame(c3d_Dict).T
# adding first column name to facilitate merging operation in later stage.
df_C3D=pd.DataFrame(df_C3D).rename_axis('video')
df_C3D

Unnamed: 0_level_0,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,...,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100
video,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1,Unnamed: 48_level_1,Unnamed: 49_level_1,Unnamed: 50_level_1,Unnamed: 51_level_1,Unnamed: 52_level_1,Unnamed: 53_level_1,Unnamed: 54_level_1,Unnamed: 55_level_1,Unnamed: 56_level_1,Unnamed: 57_level_1,Unnamed: 58_level_1,Unnamed: 59_level_1,Unnamed: 60_level_1,Unnamed: 61_level_1,Unnamed: 62_level_1,Unnamed: 63_level_1,Unnamed: 64_level_1,Unnamed: 65_level_1,Unnamed: 66_level_1,Unnamed: 67_level_1,Unnamed: 68_level_1,Unnamed: 69_level_1,Unnamed: 70_level_1,Unnamed: 71_level_1,Unnamed: 72_level_1,Unnamed: 73_level_1,Unnamed: 74_level_1,Unnamed: 75_level_1,Unnamed: 76_level_1,Unnamed: 77_level_1,Unnamed: 78_level_1,Unnamed: 79_level_1,Unnamed: 80_level_1,Unnamed: 81_level_1
video6632.webm,0.010858,0.010386,0.000000,0.000000e+00,0.000000e+00,0.000000e+00,2.700000e-07,0.000000e+00,1.000000e-08,3.400000e-07,8.000000e-08,1.000000e-08,0.000004,0.000105,0.000000e+00,4.000000e-08,1.000000e-08,0.000000,0.000000e+00,1.331800e-04,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,7.600000e-07,0.000000e+00,9.800000e-07,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000,6.000000e-08,0.000021,5.000000e-08,0.000000e+00,1.000000e-08,3.000000e-08,2.000000e-08,0.000000e+00,...,0.001623,0.970125,0.000016,0.001298,0.000032,0.000001,0.000000,0.000000e+00,1.000000e-08,2.500000e-07,0.000000e+00,0.000000e+00,5.000000e-08,0.000000e+00,1.000000e-08,1.000000e-08,4.200000e-07,0.000000e+00,3.000000e-08,2.000000e-08,0.000000,0.000000,0.000000e+00,6.000000e-08,0.000000e+00,0.000000e+00,9.000000e-08,0.000000e+00,0.000000e+00,1.100000e-07,1.400000e-07,0.000000e+00,0.000000e+00,1.700000e-07,0.000000e+00,0.000000e+00,1.000000e-08,1.300000e-06,2.600000e-06,8.000000e-08
video6634.webm,0.000200,0.000065,0.993807,2.000000e-07,4.700000e-07,7.339000e-05,3.700000e-06,3.371100e-04,6.710000e-06,2.290000e-06,6.380000e-06,7.340000e-06,0.000019,0.000007,3.240000e-06,3.810000e-06,1.411000e-05,0.000001,2.710000e-06,2.300000e-07,1.900000e-07,2.937620e-03,5.200000e-07,9.200000e-07,1.136000e-05,1.055000e-05,3.310400e-04,9.200000e-07,2.189000e-05,8.000000e-08,2.769000e-05,0.000013,5.778000e-05,0.000037,1.683000e-05,6.860000e-06,3.990000e-06,8.030000e-06,1.450000e-06,3.800000e-07,...,0.000045,0.000027,0.000013,0.000077,0.000050,0.000138,0.000012,7.190000e-06,4.100000e-07,5.440000e-06,3.400000e-07,6.870000e-06,8.399000e-05,4.070000e-06,3.970000e-06,5.900000e-07,1.285300e-04,8.800000e-07,2.228000e-05,1.105000e-05,0.000003,0.000015,1.361000e-05,4.090000e-06,3.320000e-06,4.600000e-06,4.920000e-06,5.290000e-06,6.400000e-07,2.372000e-05,6.610000e-06,1.024000e-05,2.030000e-06,5.800000e-06,1.490000e-06,1.490000e-06,1.170000e-05,1.500000e-07,8.300000e-07,1.060000e-04
video6633.webm,0.000000,0.000000,0.000000,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000,0.000000,0.000000e+00,0.000000e+00,0.000000e+00,0.000000,0.000000e+00,0.000000e+00,0.000000e+00,7.750000e-06,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000,0.000000e+00,0.000000,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,...,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000,0.999985,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,0.000000e+00,7.120000e-06
video6645.webm,0.005959,0.004765,0.003757,5.787100e-04,7.344000e-05,1.288200e-04,4.665300e-04,1.026567e-02,3.028100e-04,2.196500e-04,2.521800e-04,2.963730e-03,0.000032,0.000592,3.260000e-05,1.466430e-03,2.430700e-04,0.000181,4.019180e-03,2.390960e-03,1.206900e-04,8.488490e-03,5.216200e-04,4.164600e-04,1.643420e-02,4.968638e-02,1.470550e-03,5.432000e-05,1.055230e-03,5.442770e-03,1.727700e-04,0.024688,9.640700e-04,0.000880,1.140529e-02,4.522160e-03,8.594500e-04,6.573000e-04,6.942400e-04,1.823110e-03,...,0.021621,0.000639,0.014109,0.000472,0.002461,0.004046,0.000905,9.967200e-04,2.514660e-03,1.587440e-03,1.125030e-03,1.008410e-03,2.133877e-02,6.474700e-04,1.373010e-03,9.568000e-05,5.635083e-02,1.628000e-05,5.379910e-03,6.020020e-03,0.046499,0.023942,2.065825e-02,3.349520e-03,2.637110e-03,9.376000e-05,1.170705e-02,2.290900e-04,9.789210e-03,2.764390e-03,8.052350e-03,3.991550e-03,4.136610e-03,3.670390e-02,7.665100e-04,3.692100e-04,1.251980e-02,8.422000e-05,1.159825e-02,1.155040e-03
video6643.webm,0.005782,0.000306,0.004011,1.007000e-05,1.034000e-05,1.740000e-06,3.160000e-06,3.320000e-06,1.984000e-05,5.750000e-06,6.642000e-05,6.690000e-06,0.000301,0.004799,2.800000e-07,1.669000e-05,2.670000e-06,0.000001,1.062000e-05,6.526000e-05,3.100000e-07,3.590000e-05,1.400000e-06,6.260000e-06,1.750103e-02,1.921000e-05,1.190515e-02,4.450000e-06,5.270000e-06,2.387000e-05,6.930000e-06,0.000035,3.630000e-06,0.667420,3.482400e-04,1.900000e-07,1.121000e-05,6.830000e-06,1.837600e-04,1.250000e-06,...,0.068799,0.005818,0.000535,0.001711,0.112263,0.000408,0.000035,2.667000e-04,5.395000e-05,3.899000e-05,3.619000e-05,3.206000e-04,4.550900e-04,1.510000e-06,6.200000e-06,7.510000e-06,7.971920e-03,7.000000e-08,1.873000e-05,1.511100e-04,0.000011,0.000398,1.230000e-06,4.090000e-06,4.260000e-06,4.680000e-06,2.080000e-06,1.480000e-06,1.651800e-04,2.738000e-05,2.106000e-05,1.710000e-06,3.720000e-06,6.818350e-03,4.920000e-06,5.000000e-08,2.088000e-05,1.271750e-03,4.862200e-04,1.965000e-05
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
video2119.webm,0.012528,0.000761,0.000007,3.500000e-07,1.200000e-07,8.000000e-08,5.587000e-05,9.400000e-07,1.413000e-05,1.961000e-05,1.034000e-05,6.500000e-06,0.097656,0.000685,3.090000e-06,2.959000e-05,8.920000e-06,0.000008,2.490000e-06,2.150000e-05,1.400000e-07,8.700000e-07,5.700000e-07,6.800000e-07,1.338200e-04,9.700000e-07,1.403200e-04,1.400000e-06,6.800000e-07,5.000000e-07,9.300000e-07,0.000003,1.398000e-05,0.715524,2.270000e-06,3.000000e-08,8.010000e-06,2.916100e-04,3.590000e-06,5.500000e-07,...,0.008800,0.063876,0.064314,0.026831,0.000108,0.000162,0.000002,5.500000e-07,5.198000e-05,3.760000e-06,7.740000e-06,1.630000e-06,1.182160e-03,1.127000e-05,5.500000e-07,5.640000e-06,8.994500e-04,1.000000e-08,3.988000e-05,1.270400e-04,0.000029,0.000359,8.800000e-07,1.127000e-05,6.800000e-07,4.500000e-07,8.860000e-06,2.000000e-08,7.300000e-07,5.360000e-06,2.583000e-05,1.300000e-07,6.900000e-07,3.190400e-04,3.900000e-07,3.000000e-08,1.190000e-06,5.204000e-05,3.012800e-04,4.497800e-04
video212.webm,0.522914,0.414735,0.000003,1.222000e-05,9.000000e-08,5.300000e-07,1.470900e-04,3.020000e-06,6.650000e-06,2.660000e-06,3.260000e-06,2.850000e-06,0.000358,0.000852,5.800000e-07,2.870000e-06,1.078000e-05,0.000001,2.600000e-07,1.442952e-02,1.290000e-06,2.400000e-07,3.000000e-07,1.000000e-07,9.419000e-05,1.100000e-07,5.430000e-06,5.800000e-07,2.240000e-06,6.700000e-07,1.260000e-06,0.000004,5.300000e-07,0.032278,2.014000e-05,4.000000e-08,2.358000e-05,6.925000e-05,2.773000e-05,1.330000e-06,...,0.000038,0.003485,0.000021,0.000036,0.000085,0.000038,0.000002,6.700000e-07,5.200000e-07,5.230000e-06,1.112000e-05,4.300000e-07,7.437000e-05,9.000000e-08,4.710000e-06,1.900000e-07,9.723320e-03,2.600000e-07,3.090000e-06,1.220000e-06,0.000006,0.000041,2.000000e-07,1.313000e-05,5.770000e-06,2.590000e-06,2.030000e-06,2.000000e-08,2.800000e-07,9.790000e-06,2.703000e-05,1.290000e-06,1.100000e-07,8.772000e-05,4.600000e-07,5.300000e-07,2.250000e-06,2.816000e-05,3.242000e-05,2.290000e-05
video2122.webm,0.000308,0.020098,0.000202,4.437977e-01,3.672000e-05,6.878000e-05,3.877300e-04,2.330830e-03,4.581000e-05,2.726000e-05,7.896000e-05,6.476519e-02,0.002513,0.005903,1.230000e-06,1.288200e-03,4.531100e-04,0.000158,2.119660e-03,4.290605e-02,4.303900e-04,1.165200e-04,1.563390e-03,5.867000e-05,3.502304e-02,8.022000e-05,3.005000e-05,3.052700e-04,1.276000e-05,2.730610e-03,1.320000e-05,0.021779,1.406000e-05,0.000243,3.942000e-05,4.301630e-03,8.485000e-05,7.271800e-04,4.828000e-05,3.712300e-04,...,0.009430,0.000964,0.020027,0.000054,0.003669,0.000494,0.000034,1.046390e-03,4.648500e-04,2.252170e-03,9.982570e-03,4.288000e-05,2.997000e-05,2.279400e-04,8.422200e-04,4.191000e-05,1.009720e-03,2.200000e-05,1.238100e-04,5.244900e-04,0.000204,0.001305,2.060600e-04,6.574840e-03,1.608600e-04,7.670000e-05,3.534600e-04,9.049000e-05,1.094464e-01,5.318130e-03,1.574500e-04,4.720000e-04,6.641000e-05,8.255000e-05,4.363000e-04,1.664550e-03,1.077400e-04,1.027500e-04,4.090985e-02,4.467380e-03
video2121.webm,0.010868,0.000942,0.006639,2.173930e-03,2.588400e-04,2.615000e-05,9.530000e-06,1.088000e-03,9.477000e-05,1.435600e-04,4.784000e-04,3.235300e-04,0.035501,0.002720,1.499000e-05,5.817800e-04,1.212900e-04,0.002276,1.222240e-03,8.132900e-04,4.320000e-05,1.281620e-03,8.023000e-05,1.521600e-04,2.565624e-02,4.381500e-03,4.189000e-05,1.160300e-04,7.560000e-05,5.451000e-05,1.367000e-05,0.018576,6.923000e-05,0.093578,4.310212e-02,2.725900e-04,1.775951e-02,1.043718e-02,5.947526e-02,3.519000e-05,...,0.003495,0.001021,0.000699,0.000397,0.000198,0.000061,0.000293,9.500000e-05,2.000130e-03,1.604900e-04,1.718040e-03,8.541000e-05,7.697712e-02,1.059350e-02,3.364200e-04,2.610000e-06,1.823974e-01,5.710000e-06,8.780000e-04,1.724180e-03,0.002050,0.007040,2.466000e-05,1.197400e-04,2.179000e-04,2.608000e-05,6.464800e-04,8.470000e-06,2.134300e-04,9.055000e-05,5.711080e-03,4.488400e-04,2.415000e-04,5.310532e-02,2.351200e-04,3.744000e-05,2.715500e-04,2.610970e-02,1.942440e-01,6.364000e-05


In [7]:
#converted to csv because while merging with caption header was getting distorted. Hence, failing merge operation
df_C3D.to_csv('/content/drive/My Drive/dfC3D.csv')
df_C3D = pd.read_csv('/content/drive/My Drive/dfC3D.csv')

In [53]:
#df_C3D = df_C3D.sort_values(by=['video'],ascending=False)
df_C3D.head()

Unnamed: 0,video,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,...,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100
998,video999.webm,0.038641,0.002332,0.000602,9e-08,9.3e-07,3.5e-07,9e-08,2e-06,4.9e-07,4.5e-05,1e-06,5e-06,4.8e-05,0.000168,8e-08,3.6e-05,1.71e-06,1.2e-07,8e-07,2e-06,5e-08,3.877e-05,8e-08,1e-08,2.8e-05,1.048e-05,1.397e-05,6.8e-07,4.4e-07,3.6e-07,6e-08,1e-06,6.9e-07,0.899517,7.9e-05,4e-07,3e-06,3e-06,9e-06,...,0.047685,0.000115,0.000739,0.000107,0.000292,5.6e-05,6.6e-07,5.23e-06,3e-06,4.8e-05,1.07e-06,2e-06,0.000162,6.8e-07,9e-07,2.1e-07,0.001956,0.0,1.7e-05,1.1e-05,2e-06,5e-06,1e-08,5e-06,0.00012449,6.6e-07,2.43e-06,3e-08,4e-07,2.5e-05,7e-06,2.171e-05,3.08e-06,0.005029,1.01e-06,0.0,1.584e-05,1.7e-05,9e-06,7e-06
997,video998.webm,0.001812,0.015487,0.000276,0.00075734,1.066e-05,0.0002148,0.0015702,0.000611,0.00038964,9e-05,0.000445,0.021026,0.001217,0.021055,8.24e-06,0.004102,7.165e-05,1.171e-05,0.00033001,5.3e-05,7.441e-05,0.0004466,0.00019428,0.00013813,0.007079,4.25e-06,0.00055254,4.649e-05,9.028e-05,0.0025549,0.00022037,0.022218,0.00074573,0.000206,4.2e-05,3.636e-05,0.000125,0.000333,9.9e-05,...,0.009826,0.001525,0.021722,0.003581,0.062846,0.000721,7.031e-05,0.00082412,0.00078036,0.00034,0.00065885,0.000457,0.006023,9.265e-05,0.00053228,0.00016892,0.000162,2.9e-06,0.001457,0.448254,0.002445,0.001534,0.00035224,0.003156,0.00178531,3.946e-05,0.00010501,4.304e-05,0.00018468,0.001375,0.000251,0.00175431,1.035e-05,0.000663,0.00028295,1.601e-05,0.0033391,0.000226,0.012095,0.197223
999,video997.webm,0.02796,0.002547,0.102731,0.0011686,0.00041828,0.00023607,0.00391812,0.00065,0.00019489,0.000154,0.000171,0.001118,0.001181,0.082492,4.361e-05,0.000466,0.00098931,0.00018318,8.189e-05,0.001726,2.085e-05,0.04500515,0.00033953,0.0003604,0.028801,0.00069819,0.07892211,0.00028942,0.00130564,0.00026123,0.0002232,0.000989,0.00624346,0.007435,0.001419,0.00023707,0.003199,0.002955,0.000319,...,0.050067,0.015414,0.021531,0.019409,0.081935,0.022812,0.00026478,0.00208777,0.00126368,0.000282,0.00016538,0.00029,0.082006,0.00020614,0.0002198,0.00024191,0.075567,5.474e-05,0.00043,0.000612,0.000453,0.002898,0.00110078,0.000628,0.00101726,0.00010441,0.00050801,0.00099028,0.00068047,0.001243,0.000151,0.00013001,2.216e-05,0.006989,0.00035539,3.791e-05,0.0048081,4.5e-05,0.001269,0.083461
991,video995.webm,0.948379,0.024559,5e-06,1.32e-06,1e-08,8e-08,0.00021564,1e-06,4.89e-06,1e-06,1.9e-05,5e-06,0.000157,0.000217,7e-08,4e-06,1.5e-07,8e-08,1.1e-07,0.001894,3e-08,1.4e-07,4e-08,1.4e-07,1.6e-05,1e-08,6.1e-07,4.7e-07,1.52e-06,7e-08,1.76e-06,2e-06,3.43e-06,0.001102,2e-06,0.0,7e-06,2e-06,0.000762,...,0.009079,0.000496,4.6e-05,0.005001,0.005528,0.000121,3.9e-07,4e-07,6e-08,2e-06,8e-08,1e-06,7e-06,4e-08,6.1e-07,3e-08,0.000512,4e-08,2e-06,2e-06,2e-06,3e-06,5e-08,3e-06,1.2e-07,6.1e-07,4.9e-07,1e-08,5e-08,6.1e-05,1.5e-05,4e-08,3e-08,1.2e-05,1.9e-07,6e-08,2.7e-07,0.000278,0.000317,3.1e-05
990,video994.webm,0.000173,0.000676,0.000116,0.00030357,6.86e-06,0.06496482,0.00053886,0.000498,0.00022739,9e-06,0.001276,0.000321,0.001146,0.006001,1.64e-06,0.001074,1.853e-05,3.425e-05,2.817e-05,6e-05,6.16e-06,5.234e-05,2.772e-05,4.717e-05,0.000292,2.94e-06,0.00601434,0.0001443,2.099e-05,0.6625988,8.589e-05,0.000343,1.315e-05,0.00024,1e-05,2.47e-06,8e-06,0.000493,2e-06,...,0.00015,0.002289,0.000846,0.00218,0.001198,0.000387,2.459e-05,0.00023406,8.57e-06,0.000106,0.00017351,0.00104,0.00124,1.895e-05,0.00030114,2.118e-05,0.000275,6.7e-07,0.000243,0.000233,3.1e-05,0.005595,1.749e-05,0.00013,0.00074859,0.00030452,2.32e-06,3.55e-06,0.00010676,6.2e-05,8e-06,3.15e-06,1.752e-05,0.000137,0.00148323,0.00116327,5.802e-05,4e-06,0.000969,0.09848


In [54]:
# ground truth
label_path = './Dev-set/Ground-truth/'
df_ground_truth=pd.read_csv(label_path+'ground-truth.csv')
df_ground_truth

Unnamed: 0,video,short-term_memorability,nb_short-term_annotations,long-term_memorability,nb_long-term_annotations
0,video3.webm,0.924,34,0.846,13
1,video4.webm,0.923,33,0.667,12
2,video6.webm,0.863,33,0.700,10
3,video8.webm,0.922,33,0.818,11
4,video10.webm,0.950,34,0.900,10
...,...,...,...,...,...
5995,video7488.webm,0.921,33,1.000,9
5996,video7489.webm,0.909,53,0.839,31
5997,video7491.webm,0.713,33,0.818,11
5998,video7492.webm,0.954,34,1.000,16


In [55]:
# merging caption with ground truth
df_caption_ground_truth = df_captions.merge(df_ground_truth, on = 'video', how ='inner')

# Data Preprocessing

<p>
    In this part of notebook we will process loaded data in format that will be required for our analysis.
    For caption, we need to convert string into numbers. Gathering knowledge from tutorials provided, we choose implement tf-idf for caption feature transformation. For C3D feature we will use them as is. 
</p>

<ol>
    <li>
        Removing Punctuation from string
    </li>
    <li>
        Apply tf-idf(learned from: 
        <a href="https://towardsdatascience.com/tf-idf-explained-and-python-sklearn-implementation-b020c5e83275" target="_blank">https://towardsdatascience.com/tf-idf-explained-and-python-sklearn-implementation-b020c5e83275</a>)
    </li>
    <li>
        merge processed caption with C3D features
    </li>
</ol>

In [56]:
#1. removing punctuations in caption dataFrame
df_caption_ground_truth['Cleaned_Caption'] = df_caption_ground_truth['Caption'].str.replace(r'[^\w\s]+', '')
df_caption_ground_truth.head(1)

Unnamed: 0,video,Caption,short-term_memorability,nb_short-term_annotations,long-term_memorability,nb_long-term_annotations,Cleaned_Caption
0,video3.webm,blonde-woman-is-massaged-tilt-down,0.924,34,0.846,13,blondewomanismassagedtiltdown


<p>
    for setting max_features, we set a threshold value of 5 where words occuring less than 5 will not be used, as it can add unwanted dimensions to input features.
    Refrence taken from 
    <a href="https://stackoverflow.com/questions/46118910/scikit-learn-vectorizer-max-features" target="_blank">https://stackoverflow.com/questions/46118910/scikit-learn-vectorizer-max-features</a>
</p>

In [57]:
# Applying tf-idf
#1477
tfIdfVectorizer=TfidfVectorizer(use_idf=True, max_features=1477)

# did not used cleaned caption beacuse results were distorted as model was treating every entity as whole word. 
tfIdf = tfIdfVectorizer.fit_transform(df_caption_ground_truth['Caption'])

# merged refrence taken from https://stackoverflow.com/questions/18646076/add-numpy-array-as-column-to-pandas-data-frame
df_caption_ground_truth['vectorized_data'] = tfIdf.toarray().tolist()
df_caption_ground_truth.head()

Unnamed: 0,video,Caption,short-term_memorability,nb_short-term_annotations,long-term_memorability,nb_long-term_annotations,Cleaned_Caption,vectorized_data
0,video3.webm,blonde-woman-is-massaged-tilt-down,0.924,34,0.846,13,blondewomanismassagedtiltdown,"[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ..."
1,video4.webm,roulette-table-spinning-with-ball-in-closeup-shot,0.923,33,0.667,12,roulettetablespinningwithballincloseupshot,"[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ..."
2,video6.webm,khr-gangsters,0.863,33,0.7,10,khrgangsters,"[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ..."
3,video8.webm,medical-helicopter-hovers-at-airport,0.922,33,0.818,11,medicalhelicopterhoversatairport,"[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ..."
4,video10.webm,couple-relaxing-on-picnic-crane-shot,0.95,34,0.9,10,couplerelaxingonpicniccraneshot,"[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ..."


In [58]:
#merging final required features for analysis
df_C3d_Captions_final = pd.DataFrame(df_caption_ground_truth['vectorized_data'].tolist())
df_C3d_Captions_final['video'] = df_caption_ground_truth['video']
df_C3d_Captions_final = pd.merge(df_C3D,df_C3d_Captions_final, on='video')
df_C3d_Captions_final = df_C3d_Captions_final.drop(['video'], axis=1)
df_C3d_Captions_final

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,...,1437,1438,1439,1440,1441,1442,1443,1444,1445,1446,1447,1448,1449,1450,1451,1452,1453,1454,1455,1456,1457,1458,1459,1460,1461,1462,1463,1464,1465,1466,1467,1468,1469,1470,1471,1472,1473,1474,1475,1476
0,0.038641,0.002332,0.000602,9.000000e-08,9.300000e-07,3.500000e-07,9.000000e-08,0.000002,4.900000e-07,4.470000e-05,0.000001,0.000005,0.000048,0.000168,8.000000e-08,0.000036,1.710000e-06,1.200000e-07,8.000000e-07,0.000002,5.000000e-08,3.877000e-05,8.000000e-08,1.000000e-08,0.000028,1.048000e-05,1.397000e-05,6.800000e-07,4.400000e-07,3.600000e-07,6.000000e-08,1.120000e-06,6.900000e-07,0.899517,7.853000e-05,4.000000e-07,0.000003,0.000003,0.000009,1.900000e-07,...,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.000000,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,0.001812,0.015487,0.000276,7.573400e-04,1.066000e-05,2.148000e-04,1.570200e-03,0.000611,3.896400e-04,8.995000e-05,0.000445,0.021026,0.001217,0.021055,8.240000e-06,0.004102,7.165000e-05,1.171000e-05,3.300100e-04,0.000053,7.441000e-05,4.466000e-04,1.942800e-04,1.381300e-04,0.007079,4.250000e-06,5.525400e-04,4.649000e-05,9.028000e-05,2.554900e-03,2.203700e-04,2.221770e-02,7.457300e-04,0.000206,4.223000e-05,3.636000e-05,0.000125,0.000333,0.000099,5.347500e-04,...,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.132285,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,0.027960,0.002547,0.102731,1.168600e-03,4.182800e-04,2.360700e-04,3.918120e-03,0.000650,1.948900e-04,1.536200e-04,0.000171,0.001118,0.001181,0.082492,4.361000e-05,0.000466,9.893100e-04,1.831800e-04,8.189000e-05,0.001726,2.085000e-05,4.500515e-02,3.395300e-04,3.604000e-04,0.028801,6.981900e-04,7.892211e-02,2.894200e-04,1.305640e-03,2.612300e-04,2.232000e-04,9.893400e-04,6.243460e-03,0.007435,1.418720e-03,2.370700e-04,0.003199,0.002955,0.000319,5.100000e-06,...,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.173126,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.466512,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,0.948379,0.024559,0.000005,1.320000e-06,1.000000e-08,8.000000e-08,2.156400e-04,0.000001,4.890000e-06,1.350000e-06,0.000019,0.000005,0.000157,0.000217,7.000000e-08,0.000004,1.500000e-07,8.000000e-08,1.100000e-07,0.001894,3.000000e-08,1.400000e-07,4.000000e-08,1.400000e-07,0.000016,1.000000e-08,6.100000e-07,4.700000e-07,1.520000e-06,7.000000e-08,1.760000e-06,1.770000e-06,3.430000e-06,0.001102,2.290000e-06,0.000000e+00,0.000007,0.000002,0.000762,8.000000e-08,...,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.225164,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,0.000173,0.000676,0.000116,3.035700e-04,6.860000e-06,6.496482e-02,5.388600e-04,0.000498,2.273900e-04,8.650000e-06,0.001276,0.000321,0.001146,0.006001,1.640000e-06,0.001074,1.853000e-05,3.425000e-05,2.817000e-05,0.000060,6.160000e-06,5.234000e-05,2.772000e-05,4.717000e-05,0.000292,2.940000e-06,6.014340e-03,1.443000e-04,2.099000e-05,6.625988e-01,8.589000e-05,3.427600e-04,1.315000e-05,0.000240,1.014000e-05,2.470000e-06,0.000008,0.000493,0.000002,1.812930e-03,...,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.000000,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
5995,0.046734,0.000868,0.000005,8.050000e-06,2.000000e-08,5.600000e-07,1.303000e-05,0.000004,1.300000e-06,5.990000e-06,0.000027,0.000006,0.000381,0.275751,6.400000e-07,0.000004,4.420000e-06,2.700000e-07,4.600000e-07,0.061782,2.010000e-06,7.000000e-08,1.600000e-07,2.400000e-07,0.000021,2.000000e-08,1.309000e-04,3.600000e-07,6.190000e-06,8.000000e-08,1.400000e-06,4.800000e-07,7.500000e-07,0.003979,2.120000e-06,4.000000e-08,0.000048,0.000179,0.000006,9.000000e-08,...,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.000000,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5996,0.014036,0.000492,0.000233,2.793000e-05,6.320000e-06,1.330000e-06,5.740000e-06,0.000002,6.950000e-06,4.002000e-05,0.000020,0.000048,0.009390,0.000211,3.862000e-05,0.001238,1.984000e-05,1.536000e-05,2.010000e-06,0.000181,7.780000e-06,3.948000e-05,3.894000e-05,4.240000e-06,0.000071,1.220000e-06,4.570000e-06,5.500000e-06,4.530000e-06,7.750000e-06,2.590000e-06,1.481000e-05,6.413000e-05,0.008073,1.450000e-06,1.210000e-06,0.000015,0.003940,0.000264,2.410000e-06,...,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.140605,0.142837,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5997,0.468035,0.489501,0.000079,9.680000e-06,2.000000e-07,1.540000e-06,1.678300e-04,0.000010,3.437000e-05,5.200000e-07,0.000008,0.000030,0.009825,0.006902,2.630000e-06,0.000060,4.840000e-06,3.430000e-06,9.900000e-07,0.000559,4.350000e-06,7.000000e-08,4.500000e-06,1.490000e-06,0.000030,1.000000e-08,8.070000e-06,1.920000e-06,1.480000e-06,2.400000e-07,2.850000e-06,1.310000e-06,8.350000e-06,0.000146,2.600000e-07,3.000000e-08,0.000002,0.000001,0.000618,4.000000e-08,...,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.000000,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5998,0.002312,0.009966,0.027439,5.850000e-06,2.155300e-04,1.394600e-04,1.109700e-04,0.000029,4.636000e-05,2.388820e-03,0.000007,0.001819,0.000069,0.000278,3.072000e-05,0.003634,1.073300e-04,2.290720e-03,8.030000e-06,0.000083,1.650100e-04,7.040000e-06,2.242490e-03,1.152100e-04,0.000096,5.674500e-04,7.576000e-05,3.734000e-05,1.468240e-03,1.172500e-04,5.400000e-06,3.619000e-05,1.406900e-04,0.000051,6.652770e-03,2.504900e-04,0.001197,0.000333,0.000121,4.010250e-03,...,0.0,0.0,0.299265,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.000000,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


#### preprocessing for neural network
we will process features individually and merged togather. Then we will save these file in npz format which is a tensor freindly format. we will then use these npz file for further prediction.

In [59]:
# importing target feature
target_data = df_ground_truth[['short-term_memorability', 'long-term_memorability']].to_numpy()

#importing caption feature
input_caption_data = pd.DataFrame(df_caption_ground_truth['vectorized_data'].tolist()).to_numpy()

#importing C3D feature
input_C3D_data = df_C3D.drop('video', axis=1).to_numpy()

#importing merged(caption + C3D) feature
input_C3D_caption_data = df_C3d_Captions_final.to_numpy()

#getting shape of each entity
print('target data :', target_data.shape)
print('caption data :', input_caption_data.shape)
print('C3D data:', input_C3D_data.shape)
print('merged data :', input_C3D_caption_data.shape)

target data : (6000, 2)
caption data : (6000, 1477)
C3D data: (6000, 101)
merged data : (6000, 1578)


In [60]:
dataset_count = target_data.shape[0]
training_set_count = int(0.8 * dataset_count)
validation_set_count = int(0.1 * dataset_count)

#splitting all dataset into train, validation, test

#---------------------------------------training set-------------------------------------------#
# caption feature
train_input_caption = input_caption_data[:training_set_count]
train_target_caption = target_data[:training_set_count]

# C3D feature
train_input_C3D = input_C3D_data[:training_set_count]
train_target_C3D = target_data[:training_set_count]

# merged feature
train_input_C3D_caption = input_C3D_caption_data[:training_set_count]
train_target_C3D_caption = target_data[:training_set_count]

#---------------------------------------validation set------------------------------------------#
#caption feature
validation_input_caption = input_caption_data[training_set_count:training_set_count+validation_set_count]
validation_train_caption = target_data[training_set_count:training_set_count+validation_set_count]

validation_input_C3D = input_C3D_data[training_set_count:training_set_count+validation_set_count]
validation_train_C3D = target_data[training_set_count:training_set_count+validation_set_count]

validation_input_C3D_caption = input_C3D_caption_data[training_set_count:training_set_count+validation_set_count]
validation_train_C3D_caption = target_data[training_set_count:training_set_count+validation_set_count]

#-----------------------------------------test set----------------------------------------------#
# caption feature
test_input_caption = input_caption_data[training_set_count+validation_set_count:]
test_target_caption = target_data[training_set_count+validation_set_count:]

#C3D feature
test_input_C3D = input_C3D_data[training_set_count+validation_set_count:]
test_target_C3D = target_data[training_set_count+validation_set_count:]

#merged feature
test_input_C3D_caption = input_C3D_caption_data[training_set_count+validation_set_count:]
test_target_C3D_caption = target_data[training_set_count+validation_set_count:]

In [61]:
# saving data into npz file.
#caption feature
np.savez('/content/drive/My Drive/Caption_data_train', inputs=train_input_caption, targets=train_target_caption)
np.savez('/content/drive/My Drive/Caption_data_validation', inputs=validation_input_caption, targets=validation_train_caption)
np.savez('/content/drive/My Drive/Caption_data_test', inputs=test_input_caption, targets=test_target_caption)

#C3D feature
np.savez('/content/drive/My Drive/C3D_data_train', inputs=train_input_C3D, targets=train_target_C3D)
np.savez('/content/drive/My Drive/C3D_data_validation', inputs=validation_input_C3D, targets=validation_train_C3D)
np.savez('/content/drive/My Drive/C3D_data_test', inputs=test_input_C3D, targets=test_target_C3D)

#merged feature
np.savez('/content/drive/My Drive/C3D_caption_data_train', inputs=train_input_C3D_caption, targets=train_target_C3D_caption)
np.savez('/content/drive/My Drive/C3D_caption_data_validation', inputs=validation_input_C3D_caption, targets=validation_train_C3D_caption)
np.savez('/content/drive/My Drive/C3D_caption_data_test', inputs=test_input_C3D_caption, targets=test_target_C3D_caption)

# Modelling
<p>
    In this section of notebook, firstly we will analyse and do prediction on individuals feature (Captions and C3D). Following that we will do analysis by using merged dataframe. We will pass input features with target features to random forest model and nueral network.
    For understanding the accuracy of model we will use spearman's correlation coefficient. This will be calculated using Get_score method referenced from tutorial.
</p>

### Random Forest Regression
<b>1. Using Captions</b>

In [62]:
# defining input(x) and target(y) for model
x=pd.DataFrame(df_caption_ground_truth['vectorized_data'].tolist())
y = df_ground_truth[['short-term_memorability', 'long-term_memorability']].values

# dividing input in test and train.
x_train,x_test,y_train,y_test = tts(x,y,test_size=0.2,random_state=365)
# checking shape
print(x_train.shape)
print(x_test.shape)
print(y_train.shape)
print(y_train.shape)

# Applying Random forest
rfModel = rf(n_estimators=70,random_state=20)
rfModel.fit(x_train,y_train)

(4800, 1477)
(1200, 1477)
(4800, 2)
(4800, 2)


RandomForestRegressor(bootstrap=True, ccp_alpha=0.0, criterion='mse',
                      max_depth=None, max_features='auto', max_leaf_nodes=None,
                      max_samples=None, min_impurity_decrease=0.0,
                      min_impurity_split=None, min_samples_leaf=1,
                      min_samples_split=2, min_weight_fraction_leaf=0.0,
                      n_estimators=70, n_jobs=None, oob_score=False,
                      random_state=20, verbose=0, warm_start=False)

In [63]:
#testing trained model
y_pred = rfModel.predict(x_test)
Get_score(y_pred, y_test)

The Spearman's correlation coefficient is: 0.399
The Spearman's correlation coefficient is: 0.176


<b>2. Using C3D</b>

In [64]:
# defining input(x) and target(y) for model
x=df_C3D.drop('video',axis=1)
y = df_ground_truth[['short-term_memorability', 'long-term_memorability']].values

# dividing input in test and train.
x_train,x_test,y_train,y_test = tts(x,y,test_size=0.2,random_state=365)
# checking shape
print(x_train.shape)
print(x_test.shape)
print(y_train.shape)
print(y_train.shape)

# Applying Random forest
rfModel = rf(n_estimators=100,random_state=20)
rfModel.fit(x_train,y_train)

(4800, 101)
(1200, 101)
(4800, 2)
(4800, 2)


RandomForestRegressor(bootstrap=True, ccp_alpha=0.0, criterion='mse',
                      max_depth=None, max_features='auto', max_leaf_nodes=None,
                      max_samples=None, min_impurity_decrease=0.0,
                      min_impurity_split=None, min_samples_leaf=1,
                      min_samples_split=2, min_weight_fraction_leaf=0.0,
                      n_estimators=100, n_jobs=None, oob_score=False,
                      random_state=20, verbose=0, warm_start=False)

In [65]:
#testing trained model
y_pred = rfModel.predict(x_test)
Get_score(y_pred, y_test)

The Spearman's correlation coefficient is: -0.012
The Spearman's correlation coefficient is: -0.032


<b>3. Using C3D+Caption</b>

In [66]:
# defining input(x) and target(y) for model
x= df_C3d_Captions_final
y = df_ground_truth[['short-term_memorability', 'long-term_memorability']].values

# dividing input in test and train.
x_train,x_test,y_train,y_test = tts(x,y,test_size=0.2,random_state=365)

# checking shape
print(x_train.shape)
print(x_test.shape)
print(y_train.shape)
print(y_train.shape)

# Applying Random forest
rfModel = rf(n_estimators=100,random_state=20)
rfModel.fit(x_train,y_train)

(4800, 1578)
(1200, 1578)
(4800, 2)
(4800, 2)


RandomForestRegressor(bootstrap=True, ccp_alpha=0.0, criterion='mse',
                      max_depth=None, max_features='auto', max_leaf_nodes=None,
                      max_samples=None, min_impurity_decrease=0.0,
                      min_impurity_split=None, min_samples_leaf=1,
                      min_samples_split=2, min_weight_fraction_leaf=0.0,
                      n_estimators=100, n_jobs=None, oob_score=False,
                      random_state=20, verbose=0, warm_start=False)

In [67]:
y_pred = rfModel.predict(x_test)
Get_score(y_pred, y_test)

The Spearman's correlation coefficient is: 0.001
The Spearman's correlation coefficient is: 0.010


### Neural Network Implementation

In [68]:
# Importing required npz files for nueral network
#------------------------------------Caption feature------------------------------------------------
npz = np.load('/content/drive/My Drive/Caption_data_train.npz')
train_inputs_caption, train_targets_caption = npz['inputs'].astype(np.float), npz['targets'].astype(np.float)

npz = np.load('/content/drive/My Drive/Caption_data_validation.npz')
validation_inputs_caption, validation_targets_caption = npz['inputs'].astype(np.float), npz['targets'].astype(np.float)

npz = np.load('/content/drive/My Drive/Caption_data_test.npz')
test_inputs_caption, test_targets_caption = npz['inputs'].astype(np.float), npz['targets'].astype(np.float)

#-------------------------------------C3D feature------------------------------------------
npz = np.load('/content/drive/My Drive/C3D_data_train.npz')
train_inputs_c3d, train_targets_c3d = npz['inputs'].astype(np.float), npz['targets'].astype(np.float)

npz = np.load('/content/drive/My Drive/C3D_data_validation.npz')
validation_inputs_c3d, validation_targets_c3d = npz['inputs'].astype(np.float), npz['targets'].astype(np.float)

npz = np.load('/content/drive/My Drive/C3D_data_test.npz')
test_inputs_c3d, test_targets_c3d = npz['inputs'].astype(np.float), npz['targets'].astype(np.float)

#----------------------------------------------merged feature-------------------------------------
npz = np.load('/content/drive/My Drive/C3D_caption_data_train.npz')
train_inputs_c3d_caption, train_targets_c3d_caption = npz['inputs'].astype(np.float), npz['targets'].astype(np.float)

npz = np.load('/content/drive/My Drive/C3D_caption_data_validation.npz')
validation_inputs_c3d_caption, validation_targets_c3d_caption = npz['inputs'].astype(np.float), npz['targets'].astype(np.float)

npz = np.load('/content/drive/My Drive/C3D_caption_data_test.npz')
test_inputs_c3d_caption, test_targets_c3d_caption = npz['inputs'].astype(np.float), npz['targets'].astype(np.float)

<b>1. Using Caption</b>

In [69]:
print('train input ',train_inputs_caption.shape)
print('train target ',train_targets_caption.shape)
print('validation input ',validation_inputs_caption.shape)
print('validation target ',validation_targets_caption.shape)
print('test input ',test_inputs_caption.shape)
print('test target ',test_targets_caption.shape)

train input  (4800, 1477)
train target  (4800, 2)
validation input  (600, 1477)
validation target  (600, 2)
test input  (600, 1477)
test target  (600, 2)


In [70]:
# applying neural network using tensor flow 2
# Set the input and output sizes
input_size = 1477
output_size = 2
# Use same hidden layer size.
hidden_layer_size = 1500
    
# define how the model will look like
model = tf.keras.Sequential([
    tf.keras.layers.Dense(hidden_layer_size, activation='relu', input_dim=input_size), # 1st hidden layer and input layer
    tf.keras.layers.Dense(hidden_layer_size, activation='relu'), # 2nd hidden layer
    tf.keras.layers.Dense(hidden_layer_size, activation='relu'), # 3rd hidden layer
    tf.keras.layers.Dense(output_size, activation='sigmoid') # output layer
])
#optimizer = ['adam','sgd','adamax']
#loss = ['mse','mean_squared_error']
model.compile(optimizer='adamax', loss='mean_squared_error', metrics=['accuracy'])

# set the batch size
#batch_size = 50

# set a maximum number of training epochs
max_epochs = 100

# set an early stopping mechanism
early_stopping = tf.keras.callbacks.EarlyStopping(patience = 2)

model.fit(
          train_inputs_caption, 
          train_targets_caption, 
          #batch_size=batch_size, 
          epochs=max_epochs, 
          callbacks=[early_stopping],
          validation_data=(validation_inputs_caption, validation_targets_caption),
          verbose = 2
         )

Epoch 1/100
150/150 - 9s - loss: 0.0156 - accuracy: 0.7029 - val_loss: 0.0127 - val_accuracy: 0.7233
Epoch 2/100
150/150 - 8s - loss: 0.0115 - accuracy: 0.7033 - val_loss: 0.0128 - val_accuracy: 0.7233
Epoch 3/100
150/150 - 9s - loss: 0.0099 - accuracy: 0.7033 - val_loss: 0.0134 - val_accuracy: 0.7217


<tensorflow.python.keras.callbacks.History at 0x7ffb07950510>

In [71]:
#testing model
prediction = model.predict(test_inputs_caption)
print(prediction.shape)

(600, 2)


In [72]:
Get_score(prediction, test_targets_caption)

The Spearman's correlation coefficient is: 0.382
The Spearman's correlation coefficient is: 0.153


<b>2. Using C3D</b>

In [73]:
print('train input ',train_inputs_c3d.shape)
print('train target ',train_targets_c3d.shape)
print('validation input ',validation_inputs_c3d.shape)
print('validation target ',validation_targets_c3d.shape)
print('test input ',test_inputs_c3d.shape)
print('test target ',test_targets_c3d.shape)

train input  (4800, 101)
train target  (4800, 2)
validation input  (600, 101)
validation target  (600, 2)
test input  (600, 101)
test target  (600, 2)


In [74]:
# applying neural network using tensor flow 2
# Set the input and output sizes
input_size = 101
output_size = 2
# Use same hidden layer size.
hidden_layer_size = 500
    
# define how the model will look like
model = tf.keras.Sequential([
    tf.keras.layers.Dense(hidden_layer_size, activation='relu', input_dim=input_size), # 1st hidden layer and input layer
    tf.keras.layers.Dense(hidden_layer_size, activation='relu'), # 2nd hidden layer
    tf.keras.layers.Dense(hidden_layer_size, activation='relu'), # 3rd hidden layer
    tf.keras.layers.Dense(output_size, activation='sigmoid') # output layer
])
#optimizer = ['adam','sgd','adamax']
#loss = ['mse','mean_squared_error']
model.compile(optimizer='adam', loss='mean_squared_error', metrics=['accuracy'])

# set the batch size
#batch_size = 50

# set a maximum number of training epochs
max_epochs = 100

# set an early stopping mechanism
early_stopping = tf.keras.callbacks.EarlyStopping(patience = 3)

model.fit(
          train_inputs_c3d, 
          train_targets_c3d, 
          #batch_size=batch_size, 
          epochs=max_epochs, 
          callbacks=[early_stopping],
          validation_data=(validation_inputs_c3d, validation_targets_c3d),
          verbose = 2
         )

Epoch 1/100
150/150 - 2s - loss: 0.0183 - accuracy: 0.7033 - val_loss: 0.0147 - val_accuracy: 0.7233
Epoch 2/100
150/150 - 1s - loss: 0.0142 - accuracy: 0.7033 - val_loss: 0.0144 - val_accuracy: 0.7233
Epoch 3/100
150/150 - 1s - loss: 0.0139 - accuracy: 0.7033 - val_loss: 0.0150 - val_accuracy: 0.7233
Epoch 4/100
150/150 - 1s - loss: 0.0140 - accuracy: 0.7033 - val_loss: 0.0145 - val_accuracy: 0.7233
Epoch 5/100
150/150 - 1s - loss: 0.0140 - accuracy: 0.7033 - val_loss: 0.0146 - val_accuracy: 0.7233


<tensorflow.python.keras.callbacks.History at 0x7ffaea1f7e10>

In [75]:
#testing model
prediction = model.predict(test_inputs_c3d)
print(prediction.shape)

(600, 2)


In [76]:
Get_score(prediction, test_targets_c3d)

The Spearman's correlation coefficient is: -0.008
The Spearman's correlation coefficient is: 0.007


<b>3. Using C3D+Caption</b>

In [77]:
print('train input ',train_inputs_c3d_caption.shape)
print('train target ',train_targets_c3d_caption.shape)
print('validation input ',validation_inputs_c3d_caption.shape)
print('validation target ',validation_targets_c3d_caption.shape)
print('test input ',test_inputs_c3d_caption.shape)
print('test target ',test_targets_c3d_caption.shape)

train input  (4800, 1578)
train target  (4800, 2)
validation input  (600, 1578)
validation target  (600, 2)
test input  (600, 1578)
test target  (600, 2)


In [78]:
# applying neural network using tensor flow 2
# Set the input and output sizes
input_size = 1578
output_size = 2
# Use same hidden layer size.
hidden_layer_size = 350
    
# define how the model will look like
model = tf.keras.Sequential([
    tf.keras.layers.Dense(hidden_layer_size, activation='relu', input_dim=input_size), # 1st hidden layer and input layer
    tf.keras.layers.Dense(hidden_layer_size, activation='relu'), # 2nd hidden layer
    tf.keras.layers.Dense(hidden_layer_size, activation='relu'), # 3rd hidden layer
    tf.keras.layers.Dense(output_size, activation='sigmoid') # output layer
])
#optimizer = ['adam','sgd','adamax']
#loss = ['mse','mean_squared_error']
model.compile(optimizer='adamax', loss='mean_squared_error', metrics=['accuracy'])

# set the batch size
#batch_size = 50

# set a maximum number of training epochs
max_epochs = 100

# set an early stopping mechanism
early_stopping = tf.keras.callbacks.EarlyStopping(patience = 2)

model.fit(
          train_inputs_c3d_caption, 
          train_targets_c3d_caption, 
          #batch_size=batch_size, 
          epochs=max_epochs, 
          callbacks=[early_stopping],
          validation_data=(validation_inputs_c3d_caption, validation_targets_c3d_caption),
          verbose = 2
         )

Epoch 1/100
150/150 - 2s - loss: 0.0186 - accuracy: 0.7000 - val_loss: 0.0144 - val_accuracy: 0.7233
Epoch 2/100
150/150 - 1s - loss: 0.0133 - accuracy: 0.7033 - val_loss: 0.0148 - val_accuracy: 0.7233
Epoch 3/100
150/150 - 1s - loss: 0.0125 - accuracy: 0.7033 - val_loss: 0.0146 - val_accuracy: 0.7233


<tensorflow.python.keras.callbacks.History at 0x7ffae804f210>

In [79]:
#testing model
prediction = model.predict(test_inputs_c3d_caption)
print(prediction.shape)

(600, 2)


In [80]:
Get_score(prediction, test_targets_c3d_caption)

The Spearman's correlation coefficient is: -0.013
The Spearman's correlation coefficient is: -0.033


<p>
    After running random forest model and Neural Net with combination of caption and C3D feature. We found that caption feature performed better than others in both random forest and neural network. So, we picked caption feature with random forest to run test data and generate ground truth.  
</p>

In [81]:
#loading test set
df_test_caption_set = load_txt_file('./Test-set/Captions_test/test-set-1_video-captions.txt','video','caption')
df_test_caption_set

Unnamed: 0,video,caption
0,video7494.webm,green-jeep-struggling-to-drive-over-huge-rocks
1,video7495.webm,hiking-woman-tourist-is-walking-forward-in-mou...
2,video7496.webm,close-up-of-african-american-doctors-hands-usi...
3,video7497.webm,slow-motion-of-a-man-using-treadmill-in-the-gy...
4,video7498.webm,slow-motion-of-photographer-in-national-park
...,...,...
1995,video10004.webm,astronaut-in-outer-space-against-the-backdrop-...
1996,video10005.webm,young-women-lying-on-sunbed-and-applying-sun-c...
1997,video10006.webm,doctor-talking-to-patient-using-a-tablet-to-ex...
1998,video10007.webm,businessman-sitting-on-the-beach-on-inflatable...


In [82]:
# Running test caption through tf-idf
tfIdf_test_vectorizer=TfidfVectorizer(use_idf=True, max_features=1477)
tfIdf = tfIdf_test_vectorizer.fit_transform(df_test_caption_set['caption'])
df_test_caption_set['test_vectorized_data'] = tfIdf.toarray().tolist()
df_test_caption_set.tail()

Unnamed: 0,video,caption,test_vectorized_data
1995,video10004.webm,astronaut-in-outer-space-against-the-backdrop-...,"[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ..."
1996,video10005.webm,young-women-lying-on-sunbed-and-applying-sun-c...,"[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ..."
1997,video10006.webm,doctor-talking-to-patient-using-a-tablet-to-ex...,"[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ..."
1998,video10007.webm,businessman-sitting-on-the-beach-on-inflatable...,"[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ..."
1999,video10008.webm,woman-eating-ice-cream-and-sitting-in-the-stre...,"[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ..."


In [83]:
# framing data into test and train sets
x_train = pd.DataFrame(df_caption_ground_truth['vectorized_data'].tolist())
y_train = df_caption_ground_truth[['short-term_memorability', 'long-term_memorability']].values
x_test = pd.DataFrame(df_test_caption_set['test_vectorized_data'].tolist())

In [84]:
x_test.shape

(2000, 1477)

In [85]:
# running regression using random forest
rfModel = rf(n_estimators=70,random_state=20)
rfModel.fit(x_train,y_train) 

RandomForestRegressor(bootstrap=True, ccp_alpha=0.0, criterion='mse',
                      max_depth=None, max_features='auto', max_leaf_nodes=None,
                      max_samples=None, min_impurity_decrease=0.0,
                      min_impurity_split=None, min_samples_leaf=1,
                      min_samples_split=2, min_weight_fraction_leaf=0.0,
                      n_estimators=70, n_jobs=None, oob_score=False,
                      random_state=20, verbose=0, warm_start=False)

In [86]:
#testing trained model
y_pred = rfModel.predict(x_test)
y_pred.shape

(2000, 2)

In [87]:
final_prediction = pd.DataFrame()
final_prediction['video'] = df_test_caption_set['video']
final_prediction['short-term_memorability'] = y_pred[:,0]
final_prediction['long-term_memorability'] = y_pred[:,1]

In [88]:
final_prediction.head()

Unnamed: 0,video,short-term_memorability,long-term_memorability
0,video7494.webm,0.845436,0.754116
1,video7495.webm,0.875253,0.768315
2,video7496.webm,0.869814,0.823184
3,video7497.webm,0.843085,0.739059
4,video7498.webm,0.871036,0.77929


In [89]:
final_prediction.tail()

Unnamed: 0,video,short-term_memorability,long-term_memorability
1995,video10004.webm,0.882039,0.753941
1996,video10005.webm,0.865112,0.787713
1997,video10006.webm,0.83385,0.713388
1998,video10007.webm,0.861453,0.802143
1999,video10008.webm,0.899958,0.799442


In [91]:
final_prediction.to_csv('/content/drive/My Drive/Vishu_Bhatnagar_20210896_predictions.csv', index=False)