
This notebook shows the prediction result in Sec 3.2 (including Table 3-5) in

    <Deriving information from missing data: implications for mood prediction>
    
    note that the participant is borderline (0), healthy (1) or bipolar (2).

In [1]:
import os
import random
import numpy as np
import datetime
import h5py
import time
import csv
import math
import scipy
import copy
import iisignature
from datetime import date
import matplotlib.dates as mdates
import matplotlib.pyplot as plt


from sklearn.ensemble import RandomForestRegressor
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score
from sklearn.preprocessing import LabelEncoder



from sklearn.metrics import r2_score
from sklearn.metrics import roc_curve, auc
from sklearn.model_selection import train_test_split
from sklearn.multiclass import OneVsRestClassifier

from prediction_functions import *

Load cohort dataset 

In [3]:
test_path='./all-true-colours-matlab-2017-02-27-18-38-12-nick/'
participants_list, participants_data_list, participants_time_list=loadParticipants(test_path)


Participants=make_classes(participants_data_list,participants_time_list,\
                                                            participants_list)

cohort=cleaning_sameweek_data(cleaning_same_data(Participants))

14050


In [13]:
class_dic={0: "BPD",
           1: "HC",
           2: "BD"}

**1.** Run state prediction in Sec 3.2.1, with the following two models in one function 'comprehensive_model':
    
    * missing-response-incorporated signature-based predictive model (MRSCM, level2)
    * naive predictive model 

In [22]:
if __name__ == "__main__":

 
    sample_size=50

    minlen=10

    for class_ in [int(0), int(1),int(2)]:
        
        print('Class', class_dic[class_], 'with min length',minlen)
        print("____________________")
        
        accuracy=comprehensive_model(Participants,\
                                     class_,\
                                     minlen=10,\
                                     training=0.7,\
                                     sample_size=10,\
                                     cumsum=True)
        
        print("ASRM state accuracy for naive prediction model and MRSPM (level 2):")
        print(accuracy[0])
        
        print("QIDS state accuracy for naive prediction model and MRSPM (level 2):")
        print(accuracy[1])
        print('\n')

Class BPD with min length 10
____________________
ASRM state accuracy for naive prediction model and MRSPM (level 2):
[0.6175, 0.7062]
QIDS state accuracy for naive prediction model and MRSPM (level 2):
[0.595, 0.6475]


Class HC with min length 10
____________________
ASRM state accuracy for naive prediction model and MRSPM (level 2):
[0.70125, 0.79875]
QIDS state accuracy for naive prediction model and MRSPM (level 2):
[0.71625, 0.78875]


Class BD with min length 10
____________________
ASRM state accuracy for naive prediction model and MRSPM (level 2):
[0.586, 0.652]
QIDS state accuracy for naive prediction model and MRSPM (level 2):
[0.556, 0.602]




**2.** Run score prediction in Sec 3.2.2, with the following two models in one function 'comprehensive_nomissing_model':
    
    * missing-response-incorporated signature-based predictive model (scoreMRSCM, level2)
    * naive predictive model 

In [29]:
if __name__ == "__main__":



    sample_size=50

    minlen=10

    for class_ in [int(0), int(1),int(2)]:
        
        print('Class', class_dic[class_], 'with min length',minlen)
        print("____________________")

        accuracy,mae=comprehensive_nomissing_model(Participants,\
                                                   class_,\
                                                   minlen=minlen,\
                                                   sample_size=sample_size,\
                                                   scaling=False)
       
        print("MAE of ASRM score prediction for naive predictive model and scoreMRSPM (level 2):")
        print(mae[0])
        
        print("MAE of QIDS score prediction for naive predictive model and scoreMRSPM (level 2):")
        print(mae[1])
        print('\n')




Class BPD with min length 10
____________________
MAE of ASRM score prediction for naive predictive model and scoreMRSPM (level 2):
[2.57167095, 2.11735167]
MAE of QIDS score prediction for naive predictive model and scoreMRSPM (level 2):
[4.67122329, 3.74499667]


Class HC with min length 10
____________________
MAE of ASRM score prediction for naive predictive model and scoreMRSPM (level 2):
[1.13730578, 0.82641092]
MAE of QIDS score prediction for naive predictive model and scoreMRSPM (level 2):
[1.89942284, 1.53168044]


Class BD with min length 10
____________________
MAE of ASRM score prediction for naive predictive model and scoreMRSPM (level 2):
[3.28666201, 2.38695222]
MAE of QIDS score prediction for naive predictive model and scoreMRSPM (level 2):
[4.60083124, 3.43730616]




**3.** Run severity prediction in Sec 3.2.2, with the following two models in one function 'comprehensive_nomissing_model':
    
    * missing-response-incorporated signature-based predictive model (scoreMRSCM, level2)
    
 but with parameter "scaling" in 'comprehensive_nomissing_model' to be True to map the raw predicted score to corresponding severity of symptoms

In [38]:
if __name__ == "__main__":



    sample_size=50

    minlen=10

    for class_ in [int(0), int(1),int(2)]:
        
        print('Class', class_dic[class_], 'with min length',minlen)
        print("____________________")



        accuracy,mae=comprehensive_nomissing_model(Participants,\
                                                   class_,\
                                                   minlen=minlen,\
                                                   sample_size=sample_size,\
                                                   scaling=True)
        
        print("accuracy and MAE of (ASRM) severity prediction from scoreMRSPM (level 2):")
        print([accuracy[0][-1],mae[0][-1]])
        
        print("accuracy and MAE of (QIDS) severity prediction from scoreMRSPM (level 2):")
        print([accuracy[1][-1],mae[1][-1]])
        print('\n')

Class BPD with min length 10
____________________
accuracy and MAE of (ASRM) severity prediction from scoreMRSPM (level 2):
[0.82433, 0.625]
accuracy and MAE of (QIDS) severity prediction from scoreMRSPM (level 2):
[0.697524, 0.79427]


Class HC with min length 10
____________________
accuracy and MAE of (ASRM) severity prediction from scoreMRSPM (level 2):
[0.95825, 0.19069]
accuracy and MAE of (QIDS) severity prediction from scoreMRSPM (level 2):
[0.949011, 0.13825]


Class BD with min length 10
____________________
accuracy and MAE of (ASRM) severity prediction from scoreMRSPM (level 2):
[0.74327, 1.04623]
accuracy and MAE of (QIDS) severity prediction from scoreMRSPM (level 2):
[0.76425, 0.684]


