## Week 8 - Session 1: Hidden Markov Model (HMM)
 - Explore an intelligent tutoring system called Deep Thought.
 - Deep Thought takes two actions, providing 1) Problem Solving (PS) and 2) Work Example (WE), based on the students' state for their best learning gain.

In [1]:
__author__ = 'yemao'
from HMM import hmm
import numpy as np
import pandas as pd
from sklearn.model_selection import KFold
from sklearn.linear_model import LogisticRegression, LinearRegression
import matplotlib.pyplot as plt
from sklearn.metrics import accuracy_score, confusion_matrix, f1_score, recall_score, roc_auc_score, mean_squared_error

### Data

In [2]:
np.random.seed(1000)
folder = "assignment_data/"             # Specify your folder here
qlg_y = np.load(folder + "y_qlg.npy")   # target label for learning gain
post_y = np.load(folder + "post_test.npy", allow_pickle=True)   # target value for post-test score
qlg_test_actual, post_test_actual = [], []

### Training (5-CV)

In [3]:
%%time
kf = KFold(n_splits=5, shuffle=True)
qlg_train_pred, post_train_pred = [], []
qlg_train_actual, post_train_actual = [], []
qlg_test_pred, post_test_pred = [], []

numkc = 7      # Q2-1. change number of kc for different data set

for train_index, test_index in kf.split(qlg_y):
    print("======================================================")
    qlg_y_train, qlg_y_test = qlg_y[train_index], qlg_y[test_index]
    post_y_train, post_y_test = post_y[train_index], post_y[test_index]

    # symbols here refers to two different observations:
    # 1: correct, 0: incorrect
    symbols = [['0', '1']]
    
    #------------------------------------------
    # Q2-2. Explore different parameters: 
    #   1) Pi : Initial staste prob.
    #   2) T : State transition prob.
    # -----------------------------------------

    #h = hmm(2, Pi=np.array([0.5, 0.5]), T=np.array([[0.86, 0.14], [0.09, 0.91]]), obs_symbols=symbols)
    
    
    nlg_train = [[] for x in range(numkc)]
    nlg_test = [[] for x in range(numkc)]
    
    for i in range(numkc):
        print("-----------------------------------------")
        print(" KC: {}".format(i))
        h = hmm(2, Pi=np.array([0.5, 0.5]), T=np.array([[0.86, 0.14], [0.09, 0.91]]), obs_symbols=symbols)
        
        X = np.load(folder + "perf_kc" + str(i+1) + ".npy", allow_pickle=True)
        X_train, X_test = X[train_index], X[test_index]

        train = [each for each in X_train if each]
        test = [each for each in X_test if each]
        
        
        if train and test:
            h.baum_welch(train, debug=False)        # Baum-Welch algorithm : training part
        
        
        nlg_train[i].extend(h.predict_nlg(X_train))
        nlg_test[i].extend(h.predict_nlg(X_test))

    print(len(nlg_train), len(nlg_train[0]))
    nlg_train = np.transpose(nlg_train)
    nlg_test = np.transpose(nlg_test)

    nlg_train = pd.DataFrame(nlg_train).fillna(value=0)
    nlg_test = pd.DataFrame(nlg_test).fillna(value=0)

    # ---------------------------------------------------------
    # logistic regression for learning gain prediction
    logreg = LogisticRegression()                   
    logreg.fit(nlg_train, qlg_y_train)
    predict = logreg.predict(nlg_train)
    qlg_train_pred.extend([each for each in predict])
    qlg_train_actual.extend(qlg_y_train)

    predict = logreg.predict(nlg_test)
    # print (logreg.predict_proba(nlg_test))
    qlg_test_pred.extend([each for each in predict])
    qlg_test_actual.extend(qlg_y_test)


    lg = LinearRegression()              # linear regression for post-test scores prediction
    lg.fit(nlg_train, post_y_train)
    predict = lg.predict(nlg_train)
    post_train_pred.extend([each for each in predict])
    post_train_actual.extend(post_y_train)

    predict = lg.predict(nlg_test)
    post_test_pred.extend([each for each in predict])
    post_test_actual.extend(post_y_test)

-----------------------------------------
 KC: 0


************************
HMM Initialization
************************

1) Numerber of hidden states:2
2) Number of observable symbols:[['0', '1']]	(0: wrong, 1: correct)
3) The symbol mapping in HMM:[{'0': 0, '1': 1}]	(0: Not learned, 1: Learned)
4) The transition proability matrix T:
[[0.86 0.14]
 [0.09 0.91]]
5) The emission probability matrix E:
[array([[0.51479232, 0.48520768],
       [0.48570504, 0.51429496]])]
6) The initial state probability Pi: 
[0.5 0.5]

Epoch 0
-568.7291398968408
Epoch 10
-321.11940400362545
Epoch 20
-319.2228187712828
Epoch 30
-317.06799377778214
Epoch 40
-315.14243950245964
Epoch 50
-313.40541069958545
Epoch 60
-312.6209314296527
Epoch 70
-312.42450122070926
Epoch 80
-312.3763035951694
The loglikelihood improvement falls below threshold, training terminates at epoch 88! 


************************
After training
************************

1) Numerber of hidden states:2
2) Number of observable symbols:[['0', 

### Results

In [4]:
# test code data ###
# qlg_test_pred, post_test_pred = [0]*len(qlg_y), [0.5]*len(post_y)
# qlg_test_actual, post_test_actual = qlg_y, post_y
print( " ")
print ("<<<<<<< student learning gain")
print ("Training accuracy:" + str(accuracy_score(qlg_train_actual, qlg_train_pred)))
print ("Accuracy: " + str(accuracy_score(qlg_test_actual, qlg_test_pred)))


# flip P and N here because we care about the low learning gain group: qlg = 0
qlg_test_actual = [1 if each == 0 else 0 for each in qlg_test_actual]       
qlg_test_pred = [1 if each == 0 else 0 for each in qlg_test_pred]


print( "f1_score: " + str(f1_score(qlg_test_actual, qlg_test_pred)))
print ("Recall: " + str(recall_score(qlg_test_actual, qlg_test_pred)))
print ("AUC: " + str(roc_auc_score(qlg_test_actual, qlg_test_pred)))
print ("Confusion Matrix: ")
print (confusion_matrix(qlg_test_actual, qlg_test_pred))
print (" ")
print ("<<<<<<< student modeling")
print ("Training MSE: ", mean_squared_error(post_train_actual, post_train_pred))
print ("Test MSE: ", mean_squared_error(post_test_actual, post_test_pred))

 
<<<<<<< student learning gain
Training accuracy:0.6305970149253731
Accuracy: 0.5746268656716418
f1_score: 0.6415094339622641
Recall: 0.6891891891891891
AUC: 0.5612612612612612
Confusion Matrix: 
[[26 34]
 [23 51]]
 
<<<<<<< student modeling
Training MSE:  1.3146155446625902
Test MSE:  1.4362419524563843


### Report your observation from
 - the HMM model parameters after training as compared to the initial parameters (select one KC in the first fold of 5-CV). 
 - changing the initial parameters of the HMM model.
 - comparing parameters of HMMs trained for different KCs.