# Data
## File Paths
Set the data paths (for training, only when it's available)

In [None]:
te_data_file = './mhd_sample_te.txt'

# Models

## 1. Logistic Regression Models
Loads the pre-trained model `./lrdialog_ovr.pkl`

In [None]:
from models import LogRegDialogModel

lr = LogRegDialogModel(lr_type='ovr')
lr.load_model(model_file="./lrdialog_ovr.pkl")

Loads test data and predicts using the loaded model. <br>
Utterance-level results will be saved to an output file.

In [None]:
lr.predict(te_data_file, verbose=1, output_filename="./utter_level_results_lrovr.txt")

Output the scores to see the scores

In [None]:
lr.result.scores

Also can print out the scores as csv and save it to a file

In [None]:
lr.result.print_scores(filename='./result_in_diff_metrics.csv')

## 2. HMM on top of LR
Running HMM requires you to have `base_model`, which should be trained in advance and given as an argument.

In [None]:
from models import HMMDialogModel
hmmlr = HMMDialogModel(base_model=lr)
hmmlr.load_model(model_file='hmmdialog_lrovr.pkl')

Predicts the output labels using HMM and Viterbi decoding. <br>
Also outputs the utterance-level results to a file.

In [None]:
hmmlr.predict_viterbi(te_data_file, output_filename="./utter_level_results_hmm_lrovr.txt")

In [None]:
hmmlr.result.scores

## 3. HMM on top of other output probabilities

If we have a set of results from another base model (independent model) that is trained somewhere else (e.g. output from RNN), <br>
we can load the predictions and output probabilities and plug them into HMM. <br>
They should be the result of the same data as `mhdtest`.
- `predictions`:  Should have a list of sessions, where each session is a 2-d array with size `(N,T)`, where `N` is the number of utterances in the session and `T` is the number of topics (labels). Each entry is the $p(topic|utterance)$ in each session.  <br> Type: `list[ 2-d np.array[float] ]`.
- `output_probs`: Should have a list of sessions, where each session is a list of utterance predictions within that session. <br> Type: `list[list[int]]` or `list[np.array[int]]`


After loading predictions and probabilities, a base model object should have the following data
and it can be plugged in as an argument to HMMDialogModel
- base_model.result
- base_model.result.output_prob
- base_model.model_info

In [None]:
from models import DialogModel, HMMDialogModel

In [None]:
predfile = './sample_pred.pkl'
outprobfile = './sample_prob.pkl'

The results are not from RNN, but let's say we've loaded the results from RNN model

In [None]:
rnn = DialogModel()
rnn.load_results(te_data_file, model_info="RNN", marginals=None, predictions=predfile, output_probs=outprobfile)

In [None]:
hmmrnn = HMMDialogModel(base_model=rnn)
hmmrnn.fit_model(tr_data_file)

In [None]:
hmmrnn.predict_viterbi(te_data_file, output_filename="./utter_level_results_hmm_rnn.txt")

In this case we should have the same result as the result at section 2. since we've loaded the same result from LR.

In [None]:
hmmlr.result.scores