# 'Terminal' for Model Training
This notebook is used for training the D-ETMs using argparse. It allows for an exploration of the training progress both during and after training.

To keep the notebook clean and still have an overview of the performance of all models, model outputs (alpha, theta, etc.) are saved and the final performance and all the hyperparameters are automatically saved in an Excel file called training_results.xlsx. In this notebook, outputs are included for the models that have been discussed in more detail (V10, V12, V13, V14, V15).

Please note that not all argparse arguments need to be passed again for evaluation (the training-related arguments), but they are in the following for printing purposes and convenience.

### Settings

In [None]:
from google.colab import drive
drive.mount('/content/drive', force_remount=True)

In [None]:
import time
import os

In [None]:
%cd '/content/drive/My Drive/Thesis/Topic-Modeling/'

In [None]:
from IPython.display import HTML, display

def set_css():
  display(HTML('''
  <style>
    pre {
        white-space: pre-wrap;
    }
  </style>
  '''))
get_ipython().events.register('pre_run_cell', set_css)

## D-ETM Training and Evaluation

### V15
--> number of topics = 75

In [None]:
%run main.py --version V15 --optimizer adamw --theta_act tanh --theta_hidden_size 1000 --anneal_lr 1 --batch_size 512 --num_topics 75

In [None]:
%run main.py --version V15 --optimizer adamw --theta_act tanh --theta_hidden_size 1000 --anneal_lr 1 --batch_size 512 --num_topics 75 --mode eval --load_from 'Results/V15/DETM_V15_Exec_17-12-2020_09h17m'

### V14 
--> fastText

In [None]:
%run main.py --version V14 --optimizer adamw --theta_act tanh --theta_hidden_size 1000 --anneal_lr 1 --batch_size 512 --rho_size 300 --emb_type fastText --emb_path Data/Embeddings/fastText/fastText_300.txt

In [None]:
%run main.py --version V14 --optimizer adamw --theta_act tanh --theta_hidden_size 1000 --anneal_lr 1 --batch_size 512 --rho_size 300 --emb_type fastText --emb_path Data/Embeddings/fastText/fastText_300.txt --mode eval --load_from Results/V14/DETM_V14_Exec_16-12-2020_13h41m

### V13 
--> GloVe

In [None]:
%run main.py --version V13 --optimizer adamw --theta_act tanh --theta_hidden_size 1000 --anneal_lr 1 --batch_size 512 --rho_size 300 --emb_type GloVe --emb_path Data/Embeddings/GloVe/GloVe_300.txt

In [None]:
%run main.py --version V13 --optimizer adamw --theta_act tanh --theta_hidden_size 1000 --anneal_lr 1 --batch_size 512 --rho_size 300 --emb_type GloVe --emb_path Data/Embeddings/GloVe/GloVe_300.txt --mode eval --load_from Results/V13/DETM_V13_Exec_07-12-2020_14h13m

### V12 
--> 300-dimensional Word2Vec

In [None]:
%run main.py --version V12 --optimizer adamw --theta_act tanh --theta_hidden_size 1000 --anneal_lr 1 --batch_size 512 --rho_size 300 --emb_path Data/Embeddings/Word2Vec/Word2Vec_300.txt

In [None]:
%run main.py --version V12 --optimizer adamw --theta_act tanh --theta_hidden_size 1000 --anneal_lr 1 --batch_size 512 --rho_size 300 --emb_path Data/Embeddings/Word2Vec/Word2Vec_300.txt --mode eval --load_from Results/V12/DETM_V12_Exec_05-12-2020_13h13m

### V10
--> 200-dimensional Word2Vec

In [None]:
%run main.py --version V10 --optimizer adamw --theta_act tanh --theta_hidden_size 1000 --anneal_lr 1 --batch_size 512

In [None]:
%run main.py --version V10 --optimizer adamw --theta_act tanh --theta_hidden_size 1000 --anneal_lr 1 --batch_size 512 --mode eval --load_from 'Results/V10/DETM_V10_Exec_03-12-2020_11h16m'