You will need TensorFlow 1.12.0
to run experiments
- Put
arzta_daten_anonym1.csv
,arzta_daten_anonym2.csv
,arzta_daten_anonym3.csv
andarzta_daten_anonym4.csv
todata_path
folder. - Run
python data_prep.py --data_dir data_path
- Be sure to run data preparation script first
- Run
python pretrain_embeddings.py --data_path data_path/full.csv
(full.csv
is the file created bydata_prep.py
script)
- Choose the model
swem_aver
,swem_max
,swem_max_features
,gru
orgru_feats
(seemodel/model_fn.py
) for detailed information) - Create experiment folder
exp_path
- Put
experiments/config.yaml
toexp_path
and specify model params inside the yaml file. - Run
python train.py --model_dir exp_path --data_dir data_path --architecture swem_max --use_pretrained
This command will initialize embeddings from word2vec_filename
(specified in exp_path/config.yaml
) and train the model (swem_max
).
After the training exp_path/config.yaml
will be updated with ROC AUC
and other metrics.
xgb.py
will trainXGBClassifier
search_hyperparams.py
will iterate over hyperparams (hard-coded inside the script) and runtrain.py
multiple timescalculate_metrics.py
will calculate metrics