source: https://github.com/swabhs/open-sesame

### Pre-requisits:

Some global configurations are defined in configurations/global_config.json including default FrameNet version **1.7**.

Assuming FrameNet version 1.7
Following files must be located under **data** directory
* fndata1.7 
* glove.6B.100d.txt:http://nlp.stanford.edu/data/glove.6B.100d.zip


In [None]:
# incase needed
# import nltk
# nltk.download('averaged_perceptron_tagger')

## 1. preprocess 
to generate train,test and dev input files to train parser
```
python -m sesame.preprocess --exp_name $EXP_NAME --data_dir $DATA_DIR --version $VERSION
```
files will be saved to: ```$OUTPUT_DIR/$EXP_NAME```

**default** values are as follows:
- exp_name: none
- data_dir : **data/open_sesame_v1_data/fn1.7**
- version: 1.7

In [None]:
! python -m sesame.preprocess --exp_name 'original' --data_dir data/open_sesame_v1_data/fn1.7

## 2. train
```
python -m sesame.$MODEL --mode train --model_name $MODEL_NAME --exp_name $EXP_NAME -- data_dir $DATA_DIR --output_dir $OUTPUT_DIR
```
- MODEL: targetid, frameid, argid
- MODEL_NAME: model will be saved to ```$OUTPUT_DIR/$EXP_NAME/$MODEL_NAME```
- EXP_NAME: a sub_dir within ```$DATA_DIR``` where data files exists, it will be also be created in ```$OUTPUT_DIR```

optionl flags and default values:
- data_dir : data/open_sesame_v1_data/fn1.7
- output_dir : logs/fn1.7**
- version: 1.7
- fixseed

In [None]:
! python -m sesame.targetid \
--mode='train' \
--model_name='fn1.7-trained-targetid' \
--data_dir='../parser_workdir/data/open_sesame_v1_data/fn1.7'\
--output_dir='../parser_workdir/step_logs'\
--exp_name='original' \
--num_steps=27460\
--fixseed

## 3. test
```
python -m sesame.$MODEL --mode test --model_name $MODEL_NAME --exp_name $EXP_NAME -- data_dir $DATA_DIR --output_dir $OUTPUT_DIR
```

In [None]:
! python -m sesame.targetid \
--mode='test' \
--model_name='fn1.7-trained-targetid' \
--data_dir='../parser_workdir/data/open_sesame_v1_data/fn1.7'\
--output_dir='../parser_workdir/step_logs'\
--exp_name='original'


# How to run multiple experiments


## 1. Define all your experiments in a json file as explained as follows: 

In [None]:
import os
import json


base_data_dir = "data/open_sesame_v1_data/fn1.7"
base_output_dir = "step_logs/fn1.7"


exps = [
    'original',
]

models = ['targetid', 'frameid', 'argid']
output_json_file = f'all_models_original'

exp_configs= []

for e in exps:
    for model in models
        exp_configs.append({"name":f'{e}-{model}',
                             "args":{
                                    "model_id":model,
                                    "model_name":f'fn1.7-trained-{model}', 
                                    "exp_name":f'{e}',
                                    "data_dir":f'{base_data_dir}',
                                    "output_dir":f'{base_output_dir}'
                             }
                           })




print(len(exp_configs))
exp_names =[exp['name'] for exp in exp_configs]
print(','.join(exp_names))


with open(f'configs_parser/{output_json_file}.json', 'w') as fp:
        json.dump(exp_configs, fp, indent=4)
        

## 2. run experiments via lexsub.run_parser module


Following command will train and test the model

```
python -m lexsub.run_parser --configs_traintest/all_models_original.json --workers 3
```

optinally, you can specify:
- --exp_names 'original-targetid'
- --mode 'test'


possible options for **mode**: 'train', 'refresh', 'test'