-
Notifications
You must be signed in to change notification settings - Fork 297
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clean up and add configs+scripts #681
Conversation
W4ngatang
commented
May 5, 2019
•
edited
Loading
edited
- add config files used by BoW and BERT baselines
- move edgeprobe configs to their own folder and adjust paths throughout
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some kibitzing, some suggested edits (trim GLUE tasks, mention SUperGLUE tasks). Also, could you add a .sh file with the actual runs, or at least an example for one task in a comment here?
optimizer = adam | ||
batch_size = 16 | ||
max_epochs = 10 | ||
lr = .00001 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, this seems pretty low for non-BERT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:/
config/superglue-bow.conf
Outdated
wnli_val_interval = 100 | ||
wnli_lr = 0.00001 | ||
|
||
mrpc = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Delete these non-SuperGLUE tasks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...and add the SuperGLUE ones?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will delete, but not sure what you mean about adding (there aren't any task-specific hyperparams).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. Assumed there were some that would be specified in the script, but makes sense.
dropout = 0.1 // following BERT paper | ||
optimizer = bert_adam | ||
batch_size = 4 | ||
max_epochs = 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sure this is in the paper.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is.
config/superglue-bert.conf
Outdated
// These override the task-specific configs set in defaults.conf | ||
cola = {} | ||
cola_classifier_dropout = 0.1 | ||
cola_val_interval = 100 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the smallest SuperGLUE tasks, this is pretty high—potentially over an epoch, even with our tiny batch size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's also a bit odd that this doesn't match the BOW setup due to the batch size diff, though not fatal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh these settings aren't actually used, I forget why I copied them here.
I was planning on adding the validation intervals to the actual script, because it depends directly on batch size and I wanted to make that obvious.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The BOW point is fair though...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- For the moved conf files, make sure that any imports (i.e., defaults.conf) still work. I think they use paths relative to the location of the conf file.
Ah good catch. @pruksmhc can you take a look at this and make sure the command I wrote for WSC is correct? |
scripts/superglue-baselines.sh
Outdated
} | ||
|
||
function wsc() { | ||
python main.py --config config/superglue-bert.conf --overrides "random_seed = ${seed}, cuda = ${gpuid}, run_name = wsc, pretrain_tasks = \"winograd-coreference\", target_tasks = \"winograd-coreference\", do_pretrain = 1, do_target_task_training = 0, do_full_eval = 1, batch_size = 4, val_interval = 125" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add optimizer="adam" here since that's what we report in the paper (BERT Adam performance was acting a bit funky, I'll open up an issue).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also val_interval should be 139 (554/4, which is train size/batch size)
* Add configs for bow and bert superglue baselines * Move edgeprobe configs to a common folder and change paths throughout * Add superglue baseline commands * Fix relative imports in edgeprob confs * Remove printing test results * Update wsc run params