Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up and add configs+scripts #681

Merged
merged 6 commits into from
May 6, 2019
Merged

Clean up and add configs+scripts #681

merged 6 commits into from
May 6, 2019

Conversation

W4ngatang
Copy link
Collaborator

@W4ngatang W4ngatang commented May 5, 2019

  • add config files used by BoW and BERT baselines
  • move edgeprobe configs to their own folder and adjust paths throughout

Copy link
Contributor

@sleepinyourhat sleepinyourhat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some kibitzing, some suggested edits (trim GLUE tasks, mention SUperGLUE tasks). Also, could you add a .sh file with the actual runs, or at least an example for one task in a comment here?

optimizer = adam
batch_size = 16
max_epochs = 10
lr = .00001
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, this seems pretty low for non-BERT.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:/

wnli_val_interval = 100
wnli_lr = 0.00001

mrpc = {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete these non-SuperGLUE tasks?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...and add the SuperGLUE ones?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will delete, but not sure what you mean about adding (there aren't any task-specific hyperparams).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Assumed there were some that would be specified in the script, but makes sense.

dropout = 0.1 // following BERT paper
optimizer = bert_adam
batch_size = 4
max_epochs = 10
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure this is in the paper.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is.

// These override the task-specific configs set in defaults.conf
cola = {}
cola_classifier_dropout = 0.1
cola_val_interval = 100
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the smallest SuperGLUE tasks, this is pretty high—potentially over an epoch, even with our tiny batch size.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also a bit odd that this doesn't match the BOW setup due to the batch size diff, though not fatal.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh these settings aren't actually used, I forget why I copied them here.
I was planning on adding the validation intervals to the actual script, because it depends directly on batch size and I wanted to make that obvious.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The BOW point is fair though...

@W4ngatang W4ngatang changed the title Add configs for bow and bert superglue baselines Clean up and add configs+scripts May 5, 2019
Copy link
Contributor

@sleepinyourhat sleepinyourhat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • For the moved conf files, make sure that any imports (i.e., defaults.conf) still work. I think they use paths relative to the location of the conf file.

@W4ngatang
Copy link
Collaborator Author

Ah good catch.

@pruksmhc can you take a look at this and make sure the command I wrote for WSC is correct?

}

function wsc() {
python main.py --config config/superglue-bert.conf --overrides "random_seed = ${seed}, cuda = ${gpuid}, run_name = wsc, pretrain_tasks = \"winograd-coreference\", target_tasks = \"winograd-coreference\", do_pretrain = 1, do_target_task_training = 0, do_full_eval = 1, batch_size = 4, val_interval = 125"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add optimizer="adam" here since that's what we report in the paper (BERT Adam performance was acting a bit funky, I'll open up an issue).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also val_interval should be 139 (554/4, which is train size/batch size)

@W4ngatang W4ngatang merged commit 6b57c07 into master May 6, 2019
@W4ngatang W4ngatang deleted the superglue-conf branch May 6, 2019 04:02
phu-pmh pushed a commit that referenced this pull request Apr 17, 2020
* Add configs for bow and bert superglue baselines

* Move edgeprobe configs to a common folder and change paths throughout

* Add superglue baseline commands

* Fix relative imports in edgeprob confs

* Remove printing test results

* Update wsc run params
@jeswan jeswan added the jiant-v1-legacy Relevant to versions <= v1.3.2 label Sep 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
jiant-v1-legacy Relevant to versions <= v1.3.2
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants