-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RES-2190: Fix labels set to -100 in finetuning tasks #517
Conversation
Good find. The solution is simple and straightforward. Note, there could be others ways. For one, we could override |
@@ -144,6 +144,12 @@ | |||
) | |||
|
|||
|
|||
initial_finetuning_tiny_bert50k_glue = deepcopy(finetuning_bert700k_glue) | |||
initial_finetuning_tiny_bert50k_glue.update( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why initial_finetuning_tiny_bert50k_glue
as opposed to just finetuning_tiny_bert50k_glue
? What does the initial
signify?
projects/transformers/run.py
Outdated
@@ -52,7 +52,7 @@ | |||
from transformers.integrations import is_wandb_available | |||
from transformers.trainer_utils import get_last_checkpoint, is_main_process | |||
|
|||
from experiments import CONFIGS | |||
from experiments import CONFIGS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you added an extra space at the end of CONFIGS
. Make sure you run flake8 in whatever repo you're submitting a PR for to catch these issues.
cd ~/nta/nupic.research
flake8
projects/transformers/run.py
Outdated
test_dataset = None | ||
if ((data_args.task_name is not None or data_args.test_file is not None) | ||
and training_args.do_predict): | ||
and training_args.do_predict): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You removed some whitespace here. Flake8 would have caught this as well.
projects/transformers/run_utils.py
Outdated
vals, cnts = np.unique(np.array(eval_dataset['label']), return_counts=True) | ||
logging.info(f"Label distribution for {task} before calling trainer.evaluate") | ||
for val in range(len(vals)): | ||
logging.info(f"Label={vals[val]}: {cnts[val]} examples") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you intentionally mean to keep this logging? Do you think we need it?
projects/transformers/run_utils.py
Outdated
preds = preds[np.where(ep.label_ids != -100)] | ||
logging.info(f"Removing {1-(len(label_ids) / len(ep.label_ids)):.2%} samples " | ||
"from evaluation set where label == -100") | ||
assert -100 not in ep.label_ids, "unknown source of -100 labels" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit pick: Could you fix the punctuation? "Unknown source of -100 labels."
projects/transformers/run_utils.py
Outdated
@@ -916,7 +930,7 @@ def model_init(): | |||
"HP search with saved models not supported." | |||
logging.info("Pretraining new model from scratch") | |||
|
|||
# Instantiate model; possibly one of our custom sparse models. | |||
# Instantiate model; possibly one of our custom sparse models. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The indentation looks like it was changed here as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also seems like a good solution to me, it is a hack but it should work fine and with no extra computation or memory required. I left some very minor comments, please take a look
test_dataset = tokenized_datasets[ | ||
"test_matched" if data_args.task_name == "mnli" else "test" | ||
] | ||
if (data_args.task_name is not None or data_args.test_file is not None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't understand this change, what is the difference between the old version and the new?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No semantic difference. I just couldn't figure out how to make it pep compliant. I was getting issues about the indent, or the length of the line, so just broke it up.
projects/transformers/run_utils.py
Outdated
# if you want drop_last for training, this toggles it back on | ||
if drop_last: | ||
trainer.args.dataloader_drop_last = True | ||
logging.info("Switched trainer.args.dataloader_drop_last" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should lower the level of this logging, logging.debug should be enough. There is no useful information to the user here, it is more of a print for debugging purposes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. Those were mainly to verify that my fix was working properly, not needed going forward. I removed the bit above and a couple related ones.
@benja-matic Do you know how this change affects the fine-tuning results? We should probably rerun those from the README (including |
@mvacaporale don't know just yet. I'm rerunning fine tuning on baseline models this afternoon. |
Regarding RES-2190. Looks like HuggingFace trainer looks at args.dataloader_drop_last for train, and eval loaders. Workaround is to turn flip that to False during evaluation in compute_metrics_task, flip it back to True at the end (if it was originally true). A few minor ignorable formatting edits will merge in as well. Finally, there's a finetuning experiment for tiny_bert50k.