-
Notifications
You must be signed in to change notification settings - Fork 282
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
meaning of DEV_FILE_NAME
#7
Comments
Hi @chris-aeviator , a) the labeled (training) data is not split at all. In case of fewglue, this means that b) yes, but only for the individual models and not for the final distilled classifier. If you need predictions for the unlabeled data, you can simply set |
I'm closing this issue for now. Feel free to reopen it if you have further questions. |
Hi @timoschick , I am running PET for a custom task with --model_type bert . In the --data_dir I have 4 files train.csv, test.csv, dev.csv, unlabeled.csv. In the shell script, I have: Now in the output, I always get the predictions.jsonl file. The UNLABELED_FILE_NAME = "unlabeled.csv", so it is not set to other datasets. However, in the predictions file I thought I was getting model predictions of dev.csv. I tested it with different number of samples for each file train/test/dev/unlabeled and the number of rows in predictions.jsonl matched with that of dev set. Is it by default predictions file (located in the final folder) showing the predictioins of dev.csv? |
More info on what I said earlier... @timoschick I run two similar experiments on the same dataset (playing with unlabeled sample size) Experiment A settings:
Experiment B settings:
My task is a classification problem with two labels but I don't understand what's the role of unlabeled data in this case and why is it impacting the result. |
Thanks for sharing this repo. When looking at the
/examples
dir, you split your dataset (labeled data?) to& further
Two questions arise:
a) How do you split the labeled data (distribution, e.g. are you splitting 32 training examples from fewglue to DEV / TRAIN / TEST equally ?)
b) will UNLABELED be automatically predicted and how is the result stored
The text was updated successfully, but these errors were encountered: