-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What do you mean by " We apply DIRECTPROBE on the training and test set separately"? #1
Comments
For DirectProbe, we do not need to differentiate the training or test set. What DirectProbe do is it takes into a labeled dataset and produces a set of clusters. It does not know about training or test set. |
Oh, I found that the current codes actually probe only entities_path and embeddings_path and do not probe test files, right? |
Yes. You are correct. Every time, DirectProbe only clusters for one dataset. That test_entities_path is something from the previous version. We do not use it in the paper "A Closer Look at How FIne-tuning Changes BERT." |
By the way, You may want to pull the latest version. We recently fixed a minor bug in the code. |
Thanks ! I will update the codes to the latest version. |
@flyaway1217 Do you have any solution to this problem? Or should I just wait till the end of the clustering? |
Sometimes I had the same warning. Usually, I just wait until the end. |
Also, the time depends on how many CPU you have because Directprobe uses multiple processes to do the linearity check. More CPU usually means faster clustering. |
Hi,
in your paper, A Closer Look at How FIne-tuning Changes BERT, it is written that "We apply DIRECTPROBE on the
training and test set separately" in section 4.1.
For DirectProbing, we need train and test set. Then, does it mean that you split training set into train/test and test set into train/test set too?
Or just to use training as a test set too?
Thanks. :D
The text was updated successfully, but these errors were encountered: