New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pretty low accuracy for yahoo answers mixtext #1
Comments
And one thing we forget to mention about pre-processing the Yahoo!Answers is that we concatenate the question title/question content and best answer together to form the text to be classified. (Just uploaded the pre-processing codes for Yahoo!Answer.) We could also provide a link to download the processed data later. |
If you use the original dataset, the text to be classified is the question content (many are none as well). That is probably the reason. |
Here is the link to download the data: https://drive.google.com/file/d/1IoX9dp_RUHwIVA2_kJgHCWBOLHsV9V7A/view?usp=sharing |
The reason is the dataset pre-processing of Yahoo Answer. We tested with
The performance is: |
Hi,
Firstly I quite liked the paper and enjoyed reading it. I was trying to implement it and tried using this code for implementation on yahoo answers dataset but unfortunately the best accuracy does not cross 0.24. Since I have 2 gpus at my disposal I made the batch-size 2 and batch-size u 4 (also tried batch size 3 and batch size-u 4) and val-iteration 2000. So I was wondering, if you could let me know should I change other parameters to make it work? ( I understand that reducing the batch size should have some impact but not sure that the impact can be this significant so thought it might need to do with some other factors or weighting)?
The command I used (i downloaded the dataset from the link provided and placed it in the directory)
python ./code/train.py --gpu 0,1 --n-labeled 10 --data-path ./data/yahoo_answers_csv/ --batch-size 2 --batch-size-u 4 --epochs 20 --val-iteration 2000 --lambda-u 1 --T 0.5 --alpha 16 --mix layers-set 7 9 12 --lrmain 0.000005 --lrlast 0.0005
The second question I had was the use of args.val_iteration? As I understand it is in a way number of batches to be processed in an epoch. So I was wondering how does it work if my number of labeled data per class is 10 == 100 labeled examples(for 10 classes in yahoo answers). So if I have batch size of 2 that would be 50 batches for data loader? So does it repeat instances in cyclic manner or just randomly pics 2 instances always?
Thanks
The text was updated successfully, but these errors were encountered: