Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pretty low accuracy for yahoo answers mixtext #1

Closed
sb1992 opened this issue Jun 3, 2020 · 5 comments
Closed

Pretty low accuracy for yahoo answers mixtext #1

sb1992 opened this issue Jun 3, 2020 · 5 comments

Comments

@sb1992
Copy link

sb1992 commented Jun 3, 2020

Hi,
Firstly I quite liked the paper and enjoyed reading it. I was trying to implement it and tried using this code for implementation on yahoo answers dataset but unfortunately the best accuracy does not cross 0.24. Since I have 2 gpus at my disposal I made the batch-size 2 and batch-size u 4 (also tried batch size 3 and batch size-u 4) and val-iteration 2000. So I was wondering, if you could let me know should I change other parameters to make it work? ( I understand that reducing the batch size should have some impact but not sure that the impact can be this significant so thought it might need to do with some other factors or weighting)?

The command I used (i downloaded the dataset from the link provided and placed it in the directory)

python ./code/train.py --gpu 0,1 --n-labeled 10 --data-path ./data/yahoo_answers_csv/ --batch-size 2 --batch-size-u 4 --epochs 20 --val-iteration 2000 --lambda-u 1 --T 0.5 --alpha 16 --mix layers-set 7 9 12 --lrmain 0.000005 --lrlast 0.0005

The second question I had was the use of args.val_iteration? As I understand it is in a way number of batches to be processed in an epoch. So I was wondering how does it work if my number of labeled data per class is 10 == 100 labeled examples(for 10 classes in yahoo answers). So if I have batch size of 2 that would be 50 batches for data loader? So does it repeat instances in cyclic manner or just randomly pics 2 instances always?

Thanks

@jiaaoc
Copy link
Member

jiaaoc commented Jun 3, 2020

  1. The parameters work well for "--batch-size 4 --batch-size-u 8" on 4 GPUs (will generate the reported performance), and the performance would get better if you could try larger batch size. We haven't tested the "batch-size 2 --batch-size-u 4". As you could notice that the TMix happened in a batch, i.e. we randomly sampled pairs from a batch to perform TMix, the batch size then became important to ensure the performance. There are several suggestions: (1) decrease the number of back-translated paraphrases per sentence so that you might increase the batch size. (2) try to increase your batch size.
    In the meantime, we will test the cases of smaller batch sizes to check if we could make it work.

  2. For the args.val_iteration, we have set the parameter "shuffle == True" in the Dataloader. So after a going-though of the labeled dataset, it will shuffle the dataset and you would observe a different 2 instances in each batch.

And one thing we forget to mention about pre-processing the Yahoo!Answers is that we concatenate the question title/question content and best answer together to form the text to be classified. (Just uploaded the pre-processing codes for Yahoo!Answer.) We could also provide a link to download the processed data later.

@jiaaoc
Copy link
Member

jiaaoc commented Jun 3, 2020

If you use the original dataset, the text to be classified is the question content (many are none as well). That is probably the reason.

@jiaaoc
Copy link
Member

jiaaoc commented Jun 3, 2020

Here is the link to download the data: https://drive.google.com/file/d/1IoX9dp_RUHwIVA2_kJgHCWBOLHsV9V7A/view?usp=sharing

@jiaaoc
Copy link
Member

jiaaoc commented Jun 4, 2020

The reason is the dataset pre-processing of Yahoo Answer.

We tested with

python ./code/train.py --gpu 2,7 --n-labeled 10 \
--data-path ./data/yahoo_answers_csv/ --batch-size 2 --batch-size-u 4 --epochs 20 --val-iteration 1000 \
--lambda-u 1 --T 0.5 --alpha 16 --mix-layers-set 7 9 12 \
--lrmain 0.000005 --lrlast 0.0005

The performance is:

image

@jiaaoc jiaaoc closed this as completed Jun 4, 2020
@jiaaoc
Copy link
Member

jiaaoc commented Jun 4, 2020

python ./code/train.py --gpu 2,7 --n-labeled 10
--data-path ./data/yahoo_answers_csv/ --batch-size 2 --batch-size-u 4 --epochs 20 --val-iteration 2000
--lambda-u 1 --T 0.5 --alpha 16 --mix-layers-set 7 9 12
--lrmain 0.000005 --lrlast 0.0005

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants