Question about the back translations. #13

callmeYe · 2020-11-18T12:37:15Z

Can I not do data augmentation on unlabelled data?

jiaaoc · 2020-11-18T16:00:30Z

We use back-translation to create paraphrases for unlabeled data and perform consistency training. You could use other ways to generate paraphrases.

callmeYe · 2020-11-19T01:38:53Z

So I have to create paraphrases, right? In addition, When I look at the code, I find that only the first 100,000 pieces of data in the data set have been back translated. Do I not need to perform back translation for all the datasets？

jiaaoc · 2020-11-19T02:15:43Z

It depends on the size of the unlabeled data you are going to use. In this work, we used 100,000 unlabeled data, so we just did back translations on them, not the whole dataset.

callmeYe · 2020-11-19T02:43:20Z

Sorry, I'm still a little confused.
When I test with:
python ./code/train.py --gpu 0,1 --n-labeled 10 \ --data-path ./data/yahoo_answers_csv/ --batch-size 2 --batch-size-u 4 --epochs 20 --val-iteration 1000 \ --lambda-u 1 --T 0.5 --alpha 16 --mix-layers-set 7 9 12 \ --lrmain 0.000005 --lrlast 0.0005
The number of unlabeled data per class seems to be 5,000. Do they add up to exactly 100,000?

jiaaoc · 2020-11-19T02:44:59Z

You could use up to 100,000

jiaaoc · 2020-11-19T02:45:12Z

10,000

jiaaoc · 2020-11-19T02:47:06Z

Anyway, the number of data you need to paraphrase only depends on the number of unlabeled data you are going to use.

callmeYe · 2020-11-19T02:50:56Z

Are they one-to-one correspondence？

jiaaoc · 2020-11-19T02:53:19Z

one unlabeled data could be associated with multiple paraphrases. Please refer to the paper/codes for details.

jiaaoc closed this as completed Nov 19, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about the back translations. #13

Question about the back translations. #13

callmeYe commented Nov 18, 2020

jiaaoc commented Nov 18, 2020

callmeYe commented Nov 19, 2020 •

edited

jiaaoc commented Nov 19, 2020

callmeYe commented Nov 19, 2020

jiaaoc commented Nov 19, 2020

jiaaoc commented Nov 19, 2020

jiaaoc commented Nov 19, 2020

callmeYe commented Nov 19, 2020

jiaaoc commented Nov 19, 2020

Question about the back translations. #13

Question about the back translations. #13

Comments

callmeYe commented Nov 18, 2020

jiaaoc commented Nov 18, 2020

callmeYe commented Nov 19, 2020 • edited

jiaaoc commented Nov 19, 2020

callmeYe commented Nov 19, 2020

jiaaoc commented Nov 19, 2020

jiaaoc commented Nov 19, 2020

jiaaoc commented Nov 19, 2020

callmeYe commented Nov 19, 2020

jiaaoc commented Nov 19, 2020

callmeYe commented Nov 19, 2020 •

edited