Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

train ipet with zero training examples #66

Closed
EneruMin opened this issue Nov 9, 2021 · 5 comments
Closed

train ipet with zero training examples #66

EneruMin opened this issue Nov 9, 2021 · 5 comments

Comments

@EneruMin
Copy link

EneruMin commented Nov 9, 2021

Hi,
I am training ipet with zero training examples, I run the following command.
python3 cli.py --method ipet --pattern_ids 0 1 2 3 4 --data_dir /share/home/zqzeng/wmni/data/ag_news_csv/ag_news_csv --model_type roberta --model_name_or_path /share/home/zqzeng/transformers/roberta-large --task_name agnews --output_dir /share/home/zqzeng/wmni/data/output/unsupervised-ipet --do_train --do_eval --pet_repetitions 1 --ipet_n_most_likely 100 --reduction mean --train_examples 0
And I got the following result:
2021-11-09 20:22:31,904 - INFO - tasks - Creating features from dataset file at ag_news_csv/ (num_examples=0, set_type=train)
2021-11-09 20:22:34,978 - INFO - tasks - Returning 120000 train examples with label dist.: [('3', 30000), ('4', 30000), ('2', 30000), ('1', 30000)]
I followed the flow of the program and found that the whole train examples(120000) was uesd to train each individual model.
When I used "--train_examples 10", it's normal, as shown below:
2021-11-09 20:19:13,402 - INFO - tasks - Creating features from dataset file at ag_news_csv/ (num_examples=10, set_type=train)
2021-11-09 20:19:16,127 - INFO - tasks - Returning 10 train examples with label dist.: [('1', 3), ('4', 4), ('2', 2), ('3', 1)]
Does the zero training examples don't work?
I would be grateful for your prompt reply.

@EneruMin
Copy link
Author

The above problem is caused by the following method in task.py.

def _shuffle_and_restrict(examples: List[InputExample], num_examples: int, seed: int = 42) -> List[InputExample]:
    """
    Shuffle a list of examples and restrict it to a given maximum size.

    :param examples: the examples to shuffle and restrict
    :param num_examples: the maximum number of examples
    :param seed: the random seed for shuffling
    :return: the first ``num_examples`` elements of the shuffled list
    """
    if 0 < num_examples < len(examples):
        random.Random(seed).shuffle(examples)
        examples = examples[:num_examples]
    return examples

When the num_examples equals 0, this method will return all the training examples, which next would be used to fine tune the language model.
So I wonder how you use ipet with zero training examples.

@timoschick
Copy link
Owner

Hi @EneruMin, you are absolutely correct, this is an error in the code as it should be if 0 <= num_examples < len(examples). I'll update the code accordingly.

The reason why things still worked in our experiments is that we additionally specified the --split_examples_evenly option. Without this error, this shouldn't have any effect in the zero-shot setting (as it basically just tells the script to choose training examples so that there is the same number of examples for each label) but it causes the script to not use _shuffle_and_restrict and instead use a LimitedExampleList, which handles the case of 0 examples correctly:

if num_examples is not None:
    examples = _shuffle_and_restrict(examples, num_examples, seed)

elif num_examples_per_label is not None:
    limited_examples = LimitedExampleList(processor.get_labels(), num_examples_per_label)
    for example in examples:
        limited_examples.add(example)
    examples = limited_examples.to_list()

So to fix this issue, you can either (a) replace 0 < num_examples [...] with 0 <= num_examples [...] (or just wait for the next update), (b) specify the --split_examples_evenly option or (c) simply change your TaskProcessor so that it doesn't return any training examples in the first place.

@EneruMin
Copy link
Author

EneruMin commented Nov 17, 2021

Hi @timoschick , thanks for your suggestion.
I specified the --split_examples_evenly option. This time it returned zero training example. But another error occurred.

Traceback (most recent call last):
  File "cli.py", line 283, in <module>
    main()
  File "cli.py", line 271, in main
    eval_data=eval_data, do_train=args.do_train, do_eval=args.do_eval, seed=args.seed)
  File "/share/home/zqzeng/wmni/pet-master-edited/pet-master/pet/modeling.py", line 216, in train_ipet
    eval_data=eval_data, do_train=do_train, do_eval=do_eval)
  File "/share/home/zqzeng/wmni/pet-master-edited/pet-master/pet/modeling.py", line 298, in train_classifier
    do_eval=do_eval, seed=seed)
  File "/share/home/zqzeng/wmni/pet-master-edited/pet-master/pet/modeling.py", line 358, in train_pet_ensemble
    unlabeled_data=unlabeled_data))
  File "/share/home/zqzeng/wmni/pet-master-edited/pet-master/pet/modeling.py", line 461, in train_single_model
    temperature=config.temperature
  File "/share/home/zqzeng/wmni/pet-master-edited/pet-master/pet/wrapper.py", line 229, in train
    train_sampler = RandomSampler(train_dataset)
  File "/share/home/zqzeng/anaconda3/envs/wmni/lib/python3.7/site-packages/torch/utils/data/sampler.py", line 94, in __init__
    "value, but got num_samples={}".format(self.num_samples))
ValueError: num_samples should be a positive integer value, but got num_samples=0

In the train() method, the task_train_data is none, so the function RandomSampler(train_dataset) return an error.
I solved this problem by checking whether the task_train_data is none in the code.
But the final accuracy in my result is 0.823, while the accuracy in your paper is 0.875 (on AGNews task). Is it reasonable?

@timoschick
Copy link
Owner

Interesting... I'll check why we didn't get a similar error as soon as I find the time. Regardless, the final accuracy should be much better than the one you've reported. There are a couple of differences between your command and the one that we have used, so I cannot tell what exactly causes the difference. Could you tell me the results after each iteration (the contents of the result_test.txt file in each iteration's directory)? That will help to identify the point where things diverge.

A couple of notes regarding possible differences:

  • We have trained 3 models per pattern, so --pet_repetitions 3, whereas you've been using --pet_repetitions 1. Using more models stabilizes results.
  • We've been using a batch size of 16.
  • We've been using --lm_training (with a ratio of 1:3 training examples and unlabeled examples).
  • We've been using 40,000 unlabeled examples. It's not clear from your script how many unlabeled examples you've been using?
  • I don't remember the exact number of iterations that we've been using, but I think it was more than the default 3 (--ipet_generations). You can check that in our paper.

Finally, you may get different results due to random selection of examples and model initialization (but those should not account for more than 5% difference in performance). If you want to reproduce our exact results and none of the above helps, you can check out the v1.1.0 branch that contains the script that we have used for iPET.

@EneruMin
Copy link
Author

The results of each iteration are as shown below.
g0

acc-p0: 0.6531578947368422 +- 0
acc-p1: 0.7471052631578947 +- 0
acc-p2: 0.5906578947368422 +- 0
acc-p3: 0.7082894736842106 +- 0
acc-p4: 0.7942105263157895 +- 0
acc-all-p: 0.6986842105263158 +- 0.07953688886139995

g1

acc-p0: 0.7794736842105263 +- 0
acc-p1: 0.7235526315789473 +- 0
acc-p2: 0.6317105263157895 +- 0
acc-p3: 0.7476315789473684 +- 0
acc-p4: 0.7796052631578947 +- 0
acc-all-p: 0.7323947368421052 +- 0.06101826675772052

g2

acc-p0: 0.8057894736842105 +- 0
acc-p1: 0.7156578947368422 +- 0
acc-p2: 0.7773684210526316 +- 0
acc-p3: 0.8132894736842106 +- 0
acc-p4: 0.7723684210526316 +- 0
acc-all-p: 0.7768947368421053 +- 0.03850371874690442

final

acc-p0: 0.8228947368421052 +- 0
acc-all-p: 0.8228947368421052 +- 0

According to the figure 4 in your paper, I think maybe I should use 4 or 5 iterations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants