Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why there is cnt variable in get_collate_function? #15

Closed
ari9dam opened this issue Aug 9, 2021 · 2 comments
Closed

Why there is cnt variable in get_collate_function? #15

ari9dam opened this issue Aug 9, 2021 · 2 comments

Comments

@ari9dam
Copy link

ari9dam commented Aug 9, 2021

In https://github.com/jingtaozhan/DRhard/blob/dc17f3d1f7f59d13d15daa1a728dc8d6efc48b92/dataset.py, if we take a look at the data collator,

def get_collate_function(max_seq_length):
    cnt = 0
    def collate_function(batch):
        nonlocal cnt
        length = None
        if cnt < 10:
            length = max_seq_length
            cnt += 1

        input_ids = [x["input_ids"] for x in batch]
        attention_mask = [x["attention_mask"] for x in batch]
        data = {
            "input_ids": pack_tensor_2D(input_ids, default=1, 
                dtype=torch.int64, length=length),
            "attention_mask": pack_tensor_2D(attention_mask, default=0, 
                dtype=torch.int64, length=length),
        }
        ids = [x['id'] for x in batch]
        return data, ids
    return collate_function  

we see that there is a cnt variable which is deciding if the collate_function should pad or not. I couldn't get why it is needed. Could you please explain the significance of cnt ?

Thank you
AM

@jingtaozhan
Copy link
Owner

It is a simple trick I used. Some inappropriate hyperparameters may trigger `outofmemory' error during training. Therefore, this code requires the input to have the max sequence length at the beginning of training. Therefore, if the batch size is too big or max seq length is too big, the error will be triggered from the beginning and I can easily know.
You can also delete this code.

@ari9dam
Copy link
Author

ari9dam commented Aug 10, 2021

I see. Thanks for the explanation.

@ari9dam ari9dam closed this as completed Aug 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants