Out of memory issue #36

taherehmarvdashti · 2020-05-17T20:23:12Z

I have a list of size 2200 of long documents. Thorough a for loop I am applying longformer to each of the documents and append ouput[1] to the output variable. Thoughg through each iterations the memory grows significantly. As a result After only a few iterations I ran out of memory (256 GB). I can't figure out what is consuming all the memory. A snippet of my code:

`model = Longformer.from_pretrained(longformer_base_dir, config=config)

tokenizer = RobertaTokenizer.from_pretrained('roberta-base')
tokenizer.max_len = model.config.max_position_embeddings
output_vec =[]
for pg in pages:
pg = f'{tokenizer.cls_token}{pg}{tokenizer.eos_token}'
input_ids = torch.tensor(tokenizer.encode(pg)).unsqueeze(0) # batch of size 1
attention_mask = torch.ones(input_ids.shape, dtype=torch.long, device=input_ids.device)

    input_ids, attention_mask = pad_to_window_size(
            input_ids, attention_mask, config.attention_window[0], tokenizer.pad_token_id)

    output = model(input_ids, attention_mask=attention_mask)
    output_vec.append(output[1])

`

The text was updated successfully, but these errors were encountered:

ibeltagy · 2020-05-17T23:23:57Z

As far as I can tell, this is expected PyTorch behavior and is not specific to Longformer. Every time you call output = model(input_ids, attention_mask=attention_mask), PyTorch stores more of the activations computed during the forward pass to use them in the backward pass.
To avoid that, you need to encode your code in with torch.no_grad(): your_code. You might also want to do model.eval() if you are only doing inference. Check here for the difference between the two.

taherehmarvdashti · 2020-05-18T14:37:10Z

Thank you. Solved my problem

taherehmarvdashti changed the title ~~Out of memory issuse~~ Out of memory issue May 17, 2020

taherehmarvdashti closed this as completed May 18, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Out of memory issue #36

Out of memory issue #36

taherehmarvdashti commented May 17, 2020 •

edited

Loading

ibeltagy commented May 17, 2020

taherehmarvdashti commented May 18, 2020

Out of memory issue #36

Out of memory issue #36

Comments

taherehmarvdashti commented May 17, 2020 • edited Loading

ibeltagy commented May 17, 2020

taherehmarvdashti commented May 18, 2020

taherehmarvdashti commented May 17, 2020 •

edited

Loading