Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Out of memory issue #36

Closed
taherehmarvdashti opened this issue May 17, 2020 · 2 comments
Closed

Out of memory issue #36

taherehmarvdashti opened this issue May 17, 2020 · 2 comments

Comments

@taherehmarvdashti
Copy link

taherehmarvdashti commented May 17, 2020

I have a list of size 2200 of long documents. Thorough a for loop I am applying longformer to each of the documents and append ouput[1] to the output variable. Thoughg through each iterations the memory grows significantly. As a result After only a few iterations I ran out of memory (256 GB). I can't figure out what is consuming all the memory. A snippet of my code:

`model = Longformer.from_pretrained(longformer_base_dir, config=config)

tokenizer = RobertaTokenizer.from_pretrained('roberta-base')
tokenizer.max_len = model.config.max_position_embeddings
output_vec =[]
for pg in pages:
pg = f'{tokenizer.cls_token}{pg}{tokenizer.eos_token}'
input_ids = torch.tensor(tokenizer.encode(pg)).unsqueeze(0) # batch of size 1
attention_mask = torch.ones(input_ids.shape, dtype=torch.long, device=input_ids.device)

    input_ids, attention_mask = pad_to_window_size(
            input_ids, attention_mask, config.attention_window[0], tokenizer.pad_token_id)

    output = model(input_ids, attention_mask=attention_mask)
    output_vec.append(output[1])

`

@taherehmarvdashti taherehmarvdashti changed the title Out of memory issuse Out of memory issue May 17, 2020
@ibeltagy
Copy link
Collaborator

As far as I can tell, this is expected PyTorch behavior and is not specific to Longformer. Every time you call output = model(input_ids, attention_mask=attention_mask), PyTorch stores more of the activations computed during the forward pass to use them in the backward pass.
To avoid that, you need to encode your code in with torch.no_grad(): your_code. You might also want to do model.eval() if you are only doing inference. Check here for the difference between the two.

@taherehmarvdashti
Copy link
Author

Thank you. Solved my problem

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants