You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a list of size 2200 of long documents. Thorough a for loop I am applying longformer to each of the documents and append ouput[1] to the output variable. Thoughg through each iterations the memory grows significantly. As a result After only a few iterations I ran out of memory (256 GB). I can't figure out what is consuming all the memory. A snippet of my code:
As far as I can tell, this is expected PyTorch behavior and is not specific to Longformer. Every time you call output = model(input_ids, attention_mask=attention_mask), PyTorch stores more of the activations computed during the forward pass to use them in the backward pass.
To avoid that, you need to encode your code in with torch.no_grad(): your_code. You might also want to do model.eval() if you are only doing inference. Check here for the difference between the two.
I have a list of size 2200 of long documents. Thorough a for loop I am applying longformer to each of the documents and append
ouput[1]
to the output variable. Thoughg through each iterations the memory grows significantly. As a result After only a few iterations I ran out of memory (256 GB). I can't figure out what is consuming all the memory. A snippet of my code:`model = Longformer.from_pretrained(longformer_base_dir, config=config)
tokenizer = RobertaTokenizer.from_pretrained('roberta-base')
tokenizer.max_len = model.config.max_position_embeddings
output_vec =[]
for pg in pages:
pg = f'{tokenizer.cls_token}{pg}{tokenizer.eos_token}'
input_ids = torch.tensor(tokenizer.encode(pg)).unsqueeze(0) # batch of size 1
attention_mask = torch.ones(input_ids.shape, dtype=torch.long, device=input_ids.device)
`
The text was updated successfully, but these errors were encountered: