Skip to content

How to release gpu memory of intermediate result tensor #29802

@akikaaa

Description

@akikaaa

In the example below, after calling torch.matmul, the gpu memory usage increases by 181796864 bytes, which is almost the sum of the sizes of c and b.transpose(2,3). So I guess the unreferenced intermediate result b.transpose(2,3) is stored in gpu memory. How could I release the gpu memory allocated to this intermediate result to save gpu memory?

import torch
from torch.autograd import Variable
a = Variable(torch.rand(32, 8, 151, 1024), requires_grad=True).cuda()
b = Variable(torch.rand(32, 8, 151, 1024), requires_grad=True).cuda()
torch.cuda.memory_allocated(0) # 316669952
c=torch.matmul(a, b.transpose(2,3))
torch.cuda.memory_allocated(0) # 498466816, increased by 181796864
c.element_size() * c.nelement() # 23348224
b.transpose(2,3).element_size() * b.transpose(2,3).nelement() #158334976

Environment

  • PyTorch Version (e.g., 1.0): 1.0.1
  • OS (e.g., Linux): centos
  • How you installed PyTorch (conda, pip, source): pip
  • Build command you used (if compiling from source):
  • Python version: 3.6.9
  • CUDA/cuDNN version: cuda9.2/cudnn7.4.2
  • GPU models and configuration:NVIDIA 1080TI
  • Any other relevant information:

cc @ngimel

Metadata

Metadata

Assignees

No one assigned

    Labels

    module: cudaRelated to torch.cuda, and CUDA support in generalmodule: memory usagePyTorch is using more memory than it should, or it is leaking memorytriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions