Memory leak when processing multiple meshes #6

dendenxu · 2022-02-26T11:28:45Z

GPU memory is not properly freed when switching to other meshes, eventually leading to CUSPARSE_STATUS_ALLOC_FAILED:

Traceback (most recent call last):
  File "scripts/show_largesteps_memory_leak.py", line 16, in <module>
    v = from_differential(M, u, 'Cholesky')
  File "/home/xuzhen/miniconda3/envs/flame/lib/python3.8/site-packages/largesteps/parameterize.py", line 51, in from_differential
    solver = CholeskySolver(L)
  File "/home/xuzhen/miniconda3/envs/flame/lib/python3.8/site-packages/largesteps/solvers.py", line 130, in __init__
    self.solver_1 = prepare(self.L, False, False, True)
  File "/home/xuzhen/miniconda3/envs/flame/lib/python3.8/site-packages/largesteps/solvers.py", line 68, in prepare
    _cusparse.scsrsm2_analysis(
  File "cupy_backends/cuda/libs/cusparse.pyx", line 2103, in cupy_backends.cuda.libs.cusparse.scsrsm2_analysis
  File "cupy_backends/cuda/libs/cusparse.pyx", line 2115, in cupy_backends.cuda.libs.cusparse.scsrsm2_analysis
  File "cupy_backends/cuda/libs/cusparse.pyx", line 1511, in cupy_backends.cuda.libs.cusparse.check_status
cupy_backends.cuda.libs.cusparse.CuSparseError: CUSPARSE_STATUS_ALLOC_FAILED

To reproduce, run this code example with this example mesh (extract armadillo.npz and place it where you run the code below):

import torch
import numpy as np
from tqdm import tqdm

from largesteps.parameterize import from_differential, to_differential
from largesteps.geometry import compute_matrix
from largesteps.optimize import AdamUniform

armadillo = np.load('armadillo.npz')
verts = torch.tensor(armadillo['v'], device='cuda')
faces = torch.tensor(armadillo['f'], device='cuda')

for i in tqdm(range(3000)):
    # assume there's different meshes w/ different topology
    M = compute_matrix(verts, faces, 10)
    u = to_differential(M, verts)
    u.requires_grad_()
    optim = AdamUniform([u], 3e-2)
    for j in range(5):
        v: torch.Tensor = from_differential(M, u, 'Cholesky')
        loss: torch.Tensor = (v.norm(dim=-1) - 1).mean()
        optim.zero_grad()
        loss.backward()
        optim.step()

While running the code above, you should see the GPU memory continuously increase but the expected behavior is that it stays constant.

For example, the result of nvidia-smi dmon -s m while running the code should be something like:

The text was updated successfully, but these errors were encountered:

dendenxu · 2022-02-26T11:45:04Z

I opened a PR for this. Explained my theory in the description of the PR.

Memory leak is cause by never freeing Csrsm2Info and MatDescr created in the prepare function of the solvers module: here

bathal1 · 2022-06-01T08:53:21Z

Thank you for this report. We replaced cusparse by cholespy for the solver part, so this should not be a problem anymore. nrhs is also not fixed to 3 anymore (up to 128 on the GPU)

Closing this and the PR

dendenxu changed the title ~~Memory leak when processing multiple meshes~~ Memory leak when processing multiple meshes in a row Feb 26, 2022

dendenxu changed the title ~~Memory leak when processing multiple meshes in a row~~ Memory leak when processing multiple meshes Feb 26, 2022

dendenxu mentioned this issue Feb 26, 2022

Fix memory leak caused by not freeing cusparse info and desc #7

Closed

bathal1 closed this as completed Jun 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory leak when processing multiple meshes #6

Memory leak when processing multiple meshes #6

dendenxu commented Feb 26, 2022

dendenxu commented Feb 26, 2022

bathal1 commented Jun 1, 2022

Memory leak when processing multiple meshes #6

Memory leak when processing multiple meshes #6

Comments

dendenxu commented Feb 26, 2022

dendenxu commented Feb 26, 2022

bathal1 commented Jun 1, 2022