Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak when processing multiple meshes #6

Closed
dendenxu opened this issue Feb 26, 2022 · 2 comments
Closed

Memory leak when processing multiple meshes #6

dendenxu opened this issue Feb 26, 2022 · 2 comments

Comments

@dendenxu
Copy link

GPU memory is not properly freed when switching to other meshes, eventually leading to CUSPARSE_STATUS_ALLOC_FAILED:

Traceback (most recent call last):
  File "scripts/show_largesteps_memory_leak.py", line 16, in <module>
    v = from_differential(M, u, 'Cholesky')
  File "/home/xuzhen/miniconda3/envs/flame/lib/python3.8/site-packages/largesteps/parameterize.py", line 51, in from_differential
    solver = CholeskySolver(L)
  File "/home/xuzhen/miniconda3/envs/flame/lib/python3.8/site-packages/largesteps/solvers.py", line 130, in __init__
    self.solver_1 = prepare(self.L, False, False, True)
  File "/home/xuzhen/miniconda3/envs/flame/lib/python3.8/site-packages/largesteps/solvers.py", line 68, in prepare
    _cusparse.scsrsm2_analysis(
  File "cupy_backends/cuda/libs/cusparse.pyx", line 2103, in cupy_backends.cuda.libs.cusparse.scsrsm2_analysis
  File "cupy_backends/cuda/libs/cusparse.pyx", line 2115, in cupy_backends.cuda.libs.cusparse.scsrsm2_analysis
  File "cupy_backends/cuda/libs/cusparse.pyx", line 1511, in cupy_backends.cuda.libs.cusparse.check_status
cupy_backends.cuda.libs.cusparse.CuSparseError: CUSPARSE_STATUS_ALLOC_FAILED

To reproduce, run this code example with this example mesh (extract armadillo.npz and place it where you run the code below):

import torch
import numpy as np
from tqdm import tqdm

from largesteps.parameterize import from_differential, to_differential
from largesteps.geometry import compute_matrix
from largesteps.optimize import AdamUniform

armadillo = np.load('armadillo.npz')
verts = torch.tensor(armadillo['v'], device='cuda')
faces = torch.tensor(armadillo['f'], device='cuda')

for i in tqdm(range(3000)):
    # assume there's different meshes w/ different topology
    M = compute_matrix(verts, faces, 10)
    u = to_differential(M, verts)
    u.requires_grad_()
    optim = AdamUniform([u], 3e-2)
    for j in range(5):
        v: torch.Tensor = from_differential(M, u, 'Cholesky')
        loss: torch.Tensor = (v.norm(dim=-1) - 1).mean()
        optim.zero_grad()
        loss.backward()
        optim.step()

While running the code above, you should see the GPU memory continuously increase but the expected behavior is that it stays constant.

For example, the result of nvidia-smi dmon -s m while running the code should be something like:
image

@dendenxu dendenxu changed the title Memory leak when processing multiple meshes Memory leak when processing multiple meshes in a row Feb 26, 2022
@dendenxu dendenxu changed the title Memory leak when processing multiple meshes in a row Memory leak when processing multiple meshes Feb 26, 2022
@dendenxu
Copy link
Author

I opened a PR for this. Explained my theory in the description of the PR.

Memory leak is cause by never freeing Csrsm2Info and MatDescr created in the prepare function of the solvers module: here

@bathal1
Copy link
Collaborator

bathal1 commented Jun 1, 2022

Thank you for this report. We replaced cusparse by cholespy for the solver part, so this should not be a problem anymore. nrhs is also not fixed to 3 anymore (up to 128 on the GPU)

Closing this and the PR

@bathal1 bathal1 closed this as completed Jun 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants