Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cuda error while initializing SparseTensor #338

Closed
victoryc opened this issue Mar 30, 2021 · 4 comments
Closed

Cuda error while initializing SparseTensor #338

victoryc opened this issue Mar 30, 2021 · 4 comments

Comments

@victoryc
Copy link

victoryc commented Mar 30, 2021

I am getting the following error while initializing a SparseTensor:

terminate called after throwing an instance of 'thrust::system::system_error'
  what():  CUDA free failed: cudaErrorIllegalAddress: an illegal memory access was encountered

Below is a small bit of code that reproduces this error [This is derived from the sparse_tensor_basic.py file in MinkowskiEngine repo].

import torch
import MinkowskiEngine as ME

data_batch_0 = [
    [0, 0, 2.1, 0, 0],  #
    [0, 1, 1.4, 3, 0],  #
    [0, 0, 4.0, 0, 0]
]


def to_sparse_coo(data):
    # An intuitive way to extract coordinates and features
    coords, feats = [], []
    for i, row in enumerate(data):
        for j, val in enumerate(row):
            if val != 0:
                coords.append([i, j])
                feats.append([val])
    return torch.IntTensor(coords), torch.FloatTensor(feats)

def sparse_tensor_initialization():
    coords0, feats0 = to_sparse_coo(data_batch_0)
    coords0 = coords0.cuda()
    feats0 = feats0.cuda()
    coords0, feats0 = ME.utils.sparse_collate(coords=[coords0], feats=[feats0], device=coords0.device)
    A = ME.SparseTensor(coordinates=coords0, features=feats0)
    return A


if __name__ == '__main__':
    sparse_tensor_initialization()

I am using cuda version11.2

@chrischoy
Copy link
Contributor

I cant replicate the error. Could you report the environment as specified on the issue template?

@victoryc
Copy link
Author

Unfortunately, after I posted the issue, for other reasons, I had to upgrade pytorch to the the latest version (i.e., 1.8.1) which as I understand isn't compatible with MinkowskiEngine. So, I am not currently able to provide the complete and exact environment info corresponding to the state when I had the issue. Below is partial info about the environment.

  • OS: Ubuntu 20.04 in WSL2 (Windows Subsystem for Linux 2) running on top of Windows 10
  • Python version: 3.8.5
  • Output of the following command. (If you installed the latest MinkowskiEngine, simply call MinkowskiEngine.print_diagnostics())
  • ==========System==========
    Linux-5.4.91-microsoft-standard-WSL2-x86_64-with-glibc2.10
    DISTRIB_ID=Ubuntu
    DISTRIB_RELEASE=20.04
    DISTRIB_CODENAME=focal
    DISTRIB_DESCRIPTION="Ubuntu 20.04 LTS"
    3.8.5 (default, Sep 4 2020, 07:30:14)
    [GCC 7.3.0]
    ==========Pytorch==========
    1.7.1
    torch.cuda.is_available(): True
    ==========NVIDIA-SMI==========

@chrischoy
Copy link
Contributor

I see. It seems like this was a pytorch 1.8 related issue. 1.8 compatibility will be fixed in the next version.

@chrischoy
Copy link
Contributor

This is the pytorch 1.8.1 + CUDA 11 error on #330. Please refer to the #330 for updates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants