Cuda error while initializing SparseTensor #338

victoryc · 2021-03-30T00:14:17Z

I am getting the following error while initializing a SparseTensor:

terminate called after throwing an instance of 'thrust::system::system_error'
  what():  CUDA free failed: cudaErrorIllegalAddress: an illegal memory access was encountered

Below is a small bit of code that reproduces this error [This is derived from the sparse_tensor_basic.py file in MinkowskiEngine repo].

import torch
import MinkowskiEngine as ME

data_batch_0 = [
    [0, 0, 2.1, 0, 0],  #
    [0, 1, 1.4, 3, 0],  #
    [0, 0, 4.0, 0, 0]
]


def to_sparse_coo(data):
    # An intuitive way to extract coordinates and features
    coords, feats = [], []
    for i, row in enumerate(data):
        for j, val in enumerate(row):
            if val != 0:
                coords.append([i, j])
                feats.append([val])
    return torch.IntTensor(coords), torch.FloatTensor(feats)

def sparse_tensor_initialization():
    coords0, feats0 = to_sparse_coo(data_batch_0)
    coords0 = coords0.cuda()
    feats0 = feats0.cuda()
    coords0, feats0 = ME.utils.sparse_collate(coords=[coords0], feats=[feats0], device=coords0.device)
    A = ME.SparseTensor(coordinates=coords0, features=feats0)
    return A


if __name__ == '__main__':
    sparse_tensor_initialization()

I am using cuda version11.2

The text was updated successfully, but these errors were encountered:

chrischoy · 2021-03-31T06:11:20Z

I cant replicate the error. Could you report the environment as specified on the issue template?

victoryc · 2021-03-31T19:53:47Z

Unfortunately, after I posted the issue, for other reasons, I had to upgrade pytorch to the the latest version (i.e., 1.8.1) which as I understand isn't compatible with MinkowskiEngine. So, I am not currently able to provide the complete and exact environment info corresponding to the state when I had the issue. Below is partial info about the environment.

OS: Ubuntu 20.04 in WSL2 (Windows Subsystem for Linux 2) running on top of Windows 10
Python version: 3.8.5
Output of the following command. (If you installed the latest MinkowskiEngine, simply call MinkowskiEngine.print_diagnostics())
==========System==========
Linux-5.4.91-microsoft-standard-WSL2-x86_64-with-glibc2.10
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=20.04
DISTRIB_CODENAME=focal
DISTRIB_DESCRIPTION="Ubuntu 20.04 LTS"
3.8.5 (default, Sep 4 2020, 07:30:14)
[GCC 7.3.0]
==========Pytorch==========
1.7.1
torch.cuda.is_available(): True
==========NVIDIA-SMI==========

chrischoy · 2021-04-01T01:37:24Z

I see. It seems like this was a pytorch 1.8 related issue. 1.8 compatibility will be fixed in the next version.

chrischoy · 2021-04-08T02:57:48Z

This is the pytorch 1.8.1 + CUDA 11 error on #330. Please refer to the #330 for updates.

victoryc mentioned this issue Mar 30, 2021

Cuda 11.1 - Coordinate manager #330

Closed

chrischoy closed this as completed Apr 8, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cuda error while initializing SparseTensor #338

Cuda error while initializing SparseTensor #338

victoryc commented Mar 30, 2021 •

edited

Loading

chrischoy commented Mar 31, 2021

victoryc commented Mar 31, 2021

chrischoy commented Apr 1, 2021

chrischoy commented Apr 8, 2021

Cuda error while initializing SparseTensor #338

Cuda error while initializing SparseTensor #338

Comments

victoryc commented Mar 30, 2021 • edited Loading

chrischoy commented Mar 31, 2021

victoryc commented Mar 31, 2021

chrischoy commented Apr 1, 2021

chrischoy commented Apr 8, 2021

victoryc commented Mar 30, 2021 •

edited

Loading